[SGVLUG] Polling Web Sites

bb.odenthal at gmail.com bb.odenthal at gmail.com
Sat Nov 10 22:30:33 PST 2007


John,

I may be over simplifying this but a web "search" is usually just a POST or GET method action on an HTML form.  If you can take a packet trace of the transaction (assuming it's not SSL) then it's easy to discover the URL format and method for the search.  A simple "lynx -dump" of that URL using "watch" every 120 seconds could be helpful (Assuming that a text only version of the web page would be of any use to you):

#watch -n 120 "lynx -dump http://foo.com/search?bar=san_gabriel_valley"

If the site requires more interaction than that (login, password, click on a few links, fill out a form) or requires cookies then I suggest using a Perl script.  Maybe WWW::Mechanize for some simple HTML form automation.  

**I'm putting on my Nomex jacket**

Or...just spend $30 on http://www.newdigitalsoft.com/airobot/ or similar and use a windows box?   It IS an option.   

-bb
-----Original Message-----
From: juanslayton at dslextreme.com

Date: Sat, 10 Nov 2007 21:37:04 
To:sgvlug at sgvlug.net
Subject: [SGVLUG] Polling Web Sites



     Got a little project here that I could use some help on.  El Monte
City School District uses a program called Aesop to post daily
openings for substitute teachers.  All I have to do is go to their
web site and click on the search button and I can see who has
currently called in to be absent.  Trouble is, if someone calls in
sick just after I've checked, I won't find out about it until the
next time I check.  And I have better things to do than sit and click
on the search button all evening.
     So I began to figure out ways to poll that site automatically.  The
current approach works like this:  A timing program (written in C)
runs in the background on a virtual terminal and produces a negative
pulse on data line 1 of the parallel port every few minutes.  I 'hot
wired' the left click switch (high, pull-down side) of a USB mouse to
that data line (through a diode to protect the port in case someone
physically clicks the mouse).  By leaving the cursor on the search
button, the background program electronically clicks that button
every few minutes.  All I have to do as I go about my business is
glance at the screen every now and then to see if anything new has
come up.
     But this is over-complicated.  There ought to be a simple way to poll
that page programatically without messing with the hardware.  Say, by
using the usb event mechanisms?  Like as not somebody somewhere has
already written code to do it.  I'd appreciate anyone who could point
me in the right direction.

John

***************************************************************************************
If the mind is not constrained by walls and fences, where is the need for
Windows and Gates?


More information about the SGVLUG mailing list