[SGVLUG] Polling Web Sites

Matthew Gallizzi matthew.gallizzi at gmail.com
Sun Nov 11 01:32:13 PST 2007


I would use curl. If there is a login/password, you can then use curl to
POST to a URL, create and save a cookie, then grab the page you want to look
at. If I recall correctly, the man page is fairly well documented.

Good luck!

On 11/10/07, bb.odenthal at gmail.com <bb.odenthal at gmail.com> wrote:
>
> John,
>
> I may be over simplifying this but a web "search" is usually just a POST
> or GET method action on an HTML form.  If you can take a packet trace of the
> transaction (assuming it's not SSL) then it's easy to discover the URL
> format and method for the search.  A simple "lynx -dump" of that URL using
> "watch" every 120 seconds could be helpful (Assuming that a text only
> version of the web page would be of any use to you):
>
> #watch -n 120 "lynx -dump http://foo.com/search?bar=san_gabriel_valley"
>
> If the site requires more interaction than that (login, password, click on
> a few links, fill out a form) or requires cookies then I suggest using a
> Perl script.  Maybe WWW::Mechanize for some simple HTML form automation.
>
> **I'm putting on my Nomex jacket**
>
> Or...just spend $30 on http://www.newdigitalsoft.com/airobot/ or similar
> and use a windows box?   It IS an option.
>
> -bb
> -----Original Message-----
> From: juanslayton at dslextreme.com
>
> Date: Sat, 10 Nov 2007 21:37:04
> To:sgvlug at sgvlug.net
> Subject: [SGVLUG] Polling Web Sites
>
>
>
>      Got a little project here that I could use some help on.  El Monte
> City School District uses a program called Aesop to post daily
> openings for substitute teachers.  All I have to do is go to their
> web site and click on the search button and I can see who has
> currently called in to be absent.  Trouble is, if someone calls in
> sick just after I've checked, I won't find out about it until the
> next time I check.  And I have better things to do than sit and click
> on the search button all evening.
>      So I began to figure out ways to poll that site automatically.  The
> current approach works like this:  A timing program (written in C)
> runs in the background on a virtual terminal and produces a negative
> pulse on data line 1 of the parallel port every few minutes.  I 'hot
> wired' the left click switch (high, pull-down side) of a USB mouse to
> that data line (through a diode to protect the port in case someone
> physically clicks the mouse).  By leaving the cursor on the search
> button, the background program electronically clicks that button
> every few minutes.  All I have to do as I go about my business is
> glance at the screen every now and then to see if anything new has
> come up.
>      But this is over-complicated.  There ought to be a simple way to poll
> that page programatically without messing with the hardware.  Say, by
> using the usb event mechanisms?  Like as not somebody somewhere has
> already written code to do it.  I'd appreciate anyone who could point
> me in the right direction.
>
> John
>
>
> ***************************************************************************************
> If the mind is not constrained by walls and fences, where is the need for
> Windows and Gates?
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.sgvlug.net/pipermail/sgvlug/attachments/20071111/f79b941c/attachment.html


More information about the SGVLUG mailing list