[SGVLUG] Robots.txt (was: Paging Greg Stark...)
Matt Campbell
dvdmatt at gmail.com
Tue Mar 25 19:52:31 PST 2008
Hi Tom,
You mentioned a couple of all-or-nothing approaches. If we strip out
message headers the information can be archived, searched and used without
compromising the anonymity of the list members. If the list archive is a
simple text file this should be fairly trivial to accomplish if the list
wills it so. Is there a way our moderator can take a straw poll?
Matt
From: sgvlug-bounces at sgvlug.net [mailto:sgvlug-bounces at sgvlug.net] On Behalf
Of Emerson, Tom (*IC)
Sent: Tuesday, March 25, 2008 7:34 PM
To: SGVLUG Discussion List.
Subject: [SGVLUG] Robots.txt (was: Paging Greg Stark...)
-----Original Message----- Matt Campbell
What's involved in writing a robot to strip out the headers for all the
messages in our archive? That way it would be less invasive to have
everything available through Google.
it is not a robot on our side, but rather instructions to /Google's/ robot
(or Yahoo's, Altavista's, or any of the gazillion search engines out there)
Basically, it is a simple text file that lists the directories that are "off
limits" to web-spiders or "robots". It is placed in a known/common
location, and all "robots" are /supposed/ to abide by it.
As far as "should our e-mail archive be indexed by the big guys?", I know
there are campers on both sides of this issue, and I'm generally on the
"pro" indexing side of the fence for a simple reason (or two) -- if someone
solves a particularly involved Linux problem on the list, the next person
with that same or similar problem WON'T find the answer if they aren't a
member of our group/list (and even if they are, they have to THINK about
searching our archives in the first place)
(the secondary reason is that it increases exposure of our group in
particular -- take your case as a prime example: if you found a suitable
solution to your hard drive problems solely by searching "the net" and
finding our archive AND seeing that we were "local", chances are you would
consider stopping in for a meeting or two, right?)
On the "anti" side are folks worried about how they may appear to the rest
of the world should one of their sgvlug posts appear in wider circulation
than just this list (ummm... "shouldn't have posted it in the first place"
is usually the counter argument, but even really good things can be taken
"out of context" and seem rather disparaging...)
Then there are a few that actively protect their anonymity (sp?) while
online, and a global (or even local) index kind of defeats that purpose
(for that, there is the "x-no-archive" header you can apply to your e-mail
client -- instructions for such are on our site -- but that doesn't stop
manual archiving by packrats like me ;) )
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.sgvlug.net/pipermail/sgvlug/attachments/20080325/882471ec/attachment-0001.html
More information about the SGVLUG
mailing list