[SGVLUG] Robots.txt (was: Paging Greg Stark...)

Chris Louden chris at chrislouden.com
Tue Mar 25 18:44:17 PST 2008


I recommend http://www.mail-archive.com/

On Tue, Mar 25, 2008 at 7:34 PM, Emerson, Tom (*IC)
<Tom.Emerson at wbconsultant.com> wrote:
>
>
> -----Original Message----- Matt Campbell
>
>
> What's involved in writing a robot to strip out the headers for all the
> messages in our archive?  That way it would be less invasive to have
> everything available through Google.
> it is not a robot on our side, but rather instructions to /Google's/ robot
> (or Yahoo's, Altavista's, or any of the gazillion search engines out there)
> Basically, it is a simple text file that lists the directories that are "off
> limits" to web-spiders or "robots".  It is placed in a known/common
> location, and all "robots" are /supposed/ to abide by it.
>
> As far as "should our e-mail archive be indexed by the big guys?", I know
> there are campers on both sides of this issue, and I'm generally on the
> "pro" indexing side of the fence for a simple reason (or two) -- if someone
> solves a particularly involved Linux problem on the list, the next person
> with that same or similar problem WON'T find the answer if they aren't a
> member of our group/list (and even if they are, they have to THINK about
> searching our archives in the first place)
>
> (the secondary reason is that it increases exposure of our group in
> particular -- take your case as a prime example: if you found a suitable
> solution to your hard drive problems solely by searching "the net" and
> finding our archive AND seeing that we were "local", chances are you would
> consider stopping in for a meeting or two, right?)
>
> On the "anti" side are folks worried about how they may appear to the rest
> of the world should one of their sgvlug posts appear in wider circulation
> than just this list (ummm... "shouldn't have posted it in the first place"
> is usually the counter argument, but even really good things can be taken
> "out of context" and seem rather disparaging...)
>
> Then there are a few that actively protect their anonymity (sp?) while
> online, and a global (or even local) index kind of defeats that purpose
> (for that, there is the "x-no-archive" header you can apply to your e-mail
> client -- instructions for such are on our site -- but that doesn't stop
> manual archiving by packrats like me ;) )
>
>


More information about the SGVLUG mailing list