[ List Archives Home ] [ Thread index for 2008 ]
[ Date index for 2008 ]
[ Author index for 2008 ]
[
Date Prev][
Date Next][
Thread Prev][
Thread Next][
Date Index][
Thread Index]
I know they can, especially when inserting xml code into the file. My questions are:
1- Is that file located on our III server?
2- How secured it is?
3- How can we edit it, if we know the "Standard for Robot Exclusion" as it is on (as Steve mentioned):
http://www.robotstxt.org/orig.html
4- Is there a standard to enforce Search engine robots and others to respect our file?
5- How do we know if our file has been mined?
Thanks all and sorry for any typos.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Said Shafik Library Systems Manager Drexel University Libraries Philadelphia, PA 19104-2875 Phone: 215.895.1832 Fax: 215.895.2070
http://library.drexel.edu
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> Date: Fri, 8 Aug 2008 12:03:57 -0700> From: jamesp at multcolib dot org> To: innopac at innopacusers dot org> Subject: Re: [IUG] Web Logs Analysis> > > And the *really* bad guys will mine it, using it as a list of things to> take a look into.> > James Price> Senior Development Analyst> Multnomah County Library > > > -----Original Message-----> From: innopac-bounces at innopacusers dot org> [
mailto:innopac-bounces at innopacusers dot org] On Behalf Of Steve Lindemann> Sent: Thursday, August 07, 2008 9:25 AM> To: IUG INNOPAC List> Subject: Re: [IUG] Web Logs Analysis> > > said shafik wrote:> > > > Using WebLogExpert to analys our Web Log Report, found a file called: > > robots.text which allows what pages to be indexed. Do we have a> control over what to be indexed or not?> > > > yes and no... if you have put robots.txt into place you can indicate > whether you wished to be indexed and by whom, but that only works with > well behaved spiders and bots and such. The legitimate search sites > should honor robots.txt, but less scrupulus sites and bad guys in > general will simply ignore it.> > and fyi... it's robots.txt, not robots.text - the first works, the > second does nothing.> > ref:>
http://www.robotstxt.org/ http://en.wikipedia.org/wiki/Robots.txt>
http://www.google.com/support/webmasters/bin/answer.py?hl=en&answer=4036> 0> and you can google robots.txt for more> --> Steve Lindemann __> Network Administrator //\\ ASCII Ribbon Campaign> Marmot Library Network, Inc. \\// against HTML/RTF email,> url:
http://www.marmot.org //\\ vCards & M$ attachments> email:
mailto:steve at marmot dot org> voice: +1.970.242.3331 ext 116> fax: +1.970.245.7854> > > --> This message was distributed through the Innovative Users Group INNOPAC> list Public replies: INNOPAC at innopacusers dot org Update your subscription> options:
http://innopacusers.org/mailman/listinfo/innopac> > --> This message was distributed through the Innovative Users Group INNOPAC list> Public replies: INNOPAC at innopacusers dot org> Update your subscription options:
http://innopacusers.org/mailman/listinfo/innopac
_________________________________________________________________
Got Game? Win Prizes in the Windows Live Hotmail Mobile Summer Games Trivia Contest
http://www.gowindowslive.com/summergames?ocid=TXT_TAGHM
--- StripMime Report -- processed MIME parts ---
multipart/alternative
text/plain (text body -- kept)
text/html
---