[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Full-disclosure] Google's robots.txt handling



United States law is opt-in for Fortune 500 companies.

2012/12/14 Jeffrey Walton <noloader@xxxxxxxxx>

> On Thu, Dec 13, 2012 at 7:52 AM, Philip Whitehouse <philip@xxxxxxxxx>
> wrote:
> > I restate my email's second point.
> >
> > Google is indexing robots.txt because (from all the examples I can see)
> > robots.txt doesn't contain a line to disallow indexing of robots.txt
> >
> > It is possible that some web sites provide actual content in a file that
> > happens to be called robots.txt (e.g a website concerned with AI
> > development).
> >
> > Could Google do better by removing the file? Sure. But as webmasters
> haven't
> > told them not to, even though they have provided other files not to
> index,
> > Google is doing exactly what they were asked.
> >
> Webmasters don't have to in the US - the Computer Fraud and Abuse Act
> (CFAA) means Google (et al) must operate within the authority granted
> by the webmasters. If that means the webmasters decide they don't want
> their site crawled, then Google (et al) has exceeded its authority and
> broken US Federal law. Just ask Weev.
>
> This system needs a submission based whitelist.
>
> Jeff
>
> _______________________________________________
> Full-Disclosure - We believe in it.
> Charter: http://lists.grok.org.uk/full-disclosure-charter.html
> Hosted and sponsored by Secunia - http://secunia.com/
>
_______________________________________________
Full-Disclosure - We believe in it.
Charter: http://lists.grok.org.uk/full-disclosure-charter.html
Hosted and sponsored by Secunia - http://secunia.com/