[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Full-disclosure] Google's robots.txt handling
- To: Mario Vilas <mvilas@xxxxxxxxx>
- Subject: Re: [Full-disclosure] Google's robots.txt handling
- From: Philip Whitehouse <philip@xxxxxxxxx>
- Date: Thu, 13 Dec 2012 12:52:27 +0000
I restate my email's second point.
Google is indexing robots.txt because (from all the examples I can see)
robots.txt doesn't contain a line to disallow indexing of robots.txt
It is possible that some web sites provide actual content in a file that
happens to be called robots.txt (e.g a website concerned with AI development).
Could Google do better by removing the file? Sure. But as webmasters haven't
told them not to, even though they have provided other files not to index,
Google is doing exactly what they were asked.
Maybe the R.E.S. should state that a valid robots.txt should not be indexed.
Incidentally Bing shows the same behaviour - in fact the Google file is the 4th
hit even without any of the file type classifiers.
Philip Whitehouse
On 13 Dec 2012, at 11:40, Mario Vilas <mvilas@xxxxxxxxx> wrote:
> That paragraph says pretty much the exact opposite of what you understood.
>
> Also, could we please stop refuting points nobody even made in the first
> place? OP never claimed this to be a vulnerability, nor ever said robots.txt
> is a proper security mechanism to hide files in public web directories.
>
> All OP said was the way robots.txt is indexed allows for some Google dorks to
> be made, and it may be a good idea to avoid that. Clearly it's not the
> discovery of the century, but it seems fairly reasonable to me... I don't get
> what all this fuzz is about.
>
> On Wed, Dec 12, 2012 at 12:18 PM, Christoph Gruber <list@xxxxxxx> wrote:
>> On 12.12.2012 at 00:23 "Lehman, Jim" <jim.lehman@xxxxxxxxxxxxxxxxxxx> wrote:
>>
>> > It is possible to use white listing for robots.txt. Allow what you want
>> > google to index and deny everything else. That way google doesn't make you
>> > a goole dork target and someone browsing to your robots.txt file doesn't
>> > glean any sensitive files or folders. But this will not stop directory
>> > bruting to discover your publicly exposed sensitive data, that probably
>> > should not be exposed to the web in the first place.
>>
>> Maybe I misunderstood something, but do you really think that "sensitive"
>> can be hidden in "secret" directories on publicly reachable web servers?
>> --
>> Christoph Gruber
>> By not reading this email you don't agree you're not in any way affiliated
>> with any government, police, ANTI- Piracy Group, RIAA, MPAA, or any other
>> related group, and that means that you CANNOT read this email.
>> By reading you are not agreeing to these terms and you are violating code
>> 431.322.12 of the Internet Privacy Act signed by Bill Clinton in 1995.
>> (which doesn't exist)
>>
>> _______________________________________________
>> Full-Disclosure - We believe in it.
>> Charter: http://lists.grok.org.uk/full-disclosure-charter.html
>> Hosted and sponsored by Secunia - http://secunia.com/
>
>
>
> --
> “There's a reason we separate military and the police: one fights the enemy
> of the state, the other serves and protects the people. When the military
> becomes both, then the enemies of the state tend to become the people.”
>
> _______________________________________________
> Full-Disclosure - We believe in it.
> Charter: http://lists.grok.org.uk/full-disclosure-charter.html
> Hosted and sponsored by Secunia - http://secunia.com/
_______________________________________________
Full-Disclosure - We believe in it.
Charter: http://lists.grok.org.uk/full-disclosure-charter.html
Hosted and sponsored by Secunia - http://secunia.com/