[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Full-disclosure] Google's robot.txt handling



On Tue, Dec 11, 2012 at 4:11 PM, Mario Vilas <mvilas@xxxxxxxxx> wrote:
> I think we can all agree this is not a vulnerability. Still, I have yet to
> see an argument saying why what the OP is proposing is a bad idea. It may be
> a good idea to stop indexing robots.txt to mitigate the faults of lazy or
> incompetent admins (Google already does this for many specific search
> queries) and there's not much point in indexing the robots.txt file for
> legitimate uses anyway.
I kind of agree here. The information is valuable for the
reconnaissance phase of an attack, buts its not a vulnerability per
se. But what is to stop the attacker from fetching it himself/herself
since its at a known location for all sites? In this case, Google
would be removing aggregated search results (which means the attacker
would have to compile it himself/herself).

Google removed other interesting searches, such as social security
numbers and credit card numbers (or does not provide them to the
general public).

Jeff

> On Tue, Dec 11, 2012 at 2:01 PM, Scott Ferguson
> <scott.ferguson.it.consulting@xxxxxxxxx> wrote:
>>
>> > If I understand the OP correctly, he is not stating that listing
>> > something
>> > in robots.txt would make it inaccessible, but rather that Google indexes
>> > the robots.txt files themselves,
>>
>> <snipped>
>>
>> Well, um, yeah - I got that.
>>
>> So you are what, proposing that moving an open door back a few
>> centimetres solves the (non) problem?
>>
>> Take your proposal to it's logical extension and stop all search engines
>> (especially the ones that don't respect robots.txt) from indexing
>> robots.txt. Now what do you do about Nutch or even some perl script that
>> anyone can whip up in 2 minutes?
>>
>> Security through obscurity is fine when couple with actual security -
>> but relying on it alone is just daft.
>>
>> Expecting to world to change so bad habits have no consequence is
>> dangerously naive.
>>
>> I suspect you're looking to hard at finding fault with Google - who are
>> complying with the robots.txt. Read the spec. - it's about not following
>> the listed directories, not about not listing the robots.txt.  Next
>> you'll want laws against bad weather and furniture with sharp corners.
>>
>> Don't put things you don't want seen to see in places that can be seen.
>>
>> >
>> >
>> > On Mon, Dec 10, 2012 at 8:19 PM, Scott Ferguson <
>> > scott.ferguson.it.consulting () gmail com> wrote:
>> >
>> >
>> >     /From/: Hurgel Bumpf <l0rd_lunatic () yahoo com>
>> >     /Date/: Mon, 10 Dec 2012 19:25:39 +0000 (GMT)
>> >
>> > ------------------------------------------------------------------------
>> >     Hi list,
>> >
>> >
>> >     i tried to contact google, but as they didn't answer my email,  i do
>> >
>> > forward this to FD.
>> >
>> >     This "security" feature is not cleary a google vulnerability, but
>> >
>> > exposes websites informations that are not really
>> >
>> >     intended to be public.
>> >
>> >     Conan the bavarian
>> >
>> > Your point eludes me - Google is indexing something which is publicly
>> > available. eg.:- curl http://somesite.tld/robots.txt
>> > So it seems the solution to the "question" your raise is, um,
>> > nonsensical.
>> >
>> > If you don't want something exposed on your web server *don't publish
>> > references to it*.
>> >
>> > The solution, which should be blindingly obvious,  is don't create the
>> > problem in the first place. Password sensitive directories (htpasswd) -
>> > then they don't have to be excluded from search engines (because listing
>> > the inaccessible in robots.txt is redundant).  You must of missed the
>> > first day of web school.

_______________________________________________
Full-Disclosure - We believe in it.
Charter: http://lists.grok.org.uk/full-disclosure-charter.html
Hosted and sponsored by Secunia - http://secunia.com/