While playing with the Google Webmaster tools, I came across the “Sitemap” XML protocol which is used to inform search engines about pages on your website that are available for crawling. The protocol spec is at https://www.google.com/webmasters/sitemaps/docs/en/protocol.html Think of this as the anti-robots.txt - instead of URLs with Disallow: tags, you have URLs for which the web administrator is saying “Index me.” Sitemap makes anti-forensics Google hacking more productive. It’s only a matter of time before enumeration tools like Wikto use it the same way that they use robots.txt to locate files. There are two interesting security-related issues with Sitemap, one significantly more interesting than the other. First, you can find pages with it that aren’t indexed by Google. The Sitemap protocol spec says “Using this protocol does not guarantee that your webpages will be included in search indexes. (Note that using this protocol will not influence the way your pages are ranked by Google.)” This is the lesser of the interesting points. Far more interesting - you can find pages in the sitemap.xml which would not be indexed if it weren’t for the Sitemap protocol… You can find some interesting stuff by querying for Sitemap files. “.htaccess” inurl:sitemap filetype:xml “global.asa” inurl:sitemap filetype:xml Whew. There are a LOT of automagic-generation Sitemap scripts out there which create Sitemap.xml files not by spidering a site, as they should… but by reading the contents of directories inside the web root from the local filesystem and creating the Sitemap.xml file from that. Ouch. I don’t blame Google - I think the Sitemap protocol is a pretty good idea, if though the guys at the Black Hat SEO forum think it doesn't help you get better rankings. It tells the Google search engine where to find pages which otherwise might not get indexed. Due to a plethora of rotten Sitemap.xml generation scripts, this is a directory and file enumeration issue that is going to be with us for a long, long time to come. Adam Muntner, CISSP | Partner | QuietMove, Inc. | w: http://www.quietmove.com Securing the Nexus Between People, Technology, and Information. ((Q))
Attachment:
signature.asc
Description: This is a digitally signed message part
_______________________________________________ Full-Disclosure - We believe in it. Charter: http://lists.grok.org.uk/full-disclosure-charter.html Hosted and sponsored by Secunia - http://secunia.com/