FORTUNE — The National Security Agency has declassified its training manual for using common search engines as a research tool.
Written by Robyn Winder and Charlie Speight and published in 2007 by the NSA’s Center for Digital Content,
Untangling the Web: An Introduction to Internet Research
is a 643-page long introduction to everything from the very basics of web research to finding confidential information that has accidentally slipped into the public domain. The document became available as a result of an April Freedom of Information Act request by MuckRock, a service-provider for journalists and researchers.
At George R. R. Martin length, the document is thorough to say the least. The introduction alone is filled with references to 10th-century Persia, Jorge Luis Borges, Sigmund Freud, and the Minotaur in the Labyrinth. As
pointed out, the chapter titled “Google Hacking” is getting the most immediate play. (Showing the document’s age, perhaps, there are also sections on Yahoo Search, Windows Live Search, and Ask.com.) “Nothing I am going to describe to you is illegal, nor does it in any way involve accessing unauthorized data,” the authors write. Instead, it “involves using publicly available search engines to access publicly available information that almost certainly was not intended for public distribution.”
The book is replete with tips and tricks, ranging from undocumented filetypes Google (GOOG) can look for, to how-to’s on running searches that include all the synonyms of a given term (a.k.a. use the magic ~). The entire document is available here, but here are the three hacks getting the most attention:
1. Find Passwords: The authors suggest the following search term to look for Russian spreadsheets that may contain login credentials: “filetype:xls site:ru login.” The filetype tells the search engine to look for Microsoft (MSFT) spreadsheets, the site indicates Russian domain names, and login — because “login” and “password” are often written in English even in foreign countries.
2. Find Confidential Spreadsheets: Again, a term like “filetype:xls site:za confidential” will pull confidential spreadsheets that have been accidentally posted in public, in this case in Brazil.
3. Find Misconfigured Web Servers: Web servers “that list the contents of directories not intended to be on the web often offer a rich load of information to Google hackers,” the document states. To find them, it suggest search: “—intitle: ‘index of’ site:kr password.”