Monday, June 04, 2007
Inside Google's Secret Search Algorithms

For many webmasters and bloggers, getting to the top of Google's search results page for a popular search term is like becoming the next American Idol.
Not surprisingly, the algorithms that Google uses to rank websites is kept closely guarded, but Saul Hansell of The New York Times managed to get an inside look at how Google serves up search results.
In case you don't want to read the whole piece, here are the highlights:
- Any Google employee can report search problems to the search quality team using a homegrown "Buganizer" system. This happens about 100 times a day.
- The search results page balances freshness, authority, and diversity of websites in deciding which sites make the cut.
- QDF, for query deserves freshness, is a newly developed algorithm which Google uses to determine whether a user is searching for up-to-the-minute news on a subject or wants older, more authoritative pages. How "hot" a topic is factors into it.
- The number of search queries is one factor used to quantify "hotness" along with how many people are writing about a topic on blogs and other sites.
- 200 pieces of information, or signals, are used to rank pages. The infamous PageRank is just one signal.
- Signals are passed onto classifiers that try to determine what a user is looking for: a product, information about a place, a brand name, etc. If you're familiar with pattern recognition, you'll see some parallels.
- Key measures are calculated by the signals and classifiers to determine the relevancy of a page, which are combined to form a relevancy score.
- It isn't enough to get the highest relevancy score: the 10 sites on the first page of search results must offer a diversity of opinion. For instance, if you search for a product, you might see a blog review of it, the manufacturer's page, and a shopping site where you can purchase it.
Labels: Technology