Google Patent – Duplicate Document Detection
12/1/2009 Google was granted a patent by the US Patent Office detailing how duplicate documents are detected in a web crawler system. This new patent details how Google detects and then filters or determines which documents are the “more important” version for the purpose of providing unique search results.
Details