Friday, November 4, 2011

Google Webmaster Tools Launches New Message Alert For Duplicate Content!

Google Webmaster Tools has launched a new message alert that will inform the webmasters whenever a particular URL is seen by Google as a duplicate of another domain's URL, and consequently doesn’t appear in search results.

How Google Identifies Duplicate Content:

Google recently introduced the Webmaster Tools alert feature and also threw light on the way, Google identifies duplicate content cluster and then selects a canonical version of the cluster to display as a part of search results.

Google is quoted on this matter as, “When we discover a group of pages with duplicate content, Google uses algorithms to select one representative URL for that content. A group of pages may contain URLs from the same site or from different sites. When the representative URL is selected from a group with different sites the selection is called a cross-domain URL selection.”

Google says that Webmasters mostly use rel=''canonical'' elements or 301 redirects to indicate the preferred URL to Google, and that is why the choice made by the algorithms are guided by the webmaster’s intent. Like in the case of multiple URLs having the same content (the reasons may be infrastructure configuration, optional parameters, or internationalization), the webmasters can use the aforementioned options to indicate to Google which version is canonical.

But, if these preferences are not defined, or if Google selects a different version than the site owner specified, then this alert comes into play. This new feature alerts site owners when Google algorithms select an external URL rather than one from their own website. As Google says, “To be transparent about cross-domain URL selection decisions, we’re launching new Webmaster Tools messages that will attempt to notify webmasters when our algorithms select an external URL instead of one from their website.”

Why Google Makes The Cross Domain URL Selections:

In the case that webmasters specified a URL, and still a cross domain URL was selected, Google has pointed out to these common reasons:

  • Regional sites- Duplicate content on multiple regional sites is a common cause. Same content for a .com or a .ca site. In this case Google may will bring together the pages with duplicate content and will display the pages according to their relevance to the query.
  • Incorrect canonicalization- If the webmasters have misconfigured the usage of canonicalization techniques, and have pointed to URLs on an external website, then Google's algorithms may select the external URLs. The same is the case with misconfigured content management systems or CMS plugins installed by the webmaster.
  • Misconfigured server- In the case of shared hosting, a hosting misconfiguration can lead to duplicate content being displayed across two different domains.
  • Hacked site- If a site is hacked, then a code is introduced that leads to unwanted canonicalization. For instance, a malicious code may lead the website to return an HTTP 301 redirect .
Another situation, where Google may select an incorrect URL is, when it displays a site that has scraped content from your site. In this case webmasters may complain too, as Google is quotes, “If you believe that another site is duplicating your content in violation of copyright law, you may contact the site’s host to request removal. In addition, you can request that Google remove the infringing page from our search results by filing a request under the Digital Millennium Copyright Act.”

When And Where Will You Get The Alert

Google has specified that this alert will be available to the webmasters within the message center. Also, they will only see them if their site has duplicate content issue, and that Google is currently only reporting on the URLs from the Top Pages report.

As a webmaster, do you think this alert will come in handy, to figure out why a certain page never shows up in search results? For further clarifications, you can refer to Google's in-depth help topic, and do leave your comments behind on this feature.


No comments: