Even with help, Bing struggles with duplicate content

February 3, 2012 / Search Engine Optimization / By Kathy Alice

seo: search engine optimizationFair warning, this one is for the SEO geeks out there.
Checking indexation and for duplicate content issues on Bing is sometimes overlooked with all the focus on Google. But it’s always good to check once in a while to see how your site’s SEO is doing on Bing. Check out my basic overview of Bing Webmaster Tools if it is new for you.

In this case, I had a client that had a number of very similar pages that did not fully have the rel canonical tag fully applied as a solution. Based on Eric Enge’s interview with Duane Forrester, where Duane stated “If we are finding your pages, but not keeping them in the index, there is a reason for that.” I decided to see what Bing Webmaster’s Index Explorer had indexed as “quality content” as a data point.

Bing Webmaster Indexation Stats
Bing Webmaster Indexation Stats, the spike in the middle was due to some crawl errors.
The answer was: apparently everything and more. Index Explorer listed more than 70,000 URLs. To put this in perspective, Google indexation of this site is about 2000 pages. While there still duplicate content issues with this site (true duplicate content, not just similar pages), particularly with a URL parameter used as a tracking parameter, this seemed extreme. An examination of the URLs in the Index Explorer showed indeed the parameter the main culprit.

Google introduced the URL parameters settings in Webmasters Tools so you could help Google decide whether to crawl pages with a parameter. With Google this is working fairly well. Bing has recently introduced a similar feature, called “URL Normalization”. The feature is confusing, you have two choices: “Enabled” and “Disabled”. Non intuitively “Disabled” means that Bing doesn’t do any “URL normalization”, so if you want the search engine to do the normalization you enable it. What that means is not entirely clear, but the instructions on the page say “use this page to specify the parameter “abc” be ignored …” So apparently “enabled” means “ignored”. Clear? Maybe to you. It hurts my head.

In any case it doesn’t work. 2 weeks later Bing Webmaster tells me that its indexation is still north of 75,000 pages. A site: command at bing.com is still at 11,000 pages plus. And annoyingly, the pages we do have canonical tags on, are showing up multiple times in the site: SERPs. I have heard through the grapevine that Bing is having problems processing the rel canonical tag properly, it appears this is true. Of course to be fair SEOs complain that Google doesn’t always remove canonicalized duplicate content either.

So far the tools at my disposal have proven ineffectual in addressing the duplicate content in Bing’s index. Of course the search engines have the position that rel canonical is a hint, a 301 is the preference, but a 301 is not an option here.

And even more sadly, I didn’t even get to a strategy for the similar pages.

About the Author Kathy Alice

Kathy Alice Brown is a SEO expert specializing in Technical SEO and Content. In her spare time she loves to get outside.

Leave a comment:

Your email address will not be published. Required fields are marked

    1. I have found that Bing can be slow to update it’s index – make sure you have an RSS feed on your site as Bing likes them – and set up a Bing webmaster tools account and submit your XML sitemap. Those two things might help.

  1. Kathy,
    Do you have any updates on this issue? I have a client with similar issue across his multi language site. Any further tips or lessons learned would be great.

    1. Good question, thanks for reminding me! Keep in mind that Bing Webmaster Tools has been completely revamped since I wrote this post, so the index explorer is presenting data differently – so much so that I am not sure I can confidently compare what I see with my previous findings. However, by one stat it is reporting a lot less pages: 9K and the site: command also returns thousands less. We did get pretty aggressive on the parameter settings, adding quite even more to ignore about 60 days ago. So it does appear that either Bing is heeding the canonical better, or the parameter settings really kicked in the last month or so. Couldn’t help to do both. Regarding multi language sites – that is another conversation – have you checked into the hreflang attribute?

{"email":"Email address invalid","url":"Website address invalid","required":"Required field missing"}

Related Posts