Why Have A Supplemental Index?

On Tuesday the Google Webmaster Central Blog announced that the supplemental tag would be removed from search results. While not unexpected, the news is disappointing and the reasons given for the tags removal still strike me as hollow and little more than supplemental spin.

Danny Sullivan had a nice write up of the supplemental index and the news about dumping the results label. Barry Schwartz added some thoughts as well.

From Danny’s post:

By and large, it was less important pages that ended up in the supplemental index. If Google’s main index — containing what it considered the best or most important pages — didn’t seem to have a match, then Google would look to the supplemental index for listings.

and

Pages in the supplemental index could, did and do rank — but only generally if there’s not much to be found in the primary index.

As much as Google would like us to believe otherwise, a page in the supplemental index will get less traffic from Google than were that same page in the main index. Knowing which pages are in the supplemental index allows webmasters to identify potential problems with those pages and sometimes with the site as a whole.

If it were simply a matter of not enough links pointing to a page then this wouldn’t be such a big deal. We all know or should know that links help pages rank and we should be looking to bring links into our pages and our site as a whole. But sometimes the problem has nothing to do with links. As I’ve mentioned on several occasions last summer this entire blog went supplemental due mostly to Google indexing the feed of each post along with the html version of the post. Admittedly most of the posts at that time had very few links, but a quick fix to my robots.txt file pulled all of the posts out of the supplemental index and placed them back in the main index. Traffic to the site increased about 1500% in less than a month.

The problem was more than links or a lack of them. Seeing the supplemental label is what led me to understand the nature of the problem and find the solution. As recently as today Michael Gray posted about WordPress comments being seo unfriendly and the issue and solution are the same. Content goes supplemental in ways that might not be apparent at first and the label in the results helped in the discovery of potential problems.

Google, though, has chosen to remove the information. Their reasons are

The distinction between the main and the supplemental index is therefore continuing to narrow. Given all the progress that we’ve been able to make so far, and thinking ahead to future improvements, we’ve decided to stop labeling these URLs as “Supplemental Results.”

Narrowing the gap is an admission that there is a gap. It’s also vague in that it’s not quantifiable or meaningful. If you and I are currently 10,000 miles apart and we take a step towards each other we’ve narrowed the gap between us. We’re hardly any closer to each other.

Google would like us to believe the goal is to bring the two indexes closer to each other in the future until the distinction between them is virtually non-existent. If that’s the case then why have the supplemental index at all? What purpose would it be serving?

We’re being told the reason for dropping the label is that it’s not important for us to know and Matt Cutts doesn’t want people fixating on their supplemental pages. More likely the reason the labels are going away is that Google wants to hide the fact that there is a difference and will continue to be a difference between the two indexes.

In all fairness to Google it’s still possible they’ll give us a way to determine supplemental pages in some fashion through Webmaster Tools. And even if Google doesn’t people will still find a way. At the moment there’s another simple query you can use to find your supplemental pages. It will probably stop working before long, though. Jim Boykin has ideas on another way to find supplemental results. Google sent Danny a way to discover supplemental pages as well.

First, get a list of all of your pages. Next, go to the webmaster console [Google Webmaster Central] and export a list of all of your links. Make sure that you get both external and internal links, and concatenate the files.

Now, compare your list of all your pages with your list of internal+external backlinks. If you know a page exists, but you don’t see that page in the list of site with backlinks, that deserves investigation. Pages with very few backlinks (either from other sites or internally) are also worth checking out.

I agree with Danny that it would be better to simply select the “export supplemental URLs” option, but I’m sure more ways will follow to determine or at least approximate supplemental pages.

The best solution is to return things as they were. Failing that the next best alternative would be a quick way to see supplemental pages through Webmaster Central. Regardless of what Google would have us believe the information about which pages of a site are supplemental is helpful and worth knowing.

Download a free sample from my book, Design Fundamentals.

3 comments

  1. Oh, that was interesting. We have so many pages in the supplemental index, it is not funny. We use Joomla for blogging. Joomla was not design with seo in mind. There are so many ways you can navigate to get to the same post with different urls. Of course Google thinks that these entries are duplicates of each other.

    I have been thinking more and more of moving our blog back to wp at least that way I can have a little more control over this.

    Khalid

  2. I agree it doesn’t really make sense to have a supplemental index, at least from the outside … but I don’t understand what Google has to gain from making us believe it doesn’t matter which index a age winds up in?

  3. Khalid I think the situation with Joomla happens with most CMS driven sites. WordPress is generally seo friendly, but it still has it’s duplicate content issues. Some of those issues do lead to the supplemental index and knowing that your posts have gone supplemental helps in fixing a few things.

    Forrest the supplemental index originated so that Google could still index pages it considered less important. They claim they’re making it much more like the main index, but if there’s no difference it seems silly to maintain two separate indexes.

    My guess is Google wants to keep us from reverse engineering things or maybe people were figuring out how to manipulate ranking based on supplemental results. Since supplemental pages don’t pass link juice maybe Google wants to keep hidden which sites it’s figured out are buying and selling links.

    It’s also been suggested that people clicked less on links that were labeled supplemental so perhaps this is simply a way to increase CTR.

    In the end we’ll still be able to figure out or guess well which pages of a site are supplemental, but it will be harder and it’s another step in Google holding back useful information. If the do add something to Webmaster Central it might not be so bad.

Leave a Reply

Your email address will not be published.

css.php