Google: We Do Not Index Parts Of A Page Independently
Some may speculate that Google doesn’t just index a URL and the content on that URL as a whole, but may index parts of a page independently. That is not true, Google’s John Mueller said on Twitter “We don’t index parts of a page independently, we index the pages as a whole, and try to understand the context of the content there.”
We don’t index parts of a page independently, we index the pages as a whole, and try to understand the context of the content there. Scrolling to a part of a page when we know that’s where the snippet was from makes a lot of sense regardless of indexing.
— 🍌 John 🍌 (@JohnMu) August 7, 2019
This came up when Glenn pointed out a section from John Mueller in a Google hangout where John said it is super rare for Google to index content after the hash in the URL uniquely. John said Google might index urls with a hash if it knows it leads to unique content but it is super rare.
Here is what John said:
Does Google ignore the URL parameters that come after hash. In other words – Google is slash category, hash some attribute the same?
Yes. In general we ignore everything after hash.
There two exceptions, one is the hash bang kind of the hash and then the exclamation mark. Which is what was used in the old AJAX crawling scheme, which we kind of separate out individually and treat as unique URLs. And the other exception is for a very very tiny number of sites we’ve recognized that URLs with the hash lead to unique content, so it’s not just going up and down within the page but actually leading to unique content, and there we do sometimes index those URLs with the hash as well.
But that’s extremely rare and that’s not something that I would rely on. So if you’re using the hash to
change the content of your page, I would assume that we will crawl and index the URL without the hash for the most part. If you’re using the hash to jump up and down within your pages content, then that’s perfectly fine. We we tend to ignore everything after the hash. So things like links to the site and the indexing, all of that will be based on the non hash URL. And if there are any links to the hashed URL then we will fold up into the non hash URL.
So two exceptions:
(1) The old AJAX crawling schema format, which I thought they stopped supporting completely but maybe not?
(2) A very very tiny number of sites we’ve recognized that URLs with the hash lead to unique content.
But back to Google indexing sections of a page separately, that is not true according to Google’s John Mueller. Google may show they have the # URLs indexed but that probably isn’t what is in the Google index but what Google might be linking to. Google can and has anchored searchers down to portions of content before – it does not mean Google indexed that part separately but rather it understands the content on the page.