If you’ve ever worked in SEO, it’s likely there have been many instances where you’ve felt at the mercy of Google. You’re sure that you’ve done everything right, you’ve gone through Google’s guidelines with a fine toothcomb, and yet it’s almost as if Google takes some kind of sadistic pleasure in confining your website to the darkest corners of its search results.
Let us start by clarifying something: Google wants you to succeed, providing that you put in the work. This can be seen in how it makes the inner workings of its processes more transparent, particularly when it comes to how it crawls and indexes websites. In 2018, Google introduced the URL Inspection Tool, which in essence breaks down Google’s indexing criteria and helps you pinpoint any issues that may be affecting your website’s search engine visibility.
What is Google indexing?
What distinguished Google from other search engines pretty early on is that, rather than periodically scanning the world wide web, its bots are constantly out there crawling the widest corners of the internet. Once a webpage is crawled it is filed into an ever-expanding library known as Google’s ‘index’. This index is constantly refreshing and refining itself so that it can consistently deliver the most relevant and highest quality web content as simply and quickly as possible. This process of Google crawling webpages, retrieving all the salient information and filing it away in its library is known as ‘indexing’.
Sounds pretty straightforward right? Well not exactly. A crucial part of SEO is making it as easy as possible for Google to effectively crawl and index websites. Part of this involves identifying a range of common technical issues that act as roadblocks for all of those little bots scuttling around the finer details of your website.
What are some examples of common indexing issues?
This term is used to refer to links that are in essence, dead ends. This could be caused by incorrectly entering a URL, an error in a webpage’s HTML code or the fact that the page simply no longer exists.
This is the assigned URL for a page and acts as a signal for Google that it should be prioritised over other variables (e.g. HTTP vs. HTTPS). This minimises the risks associated with duplicate content, which is when two pages are so alike that Google doesn’t know which one to prioritise in search results.
These are tags that signal to Google that a page shouldn’t be indexed, whether to limit webspam or duplicate content. These are usually added deliberately, but it’s still something to check for if Google isn’t indexing your website.
A 301 redirect is used to fix broken links, but there may be instances where one page redirects to another, which then links back to the original URL. This creates a closed loop where a page is constantly redirecting.
This file is added to a website to essentially act as a point of reference for Google’s bots. This ensures that only user-facing pages on your website (and none of that back-end stuff) are crawled and indexed. So be sure to check that no important pages have managed to inadvertently find their way onto your robots.txt file.
As you probably know, Google uses mobile-first indexing to address the drastic shift from desktop to mobile as the dominant way people use the internet. While this has been the case for some time, some websites are yet to catch up. If a website loads at a glacial pace or doesn’t have a responsive design for mobile users, Google isn’t exactly incentivised to index or rank your page well.
How does Google’s URL Inspection Tool Work?
If any of the above issues are affecting your website, you should be able to identify these using Google’s URL Inspection Tool. Launched in 2018, this tool is a fantastic example of Google seeking to make its processes more transparent. In this section, we’ll be breaking down each of the features in the URL Inspection Tool and how they work together to ensure your pages are indexed properly.
URL Presence on Google
If your URL is marked as being ‘on Google’ then huzzah, you’re indexed! This is one of five indexability categories offered by Google, the remaining can be found below:
- URL is on Google, but has issues: While the URL has technically been indexed, there are problems with the enhancements that are considered best practice for search engine visibility. For example, the structured data (commonly considered the “language” of browsers and search engines) may be incorrect, or perhaps the website is considered unfriendly to mobile users.
- URL is not on Google: Google has taken heed of a notice that a page shouldn’t be crawled, such as a URL being listed on the website’s robots.txt file. There may also be less direct signifiers that a page shouldn’t be indexed. For example, it may be an orphan page, meaning that no other internal links are pointing to that page on the website.
- URL is not on Google (Indexing): Google has encountered one of the indexability issues mentioned earlier in this article, such as a broken link or a no-index tag.
- URL is an alternate version: This is when Google flags that the URL you have entered is either the AMP version or perhaps the desktop version of a URL when your website is understood by Google as being mobile-first.
View Crawled Page
This part of Google Search Console allows you to understand the three main ways Google crawls your website and evaluates its overall quality and user experience. Firstly, the ‘HTML’ tab presents the rendered page code, which can allow you to diagnose issues such as incorrect canonical URLs or misplaced structured data. Secondly, a screenshot of the crawled webpage (as it would be seen on a smartphone) can help you visualise any potential problems with mobile usability. Finally, a handy ‘More Info’ section allows you to dive into the technical nitty-gritty of a webpage, such as the specific type of content that has been crawled (usually text and HTML) and HTTP status codes.
Made changes to a URL that you’re eager for Google to crawl? If you’ve added some new content to a page, perhaps optimised for a high-priority keyword or fixed some pressing technical issues, the Request Indexing tool prompts Google to crawl and re-index your page. The act of submitting URLs for crawling is simple enough, but there are a few key points to consider:
- Before you get too trigger-happy you should know that requesting the same page to be indexed multiple times does not make Google re-index your URL faster.
- You can submit a maximum of 10-12 URLs from the same website to be indexed in a day.
- Regardless of how many times a page is requested to be indexed, Google won’t turn a blind eye to any issues such as no index tags or misused canonical tags.
If you’re interested, this section gives you more insight into how Google was able to crawl and index a webpage. Firstly, the Discovery tab indicates how the URL was first encountered by Google’s trusty bots, for instance, whether it was via a referral from another page or by crawling a sitemap. Secondly, the Crawl section offers details about the last time a page was successfully crawled, such as the date of a crawl and the ‘user-agent’ (e.g. a smartphone or desktop). Finally, the Indexing section makes a distinction between the ‘user-declared canonical’, which is usually specified using a canonical tag, and the canonical URL that Google was able to settle on.
This is where as SEO experts, we get to validate the success of all of those little flourishes that help a website achieve peak crawlability and search engine visibility. For example, this part of the URL Inspection Tool pulls through any structured data you have added, highlights whether your website is mobile-friendly or not, and shows whether or not Google has crawled key parts of your website’s brand identity such as its logo or any user reviews.
Test Live URL
By prompting Google to retrieve the latest version of a URL, this tool allows you to get a real-time update on a URL’s indexability status. It is often used to validate technical fixes so users can see if their work has had a direct impact on whether or not a page is indexable.
How Long Does Google Indexing Take?
As I’m sure you can imagine Google is often very busy! Therefore, it can any time between a few days and a few weeks for a page to be indexed, and that’s if there are no issues that would affect the website’s indexability. While we recommend checking periodically, it’s not worth your time submitting URLs to be recrawled unless you’ve fixed a problem with indexability.
Yellowball is a leading London SEO agency, with a team of technical specialists, content creators and account managers who are all dedicated to helping clients achieve long-term success in organic search.
Your website is the foundation of your online presence, so we begin all of our campaigns with a thorough technical review. This ensures that any technical or UX-related issues are swiftly rectified in adherence to Google’s quality standards.
Need help growing your traffic, rankings and conversions in organic search? Get in touch with our team of experts to find out more.