If you are a SEO you know of Screaming Frog, a tool used to analysis a website by crawling each page on the site as a search spider would. Screaming Frog just released version 10, a major enhancement that just may make it my new favorite toy.
Embracing the Concept of Indexable
One of my biggest beefs with Screaming Frog has been that it didn’t have the concept of indexability. Let me illustrate what I mean by this with an example.
When you crawl a website with Screaming Frog, it organizes the results into tabs that align with on page elements important for SEO. The titles tab lists the URLs and the title tags found on the page which then can be filtered for potential SEO issues such as duplicate title tags.
It’s frustrating to see a list of pages with duplicate titles only to investigate and find out they have been addressed with the canonical tag. If only Screaming Frog understood the concept of indexability and showed me only duplicate titles that have not been addressed. Other crawlers such as Deep Crawl and OnCrawl do this, why not Screaming Frog?
In version 10 Screaming Frog has addressed this shortcoming. In most of the tabs you’ll find two new fields, Indexability and Indexability Status. The first field is set by the crawl to either “Indexable” or “Non-Indexable”, the second field will tell you why Screaming Frog considers the page Non-Indexable. For example the status field could be set to “Canonicalised” (Screaming Frog is a British company, hence the “s” instead of “z”) indicating that the page is not indexable because there is a pointer to a “canonical” (or representative) page of a group of pages.

Then it is just a matter of sorting or filtering the Non-Indexables out and focusing on Indexables with duplicate titles.
Now you should keep in mind that just because a page is “Non-Indexable” there is no guarantee that Google will not index it. Google has been known to ignore canonical tags as well as index URLs blocked by robots.txt Disallows, but this feature goes a long way to helping SEOs focus on the right issues.
Snazzy Visualizations
While I still think the Indexability fields are the best thing about Screaming Frog release 10, I must admit the new visualizations (Should I spell that with a “s” since the tool does?) are not only pretty snazzy looking but actually very useful.
There are several visualizations of three categories: Force-Directed, Tree Graphs and Word Clouds. You can get the Force-Directed and Tree Graphs for the entire crawl, or you can select a URL and right click and get the Word Clouds as well.

The Force-Directed and Tree Graphs come in two flavors, Crawl and Directory. A Directory graph will give you a picture of the “levels” of your site, how it is laid off and where each page is located in the information architecture of your site.
This is what a Force-Directed Directory Graph looks like for a 30,000 page site I recently crawled:

Each page is represented by a node on this graph. You’ll note there are both red nodes and green nodes. The green nodes represent valid Indexable pages, the red nodes are Non-Indexable. See how there are a quite a few red nodes in the first ring? These are all redirects, and the the internal linking should be fixed to link directly to the target URL.
Yellow nodes (not seen above) are collapsed nodes that can be opened (Screaming Frog says they will show up to 10,000 URLs at a time so with larger sites you will have some collapsed nodes). You can also drill down by shifting focus with a right click.
And just because I could ….
You can also customize the colors of the graphs. Don’t like green and red? Change it! Here’s one of my creations using a Force-Directed crawl graph as the canvas.

More Emphasis on Canonical and Pagination Data
You used to be able to find Canonical and Pagination tag data under the Directives tab. The Directives tab is still there but now focused on the meta robots tag. Canonical and Pagination now have their own tabs, and don’t overlook the Canonical inlinks bulk reports, I’m sure I will be using those.
Post Crawl Analysis and Link Score
After the crawl is done you can click a button to perform a post crawl analysis. This will provide even more data on your crawl. You’ll find the button to the right of the “Start” and “Clear” buttons, the below is the display you will see when the analysis is complete.
Here is the additional data surfaced by the post crawl analysis:
- Link Score: Assigns a link score to all internal URLs
- Pagination: Finds pagination loops and unlinked pagination URLs
- Hreflang: Finds unlinked hreflangs
- AMP: Finds missing html amp tags
- Sitemaps: Finds a number of useful problems with Sitemaps such as Non-Indexable URLs in the Sitemap
- If you have connected Search Console and Analytics, it will find orphan URLs referenced in these tools that are not in the crawl
In version 10 Screaming Frog has expanded functionality in assessing AMP as well as well as the ability to include Sitemaps in the crawl (previously you could upload sitemaps but they were crawled separately).
I was very interested in the new Link Score metric. This metric calculates the internal SEO authority of a page by measuring the value of page based on the internal links pointing to it. This is a very useful metric for SEOs as you can uncover important pages that are not getting enough internal link love.
I found the Link Score data in the Internal tab next to the Crawl Depth tab. Just with a quick glance we can see that the first level pages have a high Link Score due to a lot of incoming internal links (which is typical), but what may warrant further investigation is whether pages in the next level are getting enough internal link love.
As we can see in the second graphic, once we get to the pages in the next levels of crawl depth, the Link Scores are much lower, this might be something to address.
There are many more new features I haven’t mentioned in version 10. For example, Screaming Frog also now allows you to schedule crawls and provides a headless browser. You can find a full writeup of the new version in the Screaming Frog blog.
Hi Kathy,
Thx for a super intro to this new Froggy thing… I compared a crawl of herbalcenter.dk and found some NON-INDEXABLE pages with status code 429 (Too many requests) with Froggy and then I did the same with Link Assistant’s Website Auditor and all is status 200 ??
When I compare my site to another WP on a different host there are also a difference – My hosts uses the 429 for all webshops (Woo and Magento) and I do have a Woo.
But why does Froggy now detect error 429 code and non-indexability on pages that are indexed correctly (I checked) where other tools (also with the visuel-tool :-)) don’t.
Should I care about 80 blog posts with 429 code? or ask my host to configure his server in “normal” ways 😉
Best wishes
Peter
Yep, there’s a Cloudflare feature that detects crawls from non Google IPs, especially if the user agent is set as Googlebot; and blocks by returning 429s. You can config Cloudflare to allow your IP address. This is a clever way to ensure that only crawls that the website owner knows about are done. But note that Deepcrawl has a “random” crawl that usually works.