Log in to your account
Sign up

13 Steps To Maximise Your Crawl Budget

29 November 2016 | 0 comments | Posted by Che Kohler in nichemarket Advice

How to maximise your crawl budget

Crawl budget is one of the most important SEO factors, a lot is spoken about it, many of you have heard about it but not much focus and attention have been put on this vital practice. Making sure a crawler has a great experience and can easily find content and pages to index without bumping its head too many times can greatly increase your organic visibility.

Over the years I've worked on many sites, who to an extent go out of their way to make it harder for crawlers to engage with their site, their content and of course product quality organic listings. While optimising a website's crawl budget I've come across common mistakes which I now refer to as the 13 crawling budget sins. In the interest of websites everywhere I've decided to list them all and take you through them, so let's begin.

1. Configure Your Robots file

Your robots file is generally the doormat to your site and is the first interaction crawlers will have with your site. Letting spiders know where to go to find the most relevant content as fast as possible is self-explanatory. Use your robots file to block anything you think wouldn't be important for crawlers, like your backend and reference your XML sitemap for easy access.

2. XML sitemaps

Your sitemap is one of the clearest messages to the Googlebot and Bingbot about how to access your site. As the name suggests it serves as a map to your site for the crawlers to follow. Not every site can be crawled easily. Complicating factors may, for lack of a better word, “confuse” Googlebot or get it sidetracked as it crawls your site. A sitemap gives bots a guide on where to prioritise its search and what you recommend to it as important.

3. HTML sitemap

Just like the XML sitemap, an HTML sitemap can not only help crawlers but user experience. Helping both users and bots find important pages deeper within your site and provide a relevant structure to your site. HTML sitemaps also provide a great place to improve your internal interlinking especially when you're top site navigation may be limited to the most popular pages or limited due to mobile first considerations.

4. Regular fresh content

Web-crawlers are constantly crawling sites to find new and fresh content to add to its index and improve the search results for users. Freshness is an important factor when it comes to content as search results need to be as up to date or definitive as possible to please users. Giving crawlers brand new pages and content to crawl will go a long way into increasing your sites crawl budget.

Content that is crawled more frequently is likely to gain more traffic and receive a bias by the bots. Although page rank and other factors do play a role in crawl frequency, it’s likely that the page rank becomes less important when compared with freshness factor of similarly ranked pages.

5. Optimise infinite scrolling pages

The use of an infinite scrolling page has been popularised by social media platforms like Facebook, Instagram and Pinterest. It does not necessarily ruin your chance at Googlebot optimisation however, you need to ensure that your infinite scrolling pages comply with the stipulations provided by Google.

6. Have an internal linking strategy

Internal linking is, in essence, a map for Googlebot to follow as it crawls your site. The more integrated and tight-knit your internal linking structure, the better Googlebot will crawl your site. Share the love and allow users and bots to navigate freely between pages on your site.

7. Have a custom 404 page

Either you're negligent or arrogant but having a custom 404 page can indirectly help improve your crawl rates. Find external links that can be salvaged and redirect traffic from bad links. For more on 404 pages check out the article here.

8. Fix broken links

This one is a no-brainer, why are you hoarding broken links? Time to let go and spring clean those links. Make a list of all your broken links, find the most relevant page to redirect them to and only use the homepage redirect as the last resort.

9. Fix crawl errors

Another no brainer, if you have a look at your crawl reports and you're finding tonnes of errors that are not 404 errors, maybe it's time you fix those. Exhausing your crawl budget on errors is the biggest waste of a resource that could be put to better use to improve your site.

10. Set parameters for dynamic URLs

From custom build sites to popular content management systems, from retail site facetting to booking engines. There are many reasons to generate lots of dynamic URLs that in fact lead to one and the same page. By default, search engine bots will treat these URLs as separate pages; as a result, you may be both wasting your crawl budget and, potentially, breeding content duplication concerns. Setting up exclusions for dynamic URL generation or better yet block the entire process and not waste crawl budget on duplicate pages will always get a thumbs up from myself and crawlers.

11. Build external links

Building external links have been an important part of SEO as long as I can remember. Having links from other sites pointing to your content, especially when it comes to deep linking even sends crawlers off-site to your site, giving you more crawl range and budget, along with the usual link juice and authority passing. Link building as difficult as it can be is as rewarding as any SEO practice to date when done properly. For more on the topic check out the pain of link building here

12. Avoid Redirect chains

Each time you redirect URLs it wastes a little of your crawl budget. When your website has long redirect chains or multiple redirects like a large number of 301 and 302 redirects in a row, spiders such as Googlebot may drop off before they reach your destination page. This means that the final page won’t be indexed. Best practice with redirects is to have as few as possible on your website, and no more than two in a row.
Extra tip! Avoid redirect loops where a page redirects in on itself, this is a crawl budget waste of note.

13. Use rich media files sparingly

If you're being too fancy it can really hurt your website visibility, sites using rich media such as Flash and Silverlight are extremely taxing and hard for crawlers to interpret, as the content sites behind a wall of code and execution touch points. This is a bigger issue for me to go into detail in this post but you may also want to provide alternative text versions of pages that rely heavily on rich media files so crawlers understand the context of what you're trying to do.

Contact us

If you want to know more about optimising your crawl budget don’t be shy we’re happy to assist. Simply contact us here

Tags: How to, SEO, Tools

Previous: {{ previousBlog.sTitle }}

Posted {{ previousBlog.dtDatePosting }}

Next: {{ nextBlog.sTitle }}

Posted {{ nextBlog.dtDatePosting }}

You might also like

Improve engagement on facebook posts

10 Techniques To Increase Engagement on Facebook

20 June 2024

Posted by Ryta Musyenko in Industry Experts

Increase your Facebook engagement rate on your posts with our simple, easy-to-follow strategies and revive your brands presence on the biggest social...

Read more
Google TV ad network

What Is The Google TV Advertising Network?

14 June 2024

Posted by Che Kohler in nichemarket Advice

The new Google TV network is now available to advertisers via Google Ads and Google Display & Video 360 for campaigns focusing on the United States

Read more

Leave us a comment


{{comment.iDayLastEdit}} day ago

{{comment.iDayLastEdit}} days ago


Sign up for our newsletter