Egosurfing and Your Online Presence
09 July 2020
How To Pass A AZ-900 Azure Certification
08 July 2020
How CBD Is Helpful For Keeping You Fit
06 July 2020
How To Search For Decentralised Domains
06 July 2020
Trending Music Hashtags To Get Your Posts Noticed
24 August 2018
Trending Fashion Hashtags To Get Your Posts Noticed
05 April 2018
Trending Beauty Hashtags To Get Your Posts Noticed
05 July 2018
Trending Wedding Hashtags To Get Your Posts Noticed
18 September 2018
13 Steps To Maximise Your Crawl Budget
Crawl budget is one of the most important SEO factors, a lot is spoken about it, many of you have heard about it but not much focus and attention have been put on this vital practice. Making sure a crawler has a great experience and can easily find content and pages to index without bumping its head too many times can greatly increase your organic visibility.
Over the years I've worked on many sites, who to an extent go out of their way to make it harder for crawlers to engage with their site, their content and of course product quality organic listings. While optimising a website's crawl budget I've come across common mistakes which I now refer to as the 13 crawling budget sins. In the interest of websites everywhere I've decided to list them all and take you through them, so let's begin.
1. Configure Your Robots file
Your robots file is generally the doormat to your site and is the first interaction crawlers will have with your site. Letting spiders know where to go to find the most relevant content as fast as possible is self-explanatory. Use your robots file to block anything you think wouldn't be important for crawlers, like your backend and reference your XML sitemap for easy access.
2. XML sitemaps
Your sitemap is one of the clearest messages to the Googlebot and Bingbot about how to access your site. As the name suggests it serves as a map to your site for the crawlers to follow. Not every site can be crawled easily. Complicating factors may, for lack of a better word, “confuse” Googlebot or get it sidetracked as it crawls your site. A sitemap gives bots a guide on where to prioritise its search and what you recommend to it as important.
3. HTML sitemap
Just like the XML sitemap, an HTML sitemap can not only help crawlers but user experience. Helping both users and bots find important pages deeper within your site and provide a relevant structure to your site. HTML sitemaps also provide a great place to improve your internal interlinking especially when you're top site navigation may be limited to the most popular pages or limited due to mobile first considerations.
4. Regular fresh content
Web-crawlers are constantly crawling sites to find new and fresh content to add to its index and improve the search results for users. Freshness is an important factor when it comes to content as search results need to be as up to date or definitive as possible to please users. Giving crawlers brand new pages and content to crawl will go a long way into increasing your sites crawl budget.
Content that is crawled more frequently is likely to gain more traffic and receive a bias by the bots. Although page rank and other factors do play a role in crawl frequency, it’s likely that the page rank becomes less important when compared with freshness factor of similarly ranked pages.
5. Optimise infinite scrolling pages
The use of an infinite scrolling page has been popularised by social media platforms like Facebook, Instagram and Pinterest. It does not necessarily ruin your chance at Googlebot optimisation however, you need to ensure that your infinite scrolling pages comply with the stipulations provided by Google.
6. Have an internal linking strategy
Internal linking is, in essence, a map for Googlebot to follow as it crawls your site. The more integrated and tight-knit your internal linking structure, the better Googlebot will crawl your site. Share the love and allow users and bots to navigate freely between pages on your site.
7. Have a custom 404 page
Either you're negligent or arrogant but having a custom 404 page can indirectly help improve your crawl rates. Find external links that can be salvaged and redirect traffic from bad links. For more on 404 pages check out the article here.
8. Fix broken links
This one is a no-brainer, why are you hoarding broken links? Time to let go and spring clean those links. Make a list of all your broken links, find the most relevant page to redirect them to and only use the homepage redirect as the last resort.
9. Fix crawl errors
Another no brainer, if you have a look at your crawl reports and you're finding tonnes of errors that are not 404 errors, maybe it's time you fix those. Exhausing your crawl budget on errors is the biggest waste of a resource that could be put to better use to improve your site.
10. Set parameters for dynamic URLs
From custom build sites to popular content management systems, from retail site facetting to booking engines. There are many reasons to generate lots of dynamic URLs that in fact lead to one and the same page. By default, search engine bots will treat these URLs as separate pages; as a result, you may be both wasting your crawl budget and, potentially, breeding content duplication concerns. Setting up exclusions for dynamic URL generation or better yet block the entire process and not waste crawl budget on duplicate pages will always get a thumbs up from myself and crawlers.
11. Build external linksBuilding external links have been an important part of SEO as long as I can remember. Having links from other sites pointing to your content, especially when it comes to deep linking even sends crawlers off-site to your site, giving you more crawl range and budget, along with the usual link juice and authority passing. Link building as difficult as it can be is as rewarding as any SEO practice to date when done properly. For more on the topic check out the pain of link building here
12. Avoid Redirect chainsEach time you redirect URLs it wastes a little of your crawl budget. When your website has long redirect chains or multiple redirects like a large number of 301 and 302 redirects in a row, spiders such as Googlebot may drop off before they reach your destination page. This means that the final page won’t be indexed. Best practice with redirects is to have as few as possible on your website, and no more than two in a row.
Extra tip! Avoid redirect loops where a page redirects in on itself, this is a crawl budget waste of note.