We're all afraid of broken links on our website but how to find and fix them fast? What kind of redirect should you use for an old HTTP version? How to find all 4xx and 5xx pages? Let me answer all these questions.
The easiest task ever is a status codes check. Next step is to learn their classes and each code in particular. All in all, they are divided into 5 classes, and honestly, you need to remember only several of them without Google's help.
1. Briefly on What Is HTTP Status Code
Everytime you click on a link or enter URL into the address bar, you send a request to a server. It makes some black box magic to create an answer for you where the first part is exactly an HTTP status code.
First three digits and a short phrase give a user (browser), crawler, or search robot an understanding of server's reaction on a query to a specified URL. For example, 200 OK response code gives a clear message: 'Everything is OK. You're at the right place'.
All codes are divided into 5 classes which differ by the first digit:
- 1хх – informational class, it's necessary for a client (browser) during the data transfer or analysis. They are mostly service codes and are rare for our eyes.
- 2хх – codes that tell us about a successful query processing.
- 3хх – three at the beginning means a redirect from one address to another. By the way, SEO newbies are mostly confused about the usage of redirects. That's why we will get back to this a little bit later.
- 4хх – status codes which tell us about a client-side error. The reason for an issue is briefly described right after 3 digits.
- 5хх – also the issue codes but in this case, it's a server-side one. Reasons may vary but just like in the previous class, the reason is told right after the number. For example, it can tell us about a high load on a server or other internal problems.
2. How to Check the Status Code?
There are several ways to check the status code:
- Developer tools in your browser (F12 + 'Network' tab)
- Tons of extensions
- Online services
- Different SEO tools
But since I'm a Netpeak Software employee, I will show you how to do that using Netpeak Spider as your own HTTP status code checker and even more.
Now let's define your task:
- Is it checking status codes of all pages on your website?
- Or a bulk response code check of the list of URLs?
2.1. Checking Status Codes of All Pages on Your Website
In the main window of the program, enter your website homepage URL and click on the 'Start' button. As soon as crawling is complete, you will have all status codes in the corresponding table column.
Pages with the 4xx and 5xx HTTP status codes will be gathered into special issue reports. When you click on any issue in a sidebar, program forms the report only for pages with the current error.
After that, you're able to see all pages that refer to these 4xx and 5xx pages. Replace these links with the working ones to get rid of such nightmare as broken links on your website. To do so, call for context menu using right button click and choose 'Incoming links'. By the way, at the same time you can:
- Recrawl URLs
- Open URLs in external services (for example, Serpstat, Ahrefs, Google PageSpeed)
- Try other reports
To get access to Netpeak Spider, you just need to sign up, download, and launch the program 😉
P.S. Right after signup, you'll also have the opportunity to try all paid functionality and then compare all our plans and pick the most suitable for you.
2.2. A Bulk Response Code Check of the List of URLs
If you have a task to make a bulk status code check of different websites, advertisements or just some specific pages, you can upload them to the program in 3 ways:
- Copy from clipboard
- Upload from a document (.xlsx, .csv, .txt, .xml)
- Upload from a Sitemap
After that, click 'Start' and the tool will scan only this list of URLs.
By the way, here is one more feature! If you need only status codes and nothing more, tick off all other parameters in a sidebar. But if you need more comprehensive report – choose necessary parameters and run the crawl.
3. What Does xxx HTTP Response Code Mean?
Let's talk about the most popular HTTP response codes to understand their purpose.
The response tells you about a successfully completed request. It means that the page has been found and the information is sent to a client.
301 Moved Permanently
The requested document is permanently moved to another URL address. This HTTP code is the most frequent topic to discuss among SEO beginners. But actually it's not so complex: if you want to lead visitors from one page to another, it should respond with this code. They may be duplicates, mirrors, deleted pages which still get traffic, or other skeletons in your closet which you don't want to talk about.
After crawling such pages, search engines will eventually merge these pages with the target redirect URLs and pass the page weight. And of course, try to keep eye on your website links. If you systematically crawl your site and find links that respond with 301 code, replace them with target redirect pages, thus you will help search robots save their time.
Says to a client that page has been found and is temporarily located in another place. Search engines usually don't delete this kind of pages from the index, as they may work again. Previously, this redirect was used in cases like:
- Website development or redesign.
- Goods are out of stock but page gets traffic, so it's reasonable to redirect it to a catalog or similar pages.
But now we have HTTP 1.1 protocol and 303 See Other and 307 Temporary Redirect response codes as a replacement of 302 Found.
303 See Other
The best case to use this code is leading a user to a little bit different page which may fulfill the search intent but not completely. Only GET method is available for this request. It means that you can only request data from a specified resource and can't update or create any information on resources.
304 Not Modified
One of my favorite responses. At first sight, it may look as a redirect but in reality, it's configured specifically for robots and gives you even more profit than 200 OK.
We all know about crawl budget. 304 status code is one of the ways to help search robots not to waste their time on pages where you've made no changes since their last visit. Thus, they will focus on crawling new pages. It's possible to configure this code together with If-Modified-Since HTTP header.
I want to underline that the necessity of this code for websites which consist of less than 10 thousand pages is pretty small. But when you work with quite big marketplaces, it's a must-have ;)
307 Temporary Redirect
I advise you guys to use this code in cases when you temporarily need to lead a user to another page but still have an opportunity to use POST method in a request. It's a unique characteristic of the 307 code which allows you to send data.
Let's move to the codes which are responsible for client errors.
This status code tells you that a user hasn't passed an authentication yet or his credentials are wrong.
Have you ever seen the 'Access denied' message in movies when hackers wanted to penetrate any system? That's pretty much the same. Server received the request but due to access limitations refused to complete it. For example, a user wants to get system files from the root folder which can be reached only by administrators.
404 Not Found
Requested address hasn't been found. It's a must-have code on your website for non-existing pages. Search engines can index a bunch of pages which do not exist in reality causing a long-standing headache to get rid of them from SERP. By the way, everybody likes creative 404s, so do not forget to add some cat memes there ;)
When somebody wants to reach an old deleted page, it's better to respond with 410 error code. Of course, if you're sure that you will never have a similar one to redirect to. In this case, search robot may never get back to this page and mark it as a deleted one. Eventually, it will be gone from SERP.
429 Too Many Requests
We see this code in our crawler every day. When a server detects excessively high activity from one user during a period of time, it responds with 429 code. If you want to continue crawling, try setting less simultaneous threads or a longer delay between your requests.
Respect your server, it's almost as busy as Google – everybody asks it about something.
In the end, let's talk about the most common server-side error codes.
500 Internal Server Error
This issue may describe a lot of problems on your server. It's something unpredictable, and the reason can't be easily detected, so it's not marked with any other exact codes.
503 Service Unavailable
It's like a vacation for your server. Usually, it's caused by a temporary maintenance or a high load.
To Make Long Story Short
To sum up, let's briefly run through the things I've told you today:
- First three digits and a short phrase are meant to provide a user (browser), crawler, search robot with an understanding of the server's reaction.
- All codes are divided into 5 classes which differ by the first digit:
- 1хх – informational class
- 2хх – successful responses
- 3хх – redirections
- 4хх – client error codes
- 5хх – server-side issues
- You can check status codes in many ways but the best one is Netpeak Spider. Don't forget to sign up to get a free 7-days trial =)
Wish you fewer errors and more good clients, guys ;)