How To Configure 301 Redirects From Old URLs To New Ones When Upgrading Or Migrating a WebsiteUse Cases
A website is a dynamic system that’s constantly changing and improving as it adapts to new conditions. Sometimes, when updating an Internet resource (say, if you’re moving to another CMS), the URLs of all a site’s landing pages change. This leads to a problem: all the old URLs show a 404 server response (page not found). Example:
- old URL: https://website.com/category/subcategory/page-1.html
- new URL: https://website.com/page-635.html
Instead of pages not found, we can see a standard apology page from the website administration, similar to this one:
As a result, the resource quickly loses its positions in search results, and along with them, its traffic, which can be catastrophic. To avoid this, you must immediately redirect users from the old pages to the new versions.
If your website is small (several dozens of pages) you can compare old and new pages manually, but this won’t be feasible for larger resources with hundreds or thousands (or even tens of thousands) of pages.
In this article, we’ll analyze how you can use Netpeak Software to automatically compare pages before and after updating a website in order to set up page-by-page 301 redirects. This method is great for very large websites with hundreds of thousands or even millions of pages.
Stage 1: Collect the Pages Of the Old And New Site
First, you must collect all the important pages of the old and new sites. There are two ways to do this:
Method 1: Get the list of URLs from XML map files
This method is suitable if the old or new website has an XML map listing the desired pages. To get this list, go to ‘List of URLs’ → select ‘Download from Sitemap’ → enter the URL of the XML sitemap → click ‘Start’ → click ‘Transfer’:
Method 2: Collect the list of URLs using a website crawler
This method will take a little longer, since the program will need to scan the entire website, but it’s a great idea if the old or new website doesn’t have XML maps listing the pages it needs. To use the web crawler, enter the URL of the website and click the ‘Start’ button:
Step 2: Extract Information For Mapping
To match old and new URLs, you need to highlight the elements that will be the same for both versions of the pages. As a rule, these elements are:
- the page’s header H1
- unique product identifiers (SKU)
You can also match other blocks that are unique within the website, but the same for the pages being compared.
To add H1 headers for the list of pages, check the ‘H1 content’ option and click ‘Start’:
For SKU scraping or other unique page elements, you should use the Netpeak Spider parser: open ‘Settings’ → select ‘Scraping’ → enter the name of the parsing parameter in the ‘Name’ field → select the ‘XPath’ scraping type → fill in the field with the XPath value of the block where the site displays the SKU or other identifier → Select ‘Inner text’ as the type of information → Save the settings by clicking ‘OK’ → Mark the scraping name in the right column with parameters → Click ‘Start’:
To copy the XPath of the desired block, open the browser inspector panel (Ctrl + Shift + C) → select the block in the code → right-click → select the ‘Copy’ item → click ‘Copy XPath’:
To export scraped data, open the ‘Export’ item → select ‘XL (extra large) reports from database’ → click ‘Scraping data and all results in single file (XL)’:
Step 3: Map the URL
To match the URLs, we need a Google Spreadsheet with a list of URLs and SKUs (or some other shared parameter) and a VLOOKUP function. Insert the data in the table as follows:
- column A: unique identifiers (SKUs) or H1 titles of old website pages
- column B: unique identifiers (SKUs) or H1 titles of new website pages
- column C: new website URLs
- column D: old website URLs
Enter this formula in column E:
Decoding the obtained results:
- The URLs in column D are the pages from which you need to set up a 301 redirect
- The URLs in column E are the pages to which you need to configure 301 redirects.
Learn how to set up page-by-page 301 redirects in the article ‘Redirects: Which Redirect Code to Use and How to Set It Up’.
Important! All redirects from old URLs to new ones must be implemented with a 301 code.
We don’t recommend changing the URLs of website pages unless absolutely necessary. However, if URLs have changed as a result of updating the resource, you must configure 301 redirects from the old URLs to the new ones. This should be done simultaneously with the release of a new version, or shortly after it, if there must be a delay. This is the only way to minimize the potential drop in positions and traffic. If you wait too long to set up the redirects, the resource’s audience may weaken substantially, and it will take a long time to restore it.