How to Scrape Prices from Online Stores with Netpeak SpiderUse Cases
Data scraping is a labor-intensive and time-consuming procedure but you can’t do any extensive competitive analysis without carrying it out. You can simplify this process with automated data scraping in Netpeak Spider.
In this post we will show you how to collect, filter and export prices from competitors' online stores.
However, cases for using scraping feature are not limited to scraping prices. You can scrape different types of data except those protected in the website code.
You can scrape websites in the free version of Netpeak Spider crawler that is not limited by the term of use and the number of analyzed URLs. Other basic features are also available in the Freemium version of the program.
To get access to free Netpeak Spider, you just need to sign up, download, and launch the program 😉
P.S. Right after signup, you'll also have the opportunity to try all paid functionality and then compare all our plans and pick the most suitable for you.
1. How Scraping Works
Scraping is a Netpeak Spider feature that allows you to extract the data you need from web pages. There are 4 types of search:
- ‘Includes’ (‘only search’)
- ‘RegExp’ (‘search and extraction’)
- ‘CSS Selector’ (‘search and extraction’)
- ‘XPath’ (‘search and extraction’)
The type of search depends on the website structure.
You can get a detailed information about each type of search in the feature overview.
2. How to Find the Element to Scrape
Before scraping you should define required data. In our case, it’s prices.
To get the source code of this element you need to:
- Open product page.
- Find the price and hover over it.
- Right-click on it and choose ‘Inspect’ in the context menu.
- Inspect the price element in the opened window (page will highlight the element your cursor is hovering over).
- Right-click on it and select ‘Copy’ → ‘Copy XPath’.
Usually, it’s enough to copy XPath from a single product page to scrape prices from the entire website or a category. Note that it works like that only if all product pages have the same template.
3. Setting and Launching the Crawling
3.1. How to Set Up Scraping
You can perform the scraping in a few simple steps:
- Go to ‘Scraping’ in the ‘Settings’ tab.
- Tick the ‘Enable HTML scraping’ checkbox.
- Choose the ‘CSS selector’ type and paste the code you’ve copied (framed into regex)
- For data extraction, select ‘Inner text’. You can add up to 100 simultaneous scraping parameters if you need to scrape more data.
- Name each scraping parameter not to get lost in final search results. We have only one parameter responsible for price extracting.
3.2. Rules Setting and Scraping Launch
If you need information about all products on the current website, use scraping parameters from the paragraph 3.1. and launch the entire website crawling with default Netpeak Spider settings.
If URLs of the website look like site.com/category/product or site.com/category-product, you can use custom rules setting.
Let's say, we are interested in Air Jordan brand products with URLs beginning with shop.bdgastore.com/collections/air-jordan/.
‘Rules’ allow you to see only specific pages in the crawling results. It means that you will see only pages that abide by your custom rules (in our case, all URLs must begin with shop.bdgastore.com/collections/air-jordan/).
There are two ways to do it: include pages of a particular type or exclude category you’re not interested in. This is how you can do it:
- Choose ‘Include’ or ‘Exclude’ option.
- Choose matching type (in our case it’s ‘Begins with’).
- Enter the common element of all URLs you are interested in (or have absolutely no interest in) into the field below.
- Apply settings and launch crawling.
As a result of setting crawling rules, Netpeak Spider will show you only pages answering your requirements.
In the last right column of the main results table, you'll find scraping results. It will show you the number of all entries of the expression in every URL (1 product = 1 entry). You can find scraping results in the right panel of the main window, on a ‘Scraping’ tab.
4. How to Export Scraping Results
To get a table summary that contains only URLs and price information:
- In the ‘Scraping’ tab in a sidebar, go to the 'Show all results'.
- Filter obtained data if necessary.
- Hit the ‘Export’ button in the upper left corner for further analysis. You can export the table either to your computer or to Google Sheets.
Check out the plans, subscribe to the most suitable for you, and get inspiring insights!
To scrape data from the competitors' websites you need to perform several consistent actions:
- Determine scraping element (in our case it’s price).
- Copy XPath.
- Set scraping parameters.
- Set scraping crawling rules.
- Launch crawling.
- Export obtained data.
Using this method you can scrape not only prices but many other types of data. So scraping feature can be useful for digital and content marketers, SEO specialists, webmasters, sales managers, etc.
Do you use this feature in your working process? If so, for what purpose?