Netpeak Spider

Your Personal SEO Crawler

Buy Now
save up to 40%
Get Started
14-day free trial
Netpeak Spider: Website Crawler

Platforms

Windows
Ready
Mac OS
Coming soon...
Linux
Coming soon...

Netpeak Spider is your personal SEO crawler that helps you do a fast, comprehensive technical audit of the entire website. This tool allows you to:

Check 50+ key On-Page SEO parameters of crawled URLs
Spot 60+ issues of your website optimization
Analyze incoming and outgoing internal links
Find broken links and redirects
Avoid duplicate content: pages, titles, meta descriptions, H1 headers, etc.
Consider indexation instructions (robots.txt, Meta Robots, X-Robots-Tag, canonical)
Calculate internal PageRank to improve website linking structure
Set custom rules to crawl either the entire website or its certain part
Save or export data to work with it whenever you want

Thousands of SEOs and webmasters all over the world use Netpeak Spider to perform everyday SEO tasks in the most efficient way. Try it free for 14 days!

Parameters (54)Description
General
#URL sequence number in the results table.
URLURL (Uniform Resource Locator) is the unified address of documents in the World Wide Web. In this column maximum severity of SEO issues found on a page is highlighted with appropriate color. Note that you always see decoded URLs in the program interface.
Status CodePart of the first line of HTTP headers: consists status code number and description. If necessary, special status codes are added to this field after '&' symbol showing indexation instructions (disallowed, canonicalized, refresh redirected and noindex / nofollow).
Content-TypeValue for this field is taken from content of the 'Content-Type' field in HTTP response headers or <meta http-equiv="content-type" /> tag in <head> section. Tells to user what the content type of the returned content actually is. For example, text/html, image/png, application/pdf, etc.
IssuesNumber of all issues (errors, warnings and notices) found on the page.
Response TimeTime (in milliseconds) taken by a website server to respond to user's or visitor's request. It is the same as Time To First Byte (TTFB).
Content Download TimeTime (in milliseconds) taken by a website server to return an HTML code of the page.
DepthNumber of clicks from initial URL to the current one. '0' value is initial URL's depth; '1' stands for pages received after following links from initial URL for the first time, etc.
URL DepthNumber of segments in inspected page's URL. Unlike 'Depth' parameter, URL Depth is a static one not depending on initial URL. For instance, URL depth for https://example.com/category/ page is 1, for https://example.com/category/product/ page is 2 and so forth.
Last-ModifiedContent of the 'Last-Modified' field in HTTP response headers: indicates file's last modification date and time.
Indexation
Allowed in robots.txtAccessibility of URL by robots.txt file if it exists. TRUE means that URL is allowed to be indexed. False – disallowed in robots.txt file.
Meta RobotsContent of <meta name="robots" /> tag in <head> section of the document.
Canonical URLContent of the canonical directive in HTTP response header or <link rel="canonical" /> tag in <head> section of the document.
RedirectsNumber of redirects from the current URL: can be useful to determine chains of redirects.
Target redirect URLTarget URL of single redirect or redirect chain if it exists.
X-Robots-TagContent of the 'X-Robots-Tag' field in HTTP response header. It contains indexation instructions and is equivalent to Meta Robots in <head> section.
RefreshContent of the refresh directive in HTTP response header or <meta http-equiv="refresh"> tag in <head> section of the document.
CanonicalsNumber of URLs in a canonical chain starting from the current page. Check is automatically performed when the crawling is paused and after it has been successfully completed. Check can be canceled and then switched on via 'Analysis' menu → 'Checking canonical chains'.
Links
Internal PageRankRelative weight of the page determined by the PageRank algorithm. Considers all main indexation instructions, link attributes and link juice distribution. This parameter is calculated automatically when the crawling is paused and after it has been successfully completed. Calculation can be canceled and then switched on via 'Analysis' menu → 'Calculate internal PageRank'.To see extended features and apply advanced settings for this parameter, go to 'Tools' → 'Internal PageRank calculation'.
Incoming LinksAll links to current page from the crawled URLs. Calculation is automatically performed when the crawling is paused and after it has been successfully completed. Check can be canceled and then switched on via 'Analysis' menu → 'Count incoming links'.
Outgoing LinksAll links from the current URL.
Internal LinksLinks from current URL to other URLs of crawled website.
External LinksLinks from current URL to other websites.
Head Tags
TitleContent of the <title> tag in <head> section of the document. It is a name of webpage and one of the most important tags in SEO.
Title LengthNumber of characters (including spaces) in the <title> tag on the target URL.
DescriptionContent of the <meta name="description" /> tag in <head> section of the document. Usually displayed in SERP for a relevant query to current page, thus affecting CTR.
Description LengthNumber of characters (including spaces) in the <meta name="description" /> tag on the target URL.
Base TagContent of the <base> tag in <head> section of the document.
KeywordsContent of the <meta name="keywords" /> tag in <head> section of the document.
Keywords LengthNumber of characters (including spaces) in the <meta name="keywords" /> tag on the target URL.
Rel Next URLContent of the <link rel="next" /> tag in <head> section of the document.
Rel Prev URLContent of the <link rel="prev" /> tag in <head> section of the document.
AMP HTMLIndicates whether the target document is an AMP HTML page. It is determined by presence of the <html ⚡> or <html amp> tags in <head> section of the document.
Link to AMP HTMLContent of the <link rel="amphtml" /> tag in <head> section of the document.
Content
ImagesNumber of images found in <img> tags on the target page. At the same time with collecting number of images, we gather alt attributes and initial URL source view.
Content-LengthContent of the 'Content-Length' field in HTTP response headers. Indicates the size of the document in bytes.
Content-EncodingContent of the 'Content-Encoding' field in HTTP response headers. Indicates encodings applied to document.
H1 ContentContent of the first non-empty <h1> tag on the target URL.
H1 LengthNumber of characters (including spaces) in the first non-empty <h1> tag on the target URL.
H1 HeadersNumber of <h1> headers on the target URL.
H2 HeadersNumber of <h2> headers on the target URL.
H3 HeadersNumber of <h3> headers on the target URL.
H4 HeadersNumber of <h4> headers on the target URL.
H5 HeadersNumber of <h5> headers on the target URL.
H6 HeadersNumber of <h6> headers on the target URL.
HTML SizeNumber of characters in <html> section of the target page including HTML tags.
Content SizeNumber of characters (including spaces) in <body> section of the target page excluding HTML tags. To put it simply, size of text on the page including spaces.
Text/HTML RatioPercentage of plain text to whole content size ('Content Size' to 'HTML Size' parameters).
CharactersNumber of characters (excluding spaces) in <body> section of the target page excluding HTML tags. To put it simply, it's a size of text on the page excluding spaces.
WordsNumber of words in <body> section of the document.
Characters in <p>Number of characters (excluding spaces) in <p></p> tags in <body> section of the target page.
Words in <p>Number of words in <p></p> tags in <body> section of the target page.
Page HashUnique key for the content of the entire page: allows you to find duplicates according to this parameter.
Text HashUnique key for text content in the <body> section: allows you to find duplicates according to this parameter.
Issues (62)Description
Errors
Broken PagesIndicates unavailable URLs (e.g. due to connection failure, exceeded response timeout, etc.), or the ones returning 4xx and higher HTTP status codes. To view a special report precisely on broken links, press the 'Issue report' button over the main table.
4xx Error Pages: Client ErrorIndicates URLs returning a 4xx HTTP status code.
5xx Error Pages: Server ErrorIndicates URLs that return a 5xx HTTP status code.
Links with Bad URL FormatIndicates pages that contain internal links with bad URL format. To view a special report on this issue, press the 'Issue report' button over the main table.
Duplicate PagesIndicates duplicate compliant pages by all HTML code of the page. URLs in this report are grouped by the 'Page Hash' parameter.
Duplicate TextIndicates all compliant pages that have the same text content in the <​body> section. URLs in this report are grouped by the 'Text Hash' parameter.
Duplicate TitlesIndicates all compliant pages with duplicate <​title> tag content. URLs in this report are grouped by the 'Title' parameter.
Duplicate DescriptionsIndicates all compliant pages with duplicate <​meta name="description" /> tag content. URLs in this report are grouped by the 'Description' parameter.
Duplicate H1Indicates all compliant pages with duplicate <​h1> heading tags content. URLs in this report are grouped by the 'H1 Content' parameter.
Missing or Empty TitleIndicates all compliant pages without the <​title​> tag or with an empty one.
Missing or Empty DescriptionIndicates all compliant pages without the <​meta name="description" /​> tag or with an empty one.
Broken RedirectIndicates addresses of the pages that redirect to unavailable URLs (e.g. due to a connection failure, timeout, etc.) or URLs returning 4xx and higher HTTP status codes.
Redirects with Bad URL FormatIndicates addresses of the pages that return a redirect with bad URL format in HTTP response headers.
Endless RedirectIndicates page addresses ultimately redirecting to themselves and thereby generating an infinite redirect loop.
Max RedirectionsIndicates addresses of the pages that redirect more than 4 times (by default). Note that you can change the default value on the 'Restrictions' tab of crawling settings.
Redirect Blocked by Robots.txtIndicates addresses of the pages that return a redirect to a URL blocked by robots.txt. Note that the report will contain each URL from the redirect chain pointing to the blocked address. To view a special report on this issue, press the 'Issue report' button over the main table.
Canonical Blocked by Robots.txtIndicates pages that contain the <​link rel="canonical" /> tag or the 'Link: rel="canonical"' HTTP response header pointing to URLs blocked by robots.txt. Note that if the target URL starts a canonical chain leading to the blocked URL, the report will contain each URL from the canonical chain. To view a special report on this issue, press the 'Issue report' button over the main table.
Canonical ChainIndicates pages starting a canonical chain (when a canonical URL contains page address that links to another URL in <​link rel="canonical" /> or the 'Link: rel="canonical"' HTTP response header) or taking part in it. To view detailed information, open an additional table 'Canonicals' in the 'Database' menu.
Broken ImagesIndicates unavailable images (e.g. due to a connection failure or a timeout), as well as the ones returning 4xx and higher HTTP status codes. Note that image checking must be enabled on the 'General' tab of crawling settings to detect this issue. To view a special report on this issue, press the 'Issue report' button over the main table.
PageRank: Dead EndIndicates HTML pages that were marked by the internal PageRank algorithm as 'dead ends'. They are the pages that have incoming but no outgoing links, or the last ones are blocked by crawling instructions.
Missing Internal LinksIndicates HTML pages that have incoming links, but do not contain internal outgoing links.
Bad AMP HTML FormatIndicates AMP HTML documents that do not meet the AMP Project documentation standards. Note that there are at least eight markup requirements to each AMP HTML page.
Warnings
Long Server Response TimeIndicates addresses of the pages with TTFB (time to first byte) exceeding 500 ms (by default). Note that you can change default value on 'Restrictions' tab of crawling settings.
Missing or Empty H1Indicates compliant pages without the <​h1> heading tag or with an empty one.
Min Content SizeIndicates compliant pages with less than 500 characters (by default) in the <​body> section (excluding HTML tags). Note that you can change the default value on the 'Restrictions' tab of crawling settings.
Images Without Alt AttributesIndicates compliant pages that contain images without an alt attribute or with an empty one. To view a special report on this issue, press the 'Issue report' button over the main table.
Max Image SizeIndicates addresses of the images with the size exceeding 100 kBs (determined by the Content-Length HTTP response header). Take into account that the 'Check images' option should be enabled on the 'General' tab of crawling settings to detect this issue. Note that you can change the default value on the 'Restrictions' tab of crawling settings.
3xx Redirected PagesIndicates URLs that return a 3xx redirection status code.
Redirect ChainIndicates URLs that redirect more than 1 time.
Refresh RedirectedIndicates addresses of the pages that redirect to another URL using the refresh directive in the HTTP response header or the <​meta http-equip="refresh"> tag in the <​head> section of a document.
External RedirectIndicates internal URLs that return a 3xx redirect to external website which is not a part of the analyzed one.
PageRank: RedirectIndicates URLs marked by the internal PageRank algorithm as redirecting link weight. It could be page addresses returning a 3xx redirect or having canonical / refresh instructions that point to another URL.
Multiple TitlesIndicates compliant pages with more than one <​title> tag in the <​head> HTML section.
Multiple DescriptionsIndicates compliant pages with more than one <​meta name="description" /> tag in <​head> HTML section.
Blocked by Robots.txtIndicates URLs disallowed in the robots.txt file.
Blocked by Meta RobotsIndicates pages that are disallowed from indexing by the 'noindex', or 'none' directives in the <meta name="robots"/>, or <meta name="[bot name]"/> tags in the <head> section where the [bot name] is a name of a certain search robot.
Blocked by X-Robots-TagIndicates URLs that contain the 'noindex' or 'none' directive in the X-Robots-Tag HTTP header.
Notices
Non-HTTPS ProtocolIndicates URLs without secure HTTPS protocol.
Percent-Encoded URLsIndicates pages that contain percent-encoded (non-ASCII) characters in URL. For instance, URL https://example.com/例 is encoded as https://example.com/%E4%BE%8B.
Same Title and H1Indicates all pages that have identical <​title> and <​h1> heading tags.
Short TitleIndicates compliant pages that have less than 10 characters (by default) in the <​title> tag. Take into account that you can change the default value on the 'Restrictions' tab of crawling settings.
Max Title LengthIndicates compliant pages with the <​title> tag exceeding 70 characters (by default). Note that you can change the default value on the 'Restrictions' tab of crawling settings.
Short DescriptionIndicates compliant pages that have less than 50 characters (by default) in the <​meta name="description" /​> tag. Take into account that you can change the default value on the 'Restrictions' tab of crawling settings.
Max Description LengthIndicates compliant pages with the <​meta name="description" /​> tag exceeding 320 characters (by default). Note that you can change the default value on the 'Restrictions' tab of crawling settings.
Multiple H1Indicates compliant pages with more than one <​h1> heading tag.
Max H1 LengthIndicates compliant pages with the <​h1> heading tag exceeding 65 characters (by default). Note that you can change the default value on the 'Restrictions' tab of crawling settings.
Max HTML SizeIndicates compliant pages with more than 200K characters (by default) in the <​html> section (including HTML tags). Note that you can change the default value on the 'Restrictions' tab of crawling settings.
Max Content SizeIndicates compliant pages with more than 50K characters (by default) in the <​body> section (excluding HTML tags). Note that you can change the default value on the 'Restrictions' tab of crawling settings.
Min Text/HTML RatioIndicates compliant pages with less than 10% ratio (by default) of the text (the 'Content Size' parameter) to HTML (the 'HTML Size' parameter). Note that you can change the default value on the 'Restrictions' tab of crawling settings.
Canonicalized pagesIndicates canonicalized pages where a URL in the <​link rel="canonical" /​> tag or the 'Link: rel="canonical"' HTTP response header differs from the page URL.
Identical Canonical URLsIndicates all pages with identical canonical URLs in the tags in the <​head> section or 'Link: rel="canonical"' HTTP response header. URLs in this report are grouped by the 'Canonical URL' parameter.
Missing or Empty Robots.txt FileIndicates compliant URLs that belong to a host with an empty or missing robots.txt file. Note that different hosts (subdomains or http/https protocols) may contain different robots.txt files.
Nofollowed by Meta RobotsIndicates HTML pages that contain a 'nofollow' or 'none' directive in the <meta name="robots" /> tags or the <meta name="[bot name]" /> in the <head> section where the [bot name] is a name of a certain search robot.
Nofollowed by X-Robots-TagIndicates HTML pages that contain a 'nofollow' or 'none' directive in the X-Robots-Tag field of the HTTP response header.
PageRank: OrphanIndicates URLs that were marked by the internal PageRank algorithm as inaccessible. It means the algorithm hasn't found any incoming links to these pages.
Such links may appear:
– during website crawling with disabled crawling and indexing instructions (robots.txt, сanonical, refresh, X-Robots-Tag, Meta Robots, a rel="nofollow" link attribute) → note that if these instructions are disabled, Netpeak Spider does not crawl the site in the same way as search robots do. However, the PageRank algorithm always considers them, so some links found during crawling might be inaccessible for it;
– during crawling of a list of URLs – the links that are not connected.
PageRank: Missing Outgoing LinksIndicates addresses of the pages with no outgoing links found after calculating internal PageRank. It usually happens when outgoing links on a page had not been crawled yet.
Max Internal LinksIndicates pages with more than 100 outgoing internal links (by default). Note that you can change the default value on the 'Restrictions' tab of crawling settings.
Max External LinksIndicates compliant pages containing more than 10 external outgoing links (by default). We have set 10 as the average number of external links on the majority of websites: approximately 5 links to social media and a couple of external links to other sites. Note that you can change the default value on the 'Restrictions' tab of crawling settings.
Internal Nofollow LinksIndicates compliant pages that contain outgoing internal links with the rel="nofollow" attribute. To view a special report on this issue, press the 'Issue report' button over the main table.
External Nofollow LinksIndicates compliant pages that contain external outgoing links with the rel="nofollow" attribute. To view a special report on this issue, press the 'Issue report' button over the main table.
Bad Base Tag FormatIndicates pages that contain the <​base> tag in bad format.
Max URL LengthIndicates pages with more than 2000 characters in URL (by default). Note that you can change the default value on the 'Restrictions' tab of crawling settings.

Plans and Pricing

Netpeak Spider
Number of licenses: [[ vm.options[vm.urlParameter__spider].start |limitTo: 2 ]]
Volume discount: [[ vm.quantityDiscounts[vm.urlParameter__spider] ]]%
[[ vm.quantityMessages[vm.urlParameter__spider] ]]
  • best value
    20% off

    12 months

    $[[ vm.amountByMonthMain[vm.urlParameter__spider][vm.key_spider_12] ]][[ vm.amountByMonthFraction[vm.urlParameter__spider][vm.key_spider_12] ]]/mo each
    Billed as one payment of $[[ vm.amountWithDiscount[vm.urlParameter__spider][vm.key_spider_12]|number:2 ]]
    (save $[[ vm.discountTotal[vm.urlParameter__spider ][vm.key_spider_12]|number:2 ]]

    Initial price: $[[ vm.amountTotal[vm.urlParameter__spider ][vm.key_spider_12]|number:2 ]]

    Long-term discount: $[[ vm.discountByPeriod[vm.urlParameter__spider ][vm.key_spider_12]|number:2 ]]

    Volume discount: $[[ vm.discountByQuantity[vm.urlParameter__spider ][vm.key_spider_12]|number:2 ]]

    Loyalty discount: $[[ vm.discountByLoyalty[vm.urlParameter__spider ][vm.key_spider_12]|number:2 ]]

    )
    Buy Now
    [[vm.loyalty ]]% loyalty discount was applied
  • 1 month

    $[[ vm.amountByMonthMain[vm.urlParameter__spider][vm.key_spider_1] ]][[ vm.amountByMonthFraction[vm.urlParameter__spider][vm.key_spider_1] ]]/mo each
    Billed as one payment of $[[ vm.amountWithDiscount[vm.urlParameter__spider][vm.key_spider_1]|number:2 ]]
    (save $[[ vm.discountTotal[vm.urlParameter__spider ][vm.key_spider_1]|number:2 ]]

    Initial price: $[[ vm.amountTotal[vm.urlParameter__spider ][vm.key_spider_1]|number:2 ]]

    Long-term discount: $[[ vm.discountByPeriod[vm.urlParameter__spider ][vm.key_spider_1]|number:2 ]]

    Volume discount: $[[ vm.discountByQuantity[vm.urlParameter__spider ][vm.key_spider_1]|number:2 ]]

    Loyalty discount: $[[ vm.discountByLoyalty[vm.urlParameter__spider ][vm.key_spider_1]|number:2 ]]

    )
    Buy Now
    [[vm.loyalty ]]% loyalty discount was applied
New to Netpeak Software?
Sign up and get a 14-day FREE trial of Netpeak Spider and Netpeak Checker. No credit card required.
Get Started
Have a question?
Drop us a line if you need more than 50 licenses or want to discuss personal subscription plan.
Contact Us

Frequently Asked Questions

– Netpeak Spider is a desktop tool which crawls your website like a search engine robot and detects key SEO issues that influence the website’s visibility in SERP.
– The free trial grants you full access to all the features of Netpeak Spider for 14 days. Notice that no credit card information is required.

– To start using Netpeak Spider you need to:

  1. Create a Netpeak Software Account.
  2. Download Netpeak Launcher and install it.
  3. Log in to Netpeak Launcher and install Netpeak Spider.
– Yes, it is. After a quick registration of a Netpeak Software Account you will acquire easy access to all the Netpeak Software products. This Account will be shared for Netpeak Launcher, knowledge base and User Control Panel at the website.
– Netpeak Launcher is a desktop software that helps you manage all Netpeak Software products. You can download it from the User Control Panel or by following this link.

– You can use Netpeak Spider on several devices, as long as they are not running at the same time. If you wish to use the software on multiple devices simultaneously, you need to buy the separate licenses.

To change or adjust the devices applied for using Netpeak Spider, please visit ‘Device Management’ section in User Control Panel.