Netpeak Spider

Your personal SEO crawler

Buy Now
save up to 40%
Get Started
14-day free trial
Netpeak Spider: Website Crawler

Platforms

Windows
Ready
Mac OS
Coming soon...
Linux
Coming soon...

Netpeak Spider is your personal SEO crawler that helps you do a fast, comprehensive technical audit of the entire website. This tool allows you to:

Check 50+ key on-page SEO parameters of crawled URLs
Spot 60+ issues of your website optimization
Analyze incoming and outgoing internal links
Find broken links and redirects
Avoid duplicate content: Pages, Titles, Meta Descriptions, H1 Headers, etc.
Consider indexation instructions (Robots.txt, Meta Robots, X-Robots-Tag, Canonical)
Calculate internal PageRank to improve website linking structure
Set custom rules to crawl either the entire website or its certain part
Save or export data to work with it whenever you want

Thousands of SEOs and webmasters all over the world use Netpeak Spider to perform everyday SEO tasks in the most efficient way. Try it free for 14 days!

Parameters (53)Description
General
#URL serial number in the results table.
☑ Required parameter
URLURL (Uniform Resource Locator) is the global address of documents and other resources on the World Wide Web. Netpeak Spider uses them to show the severity of SEO issues appearing on the crawled website. Notice that you always see user-friendly URL in an appropriate column.
☑ Required parameter
Status CodePart of HTTP response headers consisting of a numeric status code and its associated textual phrase.
☑ Required parameter
Content-TypeContent of the field 'Content-Type' in HTTP response headers.
☑ Required parameter
IssuesNumber of all issues (errors, warnings, and notices) found on the target URL.
☑ Required parameter
LevelNumber of clicks from initial crawled URL to current one.
Response TimeTime (in milliseconds) taken for a website server to respond to a user's or visitor's request. It is the same as Time To First Byte (TTFB).
Content Download TimeTime (in milliseconds) taken for a website server to return an HTML code of the page.
Last-ModifiedContent of the 'Last-Modified' field in HTTP response headers: indicates file last modification date and time.
Indexation
Robots.txtAccessibility of URL in robots.txt file if it exists.
RedirectsNumber of redirects from the current URL: can be useful for determining chains of redirects.
• Reduces the crawling speed
Redirect Target URLTarget URL of single redirect or redirect chain if it exists.
Canonical URLContent of the Canonical directives in HTTP response header or <link rel="canonical" /> tag in <head> section of the document.
RefreshContent of the Refresh directive in HTTP response header or <meta http-equiv="refresh"> tag in the <head> section of the document.
X-Robots-TagContent of the 'X-Robots-Tag' in HTTP response header.
Meta RobotsContent of the tag <meta name="robots" /> in <head> section of the document.
Canonical ChainsNumber of URLs in a canonical chain starting from the current page.
Head Tags
TitleContent of the tag <title> in <head> section of the document.
Title LengthNumber of characters (including spaces) in the <title> tag on the target URL.
DescriptionContent of the tag <meta name="description" /> in <head> section of the document.
Description LengthNumber of characters (including spaces) in the <meta name="description" /> tag on the target URL.
KeywordsContent of the tag <meta name="keywords" /> in <head> section of the document.
Keywords LengthNumber of characters (including spaces) in the <meta name="keywords" /> tag on the target URL.
URL Base TagContent of the <base> tag in <head> section of the document.
Rel Next URLContent of the tag <link rel="next" /> in the <head> section of the document.
Rel Prev URLContent of the tag <link rel="prev" /> in the <head> section of the document.
AMP HTMLIndicates whether the target document is an AMP HTML page. It is indicated by presence of the <html ⚡> or <html amp> in the <head> section of the document.
Link to AMP HTMLContent of the <link rel="amphtml" /> in the <head> section of the document.
Content
h1 ValueContent of the first non-empty <h1> tag on the target URL.
h1 LengthNumber of characters in the first non-empty <h1> tag on the target URL.
h1 HeadersNumber of <h1> headers on the target URL.
• Reduces the crawling speed
h2 HeadersNumber of <h2> headers on the target URL.
• Reduces the crawling speed
h3 HeadersNumber of <h3> headers on the target URL.
• Reduces the crawling speed
h4 HeadersNumber of <h4> headers on the target URL.
• Reduces the crawling speed
h5 HeadersNumber of <h5> headers on the target URL.
• Reduces the crawling speed
h6 HeadersNumber of <h6> headers on the target URL.
• Reduces the crawling speed
HTML SizeNumber of characters in the <html> section of the target page including HTML tags
Content SizeNumber of characters (including spaces) in the <body> section of the target page excluding HTML tags
Text/HTML RatioPercentage of the text content on the target page, rounded to the nearest integer
CharactersNumber of characters (excluding spaces) in the <body> section of the target page excluding HTML tags
WordsNumber of words in the <body> section of the target page
Characters in <p>Number of characters (excluding spaces) in <p></p> tags in the <body> section of the target page
Words in <p>Number of words in <p></p> tags in the <body> section of the target page
Page HashUnique key of the entire page calculated using SHA1 algorithm.
Page Body HashUnique key of the page <body> section calculated using SHA1 algorithm
ImagesNumber of images found in <img> tags on the target page.
• Reduces the crawling speed
Content-LengthContent of the field 'Content-Length' in HTTP response headers.
Content-EncodingContent of the field 'Content-Encoding' in HTTP response headers.
Links
Incoming LinksAll links TO current URL from crawled website.
• Significantly affects the RAM memory usage
• Reduces the crawling speed
Outgoing LinksAll links FROM current URL.
• Reduces the crawling speed
Internal LinksLinks from current URL to other URLs of crawled website.
• Reduces the crawling speed
External LinksLinks from current URL to other websites.
• Reduces the crawling speed
Special
Internal PageRankRelative weight of the page, determined by the PageRank algorithm which considers all main indexing instructions, link attributes, and link juice distribution. This parameter is calculated automatically when the crawl is paused and after it has been successfully completed if the number of URLs in the results table doesn't exceed 10,000. To see extended features and apply advanced settings for this parameter, go to 'Tools' → 'Internal PageRank Calculation'.
• 'Outgoing Links' is a required parameter for calculation of the internal PageRank
Issues (64)Description
Errors
PageRank: Dead EndIndicates URLs that were marked by the internal PageRank algorithm as dead ends – these pages contain incoming but not outgoing links, which creates a disbalance in the link juice distribution.
Duplicate PagesIndicates all pages that have the same page hash output value. URLs in this report are grouped by page hash.
Duplicate Body ContentIndicates all pages that have the same page hash output value of the <body> section. URLs in this report are grouped by page body hash.
Duplicate TitlesIndicates all pages with title tags that appear on more than one page of the crawled website. URLs in this report are grouped by title tag.
Missing or Empty TitleIndicates all pages without the title tag or with the empty one.
Duplicate DescriptionsIndicates all pages with meta description tags that appear on more than one page of the crawled website. URLs in this report are grouped by meta description tag.
Missing or Empty DescriptionIndicates all pages without meta description tag or with the empty one.
4xx Error Pages: Client ErrorIndicates all pages that return 4xx HTTP status code. To view all the broken links (pointing to 4xx error pages), filter the pages by this issue, click ‘Current Table Summary’ button and choose ‘Incoming Links’.
Redirect to 4xx Error PageIndicates all pages that redirect to 4xx error pages like 404 Not Found Error.
Endless RedirectIndicates all pages that are redirecting to themselves and thereby generate infinite redirect loop.
Max RedirectionsIndicates all pages that redirect more than 4 times (by default). Notice that you can change the maximum number of redirects in the 'Restriction' tab of Crawling Settings.
Redirect Blocked by Robots.txtIndicates pages that return a redirect to a URL blocked by robots.txt.
Bad URL Format RedirectsIndicates pages that return a redirect with bad URL format in HTTP response headers.
External 4xx-5xx Errors PagesIndicates all external URLs that return 4xx-5xx status code. Notice that external links crawling should be checked in the 'General' tab of Crawling Settings to enable this issue detection.
Connection ErrorIndicates all pages that failed to respond as a result of connection error.
Bad URL Base Tag FormatIndicates pages which contain base tag in incorrect format. Notice that relative links сan't be used in this tag since they are not supported by search engine robots.
Max URL LengthIndicates all pages with more than 2000 characters in URL.
Missing Internal LinksIndicates all pages with no internal links. Notice that such pages get link juice but do not pass it.
Links with Bad URL FormatIndicates pages that have internal links with bad URL format. To view a complete report on such links, filter the pages with this issue, click ‘Current table summary’ button, choose ‘Outgoing links’, and set the appropriate filter (Include → URLs with issue → Bad URL format).
Broken ImagesIndicates images that return 4xx-5xx status code. Notice that 'Images' Content Type should be checked in the 'General' tab of Crawling Settings to enable this issue detection.
Canonical Chain Blocked by Robots.txtIndicates pages that are part of сanonical chains pointing to the URLs which are blocked by robots.txt.
Warnings
Multiple TitlesIndicates all pages with more than one title tag.
Multiple DescriptionsIndicates all pages with more than one meta description tag.
Missing or Empty h1Indicates all pages without h1 header tag or with the empty one.
Multiple h1Indicates all pages with more than one h1 header tag.
Duplicate h1Indicates all pages with h1 header tags that appear on more than one page of the crawled website. URLs in this report are grouped by h1 header tag value.
Min Content SizeIndicates all pages with less than 500 characters in the <body> section (excluding HTML tags).
PageRank: RedirectIndicates URLs that were marked by the internal PageRank algorithm as redirecting link juice – these can be pages that return a 3xx redirect or have canonical / refresh tags pointing to another URL.
3xx Redirected PagesIndicates all pages that return 3xx redirection status code.
Non-301 RedirectsIndicates all pages that return redirection status code different from 301 (permanent redirect).
Redirect ChainIndicates all pages that redirect more than 1 time.
Refresh RedirectedIndicates all pages with redirect in Refresh directive in HTTP response header or <meta http-equip="refresh"> tag in the <head> section of the document.
Canonical ChainIndicates pages that are part of canonical chains. To view this information, open an additional table 'Canonical Chains'.
External RedirectIndicates all pages that return a 3xx redirect to an external website which is not a part of the crawled one.
Blocked by Robots.txtIndicates all pages that are disallowed in robots.txt file.
Blocked by Meta RobotsIndicates all pages that contain <meta name="robots" content="noindex"> directive in the <head> section.
Blocked by X-Robots-TagIndicates all pages that contain 'noindex' directive in X-Robots-Tag of the HTTP header response.
Missing Images ALT AttributesIndicates all pages that contain images without the alt attribute. To view the report, please click 'Current Table Summary' button, choose 'Images' and set the appropriate filter (Include → URLs with issue → Missing Images ALT Attributes).
Max Image SizeIndicates images which size exceeds 100 kBs. Notice that 'Images' box should be checked in the 'General' tab of Crawling Settings to enable this issue detection.
5xx Error Pages: Server ErrorIndicates all pages that return 5xx HTTP status code.
Long Server Response TimeIndicates all pages with the response time of more than 500 ms.
Wrong AMP HTML FormatIndicates AMP HTML documents that do not correspond to the standards of AMP Project documentation. Note that there are at least 8 markup requirements to each AMP HTML page.
Other Failed URLsIndicates all pages that failed to respond as a result of other unknown errors.
Notices
Duplicate Canonical URLsIndicates all pages with Canonical URLs that appear on more than one page of the crawled website. URLs in this report are grouped by Canonical URL.
PageRank: OrphanIndicates URLs that were marked by the internal PageRank algorithm as orphans – these pages have no incoming links.
Same Title and h1Indicates all pages that have identical title and h1 header tags.
Max Title LengthIndicates all pages with the title tag of more than 70 characters.
Short TitleIndicates all pages with the title tag of less than 10 characters.
Max Description LengthIndicates all pages with meta description tag of more than 160 characters.
Short DescriptionIndicates all pages with meta description tag of less than 50 characters.
Max h1 LengthIndicates all pages with h1 header tag of more than 65 characters.
Max HTML SizeIndicates all pages with more than 200k characters in the <html> section (including HTML tags).
Max Content SizeIndicates all pages with more than 50k characters in the <body> section (excluding HTML tags).
Min Text/HTML RatioIndicates all pages with less than 10 percent of the text to HTML ratio.
Nofollowed by Meta RobotsIndicates all pages that contain <meta name="robots" content="nofollow"> directive in the <head> section.
Nofollowed by X-Robots-TagIndicates all pages that contain 'nofollow' directive in X-Robots-Tag of the HTTP header response.
Missing or Empty Canonical TagIndicates all pages without Canonical URL or with the empty one.
Different Page URL and Canonical URLIndicates all pages where the Canonical URL differs from the Page URL.
Non-https ProtocolIndicates the list of URLs that do not have https protocol.
Max Internal LinksIndicates all pages with more than 100 internal links.
Max External LinksIndicates all pages with more than 10 external links.
Internal Nofollowed LinksIndicates all pages that contain internal links with rel="nofollow" attribute.
External Nofollowed LinksIndicates all pages that contain external links with rel="nofollow" attribute.
Missing or Empty Robots.txt FileIndicates all URLs related to missing or empty robots.txt file. Notice that different subdomains and protocols (http / https) can contain different robots.txt files. This issue may occur when robots.txt redirects to any other URL or when it returns a status code other than 200 OK.

Plans and Pricing

Netpeak Spider
Number of licenses: [[ vm.options[vm.urlParameter__spider].start |limitTo: 2 ]]
Volume discount: [[ vm.quantityDiscounts[vm.urlParameter__spider] ]]%
[[ vm.quantityMessages[vm.urlParameter__spider] ]]
  • best value
    30% off

    12 months

    $[[ vm.amountByMonthMain[vm.urlParameter__spider][vm.key_spider_12] ]][[ vm.amountByMonthFraction[vm.urlParameter__spider][vm.key_spider_12] ]]/mo each
    Billed as one payment of $[[ vm.amountWithDiscount[vm.urlParameter__spider][vm.key_spider_12]|number:2 ]]

    Initial price: $[[ vm.amountTotal[vm.urlParameter__spider][vm.key_spider_12]|number:2 ]]

    Long-term discount: $[[ vm.discountByPeriod[vm.urlParameter__spider][vm.key_spider_12]|number:2 ]]

    Volume discount: $[[ vm.discountByQuantity[vm.urlParameter__spider][vm.key_spider_12]|number:2 ]]

    Loyalty discount: $[[ vm.discountByLoyalty[vm.urlParameter__spider][vm.key_spider_12]|number:2 ]]

    (save $[[ vm.discountTotal[vm.urlParameter__spider][vm.key_spider_12]|number:2 ]] )
    Buy Now
    [[vm.loyalty ]]% loyalty discount was applied
  • 20% off

    6 months

    $[[ vm.amountByMonthMain[vm.urlParameter__spider][vm.key_spider_6] ]][[ vm.amountByMonthFraction[vm.urlParameter__spider][vm.key_spider_6] ]]/mo each
    Billed as one payment of $[[ vm.amountWithDiscount[vm.urlParameter__spider][vm.key_spider_6]|number:2 ]]

    Initial price: $[[ vm.amountTotal[vm.urlParameter__spider][vm.key_spider_6]|number:2 ]]

    Long-term discount: $[[ vm.discountByPeriod[vm.urlParameter__spider][vm.key_spider_6]|number:2 ]]

    Volume discount: $[[ vm.discountByQuantity[vm.urlParameter__spider][vm.key_spider_6]|number:2 ]]

    Loyalty discount: $[[ vm.discountByLoyalty[vm.urlParameter__spider][vm.key_spider_6]|number:2 ]]

    (save $[[ vm.discountTotal[vm.urlParameter__spider][vm.key_spider_6]|number:2 ]] )
    Buy Now
    [[vm.loyalty ]]% loyalty discount was applied
  • 10% off

    3 months

    $[[ vm.amountByMonthMain[vm.urlParameter__spider][vm.key_spider_3] ]][[ vm.amountByMonthFraction[vm.urlParameter__spider][vm.key_spider_3] ]]/mo each
    Billed as one payment of $[[ vm.amountWithDiscount[vm.urlParameter__spider][vm.key_spider_3]|number:2 ]]

    Initial price: $[[ vm.amountTotal[vm.urlParameter__spider][vm.key_spider_3]|number:2 ]]

    Long-term discount: $[[ vm.discountByPeriod[vm.urlParameter__spider][vm.key_spider_3]|number:2 ]]

    Volume discount: $[[ vm.discountByQuantity[vm.urlParameter__spider][vm.key_spider_3]|number:2 ]]

    Loyalty discount: $[[ vm.discountByLoyalty[vm.urlParameter__spider][vm.key_spider_3]|number:2 ]]

    (save $[[ vm.discountTotal[vm.urlParameter__spider][vm.key_spider_3]|number:2 ]] )
    Buy Now
    [[vm.loyalty ]]% loyalty discount was applied
  • 1 month

    $[[ vm.amountByMonthMain[vm.urlParameter__spider][vm.key_spider_1] ]][[ vm.amountByMonthFraction[vm.urlParameter__spider][vm.key_spider_1] ]]/mo each
    Billed as one payment of $[[ vm.amountWithDiscount[vm.urlParameter__spider][vm.key_spider_1]|number:2 ]]

    Initial price: $[[ vm.amountTotal[vm.urlParameter__spider][vm.key_spider_1]|number:2 ]]

    Long-term discount: $[[ vm.discountByPeriod[vm.urlParameter__spider][vm.key_spider_1]|number:2 ]]

    Volume discount: $[[ vm.discountByQuantity[vm.urlParameter__spider][vm.key_spider_1]|number:2 ]]

    Loyalty discount: $[[ vm.discountByLoyalty[vm.urlParameter__spider][vm.key_spider_1]|number:2 ]]

    (save $[[ vm.discountTotal[vm.urlParameter__spider][vm.key_spider_1]|number:2 ]] )
    Buy Now
    [[vm.loyalty ]]% loyalty discount was applied
New to Netpeak Software?
Sign up and get a 14-day FREE trial of Netpeak Spider and Netpeak Checker. No credit card required.
Get Started
Have a question?
Drop us a line if you need more than 50 licenses or want to discuss personal subscription plan.
Contact Us

Frequently Asked Questions

– Netpeak Spider is a desktop tool which crawls your website like a search engine robot and detects key SEO issues that influence the website’s visibility in SERP.
– The free trial grants you full access to all the features of Netpeak Spider for 14 days. Notice that no credit card information is required.

– To start using Netpeak Spider you need to:

  1. Create a Netpeak Software Account.
  2. Download Netpeak Launcher and install it.
  3. Log in to Netpeak Launcher and install Netpeak Spider.
– Yes, it is. After a quick registration of a Netpeak Software Account you will acquire easy access to all the Netpeak Software products. This Account will be shared for Netpeak Launcher, knowledge base and User Control Panel at the website.
– Netpeak Launcher is a desktop software that helps you manage all Netpeak Software products. You can download it from the User Control Panel or by following this link.

– You can use Netpeak Spider on several devices, as long as they are not running at the same time. If you wish to use the software on multiple devices simultaneously, you need to buy the separate licenses.

To change or adjust the devices appiled for using Netpeak Spider, please visit ‘Device Management’ section in User Control Panel.