RANKNIBBLER

// FREE ON-PAGE SEO CHECKER

What Is Robots.txt?

Robots.txt is a text file at the root of your website (e.g. yoursite.com/robots.txt) that tells search engine crawlers which pages they are allowed or not allowed to access. It is part of the Robots Exclusion Protocol and is one of the first files a crawler checks when visiting your site.

How Robots.txt Works

When Googlebot arrives at your site, it first requests /robots.txt. If the file exists, the crawler follows the rules inside it. If it does not exist, the crawler assumes it can access everything.

Important: robots.txt controls crawling, not indexing. A page blocked by robots.txt can still appear in search results if other pages link to it. To prevent indexing, use a noindex meta tag instead (check with robots directives checker).

Common Robots.txt Directives

DirectiveExampleMeaning
User-agentUser-agent: *Applies to all crawlers
DisallowDisallow: /admin/Do not crawl this path
AllowAllow: /admin/public/Override a disallow for this path
SitemapSitemap: https://site.com/sitemap.xmlLocation of the sitemap
Crawl-delayCrawl-delay: 10Wait 10 seconds between requests (not used by Google)

Example Robots.txt

User-agent: *
Allow: /
Disallow: /admin/
Disallow: /checkout/
Disallow: /search?

Sitemap: https://www.example.com/sitemap.xml

Common Mistakes

The RankNibbler site audit checks your robots.txt for sitemap references when discovering your pages.

Check your site now: Run a free audit on the RankNibbler homepage to see how your page scores across 30+ SEO checks.

Last updated: March 2026