Robots.txt Viewer — Check Any Website

Every website can include a robots.txt file that tells search engines like Google which pages they are allowed to crawl and index. By reading this file you can find out which sections of a site are intentionally hidden from search engines. This is useful for SEO audits, competitive research, and checking your own site's crawler settings. Enter any domain name below and this tool will fetch and display its robots.txt file immediately.

Frequently Asked Questions

What is a robots.txt file?

A robots.txt file is placed in a website's root directory to tell search engine crawlers which pages or sections they should not visit. It is a voluntary standard and most major search engines like Google and Bing respect it.

What does Disallow mean in robots.txt?

Disallow tells a crawler not to visit a specific URL or directory. For example, "Disallow: /admin/" stops crawlers from indexing the admin section. A Disallow with no path value means nothing is blocked for that user agent.

Does robots.txt prevent a page from appearing in Google?

Not always. Blocking a page in robots.txt prevents Google from crawling it, but if other sites link to it, Google may still show the URL in results without a description. To fully remove a page you also need a noindex meta tag or the Google URL Removal Tool.

What happens if a site has no robots.txt?

If no robots.txt exists, search engine crawlers assume all public pages are allowed. The absence of a robots.txt is not a problem or an error. It simply means no crawling restrictions are in place.

What is the correct syntax for a robots.txt file?

A robots.txt file uses User-agent lines (which bot the rule applies to) followed by Disallow or Allow lines (which paths to block or permit). Use User-agent: * to apply to all bots. Disallow: / blocks the entire site. Disallow: /private/ blocks only that directory. Allow: /public/ can re-permit a path under a blocked parent. Comments start with #. Each rule group must be separated by a blank line.

Does robots.txt guarantee a page will not be indexed?

No. Robots.txt is a request, not a technical block. Well-behaved crawlers like Googlebot respect it, but malicious bots and scrapers ignore it entirely. A disallowed page can still appear in search results if other sites link to it — Google may index the URL without crawling the content. To truly prevent indexing, use a noindex meta tag or HTTP header on the page itself.

Can robots.txt be used to hide sensitive content?

No. Robots.txt is publicly readable — listing a private directory in Disallow: /secret/ actually tells anyone who reads the file that the directory exists. For sensitive content, use proper authentication, server-side access control, or password protection. Never rely on robots.txt as a security measure for content you do not want the public to access.

How It Works

This tool fetches the robots.txt file from a domain using the URL you enter, then displays and parses the rules. The fetch goes through a server-side proxy to avoid CORS restrictions that would block a direct browser request. The parser identifies User-agent groups, Disallow rules, Allow overrides, Crawl-delay directives, and the Sitemap declaration.

Robots.txt Syntax Rules

User-agent: * applies rules to all crawlers. Disallow: / blocks everything. Disallow: /folder/ blocks a directory. Allow: /folder/file.html re-permits one file under a blocked folder. Crawl-delay: 10 asks bots to wait 10 seconds between requests. Sitemap: https://example.com/sitemap.xml tells crawlers where the sitemap is. Each bot group must end with a blank line before the next group.

robots.txt vs noindex

Robots.txt controls crawling (whether a bot visits the page). The noindex meta tag controls indexing (whether a page appears in search results). A page blocked in robots.txt can still be indexed if other sites link to it — Google indexes the URL without crawling the content. A page with noindex will be crawled but not included in search results. Use both if you want a page completely hidden from search.

When to Use This

Use to check a competitor's robots.txt to understand which sections of their site they restrict from crawlers, to verify your own robots.txt rules are correctly formatted before deploying, to diagnose why Google Search Console reports crawling errors or blocked pages, or to find where a site's sitemap is declared for manual submission.

More Free Tools

📴

Robots.txt Viewer — Check Any Website

Frequently Asked Questions

How It Works

Robots.txt Syntax Rules

robots.txt vs noindex

When to Use This

More Free Tools

SMS Link Generator

Lua Script Generator

Computer Virus Name Generator

Price Tag Generator