There are a lot of AI crawlers active today. Paste your robots.txt to see which ones your site allows or blocks. The results table and FAQ below explain each one.
Copy the contents of your https://yourdomain.com/robots.txt file and paste it here. The audit updates live.
https://yourdomain.com/robots.txt in your browser, select all (Ctrl+A), copy (Ctrl+C), and paste below. Browser security (CORS) prevents this tool from fetching it automatically.
Status of each known AI crawler against your robots.txt.
| Crawler | Status |
|---|---|
| Paste your robots.txt to see results. | |
Paste the contents of your live https://yoursite.com/robots.txt here. The audit parses every User-agent block and its Disallow rules, then checks each known AI crawler against those rules. Updates live as you type — no submit button.
Fill the textarea with a pre-made example so you can see how each kind of configuration parses. "Blocks all AI" pre-loads a config that disallows the AI crawlers in our list. "Allows all" leaves them open.
Live counts of how many of the 18 tracked AI crawlers are allowed vs blocked based on your pasted robots.txt. "Total" is the size of the reference list.
One row per known AI crawler. Each row shows the bot name, its operator, what it does, and whether your current robots.txt allows or blocks it, with the reason (e.g., "Disallow: / under the matching User-agent" or "Matched by wildcard *").
"Copy report" copies a plain-text summary of the audit results to your clipboard. "Copy share link" creates a URL that encodes your robots.txt input — anyone you send it to will see the same audit on their machine (no server involved, the data lives in the URL hash).
Allowing a crawler lets it fetch pages from your site (subject to its own crawl behavior). Blocking it tells compliant crawlers not to fetch any URLs. Some AI crawlers fetch pages to train models. Others fetch pages on demand to answer user queries with citations.
The major operators publicly state that they follow robots.txt. Because robots.txt is a request rather than enforcement, server-side controls (firewall, .htaccess, nginx) are required if you need guaranteed blocking.
Training crawlers fetch pages to build datasets used to train AI models. Search crawlers fetch pages on demand to answer user queries with citations, similar to a standard search engine crawler. The two types are usually independent user-agents, so a robots.txt rule for one does not affect the other.
The tool parses your robots.txt, finds each User-agent block, and checks the Disallow rules for each known AI crawler. Disallow: / = blocked. No rule or empty Disallow = allowed. User-agent: * with Disallow: / blocks every bot that doesn't have its own specific rule.
If you want to create or update your robots.txt, use the Robots.txt Generator. It has 28+ crawlers pre-loaded.
No. Parsing happens entirely in your browser. Your robots.txt content never leaves your device.