robots.txt is a file that crawlers read to know which parts of a site they may access. This tool lists 28+ known crawlers across AI training, AI search, traditional search, archives, and SEO categories. Toggle each one to allow or block, then generate a valid robots.txt.
📖 Read the full guide: Controlling AI crawlers on your website →
28+ crawlers listed for reference. All allowed by default. Toggle any individual bot if you want a different rule.
Download or copy the generated file. We do not provide upload, hosting, or installation advice.
Loading...
robots.txt is a standard file (introduced in 1994) that tells web crawlers which parts of a site they may or may not access. Modern variants include rules specifically for AI training and search crawlers.
Each row represents one known crawler. The Allow / Block toggle decides what rule the generated file will have for that bot. "Allow" produces no Disallow line (the bot can crawl everything). "Block" produces Disallow: / (the bot is told not to crawl anything).
Sets every row to Allow with one click. Useful as a reset before picking specific bots to block.
Sets every row to Block with one click. Useful if you want to start from "block everything" and then allow a few specific crawlers back in.
Optional but recommended. Adds a Sitemap: directive to the generated file pointing search crawlers to your sitemap.xml. Most major crawlers read this and use it to discover your URLs faster.
Optional. Adds a Crawl-delay directive in seconds. It is a hint to compliant crawlers to wait that many seconds between requests. Note: some major crawlers ignore this directive, while others honor it.
"Copy" puts the generated text on your clipboard. "Download" saves it as a file named robots.txt.
Major crawlers publicly state they follow robots.txt. The standard is a request, not enforcement. Guaranteed blocking requires server-side controls (firewall rules, authentication).
No. AI models may already have data from earlier crawls or third-party sources. A new robots.txt rule only affects future fetches; it does not remove anything previously collected.
Training crawlers collect text used to train AI models. Search crawlers fetch pages on demand when a user asks the AI a question, often with citations back to the source. The two are independent; you can allow or block each on its own.
If you are replacing an existing file (robots.txt, sitemap.xml, .htaccess, <head> tags, etc.), keep a copy of the original somewhere safe first.
No. The tool runs entirely in your browser. Selections never leave your device. The file is generated client-side and offered as a download or copy-to-clipboard.