FAQ

What is robots.txt?

robots.txt is a standard file (introduced in 1994) that tells web crawlers which parts of a site they may or may not access. Modern variants include rules specifically for AI training and search crawlers.

What each field / control does

Bot rows (Allow / Block toggles)

Each row represents one known crawler. The Allow / Block toggle decides what rule the generated file will have for that bot. "Allow" produces no Disallow line (the bot can crawl everything). "Block" produces Disallow: / (the bot is told not to crawl anything).

Allow all preset

Sets every row to Allow with one click. Useful as a reset before picking specific bots to block.

Block all preset

Sets every row to Block with one click. Useful if you want to start from "block everything" and then allow a few specific crawlers back in.

Sitemap URL

Optional but recommended. Adds a Sitemap: directive to the generated file pointing search crawlers to your sitemap.xml. Most major crawlers read this and use it to discover your URLs faster.

Crawl-delay

Optional. Adds a Crawl-delay directive in seconds. It is a hint to compliant crawlers to wait that many seconds between requests. Note: some major crawlers ignore this directive, while others honor it.

Copy / Download robots.txt

"Copy" puts the generated text on your clipboard. "Download" saves it as a file named robots.txt.

General questions

Do crawlers follow what I put in this file?

Major crawlers publicly state they follow robots.txt. The standard is a request, not enforcement. Guaranteed blocking requires server-side controls (firewall rules, authentication).

If a rule prevents new AI crawls, does that erase what AI already learned?

No. AI models may already have data from earlier crawls or third-party sources. A new robots.txt rule only affects future fetches; it does not remove anything previously collected.

What is the difference between training crawlers and search crawlers?

Training crawlers collect text used to train AI models. Search crawlers fetch pages on demand when a user asks the AI a question, often with citations back to the source. The two are independent; you can allow or block each on its own.

Rule of thumb: back up before replacing

If you are replacing an existing file (robots.txt, sitemap.xml, .htaccess, <head> tags, etc.), keep a copy of the original somewhere safe first.

Does this tool upload or send anything?

No. The tool runs entirely in your browser. Selections never leave your device. The file is generated client-side and offered as a download or copy-to-clipboard.

Robots.txt Generator

🤖 Bot reference list

📄 Your robots.txt

FAQ

What each field / control does

General questions