SEO

robots.txt

By Paul Brock·Updated on 22-04-2026
TL;DR

A robots.txt file in a website's root tells crawlers which URLs they may or may not crawl.

robots.txt is the oldest crawler standard on the web (1994) and always lives at /robots.txt. Rules like Disallow and Allow per User-agent control which bots may see which paths. Important: robots.txt blocks crawling, not indexing — pages can still end up indexed via external links.

Example

User-agent: GPTBot
Disallow: /premium/
— blocks OpenAI's training crawler from the premium section but leaves other bots alone.

Frequently asked questions

Should I block AI crawlers?

Strategic choice. Blocking protects content but excludes AI visibility. For GEO opportunities, allow at minimum GPTBot, PerplexityBot, ClaudeBot and Google-Extended.

Related terms

Further reading

  • → Our service: SEO

Need help with SEO or GEO?

We help Bitcoin, AI and fintech companies get found in Google and in AI search engines.

Book a call