SEO

robots.txt

By Paul Brock·Updated on 22-04-2026

TL;DR

A robots.txt file in a website's root tells crawlers which URLs they may or may not crawl.

robots.txt is the oldest crawler standard on the web (1994) and always lives at /robots.txt. Rules like Disallow and Allow per User-agent control which bots may see which paths. Important: robots.txt blocks crawling, not indexing — pages can still end up indexed via external links.

Example

User-agent: GPTBot Disallow: /premium/ — blocks OpenAI's training crawler from the premium section but leaves other bots alone.

Frequently asked questions

Should I block AI crawlers?

Strategic choice. Blocking protects content but excludes AI visibility. For GEO opportunities, allow at minimum GPTBot, PerplexityBot, ClaudeBot and Google-Extended.

robots.txt

Example

Frequently asked questions

Related terms

Further reading

Need help with SEO or GEO?