robots.txt
A robots.txt file in a website's root tells crawlers which URLs they may or may not crawl.
robots.txt is the oldest crawler standard on the web (1994) and always lives at /robots.txt. Rules like Disallow and Allow per User-agent control which bots may see which paths. Important: robots.txt blocks crawling, not indexing — pages can still end up indexed via external links.
Example
User-agent: GPTBot — blocks OpenAI's training crawler from the premium section but leaves other bots alone.
Disallow: /premium/
Frequently asked questions
Should I block AI crawlers?
Strategic choice. Blocking protects content but excludes AI visibility. For GEO opportunities, allow at minimum GPTBot, PerplexityBot, ClaudeBot and Google-Extended.
Related terms
Further reading
- → Our service: SEO