Flash deal ends in--:--:--— use GEO10 for 10% off

Technical SEOFree · runs in your browser

AI bot robots.txt simulator

CrawlabilityChecker

Check whether AI crawlers can reach every URL on your site.

Paste robots.txt
Verify live with curl
# Check your live robots.txt
curl -A "GPTBot" https://yourdomain.com/robots.txt
curl -A "ClaudeBot" https://yourdomain.com/blog/how-ai-assistants-cite/
curl -A "PerplexityBot" https://yourdomain.com/sitemap.xml
AI bot verdicts

7 allowed · 2 blocked · 10 total

!
!

GPTBotOpenAI (ChatGPT training)

Test path allowed, but group has 2 disallow rule(s) elsewhere.

Partial

OAI-SearchBotOpenAI (ChatGPT search)

Fully allowed for this path.

Allow

ChatGPT-UserOpenAI (ChatGPT on-demand)

Fully allowed for this path.

Allow

ClaudeBotAnthropic (Claude training)

Fully allowed for this path.

Allow

Claude-WebAnthropic (Claude browsing)

Fully allowed for this path.

Allow

Google-ExtendedGoogle (Gemini training)

Fully allowed for this path.

Allow

GoogleOtherGoogle (AI Overviews crawl)

Fully allowed for this path.

Allow

PerplexityBotPerplexity (index + live)

Fully allowed for this path.

Allow
×

CCBotCommon Crawl (used as training base)

Disallow: / blocks the entire site for this user-agent.

Block
×

BytespiderByteDance (Doubao)

Disallow: / blocks the entire site for this user-agent.

Block

How it works

Browser CORS policy prevents us from fetching your live robots.txt directly. Instead, paste the contents of your robots.txt file (from https://yoursite.com/robots.txt) and set a test path. We simulate all 10 major AI crawlers against the longest-match-wins robots.txt spec and show allow/block verdicts per bot. We also give you the exact curl commands to verify live behavior from your terminal.

Why crawlability is stage zero

  • No crawl = no citation. If GPTBot can't fetch a page, ChatGPT literally has zero chance of citing it.
  • Training vs live crawl. GPTBot is for training; OAI-SearchBot is live browsing. Blocking one without the other is almost always a mistake.
  • CCBot is the sleeper. Common Crawl is the corpus most foundation models train on. If you block CCBot, you lose training-data presence across the board.

Pair with

Once the crawlers can reach the page, shape their access with the llms.txt generator, then cross-check policies with the robots vs llms.txt checker. Validate per-bot with the AI bot access checker. Background reading: do AI assistants follow links.

Want this done for you?

Ship the full GEO playbook in 14 days

Geolify GEO packages bundle every tool on this site into a 14-day done-for-you build - llms.txt, schema, entity strength, content overhaul, citations and the measurement stack. From $499.

Explore More Packages

Combine services for maximum AI visibility.