How it works
Browser CORS policy prevents us from fetching your live robots.txt directly. Instead, paste the contents of your robots.txt file (from https://yoursite.com/robots.txt) and set a test path. We simulate all 10 major AI crawlers against the longest-match-wins robots.txt spec and show allow/block verdicts per bot. We also give you the exact curl commands to verify live behavior from your terminal.
Why crawlability is stage zero
- No crawl = no citation. If GPTBot can't fetch a page, ChatGPT literally has zero chance of citing it.
- Training vs live crawl. GPTBot is for training; OAI-SearchBot is live browsing. Blocking one without the other is almost always a mistake.
- CCBot is the sleeper. Common Crawl is the corpus most foundation models train on. If you block CCBot, you lose training-data presence across the board.
Pair with
Once the crawlers can reach the page, shape their access with the llms.txt generator, then cross-check policies with the robots vs llms.txt checker. Validate per-bot with the AI bot access checker. Background reading: do AI assistants follow links.