GPTBot
OpenAI's training data crawler for GPT models including ChatGPT and GPT-4.
What is GPTBot?
GPTBot is OpenAI's primary web crawler for collecting training data for GPT models, including ChatGPT, GPT-4, and future models. It crawls web pages at scale to build the datasets used during model pre-training and fine-tuning.
This is the most discussed AI training crawler due to OpenAI's market prominence. GPTBot respects robots.txt and publishes its IP ranges at openai.com/gptbot.json. Its crawl rate is moderate (around 100 pages/hour on major sites) compared to OpenAI's real-time browsing agents.
The decision to allow or block GPTBot is one of the most consequential AI policy decisions site owners face today. Allowing it means your content may influence GPT model behavior; blocking it keeps your content out of training but has no effect on ChatGPT's real-time browsing (that's ChatGPT-User) or search features (that's OAI-SearchBot).
User-Agent Strings
These are the known user-agent patterns used by GPTBot. Use them to identify this crawler in your server logs or configure robots.txt rules.
robots.txt example:
User-agent: GPTBot Disallow: /private/ Allow: /
How to Manage GPTBot
Block GPTBot in robots.txt if you don't want your content used for AI training.
This does NOT affect ChatGPT browsing or search — those are separate agents.
Use Switch journeys to serve modified content specifically to GPTBot.
Monitor GPTBot crawl patterns to understand which content interests OpenAI.
Start managing GPTBot today
Switch detects, tracks, and lets you build custom journeys for GPTBot and 35+ other AI agents and crawlers. Set up in five minutes.
Get Started FreeRelated Agents
ChatGPT-User
AI AssistantsOpenAI
OpenAI's real-time browsing agent when ChatGPT users request live web content.
OAI-SearchBot
Search EnginesOpenAI
OpenAI's search indexing crawler for ChatGPT Search features.
OpenAI Operator
Browser AgentsOpenAI
OpenAI's browser agent that autonomously performs web tasks for users.
AI2Bot
Commercial CrawlersAllen AI
Allen Institute for AI's research crawler for academic AI development.
Amazonbot
Commercial CrawlersAmazon
Amazon's web crawler powering Alexa, Amazon search, and AI services.
Applebot-Extended
Commercial CrawlersApple
Apple's AI training token controlling how Applebot data is used for Apple Intelligence.