Commercial CrawlersActive

GPTBot

Name: GPTBot
Author: OpenAI

OpenAI's training data crawler for GPT models including ChatGPT and GPT-4.

Operated by OpenAIOfficial docs

What is GPTBot?

GPTBot is OpenAI's primary web crawler for collecting training data for GPT models, including ChatGPT, GPT-4, and future models. It crawls web pages at scale to build the datasets used during model pre-training and fine-tuning.

This is the most discussed AI training crawler due to OpenAI's market prominence. GPTBot respects robots.txt and publishes its IP ranges at openai.com/gptbot.json. Its crawl rate is moderate (around 100 pages/hour on major sites) compared to OpenAI's real-time browsing agents.

The decision to allow or block GPTBot is one of the most consequential AI policy decisions site owners face today. Allowing it means your content may influence GPT model behavior; blocking it keeps your content out of training but has no effect on ChatGPT's real-time browsing (that's ChatGPT-User) or search features (that's OAI-SearchBot).

User-Agent Strings

These are the known user-agent patterns used by GPTBot. Use them to identify this crawler in your server logs or configure robots.txt rules.

GPTBot

gptbot

robots.txt example:

User-agent: GPTBot
Disallow: /private/
Allow: /

How to Manage GPTBot

Block GPTBot in robots.txt if you don't want your content used for AI training.

This does NOT affect ChatGPT browsing or search — those are separate agents.

Use Switch journeys to serve modified content specifically to GPTBot.

Monitor GPTBot crawl patterns to understand which content interests OpenAI.

How to block GPTBot

Start managing GPTBot today

Switch detects, tracks, and lets you build custom journeys for GPTBot and 35+ other AI agents and crawlers. Set up in five minutes.

Get Started Free

Related Agents

ChatGPT-User

AI Assistants

OpenAI

OpenAI's real-time browsing agent when ChatGPT users request live web content.

OAI-SearchBot

Search Engines

OpenAI

OpenAI's search indexing crawler for ChatGPT Search features.

OpenAI Operator

Browser Agents

OpenAI

OpenAI's browser agent that autonomously performs web tasks for users.

AI2Bot

Commercial Crawlers

Allen AI

Allen Institute for AI's research crawler for academic AI development.

Amazonbot

Commercial Crawlers

Amazon

Amazon's web crawler powering Alexa, Amazon search, and AI services.

Applebot-Extended

Commercial Crawlers

Apple

Apple's AI training token controlling how Applebot data is used for Apple Intelligence.

Back to Agents Directory