Question 1

What is the User-Agent for GPTBot?

Accepted Answer

GPTBot identifies itself with the User-Agent string "gptbot" (alternate forms: GPTBot). Use this exact string in robots.txt rules to control access.

Question 2

Can I stop GPTBot from using my content for AI training?

Accepted Answer

Yes. Add this to your robots.txt: User-agent: gptbot / Disallow: /. OpenAI commits to respecting robots.txt for training data.

Question 3

Will blocking GPTBot affect my AI citations?

Accepted Answer

No. GPTBot is a training crawler, separate from real-time AI assistants. For example, blocking GPTBot does not block OpenAI's user-prompt assistants from citing your content live.

Question 4

What's the difference between GPTBot and an AI assistant bot?

Accepted Answer

GPTBot crawls broadly to build training datasets — your content becomes part of the model's general knowledge but without direct attribution or links. AI assistant bots (like ChatGPT-User, Claude-User) fetch specific pages in response to user prompts and cite sources back. They use separate User-Agents and can be controlled independently.

Question 5

How do I verify that a request is really from GPTBot?

Accepted Answer

User-Agent alone is not enough — anyone can claim to be GPTBot. OpenAI publishes IP ranges at openai.com/gptbot.json so you can verify the source IP. BotSights flags spoofed traffic automatically.

Question 6

Is my content being used without permission?

Accepted Answer

Training crawlers collect publicly accessible content. The legal landscape around this is rapidly evolving (lawsuits in the US, EU AI Act, etc.). Robots.txt remains the most practical opt-out mechanism today, plus emerging standards like ai.txt.

Question 7

How often does GPTBot crawl?

Accepted Answer

Training crawlers usually visit periodically — weekly or monthly waves rather than daily. If you see sudden spikes, monitor whether the bot is honoring Crawl-delay directives in your robots.txt.

GPTBot

GPTBot Traffic (Last 90 Days)

What is GPTBot?

What GPTBot means for your site

What should you do?

How to identify GPTBot

How to block GPTBot

Option 1: Block all access

Option 2: Block specific paths only

Option 3: Slow down with a crawl delay

Frequently Asked Questions

See which pages AI training crawlers target