Bot Database
1,631 known bots, AI crawlers, and web agents. See who visits your site and why.
1,631 bots found
Googlebot
Search Engineby Google
Googlebot is Google's primary web crawler that discovers and indexes web pages for Google Search. It is the most active crawler on the internet and drives organic search visibility.
AhrefsBot
SEO Toolby Ahrefs
AhrefsBot crawls the web to build Ahrefs' backlink index and SEO database. It is one of the most active crawlers on the internet after Googlebot.
Applebot
AI Search Crawlerby Apple
Applebot is Apple's web crawler that indexes content to power search features across Apple's ecosystem including Spotlight, Siri, and Safari suggestions.
GPTBot
AI Data Scraperby OpenAI
GPTBot is OpenAI's web crawler that collects data from publicly accessible web pages to improve AI models like ChatGPT. Site owners can control access via robots.txt.
meta-externalagent
AI Data Scraperby Meta
meta-externalagent crawls web content for training AI models and improving Meta's products by indexing content directly across the internet.
meta-webindexer
AI Search Crawlerby Meta
meta-webindexer browses the internet to improve search results for Meta AI users, analyzing online content to make Meta AI's responses more relevant with proper citations.
Amazonbot
AI Search Crawlerby Amazon
AI Search Crawler bot
ChatGPT-User
AI Assistantby OpenAI
ChatGPT-User is dispatched by OpenAI when a ChatGPT user asks a question that requires fetching live web content. It retrieves the page so ChatGPT can include it as a cited source in its response.
Claude-SearchBot
AI Search Crawlerby Anthropic
Claude-SearchBot indexes websites to create a search index that can be surfaced as results in Anthropic's Claude AI assistant search feature.
ClaudeBot
AI Data Scraperby Anthropic
ClaudeBot is Anthropic's web crawler that downloads training data for its large language models that power Claude. It respects robots.txt directives.
facebookexternalhit
Preview Botby Meta
facebookexternalhit fetches link previews when someone shares a URL on Facebook, Messenger, or Instagram. It reads Open Graph meta tags to generate the preview card.
SemrushBot
SEO Toolby Semrush
SemrushBot crawls websites to collect data for Semrush's SEO analytics platform, including backlinks, keyword rankings, and competitive analysis.
Bytespider
AI Data Scraperby ByteDance
Bytespider is operated by ByteDance, the company behind TikTok. It downloads training data for ByteDance's large language models including those powering Doubao, their ChatGPT competitor.
CCBot
AI Data Scraperby Common Crawl
CCBot creates an open repository of web data used by researchers and AI companies worldwide. Its crawl data has been used to train many major language models including GPT and LLaMA.
Claude-User
AI Assistantby Anthropic
Claude-User is sent by Anthropic's Claude AI assistant when a user prompt requires fetching web content. It retrieves pages to include as cited sources in Claude's responses.
Gemini-Deep-Research
AI Assistantby Google
Gemini-Deep-Research is the agent responsible for collecting resources used in Google Gemini's Deep Research feature, which acts as a personal research assistant that browses the web on behalf of users.
Google-Extended
AI Data Scraperby Google
Google-Extended downloads web content for Google's AI products like Gemini and Vertex AI generative APIs. Blocking this bot prevents your content from being used for AI training without affecting Google Search indexing.
MistralAI-User
AI Assistantby Mistral AI
MistralAI-User is Mistral's AI assistant bot that performs web browsing tasks for users in Le Chat, retrieving web pages to answer user queries with cited sources.
OAI-SearchBot
AI Search Crawlerby OpenAI
OAI-SearchBot is OpenAI's web crawler that indexes websites for SearchGPT, collecting web content to power AI-driven search results and real-time information retrieval.
Perplexity-User
AI Assistantby Perplexity AI
Perplexity-User fetches web pages when a Perplexity user asks a question. The retrieved content is used to generate an AI-powered answer with inline citations linking back to the source.
PerplexityBot
AI Search Crawlerby Perplexity AI
PerplexityBot indexes web content to power Perplexity AI's search engine. Unlike Perplexity-User, this bot crawls proactively to build a search index rather than fetching on-demand for a specific user query.
ChatGPT Agent
AI Browser Agentby OpenAI
ChatGPT Agent is an autonomous AI agent that can use a web browser to navigate websites, interact with forms, and complete multi-step tasks on behalf of a ChatGPT user.
Google-Agent
AI Browser Agentby Google
Google-Agent is used by agents hosted on Google infrastructure to navigate the web and perform actions upon user request.
Manus-User
AI Browser Agentby Butterfly Effect
Manus-User is a browser-enabled AI agent that autonomously navigates websites, interprets content, and carries out multi-step tasks for users.
NovaAct
AI Browser Agentby Amazon
Nova Act is Amazon's AI agent that can use a web browser to navigate websites and complete multi-step tasks on behalf of a human user.
LinkedInBot
Preview Botby LinkedIn
LinkedInBot fetches link previews when a URL is shared on LinkedIn. It reads Open Graph and meta tags to generate the post preview card visible to the poster's network.
Pinterestbot
Preview BotFetcher bot
by Meta
WhatsApp's preview bot fetches link metadata when someone shares a URL in a WhatsApp chat. It reads Open Graph tags to display a title, description, and thumbnail image.
Applebot-Extended
AI Data Scraperby Apple
Applebot-Extended trains Apple's foundation language models powering Apple Intelligence features across Apple products. Blocking this bot prevents AI training without affecting Siri or Spotlight.
Baiduspider
Search Engineby Baidu
Search Engine Crawler bot
DotBot
SEO ToolSEO Crawler bot
DuckDuckBot
Search EngineSearch Engine Crawler bot
FacebookBot
AI Data Scraperby Meta
FacebookBot downloads web content to train Meta's AI speech recognition and language models. Separate from facebookexternalhit which handles link previews.
kagi-fetcher
AI Assistantby Kagi
kagi-fetcher fetches web content for Kagi AI's suite of tools including Assistant, Research, and other knowledge discovery features to answer user queries.
meta-externalfetcher
AI Assistantby Meta
meta-externalfetcher is used by Meta to perform user-initiated fetches of web pages from AI assistant product features like Meta AI.
MJ12bot
SEO ToolSEO Crawler bot
PhindBot
AI Assistantby Phind
PhindBot is an AI-powered answer engine designed for developers, fetching web content to provide technical answers and code examples with real-time citations.
YandexBot
Search Engineby Yandex
Search Engine Crawler bot
AI2Bot-DeepResearchEval
AI Assistantby Allen Institute for AI
AI2Bot-DeepResearchEval is operated by Ai2, a non-profit AI research institute. It collects resources used in deep research queries performed by open source AI models.
Amzn-User
AI Assistantby Amazon
Amzn-User is an AI assistant operated by Amazon that fetches web content to answer user queries through Alexa and other Amazon AI services.
bigsur.ai
AI AssistantAI Assistant bot
DuckAssistBot
AI Assistantby DuckDuckGo
DuckAssistBot fetches web content for DuckDuckGo's AI-assisted answers feature, which generates brief responses to search queries using natural language technology.
Google-NotebookLM
AI Assistantby Google
Google-NotebookLM is an AI-powered research assistant that fetches source URLs when users add them to their notebooks, enabling the AI to analyze pages for context and insights.
KlaviyoAIBot
AI AssistantAI Assistant bot
LinerBot
AI Assistantby Liner
LinerBot gathers information from academic sources and websites to provide answers with line-by-line source citations for research and scholarly work.
Poggio-Citations
AI AssistantAI Assistant bot
QualifiedBot
AI AssistantAI Assistant bot
TavilyBot
AI AssistantAI Assistant bot
AmazonBuyForMe
AI Browser Agentby Amazon
AI Agent bot
GoogleAgent-Mariner
AI Browser Agentby Google
AI Agent bot
TwinAgent
AI Browser AgentAI Agent bot
AddSearchBot
AI Search CrawlerAI Search Crawler bot
Amzn-SearchBot
AI Search Crawlerby Amazon
Uncategorized bot
Anomura
AI Search CrawlerAI Search Crawler bot
atlassian-bot
AI Search CrawlerAI Search Crawler bot
AzureAI-SearchBot
AI Search CrawlerUncategorized bot
Bravebot
AI Search CrawlerAI Search Crawler bot
Channel3Bot
AI Search CrawlerAI Search Crawler bot
Cloudflare-AutoRAG
AI Search CrawlerAI Search Crawler bot
ExaBot
AI Search CrawlerUncategorized bot
Google-CloudVertexBot
AI Search Crawlerby Google
AI Search Crawler bot
LinkupBot
AI Search CrawlerAI Search Crawler bot
PetalBot
AI Search CrawlerAI Search Crawler bot
YouBot
AI Search CrawlerAI Search Crawler bot
ZanistaBot
AI Search CrawlerAI Search Crawler bot
Ai2Bot-Dolma
AI Data ScraperAI Data Scraper bot
ApifyWebsiteContentCrawler
AI Data ScraperUncategorized bot
ChatGLM-Spider
AI Data ScraperAI Data Scraper bot
CloudVertexBot
AI Data ScraperAI Data Scraper bot
cohere-training-data-crawler
AI Data ScraperAI Data Scraper bot
Cotoyogi
AI Data ScraperAI Data Scraper bot
Datenbank Crawler
AI Data ScraperAI Data Scraper bot
Diffbot
AI Data ScraperAI Data Scraper bot
FirecrawlAgent
AI Data ScraperUncategorized bot
GoogleOther
AI Data Scraperby Google
AI Data Scraper bot
ICC-Crawler
AI Data ScraperAI Data Scraper bot
imageSpider
AI Data Scraperby ByteDance
AI Data Scraper bot
Kangaroo Bot
AI Data ScraperAI Data Scraper bot
laion-huggingface-processor
AI Data ScraperAI Data Scraper bot
LCC
AI Data ScraperAI Data Scraper bot
netEstate Imprint Crawler
AI Data ScraperAI Data Scraper bot
omgili
AI Data ScraperAI Data Scraper bot
PanguBot
AI Data ScraperAI Data Scraper bot
SBIntuitionsBot
AI Data ScraperAI Data Scraper bot
Spider
AI Data ScraperAI Data Scraper bot
Timpibot
AI Data ScraperAI Data Scraper bot
VelenPublicWebCrawler
AI Data ScraperAI Data Scraper bot
webzio-extended
AI Data ScraperAI Data Scraper bot
Devin
AI Coding Agentby Cognition
Devin is a software engineering AI assistant by Cognition that can browse websites and perform web-based tasks, functioning as a collaborative AI teammate for engineering teams.
360Spider
Search EngineSearch Engine Crawler bot
Alexa Archive
Search EngineSearch Engine Crawler bot
alexa site audit
Search EngineSearch Engine Crawler bot
AlexandriaOrgBot
Search EngineSearch Engine Crawler bot
Algolia
Search EngineSearch Engine Crawler bot
Algolia Crawler
Search EngineSearch Engine Crawler bot
Atom Feed Robot
Search EngineSearch Engine Crawler bot
Baiduspider-render
Search Engineby Baidu
Search Engine Crawler bot
bingbot
Search Engineby Microsoft
Search Engine Crawler bot
cludo.com bot
Search EngineSearch Engine Crawler bot
Cốc Cốc
Search EngineSearch Engine Crawler bot
coccocbot
Search EngineSearch Engine Crawler bot
coccocbot-image
Search EngineSearch Engine Crawler bot
coccocbot-web
Search EngineSearch Engine Crawler bot
Coveo Bot
Search EngineSearch Engine Crawler bot
Coveobot
Search EngineSearch Engine Crawler bot
crawler.freespoke.com
Search EngineSearch Engine Crawler bot
Crawlson
Search EngineSearch Engine Crawler bot
Dataprovider
Search EngineSearch Engine Crawler bot
Daum
Search EngineSearch Engine Crawler bot
DuckDuckGo-Favicons-Bot
Search EngineSearch Engine Crawler bot
Feedfetcher-Google
Search Engineby Google
Search Engine Crawler bot
FindFiles.net
Search EngineUncategorized bot
FindITAnswersbot
Search EngineSearch Engine Crawler bot
Freespoke
Search EngineSearch Engine Crawler bot
FreespokeCrawler
Search EngineSearch Engine Crawler bot
Funnelback
Search EngineSearch Engine Crawler bot
FyndSearchEngine-Crawler
Search EngineSearch Engine Crawler bot
FyndSearchEngine-ReCrawler
Search EngineSearch Engine Crawler bot
GeedoProductSearch
Search EngineSearch Engine Crawler bot
Gigabot
Search EngineSearch Engine Crawler bot
Google Favicon
Search Engineby Google
Search Engine Crawler bot
Google Images
Search Engineby Google
Search Engine Crawler bot
Google Scholar
Search Engineby Google
Search Engine Crawler bot
Google Videos
Search Engineby Google
Search Engine Crawler bot
Googlebot-IA
Search Engineby Google
Search Engine Crawler bot
Googlebot-Image
Search Engineby Google
Search Engine Crawler bot
Googlebot-Mobile
Search Engineby Google
Search Engine Crawler bot
Googlebot-News
Search Engineby Google
Search Engine Crawler bot
Googlebot-Video
Search Engineby Google
Search Engine Crawler bot
Greppr Web Crawler
Search EngineSearch Engine Crawler bot
HaosouSpider
Search EngineSearch Engine Crawler bot
Hype Machine
Search EngineSearch Engine Crawler bot
IbouBot
Search EngineSearch Engine Crawler bot
intelx.io_bot
Search EngineSearch Engine Crawler bot
Jooblebot
Search EngineSearch Engine Crawler bot
Kagibot
Search EngineSearch Engine Crawler bot
Level9SearchBot
Search EngineSearch Engine Crawler bot
Linespider
Search EngineSearch Engine Crawler bot
lyonl
Search EngineUncategorized bot
lyonl-crawler
Search EngineUncategorized bot
MagiBot
Search EngineSearch Engine Crawler bot
Marginalia Search
Search EngineSearch Engine Crawler bot
Mars Finder
Search EngineSearch Engine Crawler bot
MojeekBot
Search EngineSearch Engine Crawler bot
MotoMinerBot
Search EngineSearch Engine Crawler bot
MRGbot
Search EngineSearch Engine Crawler bot
MSN
Search Engineby Microsoft
Search Engine Crawler bot
msnbot
Search Engineby Microsoft
Search Engine Crawler bot
msnbot-media
Search Engineby Microsoft
Search Engine Crawler bot
Neevabot
Search EngineSearch Engine Crawler bot
Showing 150 of 1,631 results. Use search or category filters to narrow down.
Track these bots on your website
See which AI tools cite your content, who crawls your pages, and when your links are shared.
Start FreeThe Complete List of Web Crawlers, AI Bots & User Agents
Over half of all website traffic comes from bots, crawlers, and automated agents, not humans. This database catalogs 1,631 known bot user agents across 16 categories, from AI assistants like ChatGPT and Claude to search engine crawlers like Googlebot, SEO tools, social media preview fetchers, and AI training scrapers.
Each bot is identified by its user agent string, a text pattern in the HTTP request header that reveals which bot is visiting your site. Understanding these user agents is the first step toward controlling how bots interact with your content.
How to Detect Bots on Your Website
Bot detection works by matching the user agent string in incoming HTTP requests against known patterns. When a bot like ChatGPT-User or Googlebot visits your site, its user agent identifies it. BotSights matches these patterns against a database of 1,631 known bots to classify each visit by category, operator, and intent.
Not all bots identify themselves honestly. Some use fake or generic user agents to avoid detection. Server-side detection, which analyzes requests before they reach your frontend, catches bots that client-side JavaScript cannot see.
AI Crawlers and Your robots.txt
AI companies like OpenAI, Anthropic, Google, and Meta use web crawlers to collect training data for their language models. Bots like GPTBot, ClaudeBot, and Google-Extended download your content to include in datasets used to train AI. You can control this access through your robots.txt file.
However, there is an important distinction: blocking a training crawler (like GPTBot) does not prevent the AI assistant (ChatGPT-User) from citing your content. These are separate bots with different purposes. Blocking GPTBot prevents your content from being used for training, while ChatGPT-User fetches your page in real-time when a user asks a question.
Bot Categories Explained
AI Assistants
User-facing bots like ChatGPT-User, Claude-User, and Perplexity-User. They fetch your page when a human asks an AI a question. A visit from these bots means your content was cited in an AI response.
AI Search Crawlers
Bots like PerplexityBot and OAI-SearchBot that proactively index your site for AI-powered search engines. Unlike AI Assistants, these crawl independently of user queries, similar to how Googlebot works for traditional search.
AI Data Scrapers
Training crawlers like GPTBot, ClaudeBot, and CCBot that download your content to include in AI training datasets. These can be blocked via robots.txt without affecting your visibility in AI search results.
Search Engines
Traditional crawlers like Googlebot, Bingbot, and Yandex that index your pages for search results. These are essential for SEO visibility.
Preview Bots
Social media fetchers like WhatsApp, facebookexternalhit, and LinkedInBot that generate link preview cards when someone shares your URL. A visit from these bots signals a social share.
Frequently Asked Questions
What is a web crawler?
A web crawler (also called a bot, spider, or user agent) is an automated program that visits websites by making HTTP requests. Crawlers are used by search engines to index content, by AI companies to collect training data, and by tools to analyze websites.
What is a user agent string?
A user agent string is a text identifier sent with every HTTP request that tells a website what software is making the request. Bots use user agent strings like "Googlebot/2.1" or "ChatGPT-User" to identify themselves. This is how websites detect and classify bot traffic.
How many bots are in this database?
This database contains 1,631 known bots across 16 categories, including AI assistants, search engine crawlers, SEO tools, social media preview bots, AI training scrapers, and more. It is updated regularly as new bots are discovered.
What is the difference between GPTBot and ChatGPT-User?
GPTBot is OpenAI's training data crawler that downloads content to improve AI models. ChatGPT-User is the real-time assistant that fetches a page when a user asks ChatGPT a question. Blocking GPTBot in robots.txt prevents training, but does not stop ChatGPT-User from citing your content.
Can I block AI bots from crawling my website?
Yes. Most AI crawlers respect robots.txt directives. You can add rules like "User-agent: GPTBot / Disallow: /" to block specific bots. However, not all bots comply, and some ignore robots.txt entirely. Server-side bot detection gives you more control.
What does it mean when an AI assistant visits my site?
It means someone asked an AI (like ChatGPT, Claude, or Perplexity) a question, and the AI determined your page was relevant. Your content may be quoted, summarized, or linked in the AI's response. This is called an AI citation.
How can I track which bots visit my website?
Traditional analytics tools like Google Analytics only track JavaScript-enabled browsers and miss most bot traffic. Server-side tools like BotSights analyze raw HTTP requests to detect and categorize all bot visits, including AI crawlers, search engines, and social preview fetchers.
Are all bots bad?
No. Many bots are beneficial. Search engine crawlers help your pages appear in search results, AI assistants cite your content to their users, and social preview bots generate link cards when your URLs are shared. Understanding which bots visit your site helps you make informed decisions about access.