Bot Database

1,631 known bots, AI crawlers, and web agents. See who visits your site and why.

1,631 bots found

Googlebot

Search Engine

by Google

Googlebot is Google's primary web crawler that discovers and indexes web pages for Google Search. It is the most active crawler on the internet and drives organic search visibility.

AhrefsBot

SEO Tool

by Ahrefs

AhrefsBot crawls the web to build Ahrefs' backlink index and SEO database. It is one of the most active crawlers on the internet after Googlebot.

Applebot

AI Search Crawler

by Apple

Applebot is Apple's web crawler that indexes content to power search features across Apple's ecosystem including Spotlight, Siri, and Safari suggestions.

GPTBot

AI Data Scraper

by OpenAI

GPTBot is OpenAI's web crawler that collects data from publicly accessible web pages to improve AI models like ChatGPT. Site owners can control access via robots.txt.

meta-externalagent

AI Data Scraper

by Meta

meta-externalagent crawls web content for training AI models and improving Meta's products by indexing content directly across the internet.

meta-webindexer

AI Search Crawler

by Meta

meta-webindexer browses the internet to improve search results for Meta AI users, analyzing online content to make Meta AI's responses more relevant with proper citations.

Amazonbot

AI Search Crawler

by Amazon

AI Search Crawler bot

ChatGPT-User

AI Assistant

by OpenAI

ChatGPT-User is dispatched by OpenAI when a ChatGPT user asks a question that requires fetching live web content. It retrieves the page so ChatGPT can include it as a cited source in its response.

Claude-SearchBot

AI Search Crawler

by Anthropic

Claude-SearchBot indexes websites to create a search index that can be surfaced as results in Anthropic's Claude AI assistant search feature.

ClaudeBot

AI Data Scraper

by Anthropic

ClaudeBot is Anthropic's web crawler that downloads training data for its large language models that power Claude. It respects robots.txt directives.

facebookexternalhit

Preview Bot

by Meta

facebookexternalhit fetches link previews when someone shares a URL on Facebook, Messenger, or Instagram. It reads Open Graph meta tags to generate the preview card.

SemrushBot

SEO Tool

by Semrush

SemrushBot crawls websites to collect data for Semrush's SEO analytics platform, including backlinks, keyword rankings, and competitive analysis.

Bytespider

AI Data Scraper

by ByteDance

Bytespider is operated by ByteDance, the company behind TikTok. It downloads training data for ByteDance's large language models including those powering Doubao, their ChatGPT competitor.

CCBot

AI Data Scraper

by Common Crawl

CCBot creates an open repository of web data used by researchers and AI companies worldwide. Its crawl data has been used to train many major language models including GPT and LLaMA.

Claude-User

AI Assistant

by Anthropic

Claude-User is sent by Anthropic's Claude AI assistant when a user prompt requires fetching web content. It retrieves pages to include as cited sources in Claude's responses.

Gemini-Deep-Research

AI Assistant

by Google

Gemini-Deep-Research is the agent responsible for collecting resources used in Google Gemini's Deep Research feature, which acts as a personal research assistant that browses the web on behalf of users.

Google-Extended

AI Data Scraper

by Google

Google-Extended downloads web content for Google's AI products like Gemini and Vertex AI generative APIs. Blocking this bot prevents your content from being used for AI training without affecting Google Search indexing.

MistralAI-User

AI Assistant

by Mistral AI

MistralAI-User is Mistral's AI assistant bot that performs web browsing tasks for users in Le Chat, retrieving web pages to answer user queries with cited sources.

OAI-SearchBot

AI Search Crawler

by OpenAI

OAI-SearchBot is OpenAI's web crawler that indexes websites for SearchGPT, collecting web content to power AI-driven search results and real-time information retrieval.

Perplexity-User

AI Assistant

by Perplexity AI

Perplexity-User fetches web pages when a Perplexity user asks a question. The retrieved content is used to generate an AI-powered answer with inline citations linking back to the source.

PerplexityBot

AI Search Crawler

by Perplexity AI

PerplexityBot indexes web content to power Perplexity AI's search engine. Unlike Perplexity-User, this bot crawls proactively to build a search index rather than fetching on-demand for a specific user query.

ChatGPT Agent

AI Browser Agent

by OpenAI

ChatGPT Agent is an autonomous AI agent that can use a web browser to navigate websites, interact with forms, and complete multi-step tasks on behalf of a ChatGPT user.

Google-Agent

AI Browser Agent

by Google

Google-Agent is used by agents hosted on Google infrastructure to navigate the web and perform actions upon user request.

Manus-User

AI Browser Agent

by Butterfly Effect

Manus-User is a browser-enabled AI agent that autonomously navigates websites, interprets content, and carries out multi-step tasks for users.

NovaAct

AI Browser Agent

by Amazon

Nova Act is Amazon's AI agent that can use a web browser to navigate websites and complete multi-step tasks on behalf of a human user.

LinkedInBot

Preview Bot

by LinkedIn

LinkedInBot fetches link previews when a URL is shared on LinkedIn. It reads Open Graph and meta tags to generate the post preview card visible to the poster's network.

Pinterestbot

Preview Bot

Fetcher bot

WhatsApp

Preview Bot

by Meta

WhatsApp's preview bot fetches link metadata when someone shares a URL in a WhatsApp chat. It reads Open Graph tags to display a title, description, and thumbnail image.

Applebot-Extended

AI Data Scraper

by Apple

Applebot-Extended trains Apple's foundation language models powering Apple Intelligence features across Apple products. Blocking this bot prevents AI training without affecting Siri or Spotlight.

Baiduspider

Search Engine

by Baidu

Search Engine Crawler bot

DotBot

SEO Tool

SEO Crawler bot

DuckDuckBot

Search Engine

Search Engine Crawler bot

FacebookBot

AI Data Scraper

by Meta

FacebookBot downloads web content to train Meta's AI speech recognition and language models. Separate from facebookexternalhit which handles link previews.

kagi-fetcher

AI Assistant

by Kagi

kagi-fetcher fetches web content for Kagi AI's suite of tools including Assistant, Research, and other knowledge discovery features to answer user queries.

meta-externalfetcher

AI Assistant

by Meta

meta-externalfetcher is used by Meta to perform user-initiated fetches of web pages from AI assistant product features like Meta AI.

MJ12bot

SEO Tool

SEO Crawler bot

PhindBot

AI Assistant

by Phind

PhindBot is an AI-powered answer engine designed for developers, fetching web content to provide technical answers and code examples with real-time citations.

YandexBot

Search Engine

by Yandex

Search Engine Crawler bot

AI2Bot-DeepResearchEval

AI Assistant

by Allen Institute for AI

AI2Bot-DeepResearchEval is operated by Ai2, a non-profit AI research institute. It collects resources used in deep research queries performed by open source AI models.

Amzn-User

AI Assistant

by Amazon

Amzn-User is an AI assistant operated by Amazon that fetches web content to answer user queries through Alexa and other Amazon AI services.

bigsur.ai

AI Assistant

AI Assistant bot

DuckAssistBot

AI Assistant

by DuckDuckGo

DuckAssistBot fetches web content for DuckDuckGo's AI-assisted answers feature, which generates brief responses to search queries using natural language technology.

Google-NotebookLM

AI Assistant

by Google

Google-NotebookLM is an AI-powered research assistant that fetches source URLs when users add them to their notebooks, enabling the AI to analyze pages for context and insights.

KlaviyoAIBot

AI Assistant

AI Assistant bot

LinerBot

AI Assistant

by Liner

LinerBot gathers information from academic sources and websites to provide answers with line-by-line source citations for research and scholarly work.

Poggio-Citations

AI Assistant

AI Assistant bot

QualifiedBot

AI Assistant

AI Assistant bot

TavilyBot

AI Assistant

AI Assistant bot

AmazonBuyForMe

AI Browser Agent

by Amazon

AI Agent bot

GoogleAgent-Mariner

AI Browser Agent

by Google

AI Agent bot

TwinAgent

AI Browser Agent

AI Agent bot

AddSearchBot

AI Search Crawler

AI Search Crawler bot

Amzn-SearchBot

AI Search Crawler

by Amazon

Uncategorized bot

Anomura

AI Search Crawler

AI Search Crawler bot

atlassian-bot

AI Search Crawler

AI Search Crawler bot

AzureAI-SearchBot

AI Search Crawler

Uncategorized bot

Bravebot

AI Search Crawler

AI Search Crawler bot

Channel3Bot

AI Search Crawler

AI Search Crawler bot

Cloudflare-AutoRAG

AI Search Crawler

AI Search Crawler bot

ExaBot

AI Search Crawler

Uncategorized bot

Google-CloudVertexBot

AI Search Crawler

by Google

AI Search Crawler bot

LinkupBot

AI Search Crawler

AI Search Crawler bot

PetalBot

AI Search Crawler

AI Search Crawler bot

YouBot

AI Search Crawler

AI Search Crawler bot

ZanistaBot

AI Search Crawler

AI Search Crawler bot

Ai2Bot-Dolma

AI Data Scraper

AI Data Scraper bot

ApifyWebsiteContentCrawler

AI Data Scraper

Uncategorized bot

ChatGLM-Spider

AI Data Scraper

AI Data Scraper bot

CloudVertexBot

AI Data Scraper

AI Data Scraper bot

cohere-training-data-crawler

AI Data Scraper

AI Data Scraper bot

Cotoyogi

AI Data Scraper

AI Data Scraper bot

Datenbank Crawler

AI Data Scraper

AI Data Scraper bot

Diffbot

AI Data Scraper

AI Data Scraper bot

FirecrawlAgent

AI Data Scraper

Uncategorized bot

GoogleOther

AI Data Scraper

by Google

AI Data Scraper bot

ICC-Crawler

AI Data Scraper

AI Data Scraper bot

imageSpider

AI Data Scraper

by ByteDance

AI Data Scraper bot

Kangaroo Bot

AI Data Scraper

AI Data Scraper bot

laion-huggingface-processor

AI Data Scraper

AI Data Scraper bot

LCC

AI Data Scraper

AI Data Scraper bot

netEstate Imprint Crawler

AI Data Scraper

AI Data Scraper bot

omgili

AI Data Scraper

AI Data Scraper bot

PanguBot

AI Data Scraper

AI Data Scraper bot

SBIntuitionsBot

AI Data Scraper

AI Data Scraper bot

Spider

AI Data Scraper

AI Data Scraper bot

Timpibot

AI Data Scraper

AI Data Scraper bot

VelenPublicWebCrawler

AI Data Scraper

AI Data Scraper bot

webzio-extended

AI Data Scraper

AI Data Scraper bot

Devin

AI Coding Agent

by Cognition

Devin is a software engineering AI assistant by Cognition that can browse websites and perform web-based tasks, functioning as a collaborative AI teammate for engineering teams.

360Spider

Search Engine

Search Engine Crawler bot

Alexa Archive

Search Engine

Search Engine Crawler bot

alexa site audit

Search Engine

Search Engine Crawler bot

AlexandriaOrgBot

Search Engine

Search Engine Crawler bot

Algolia

Search Engine

Search Engine Crawler bot

Algolia Crawler

Search Engine

Search Engine Crawler bot

Atom Feed Robot

Search Engine

Search Engine Crawler bot

Baiduspider-render

Search Engine

by Baidu

Search Engine Crawler bot

bingbot

Search Engine

by Microsoft

Search Engine Crawler bot

cludo.com bot

Search Engine

Search Engine Crawler bot

Cốc Cốc

Search Engine

Search Engine Crawler bot

coccocbot

Search Engine

Search Engine Crawler bot

coccocbot-image

Search Engine

Search Engine Crawler bot

coccocbot-web

Search Engine

Search Engine Crawler bot

Coveo Bot

Search Engine

Search Engine Crawler bot

Coveobot

Search Engine

Search Engine Crawler bot

crawler.freespoke.com

Search Engine

Search Engine Crawler bot

Crawlson

Search Engine

Search Engine Crawler bot

Dataprovider

Search Engine

Search Engine Crawler bot

Daum

Search Engine

Search Engine Crawler bot

DuckDuckGo-Favicons-Bot

Search Engine

Search Engine Crawler bot

Feedfetcher-Google

Search Engine

by Google

Search Engine Crawler bot

FindFiles.net

Search Engine

Uncategorized bot

FindITAnswersbot

Search Engine

Search Engine Crawler bot

Freespoke

Search Engine

Search Engine Crawler bot

FreespokeCrawler

Search Engine

Search Engine Crawler bot

Funnelback

Search Engine

Search Engine Crawler bot

FyndSearchEngine-Crawler

Search Engine

Search Engine Crawler bot

FyndSearchEngine-ReCrawler

Search Engine

Search Engine Crawler bot

GeedoProductSearch

Search Engine

Search Engine Crawler bot

Gigabot

Search Engine

Search Engine Crawler bot

Google Favicon

Search Engine

by Google

Search Engine Crawler bot

Google Images

Search Engine

by Google

Search Engine Crawler bot

Google Scholar

Search Engine

by Google

Search Engine Crawler bot

Google Videos

Search Engine

by Google

Search Engine Crawler bot

Googlebot-IA

Search Engine

by Google

Search Engine Crawler bot

Googlebot-Image

Search Engine

by Google

Search Engine Crawler bot

Googlebot-Mobile

Search Engine

by Google

Search Engine Crawler bot

Googlebot-News

Search Engine

by Google

Search Engine Crawler bot

Googlebot-Video

Search Engine

by Google

Search Engine Crawler bot

Greppr Web Crawler

Search Engine

Search Engine Crawler bot

HaosouSpider

Search Engine

Search Engine Crawler bot

Hype Machine

Search Engine

Search Engine Crawler bot

IbouBot

Search Engine

Search Engine Crawler bot

intelx.io_bot

Search Engine

Search Engine Crawler bot

Jooblebot

Search Engine

Search Engine Crawler bot

Kagibot

Search Engine

Search Engine Crawler bot

Level9SearchBot

Search Engine

Search Engine Crawler bot

Linespider

Search Engine

Search Engine Crawler bot

lyonl

Search Engine

Uncategorized bot

lyonl-crawler

Search Engine

Uncategorized bot

MagiBot

Search Engine

Search Engine Crawler bot

Marginalia Search

Search Engine

Search Engine Crawler bot

Mars Finder

Search Engine

Search Engine Crawler bot

MojeekBot

Search Engine

Search Engine Crawler bot

MotoMinerBot

Search Engine

Search Engine Crawler bot

MRGbot

Search Engine

Search Engine Crawler bot

MSN

Search Engine

by Microsoft

Search Engine Crawler bot

msnbot

Search Engine

by Microsoft

Search Engine Crawler bot

msnbot-media

Search Engine

by Microsoft

Search Engine Crawler bot

Neevabot

Search Engine

Search Engine Crawler bot

Showing 150 of 1,631 results. Use search or category filters to narrow down.

Track these bots on your website

See which AI tools cite your content, who crawls your pages, and when your links are shared.

Start Free

The Complete List of Web Crawlers, AI Bots & User Agents

Over half of all website traffic comes from bots, crawlers, and automated agents, not humans. This database catalogs 1,631 known bot user agents across 16 categories, from AI assistants like ChatGPT and Claude to search engine crawlers like Googlebot, SEO tools, social media preview fetchers, and AI training scrapers.

Each bot is identified by its user agent string, a text pattern in the HTTP request header that reveals which bot is visiting your site. Understanding these user agents is the first step toward controlling how bots interact with your content.

How to Detect Bots on Your Website

Bot detection works by matching the user agent string in incoming HTTP requests against known patterns. When a bot like ChatGPT-User or Googlebot visits your site, its user agent identifies it. BotSights matches these patterns against a database of 1,631 known bots to classify each visit by category, operator, and intent.

Not all bots identify themselves honestly. Some use fake or generic user agents to avoid detection. Server-side detection, which analyzes requests before they reach your frontend, catches bots that client-side JavaScript cannot see.

AI Crawlers and Your robots.txt

AI companies like OpenAI, Anthropic, Google, and Meta use web crawlers to collect training data for their language models. Bots like GPTBot, ClaudeBot, and Google-Extended download your content to include in datasets used to train AI. You can control this access through your robots.txt file.

However, there is an important distinction: blocking a training crawler (like GPTBot) does not prevent the AI assistant (ChatGPT-User) from citing your content. These are separate bots with different purposes. Blocking GPTBot prevents your content from being used for training, while ChatGPT-User fetches your page in real-time when a user asks a question.

Bot Categories Explained

AI Assistants

User-facing bots like ChatGPT-User, Claude-User, and Perplexity-User. They fetch your page when a human asks an AI a question. A visit from these bots means your content was cited in an AI response.

AI Search Crawlers

Bots like PerplexityBot and OAI-SearchBot that proactively index your site for AI-powered search engines. Unlike AI Assistants, these crawl independently of user queries, similar to how Googlebot works for traditional search.

AI Data Scrapers

Training crawlers like GPTBot, ClaudeBot, and CCBot that download your content to include in AI training datasets. These can be blocked via robots.txt without affecting your visibility in AI search results.

Search Engines

Traditional crawlers like Googlebot, Bingbot, and Yandex that index your pages for search results. These are essential for SEO visibility.

Preview Bots

Social media fetchers like WhatsApp, facebookexternalhit, and LinkedInBot that generate link preview cards when someone shares your URL. A visit from these bots signals a social share.

Frequently Asked Questions

What is a web crawler?

A web crawler (also called a bot, spider, or user agent) is an automated program that visits websites by making HTTP requests. Crawlers are used by search engines to index content, by AI companies to collect training data, and by tools to analyze websites.

What is a user agent string?

A user agent string is a text identifier sent with every HTTP request that tells a website what software is making the request. Bots use user agent strings like "Googlebot/2.1" or "ChatGPT-User" to identify themselves. This is how websites detect and classify bot traffic.

How many bots are in this database?

This database contains 1,631 known bots across 16 categories, including AI assistants, search engine crawlers, SEO tools, social media preview bots, AI training scrapers, and more. It is updated regularly as new bots are discovered.

What is the difference between GPTBot and ChatGPT-User?

GPTBot is OpenAI's training data crawler that downloads content to improve AI models. ChatGPT-User is the real-time assistant that fetches a page when a user asks ChatGPT a question. Blocking GPTBot in robots.txt prevents training, but does not stop ChatGPT-User from citing your content.

Can I block AI bots from crawling my website?

Yes. Most AI crawlers respect robots.txt directives. You can add rules like "User-agent: GPTBot / Disallow: /" to block specific bots. However, not all bots comply, and some ignore robots.txt entirely. Server-side bot detection gives you more control.

What does it mean when an AI assistant visits my site?

It means someone asked an AI (like ChatGPT, Claude, or Perplexity) a question, and the AI determined your page was relevant. Your content may be quoted, summarized, or linked in the AI's response. This is called an AI citation.

How can I track which bots visit my website?

Traditional analytics tools like Google Analytics only track JavaScript-enabled browsers and miss most bot traffic. Server-side tools like BotSights analyze raw HTTP requests to detect and categorize all bot visits, including AI crawlers, search engines, and social preview fetchers.

Are all bots bad?

No. Many bots are beneficial. Search engine crawlers help your pages appear in search results, AI assistants cite your content to their users, and social preview bots generate link cards when your URLs are shared. Understanding which bots visit your site helps you make informed decisions about access.