Archive-It
ArchiverLast updated:
Preserves web content for historical access.
Recommended action: Allow access unless you have legal or competitive reasons to restrict.
Category
Archiver
Primary use case
Web archiving
Trust level
Review recommended
Trust Levels
- Trusted
- Generally safe
- Review recommended
- Caution advised
Trust levels are an indication based on category, operator, and robots.txt compliance. Always review bot activity for your specific situation.
Learn how we assess trustrobots.txt
Unknown
Archive-It Traffic (Last 90 Days)
Not enough network data yet.
Track this bot on your siteWhat is Archive-It?
Archiver bot
What Archive-It means for your site
Archive-It is a archiver with an undocumented operator. Its activity on your site should be reviewed to determine whether it is beneficial, neutral, or unwanted. Robots.txt compliance is not confirmed for this bot.
What should you do?
- Review this bot's activity in BotSights
- Check which pages it visits most frequently
- Consider server-side blocking if access is unwanted
See Archive-It on your own site
BotSights tracks every Archive-It visit in real time, including which pages it crawls, how often, and from where.
How to identify Archive-It
Archive-It uses the user-agent "archive-it" and robots.txt compliance unconfirmed.
archive-itArchive-ItHow to block Archive-It
Three robots.txt options below. Pick the one that matches your goal. Each snippet lists every known Archive-It user-agent pattern so the rules apply regardless of which one the bot announces. Compliance with robots.txt is unconfirmed for Archive-It, so verify with crawl logs after deploying.
Edit robots.txt with care
A single misplaced line can de-index your entire site. Common mistake: pasting User-agent: * followed by Disallow: / blocks every bot, not just Archive-It, including Googlebot. Always paste the snippet between existing rules (not over them), keep the User-agent line scoped to Archive-It's patterns, and verify with Google's robots.txt tester before deploying. If you are not sure, ask a developer first.
Option 1: Block all access
Tells Archive-It not to crawl any URL on your site. Use this when you want the bot completely off your content.
User-agent: archive-it
User-agent: Archive-It
Disallow: /Option 2: Block specific paths only
Keep public content crawlable but exclude sensitive or non-public sections. Add one Disallow: line per path. Replace the example paths with your own.
User-agent: archive-it
User-agent: Archive-It
Disallow: /admin/
Disallow: /private/
Disallow: /checkout/Option 3: Slow down with a crawl delay
Crawl-delay is a voluntary directive that asks the bot to wait the given number of seconds between requests. Useful when Archive-It is hammering your origin and slowing the site down for real visitors, but you do not want to block it outright. The value is in seconds, so 10 means at most one request every ten seconds. Not all bots honour this directive (Googlebot ignores it; Bingbot, Yandex, and many AI crawlers do respect it).
User-agent: archive-it
User-agent: Archive-It
Crawl-delay: 10Frequently Asked Questions
What is the User-Agent for Archive-It?
Archive-It identifies itself with the User-Agent string "archive-it" (alternate forms: Archive-It). Use this in robots.txt or server-side rules.
Is Archive-It safe to allow on my site?
The operator is not publicly documented for this bot. Web archivers preserve content for historical access (Wayback Machine, etc.) and are usually beneficial.
Should I block Archive-It?
Usually no — archivers like the Wayback Machine preserve your content for historical reference, which is generally beneficial.
How do I block Archive-It?
Try robots.txt first ("User-agent: archive-it / Disallow: /") and verify with crawl logs whether the bot stops appearing.
How can I verify a request is really Archive-It?
User-Agent strings can be spoofed by malicious crawlers. Without published verification details, the User-Agent alone is not trustworthy — monitor source IPs and behavior patterns. BotSights flags spoofed traffic when verification data is available.
Know what Archive-It is doing on your site
See which pages it visits, how often it appears, and whether it is helping your visibility or worth blocking.
- Bot activity tracked per page
- AI and search crawler insights
- Better allow, monitor, or block decisions
Free plan available. No credit card required. Setup in 2 minutes.