Good / Bad Bot Verification

Verifies legitimate search engine crawlers through DNS and IP range checks, and penalizes or bans unrecognized bots.

The good bots checker intercepts requests from any User-Agent whose browser type is crawler or fetcher. It then decides whether that crawler is a verified legitimate bot or an impostor. Verified crawlers receive GOOD_BOT_IDENTIFIED and bypass the rest of the pipeline. Unverified ones receive BAD_BOT_DETECTED and are banned immediately.

This checker protects against a common attack pattern where malicious bots spoof well-known crawler names (e.g. Googlebot) to avoid detection. It also prevents unknown or unlisted bots from crawling the server when banUnlistedBots is enabled.


How It Works

The checker first parses the User-Agent and checks whether the browser type resolves to crawler or fetcher. Requests from regular browser User-Agents pass through this checker without scoring.

For crawlers, the checker consults two lists:

  • Bots with a domain suffix (Googlebot, Bingbot, Yandex, and others in suffix.json): the checker performs a reverse DNS lookup on the client IP, then a forward DNS lookup on the returned hostname to verify the IP matches. This two-step DNS verification confirms the bot genuinely originates from the claimed organization.
  • Bots without a suffix (duckduckbot, gptbot, oai-searchbot, chatgpt-user): the checker validates the client IP against pre-compiled IP ranges in goodBots.mmdb.

If either verification passes, the checker returns GOOD_BOT_IDENTIFIED and the pipeline ends. If verification fails, the checker returns BAD_BOT_DETECTED and the visitor is banned immediately, regardless of score. Unknown bot User-Agents that appear in neither list are banned when banUnlistedBots is true.

The DNS verification result is cached to avoid repeated lookups for the same IP across requests.

Configuration

server.ts
await defineConfiguration({
  store: { main: { driver: 'sqlite', name: './bot-detector.db' } },
  checkers: {
    enableGoodBotsChecks: {
      enable: true,
      banUnlistedBots: true,
      penalties: 100,
    },
  },
})
enable
boolean
Enables or disables this checker. When disabled, all crawler User-Agents pass through without verification. Default: true.
banUnlistedBots
boolean
When true, any crawler User-Agent that does not appear in the known bot lists triggers BAD_BOT_DETECTED and an immediate ban. When false, unknown crawlers are passed to the IP range check instead. Default: true.
penalties
number
Score applied when a bot User-Agent fails DNS or IP range verification. Note: a failed verification also pushes BAD_BOT_DETECTED, which bans the visitor immediately regardless of this score. Default: 100.

Reason Codes

CodeTrigger
GOOD_BOT_IDENTIFIEDDNS or IP range verification passed. Pipeline stops and the request is allowed.
BAD_BOT_DETECTEDCrawler User-Agent failed verification, or is an unlisted bot when banUnlistedBots is true. Pipeline stops and the visitor is banned.

Supported Crawlers

The following crawlers are recognized and verified:

Verification methodExamples
DNS (reverse + forward lookup)Googlebot, Bingbot, Yandexbot, Applebot, Meta crawler, Twitterbot
IP range check (goodBots.mmdb)DuckDuckBot, GPTBot, OAI-SearchBot, ChatGPT-User
Googlebot verification follows the official Google procedure: reverse DNS lookup on the IP must return a hostname in googlebot.com or google.com, and then forward DNS on that hostname must resolve back to the original IP.

Do not disable this checker in production. Spoofed Googlebot traffic is one of the most common scraping techniques and the DNS verification is the only reliable way to confirm authenticity.
Logo