Good / Bad Bot Verification
The good bots checker intercepts requests from any User-Agent whose browser type is crawler or fetcher. It then decides whether that crawler is a verified legitimate bot or an impostor. Verified crawlers receive GOOD_BOT_IDENTIFIED and bypass the rest of the pipeline. Unverified ones receive BAD_BOT_DETECTED and are banned immediately.
This checker protects against a common attack pattern where malicious bots spoof well-known crawler names (e.g. Googlebot) to avoid detection. It also prevents unknown or unlisted bots from crawling the server when banUnlistedBots is enabled.
How It Works
The checker first parses the User-Agent and checks whether the browser type resolves to crawler or fetcher. Requests from regular browser User-Agents pass through this checker without scoring.
For crawlers, the checker consults two lists:
- Bots with a domain suffix (Googlebot, Bingbot, Yandex, and others in
suffix.json): the checker performs a reverse DNS lookup on the client IP, then a forward DNS lookup on the returned hostname to verify the IP matches. This two-step DNS verification confirms the bot genuinely originates from the claimed organization. - Bots without a suffix (
duckduckbot,gptbot,oai-searchbot,chatgpt-user): the checker validates the client IP against pre-compiled IP ranges ingoodBots.mmdb.
If either verification passes, the checker returns GOOD_BOT_IDENTIFIED and the pipeline ends. If verification fails, the checker returns BAD_BOT_DETECTED and the visitor is banned immediately, regardless of score. Unknown bot User-Agents that appear in neither list are banned when banUnlistedBots is true.
Configuration
await defineConfiguration({
store: { main: { driver: 'sqlite', name: './bot-detector.db' } },
checkers: {
enableGoodBotsChecks: {
enable: true,
banUnlistedBots: true,
penalties: 100,
},
},
})
true.true, any crawler User-Agent that does not appear in the known bot lists triggers BAD_BOT_DETECTED and an immediate ban. When false, unknown crawlers are passed to the IP range check instead. Default: true.BAD_BOT_DETECTED, which bans the visitor immediately regardless of this score. Default: 100.Reason Codes
| Code | Trigger |
|---|---|
GOOD_BOT_IDENTIFIED | DNS or IP range verification passed. Pipeline stops and the request is allowed. |
BAD_BOT_DETECTED | Crawler User-Agent failed verification, or is an unlisted bot when banUnlistedBots is true. Pipeline stops and the visitor is banned. |
Supported Crawlers
The following crawlers are recognized and verified:
| Verification method | Examples |
|---|---|
| DNS (reverse + forward lookup) | Googlebot, Bingbot, Yandexbot, Applebot, Meta crawler, Twitterbot |
IP range check (goodBots.mmdb) | DuckDuckBot, GPTBot, OAI-SearchBot, ChatGPT-User |
googlebot.com or google.com, and then forward DNS on that hostname must resolve back to the original IP.