Bot Detection

Configuration

Full reference for every option accepted by defineConfiguration.

defineConfiguration accepts a single configuration object. Every field has a default value, only store.main is required. The configuration is validated against a Zod schema at startup. Invalid values cause the server to fail immediately with a clear error message.

server.ts

await defineConfiguration({
  store: {
    main: { driver: 'sqlite', name: './bot-detector.db' },
  },
  // All other options are optional and have defaults
})

`store`

The store object configures the persistent database where visitor records and ban history are stored. Only store.main is required.

main

DbConfig required

The database driver configuration. Pass the driver name and any connection options it requires. DbConfig is a discriminated union keyed on driver.

Driver	`driver` value	Required peer dep
SQLite	`sqlite`	`better-sqlite3`
MySQL connection pool	`mysql-pool`	`mysql2 >= 3`
PostgreSQL	`postgresql`	`pg`
Cloudflare D1	`cloudflare-d1`	Worker environment binding
PlanetScale	`planetscale`	Serverless driver

server.ts

// SQLite: good default for single-server deployments
store: { main: { driver: 'sqlite', name: './bot-detector.db' } }

// MySQL pool
store: { main: { driver: 'mysql-pool', host: 'localhost', user: 'root', password: 'secret', database: 'mydb' } }

// PostgreSQL
store: { main: { driver: 'postgresql', connectionString: 'postgres://user:pass@localhost/mydb' } }

`storage`

The storage field configures the cache layer where visitor state, behavioral rate counters, and session records are stored between requests. When omitted, everything is stored in process memory, which works for single-process deployments but is lost on restart and not shared across multiple instances.

storage

CacheConfig

Optional cache driver configuration. CacheConfig is a discriminated union keyed on driver. Omit this field to use in-process memory.

Driver	`driver` value	Notes
Memory (default)	(omit `storage`)	Single-process only, lost on restart
LRU cache	`lru`	In-process LRU; set `max` (item limit) and `ttl` (ms)
Redis	`redis`	Recommended for multi-instance deployments
Upstash Redis	`upstash`	Serverless Redis via HTTP
Filesystem	`fs`	Persistent local storage, good for development
Cloudflare KV (binding)	`cloudflare-kv-binding`	Pass `binding` from the Worker environment
Cloudflare KV (HTTP)	`cloudflare-kv-http`	Pass `accountId`, `namespaceId`, `apiToken`
Cloudflare R2	`cloudflare-r2-binding`	Pass `binding`
Vercel	`vercel`	Vercel Runtime Cache

server.ts

// Redis: shared state across multiple app instances
storage: { driver: 'redis', host: 'localhost', port: 6379 }

// LRU: bounded in-process cache
storage: { driver: 'lru', max: 10000, ttl: 1000 * 60 * 10 }

When running multiple Node.js processes behind a load balancer, omitting storage or using an in-process driver (lru, fs) means each process maintains its own independent behavioral state. The rate tracking, velocity fingerprint, and session coherence checkers all key their cache entries by canary cookie. If a load balancer routes consecutive requests from the same visitor to different processes, those checkers will see incomplete histories and produce weaker or inconsistent signals. Configure a shared external driver such as redis or upstash so all instances read and write the same state.

`banScore`

banScore

number

Default: 100. The cumulative score threshold at which a visitor is banned. When a request's accumulated penalty points reach or exceed this value at any point in the pipeline, the middleware immediately responds with 403 and records the ban.Lower values ban visitors more aggressively. A value of 30 would ban a visitor that triggers just two or three moderate checks, while the default of 100 requires several checks to fail before a ban is issued.

banScore: 75 // ban after accumulating 75 penalty points

`maxScore`

maxScore

number

Default: 100. The ceiling on the total score that any single request can accumulate. Penalty points beyond this value are ignored. In most configurations this matches banScore, but you can set it lower to cap extreme outliers from inflating scores beyond what is meaningful.

`restoredReputationPoints`

restoredReputationPoints

number

Default: 10. The number of points the reputation healer subtracts from a visitor's stored score after each clean request. A clean request is one that does not result in a ban. This gives legitimate visitors a path to recover from an initial high score caused by unusual network conditions.For example, with restoredReputationPoints: 10 and an initial score of 40, a visitor needs four consecutive clean requests to reach a score of 0.

`setNewComputedScore`

setNewComputedScore

boolean

Default: false. Controls how the computed bot score is written to the database on each request.

`false`: Snapshot then heal (default). The detector writes the computed score once on the visitor's first request (or after the cache expires). On every subsequent request, the reputation healer decrements the stored score by `restoredReputationPoints`. This mode is efficient because the expensive score computation only runs on cache misses.

true: Live snapshot. The detector overwrites the stored score on every single request. The reputation healer then immediately decrements it. The database always reflects the freshest computed risk for every visitor, at the cost of one extra database write per request.Choose true when you need your dashboard or reporting tools to show the current risk score after every page view.

`whiteList`

whiteList

(IPv4 | IPv6 | string)[]

Default: []. A list of IP addresses or CIDR strings that bypass the entire detection pipeline. Requests from these addresses skip all checkers and pass directly to the next handler. This is useful for internal monitoring tools, health check probes, or trusted partner IPs.

whiteList: ['127.0.0.1', '::1', '10.0.0.0/8']

`checksTimeRateControl`

Controls how often the full detection pipeline runs for returning visitors who are already cached.

checkEveryRequest

boolean

Default: true. When true, runs the full pipeline on every request regardless of cache.

checkEvery

number

Default: 300000. When checkEveryRequest is false, only re-runs the pipeline after this many milliseconds (5 minutes by default). A visitor whose result is cached passes through immediately until this interval elapses.

checksTimeRateControl: {
  checkEveryRequest: false,
  checkEvery: 1000 * 60 * 2, // re-check every 2 minutes
}

`batchQueue`

The batch queue collects visitor writes and flushes them to the database asynchronously. This decouples visitor persistence from the request path, so database latency never affects response time.

flushIntervalMs

number

Default: 5000. How often the queue flushes pending writes, in milliseconds. Increase this to reduce database load on busy servers.

maxBufferSize

number

Default: 100. Triggers an immediate flush when this many jobs are queued. Increase this if your batch sizes consistently hit the limit before the interval fires.

maxRetries

number

Default: 3. Retry attempts on a failed flush before the batch is discarded.

`punishmentType`

enableFireWallBan

boolean

Default: false. When true, issues a ufw OS-level firewall rule in addition to the 403 response. Banned IPs are blocked at the network layer via sudo ufw insert 1 deny from <ip>, preventing traffic from reaching the Node.js process on subsequent connections.

enableFireWallBan requires a Linux environment with ufw installed and passwordless sudo access for the Node.js process. It has no effect and should not be enabled on macOS or Windows.

`logLevel`

logLevel

'debug' | 'info' | 'warn' | 'error' | 'fatal'

Default: 'info'. Sets the Pino log level for the middleware. Use 'debug' during development to see per-request checker decisions. Set 'warn' or 'error' in production to reduce log volume.

`checkers`

Every checker is enabled by default. To disable a checker entirely, pass { enable: false }. To adjust its penalty values, pass { enable: true, penalties: { ... } } with the fields you want to override, all unspecified fields keep their defaults.

checkers: {
  enableTorAnalysis: { enable: false }, // disable completely
  enableBehaviorRateCheck: {
    enable: true,
    behavioral_threshold: 20, // stricter rate limit
    penalties: 80,            // heavier penalty
  },
}

`enableIpChecks`

Phase: cheap

Validates that the client IP is a properly formatted IPv4 or IPv6 address. Requests with a malformed or missing IP receive a penalty. Automated scripts that manipulate the X-Forwarded-For header to produce an invalid IP value are caught here.

penalties

number

Default: 10. Applied when the IP is invalid or cannot be parsed.

`enableGoodBotsChecks`

Phase: cheap

Identifies legitimate crawlers such as Googlebot, Bingbot, DuckDuckBot, Apple, and Meta by matching the client IP against the compiled goodBots.mmdb. When a match is found, the request is immediately exempted from scoring.

banUnlistedBots controls what happens when a request presents a bot-like User-Agent that is not in the verified crawler list. When true (default), unlisted bots receive the full penalties score.

banUnlistedBots

boolean

Default: true. When true, bots not present in the verified crawler database receive the full penalty score.

penalties

number

Default: 100. Score applied to unlisted bots when banUnlistedBots is true.

`enableBrowserAndDeviceChecks`

Phase: cheap

Inspects the parsed User-Agent for browsers and device combinations that are impossible or highly suspicious in a real-user context. Each condition carries its own penalty weight.

All weights below live inside the penalties: {} sub-object.

cliOrLibrary

number

Default: 100. User-Agent belongs to a CLI tool or HTTP library (curl, python-requests, etc.).

internetExplorer

number

Default: 100. User-Agent identifies as Internet Explorer.

linuxOs

number

Default: 10. Desktop visit from Linux (elevated risk signal, not a ban signal alone).

impossibleBrowserCombinations

number

Default: 30. Browser and OS combination that cannot exist in practice.

browserTypeUnknown

number

Default: 10. Browser type field could not be determined.

browserNameUnknown

number

Default: 10. Browser name field could not be determined.

desktopWithoutOS

number

Default: 10. Desktop device type with no operating system in the UA.

deviceVendorUnknown

number

Default: 10. Device vendor could not be determined.

browserVersionUnknown

number

Default: 10. Browser version could not be determined.

deviceModelUnknown

number

Default: 5. Device model could not be determined.

`localeMapsCheck`

Phase: cheap

Compares the Accept-Language header with the geolocation of the client IP. A legitimate browser sends a language that matches the country the IP is registered in. Automated tools often send a hardcoded or missing Accept-Language.

All weights below live inside the penalties: {} sub-object.

ipAndHeaderMismatch

number

Default: 20. Accept-Language locale does not match the IP's geo locale.

missingHeader

number

Default: 20. Accept-Language header is absent.

missingGeoData

number

Default: 20. Geo data is unavailable for the IP.

malformedHeader

number

Default: 30. Accept-Language header is present but cannot be parsed.

`enableKnownThreatsDetections`

Phase: cheap

Checks the client IP against the FireHOL threat intelligence feeds. FireHOL maintains four ranked lists of known malicious IPs, plus an anonymizer feed for VPNs and anonymizing proxies.

All weights below live inside the penalties: {} sub-object. The four threatLevels fields are nested one level deeper inside penalties.threatLevels: {}.

anonymiseNetwork

number

Default: 20. IP is in the FireHOL anonymous (VPN/anonymizer) feed.

threatLevels.criticalLevel1

number

Default: 40. IP is in FireHOL level 1 (active attack sources).

threatLevels.currentAttacksLevel2

number

Default: 30. IP is in FireHOL level 2 (current attack participants).

threatLevels.threatLevel3

number

Default: 20. IP is in FireHOL level 3 (broader threat list).

threatLevels.threatLevel4

number

Default: 10. IP is in FireHOL level 4 (extended watch list).

`enableAsnClassification`

Phase: cheap

Checks the Autonomous System Number associated with the client IP. Hosting providers, cloud platforms, and data centers make up the majority of bot traffic. IPs from AS networks classified as hosting or content delivery receive a penalty. Networks with very few visible routes (low visibility) are also penalized, as they are characteristic of residential proxy services.

All weights below live inside the penalties: {} sub-object.

contentClassification

number

Default: 20. ASN is classified as a hosting or CDN network.

unknownClassification

number

Default: 10. ASN classification cannot be determined.

lowVisibilityPenalty

number

Default: 10. ASN has fewer routes visible than lowVisibilityThreshold.

lowVisibilityThreshold

number

Default: 15. Minimum route count before lowVisibilityPenalty applies.

comboHostingLowVisibility

number

Default: 20. ASN is both hosting-classified and low-visibility.

`enableTorAnalysis`

Phase: cheap

Checks the client IP against the compiled Tor relay database. Tor exit nodes are the most likely to be used for malicious automation, while guard nodes and running relays carry lower risk. Obsolete Tor versions suggest a misconfigured or malicious relay.

All weights below live inside the penalties: {} sub-object.

runningNode

number

Default: 15. IP belongs to an active Tor relay.

exitNode

number

Default: 20. IP is a Tor exit node.

webExitCapable

number

Default: 15. IP is capable of exiting to web ports.

guardNode

number

Default: 10. IP is a Tor guard (entry) node.

badExit

number

Default: 40. IP is flagged as a bad Tor exit.

obsoleteVersion

number

Default: 10. IP is running an outdated Tor version.

`enableTimezoneConsistency`

Phase: cheap

Compares the timezone sent in request headers against the timezone expected for the client's geo IP. A mismatch suggests the visitor is using a VPN or proxy that routes through a different region than their actual location.

penalties

number

Default: 20. Applied when the declared timezone does not match the geo timezone.

`honeypot`

Phase: cheap

Monitors requests to a configurable list of trap URLs. These paths serve no legitimate purpose and should never be visited by a real user. Automated scanners and vulnerability probes routinely request paths like /.env, /wp-login.php, and /admin. Any request to a listed path immediately sets the score to banScore, triggering a ban.

paths

string[]

Default: []. List of URL paths to treat as honeypots. Any request matching one of these paths is banned immediately.

honeypot: {
  enable: true,
  paths: ['/.env', '/wp-login.php', '/admin', '/phpMyAdmin'],
}

`enableKnownBadIpsCheck`

Phase: cheap

Checks the client IP against your custom highRisk.mmdb, which is generated by bot-detector generate from your own visitor history. IPs that have previously accumulated a high suspicion score in your database are caught here on all future requests without re-running the full pipeline.

highRiskPenalty

number

Default: 30. Score applied when the IP is found in highRisk.mmdb.

Run bot-detector generate periodically to keep this database current with your latest visitor data.

`enableBehaviorRateCheck`

Phase: heavy

Tracks request frequency per canary cookie. When a visitor sends more requests than behavioral_threshold within the behavioral_window time period, the excess is penalized. This catches fast-scanning bots and automated scripts that make hundreds of requests per minute.

behavioral_window

number

Default: 60000. Sliding time window in milliseconds (1 minute by default).

behavioral_threshold

number

Default: 30. Maximum requests allowed within the window before the penalty applies.

penalties

number

Default: 60. Score applied when the threshold is exceeded.

enableBehaviorRateCheck: {
  enable: true,
  behavioral_window: 60_000,   // 1-minute window
  behavioral_threshold: 15,    // stricter: max 15 req/min
  penalties: 80,
}

`enableProxyIspCookiesChecks`

Phase: heavy

Performs three related checks: proxy detection against the compiled proxy.mmdb, ISP classification for unknown or suspicious providers, and canary cookie presence. A missing canary cookie on a non-first request strongly suggests a bot that discards cookies between requests.

All weights below live inside the penalties: {} sub-object.

cookieMissing

number

Default: 80. canary_id cookie is absent on a non-first request.

proxyDetected

number

Default: 40. IP is in the proxy database.

hostingDetected

number

Default: 50. IP belongs to a known hosting or data-center provider.

ispUnknown

number

Default: 10. ISP cannot be determined from the IP.

orgUnknown

number

Default: 10. Organisation cannot be determined from the IP.

multiSourceBonus2to3

number

Default: 10. IP flagged by 2 to 3 proxy sources (cumulative risk bonus).

multiSourceBonus4plus

number

Default: 20. IP flagged by 4 or more proxy sources.

`enableUaAndHeaderChecks`

Phase: heavy

Runs a multi-factor inspection of the User-Agent and HTTP headers. It detects headless browsers by looking for tell-tale header patterns, penalizes suspiciously short User-Agents, and checks basic TLS cipher, protocol version, and HTTP version consistency.

All weights below live inside the penalties: {} sub-object.

headlessBrowser

number

Default: 100. Headers match known headless browser patterns (Puppeteer, Playwright, etc.).

shortUserAgent

number

Default: 80. User-Agent string is shorter than expected for a real browser.

tlsCheckFailed

number

Default: 60. The TLS cipher suite, TLS version, or HTTP protocol version forwarded by the proxy does not match what a real browser would use. Requires the proxy to set x-client-cipher and x-client-tls-version headers.

badUaChecker

boolean

Default: true. When true, the LMDB User-Agent pattern library is consulted and penalties from knownBadUserAgents apply.

`enableGeoChecks`

Phase: heavy

Checks for missing or incomplete geolocation data and enforces country-level bans. Legitimate residential IPs consistently resolve to a full set of geo fields. Requests where city, region, country, timezone, or coordinates are unknown may originate from misconfigured VPNs, private IP ranges leaked through proxies, or IP addresses not present in the geo database.

bannedCountries accepts a list of ISO 3166-1 alpha-2 country codes. Any request from a banned country receives the full banScore, triggering an immediate ban.

bannedCountries is a top-level field on the checker. All geo unknown penalty weights live inside the penalties: {} sub-object.

bannedCountries

string[]

Default: []. ISO 3166-1 alpha-2 country codes to ban outright. Example: ['KP', 'IR'].

countryUnknown

number

Default: 10. Country cannot be determined.

regionUnknown

number

Default: 10. Region cannot be determined.

latLonUnknown

number

Default: 10. Coordinates cannot be determined.

districtUnknown

number

Default: 10. District cannot be determined.

cityUnknown

number

Default: 10. City cannot be determined.

timezoneUnknown

number

Default: 10. Timezone cannot be determined.

subregionUnknown

number

Default: 10. Sub-region cannot be determined.

phoneUnknown

number

Default: 10. Phone prefix cannot be determined.

continentUnknown

number

Default: 10. Continent cannot be determined.

`enableSessionCoherence`

Phase: heavy

Inspects the Referer header for consistency with the request path and domain. Legitimate browsers include a Referer header on navigation requests and it consistently matches the current site's domain. Bots that crawl by constructing URLs directly often produce missing, mismatched, or cross-domain referers.

All weights below live inside the penalties: {} sub-object.

pathMismatch

number

Default: 10. Referer path does not match the expected navigation flow.

missingReferer

number

Default: 20. Referer is absent on a request that should have one.

domainMismatch

number

Default: 30. Referer domain does not match the current site domain.

`enableVelocityFingerprint`

Phase: heavy

Measures the statistical regularity of a visitor's inter-request timing using the coefficient of variation (CV). Human users have naturally irregular timing between requests. Automated scripts often produce highly regular intervals. When the CV of a visitor's request timing falls below cvThreshold, the request is penalized.

cvThreshold

number

Default: 0.1. Minimum coefficient of variation before the penalty applies. Lower values require more regularity to trigger.

penalties

number

Default: 40. Score applied when timing is unnaturally regular.

`knownBadUserAgents`

Phase: heavy

Cross-references the User-Agent string against an LMDB database of known malicious, scraper, and vulnerability-scanner patterns. Each pattern carries a severity level that maps to a separate penalty. This checker is the most comprehensive User-Agent check and is backed by a continuously updated pattern library.

All weights below live inside the penalties: {} sub-object.

criticalSeverity

number

Default: 100. Pattern matches a known critical-severity bad User-Agent.

highSeverity

number

Default: 80. Pattern matches a high-severity bad User-Agent.

mediumSeverity

number

Default: 30. Pattern matches a medium-severity bad User-Agent.

lowSeverity

number

Default: 10. Pattern matches a low-severity bad User-Agent.

This checker only runs when enableUaAndHeaderChecks.penalties.badUaChecker is true (the default). Disabling that option bypasses this checker regardless of its own enable setting.

`headerOptions`

Fine-grained penalty weights for the HTTP header fingerprint analysis performed by enableUaAndHeaderChecks. Each weight corresponds to a specific header anomaly. The total accumulated weight across all detected anomalies is added to the request score.

weightPerMustHeader

number

Default: 20. Each mandatory header that is absent for the declared browser type.

missingBrowserEngine

number

Default: 30. No browser engine header present.

postManOrInsomiaHeaders

number

Default: 50. Headers characteristic of Postman or Insomnia REST clients.

AJAXHeaderExists

number

Default: 30. X-Requested-With: XMLHttpRequest present in a non-AJAX context.

connectionHeaderIsClose

number

Default: 20. Connection: close header, which browsers do not send by default.

originHeaderIsNULL

number

Default: 10. Origin header is present but set to null.

originHeaderMismatch

number

Default: 30. Origin header domain does not match the request host.

omittedAcceptHeader

number

Default: 30. Accept header is completely absent.

clientHintsMissingForBlink

number

Default: 30. Client hint headers are absent for a Chromium (Blink) User-Agent.

teHeaderUnexpectedForBlink

number

Default: 10. TE header present in a Chromium request (Chromium does not send it).

clientHintsUnexpectedForGecko

number

Default: 30. Client hint headers are present for a Firefox (Gecko) User-Agent.

teHeaderMissingForGecko

number

Default: 20. TE header absent for a Firefox User-Agent (Firefox sends it).

aggressiveCacheControlOnGet

number

Default: 15. Cache-Control: no-cache or no-store on a GET request.

crossSiteRequestMissingReferer

number

Default: 10. Cross-site navigation without a Referer header.

inconsistentSecFetchMode

number

Default: 20. Sec-Fetch-Mode value inconsistent with the request type.

hostMismatchWeight

number

Default: 40. Host header does not match the configured server host.

`pathTraveler`

Configuration for the path traversal detection logic, which catches requests attempting to access files outside the web root using encoded ../ sequences.

maxIterations

number

Default: 3. Maximum decoding passes to attempt when looking for traversal sequences.

maxPathLength

number

Default: 1500. Maximum raw path length in characters before a penalty applies.

pathLengthToLong

number

Default: 100. Penalty applied when the path exceeds maxPathLength.

longDecoding

number

Default: 100. Penalty applied when the path requires more than maxIterations decoding passes.

traversalDetected

number

Default: 60. Penalty applied when a ../ traversal sequence is found after decoding.

`generator`

Controls how bot-detector generate (and the programmatic runGeneration()) compiles your visitor history into custom MMDB databases.

scoreThreshold

number

Default: 70. Minimum suspicious_activity_score a visitor row must have to be included in highRisk.mmdb. Lowering it includes more visitors in the fast-rejection list; raising it makes the list more conservative.

generateTypes

boolean

Default: false. When true, generates TypeScript type definitions alongside the MMDB files.

deleteAfterBuild

boolean

Default: false. When true, deletes the source database rows after a successful compile. Useful for keeping your database lean. Banned rows and high-risk visitor records compiled into MMDB are no longer needed in SQL form.

mmdbctlPath

string

Default: 'mmdbctl'. Path to the mmdbctl binary. Override this when mmdbctl is not on the system PATH.

API Reference