UA & Header Analysis
The UA and header analysis checker is one of the most comprehensive checks in the pipeline. It evaluates multiple orthogonal signals about the HTTP request itself: the User-Agent string for headless browser keywords, the request TLS fingerprint for browser identity mismatch, the full set of HTTP headers for bot-characteristic patterns, and the request path for traversal attempts.
This checker runs in the heavy phase.
How It Works
Headless Browser Detection
The checker scans the User-Agent string for keywords associated with headless browser automation: headless, puppeteer, selenium, playwright, and phantomjs. Any match fires HEADLESS_BROWSER_DETECTED and applies headlessBrowser.
Short User-Agent
A User-Agent shorter than 10 characters is a strong signal of a script or misconfigured tool. Real browser User-Agents are always significantly longer. This check applies shortUserAgent when the condition is met.
TLS & Protocol Checks
The checker reads the x-client-cipher, x-client-tls-version, and HTTP version from the request and validates them against a whitelist of cipher suites and protocol versions that real browsers use. When the cipher or protocol does not match (for example, when a tool uses TLS 1.0 or a non-browser cipher), tlsCheckFailed fires. These headers must be forwarded by an upstream proxy such as Caddy or Nginx; they are not set by Node.js itself.
Header Analysis
The checker runs a full header scoring pass that evaluates three groups of signals. All weights for this group come from the top-level headerOptions configuration object.
Must-have headers: HTTP/1.0 requests score an immediate 40 points. Real browsers always send User-Agent, Accept, Accept-Encoding, Accept-Language, Host, Upgrade-Insecure-Requests, and Sec-Fetch-* headers. Missing any of these applies weightPerMustHeader per missing header.
Engine-specific headers: The checker resolves the browser engine from the User-Agent. Blink-based browsers (Chrome, Edge, Opera) send sec-ch-ua-* client hints and never send a TE header. Gecko browsers (Firefox) send a TE header and never send client hints. WebKit browsers follow a similar pattern. Violations of these engine-specific expectations apply the corresponding headerOptions penalty.
Weird headers: Several header patterns are characteristic of automation tools or misconfigured clients: an Accept: */* wildcard (omitting the structured accept list real browsers send), an X-Requested-With header on a GET request (an AJAX marker that real browsers do not send on normal navigation), Postman-Token or Insomnia headers, a mismatched X-Forwarded-Host, aggressive cache-control on GET requests, a missing referer on cross-site requests, a null or mismatched Origin, and an unexpected sec-fetch-mode.
Path Traversal Detection
The checker inspects req.path for directory traversal sequences. It catches encoded variants (%2F, %2E%2E), decodes them iteratively up to pathTraveler.maxIterations times, and checks for ../ patterns. Paths exceeding pathTraveler.maxPathLength characters apply the pathLengthToLong penalty. Detected traversal sequences apply traversalDetected.
Configuration
await defineConfiguration({
store: { main: { driver: 'sqlite', name: './bot-detector.db' } },
checkers: {
enableUaAndHeaderChecks: {
enable: true,
penalties: {
headlessBrowser: 100,
shortUserAgent: 80,
tlsCheckFailed: 60,
badUaChecker: true, // enables the knownBadUserAgents sub-checker
},
},
},
// Header scoring weights (all optional, defaults shown)
headerOptions: {
weightPerMustHeader: 20,
missingBrowserEngine: 30,
postManOrInsomiaHeaders: 50,
AJAXHeaderExists: 30,
connectionHeaderIsClose: 20,
originHeaderIsNULL: 10,
originHeaderMismatch: 30,
omittedAcceptHeader: 30,
clientHintsMissingForBlink: 30,
teHeaderUnexpectedForBlink: 10,
clientHintsUnexpectedForGecko: 30,
teHeaderMissingForGecko: 20,
aggressiveCacheControlOnGet: 15,
crossSiteRequestMissingReferer: 10,
inconsistentSecFetchMode: 20,
hostMismatchWeight: 40,
},
// Path traversal weights (all optional, defaults shown)
pathTraveler: {
maxIterations: 3,
maxPathLength: 1500,
pathLengthToLong: 100,
longDecoding: 100,
traversalDetected: 60,
},
})
checkers.enableUaAndHeaderChecks.penalties
100.80.60.true, enables the Known Bad User-Agents sub-checker, which runs the LMDB pattern database against the User-Agent. Default: true.headerOptions
These weights are configured at the top level of defineConfiguration, not inside checkers.
20.30.Postman-Token or Insomnia headers. Default: 50.X-Requested-With (AJAX marker). Default: 30.Connection: close instead of keep-alive. Default: 20.Origin header is null on a non-navigational same-origin request. Default: 10.Origin does not match the server's protocol and hostname. Default: 30.Accept: */* is sent instead of a proper browser accept list. Default: 30.sec-ch-ua-* headers. Default: 30.TE header (a Gecko-only header). Default: 10.sec-ch-ua-* client hints. Default: 30.TE header. Default: 20.Cache-Control: no-cache and Pragma: no-cache. Default: 15.Sec-Fetch-Site: cross-site) without a Referer header. Default: 10.Sec-Fetch-Mode is not same-origin or navigate. Default: 20.X-Forwarded-Host does not match req.hostname. Default: 40.pathTraveler
3.pathLengthToLong. Default: 1500.maxPathLength. Default: 100.maxIterations decode passes to resolve (heavily encoded traversal attempts). Default: 100.../ sequences are found in the decoded path. Default: 60.Reason Codes
| Code | Trigger |
|---|---|
HEADLESS_BROWSER_DETECTED | User-Agent contains headless automation keywords. |
SHORT_USER_AGENT | User-Agent is fewer than 10 characters. |
TLS_CHECK_FAILED | TLS fingerprint does not match the declared browser identity. |
HEADER_SCORE_TOO_HIGH | Header analysis accumulated a non-zero score from must-have, engine-specific, or weird header checks. |
PATH_TRAVELER_FOUND | Path traversal sequences, excessive length, or over-encoded paths detected. |