XSS Protection
The IAM service provides a layered XSS defense pipeline that runs on every user-supplied string. The pipeline consists of three components that work together: sanitizeInput performs deep multi-pass HTML stripping and entity encoding, makeSanitizedZodString integrates that sanitizer into Zod schemas so validation and sanitization happen in a single step, and validateZodSchema orchestrates the whole flow while automatically banning the client's IP when an XSS payload is detected.
All three utilities are exported from @riavzon/auth for use in your own route handlers. The built-in authentication controllers (signup, login, MFA verification, email update, password reset, OAuth) already use the full pipeline on every field that accepts user input.
Sanitization pipeline
sanitizeInput (exported as the default from htmlSanitizer) is the core sanitization function. It accepts a raw string and returns a cleaned string along with a detection report indicating whether any HTML was found during processing.
import sanitizeInput from '@riavzon/auth'
const { vall, results } = sanitizeInput(userInput)
if (results.htmlFound) {
// HTML or script injection was detected and stripped
console.log('Detected tags:', results.tags)
}
// vall is the fully sanitized string, safe for storage or rendering
The function performs 8 sequential stages. Each stage builds on the previous one, and an attacker must bypass all of them for a payload to survive.
Length guard
Before any processing, the sanitizer rejects input longer than htmlSanitizer.maxAllowedInputLength (default 50000). Oversized input throws rather than entering the loop. This is the hard cap on CPU cost for a single call.
Unicode normalization
The input is normalized to NFKC, which collapses visually similar characters to their canonical form Zero-width characters, soft hyphens, byte-order marks, and bidirectional override characters are stripped in the same pass. Halfwidth and
fullwidth ASCII characters (U+FF01 through U+FF5E) are transliterated back to standard ASCII. This defeats payloads that hide tags inside fullwidth substitutions such as \uFF1C for <.
Strict URI decode
decodeURIComponent is called once inside a try/catch. If the call throws (malformed percent-encoding like %ZZ), the input is rejected immediately and returned as an empty string with htmlFound: true. Legitimate input does not contain malformed URI sequences, so rejecting early keeps broken data out of the loop.
Iterative URI and entity decoding
The function enters a decode loop that alternates between decodeURIComponent and he.decode (he is an HTML entity decoder). Each iteration decodes one layer of encoding. The loop continues until the output stabilizes (no change between iterations) or until the IrritationCount limit is reached.
This catches payloads that rely on nested encoding: %253Cscript%253E decodes to %3Cscript%3E on the first pass, then to <script> on the second. Without the loop, a single-pass decoder would leave the inner encoding intact.
IrritationCount iterations without stabilizing, the input is rejected entirely and returned as an empty string with htmlFound: true. This protects against intentionally crafted inputs designed to consume CPU through deep encoding chains.Residual cleanup
After the loop, zero-width characters are stripped again (the decoders may have reintroduced them) and any whitespace inside the bodies of surviving tag-like substrings is removed so that <scr\tipt> cannot slip past the tag regex.
Pattern detection
After decoding, the function tests the cleaned string against three patterns:
| Pattern | Catches |
|---|---|
/<\s*\/?\s*[A-Za-z][A-Za-z0-9-]*(?:\s+[^>]*?)?\s*>/i | Any HTML tag |
/on\w+\s*=/i | Inline event handlers (onclick=, onerror=) |
/javascript\s*:/i | JavaScript protocol URIs |
If any pattern matches, htmlFound is set to true in the results. This flag is used downstream by validateZodSchema to trigger IP banning.
sanitize-html pass
The string is passed through sanitize-html with a strict configuration:
allowedTags: [],allowedAttributes: {},allowedIframeHostnames: [],allowedSchemes: [],allowProtocolRelative: false,nestingLimit: 10,nonTextTags: ['script', 'style', 'noscript', 'iframe', 'svg'].
The textFilter re-runs the tag regex on text nodes, and onOpenTag records any tag name and attribute set that the sanitizer had to strip. If the string shrinks during this pass, htmlFound is set to true even if pattern detection did not trigger.
Entity encoding
The final output is entity-encoded: &, <, >, ", ', backtick, and ${ are replaced with their entity or escaped equivalents. The backtick and template-literal escapes prevent injection into JavaScript template strings. The result is trimmed.
| Character | Replacement |
|---|---|
& | & |
< | < |
> | > |
" | " |
' | ' |
` | ` |
${ | \${ |
Zod integration
makeSanitizedZodString creates a Zod string schema that validates length and optional regex constraints, then runs the full sanitization pipeline as a Zod transform. The returned value is always the sanitized output, and any HTML detection is reported as a Zod issue with an 'HTML found' message prefix.
import { makeSanitizedZodString } from '@riavzon/auth'
import { z } from 'zod'
const commentSchema = z.object({
text: makeSanitizedZodString({ min: 1, max: 1000 }),
name: makeSanitizedZodString({
min: 2,
max: 50,
pattern: /^[A-Za-z\s]+$/,
patternMsg: 'Name must contain only letters and spaces',
}),
})
Parameters
How it works
The schema chains three operations:
- Length and pattern validation using standard Zod
.min(),.max(), and.regex()validators - HTML detection check via
.check()that callssanitizeInputand pushes a custom Zod issue ifhtmlFoundistrue - Sanitization transform via
.transform()that callssanitizeInputagain and returns only the cleanedvallstring
makeSanitizedZodString for all user-supplied string fields. If you add custom routes, use it for consistency.The built-in schemas that use makeSanitizedZodString include:
| Schema | Fields |
|---|---|
| Signup | name, email, password |
| Login | email, password |
| Email update | email, newEmail, password |
| MFA code | code |
| Password reset | random, reason |
| Custom MFA | random, reason |
Validation with XSS enforcement
validateZodSchema ties the full pipeline together. It parses input against a Zod schema, and when any Zod issue starts with 'HTML found' (produced by makeSanitizedZodString), it calls handleXSS to ban the client immediately.
import { validateZodSchema } from '@riavzon/auth'
const result = await validateZodSchema(commentSchema, req.body, req, log)
if ('valid' in result && result.valid === false) {
// Validation failed (could be XSS ban or normal validation error)
res.status(result.errors === 'XSS attempt' ? 403 : 400).json({ errors: result.errors })
return
}
if (!result.success) {
// Standard Zod validation error
res.status(422).json(result.error.format())
return
}
// result.data is fully sanitized and validated
const { text, name } = result.data
Validation flow
Zod parsing
The schema is parsed with safeParse(). If parsing succeeds, the validated and transformed data is returned immediately.
HTML issue scan
If parsing fails, the function scans the Zod error issues array for any issue whose message starts with 'HTML found'. This marker is set by makeSanitizedZodString when the sanitizer detects HTML content.
XSS punishment
When an HTML issue is found, handleXSS is called with the Express request object. The function bans the client's IP using the Bot Detector service with the configured banScore (defaults to 100) and the reason 'XSS SCRIPTING ATTEMPT'. It also marks the visitor's canary_id as a bot and updates the banned IP record.
The function returns { valid: false, errors: 'XSS attempt' } and the controller responds with HTTP 403.
Normal validation errors
If no HTML issues are found, the function collects all Zod issues into a key-value map (field name to error message) and returns { valid: false, errors: { ... } }.
handleXSS
handleXSS is the enforcement function called when an XSS attempt is confirmed. It reads the banScore from botDetector.settings in the configuration (defaulting to 100 if not set) and executes three actions:
import { handleXSS } from '@riavzon/auth'
// Called automatically by validateZodSchema — you rarely need to call this directly
await handleXSS(req, '<script>alert(1)</script>', log)
IP ban
banIp is called first with the client IP and the configured banScore. This adds the IP to the Bot Detector ban list with the reason 'XSS SCRIPTING ATTEMPT'. This step runs before the remaining actions to ensure the IP is blocked as quickly as possible.
Visitor record and bot flag
Two actions run concurrently via Promise.all:
| Action | Function | Effect |
|---|---|---|
| Visitor record | updateBannedIP(canary_id, ip, ua, { score: 10, reasons: [...] }) | Updates the banned IP record with the visitor's cookie, user agent, and ban reason |
| Bot flag | updateIsBot(true, canary_id) | Marks the visitor's canary_id as a confirmed bot in the visitors table |
The function logs a warning before and after the ban.
Timing attack prevention
timeEnumeration (exported as waitSomeTime internally) adds a fixed delay to responses where timing differences could leak information. Authentication endpoints use it to ensure that responses for valid and invalid inputs take the same amount of time.
import { timeEnumeration } from '@riavzon/auth'
const start = Date.now()
// ... process the request (may return early if user not found)
const elapsed = Date.now() - start
const minimumResponseTime = 3000 // 3 seconds
if (elapsed < minimumResponseTime) {
await timeEnumeration(minimumResponseTime - elapsed, log)
}
The following controllers enforce a minimum 3-second response time:
| Controller | Route | Why |
|---|---|---|
initPasswordReset | POST /auth/forgot-password | Prevents enumerating which email addresses have accounts |
initCustomMfaFlow | POST /custom/mfa/:reason | Prevents timing analysis of custom MFA initiation |
timeEnumeration on any endpoint that reveals presence or absence of a user account through its response time. A password reset endpoint that returns instantly for unknown emails and slowly for known ones leaks account existence.Applying XSS protection in custom handlers
The full pipeline works in four steps: define a schema with makeSanitizedZodString, validate with validateZodSchema, check the result, and use the sanitized data.
import { validateZodSchema, makeSanitizedZodString } from '@riavzon/auth'
import { z } from 'zod'
const profileSchema = z.object({
displayName: makeSanitizedZodString({ min: 1, max: 100 }),
bio: makeSanitizedZodString({ min: 0, max: 500 }),
})
router.post('/profile', async (req, res) => {
const log = getLogger().child({ route: '/profile' })
const result = await validateZodSchema(profileSchema, req.body, req, log)
if ('valid' in result && !result.valid) {
if (result.errors === 'XSS attempt') {
return res.status(403).json({ banned: true })
}
return res.status(400).json({ errors: result.errors })
}
if (!result.success) {
return res.status(422).json(result.error.format())
}
// result.data.displayName and result.data.bio are fully sanitized
await db.updateProfile(userId, result.data)
res.json({ ok: true })
})
Configuration reference
IrritationCount to a very high value on a public endpoint creates a CPU exhaustion vector. Pair it with maxAllowedInputLength and a request body size limit (e.g. express.json({ limit: '2kb' })) to bound worst-case processing time.Summary
The XSS protection pipeline integrates with several other IAM subsystems:
| System | Integration point |
|---|---|
| Anomaly Detection | Banned IPs from XSS attempts raise the suspicious_activity_score, which triggers MFA challenges or hard blocks when the score exceeds 25% of the ban threshold |
| Bot Detector | handleXSS calls banIp, updateBannedIP, and updateIsBot to record the threat in the bot detection database |
| Fingerprinting | The canary_id cookie ties the XSS ban to a specific device, so the ban persists even if the client's IP changes |
| Signup | New account creation validates name, email, and password through makeSanitizedZodString |
| MFA | OTP code submission validates the code field through the same pipeline |