Getting Started
@riavzon/bot-detector works as an Express middleware. Getting started requires installing the package, downloading the threat intelligence data sources, configuring the module, and creating the database schema.
Requirements
Before you begin, ensure your environment meets the following requirements:
- Node.js 18 or later
- Express 5
- A supported database: SQLite, MySQL, PostgreSQL, Cloudflare D1, or PlanetScale for visitor persistence
mmdbctl: The CLI downloads and compiles data sources into MMDB format. Theinitcommand detects and installs it automatically if it is not found.
punishmentType.enableFireWallBan option requires a Linux environment with ufw available and passwordless sudo access for the Node.js process. All other features run on any platform that supports Node.js 18+.Quick Setup
The fastest way to get started is with the create package. Run this command in the root of your Express project:
npx @riavzon/bot-detector-create
pnpm dlx @riavzon/bot-detector-create
yarn dlx @riavzon/bot-detector-create
bunx @riavzon/bot-detector-create
This command installs all required dependencies, downloads and compiles every threat intelligence feed, generates a fully annotated botDetectorConfig.ts with all 17 checkers at their defaults, creates a ready-to-run Express entry point, and initializes the database tables. It defaults to better-sqlite3 as the database driver.
create package is the recommended path. Skip to Next Steps after running it.Manual Setup
If you prefer to wire things up yourself, follow the steps below.
Install the package
Install @riavzon/bot-detector along with Express, cookie-parser, and your chosen database driver.
npm install @riavzon/bot-detector express cookie-parser better-sqlite3
pnpm add @riavzon/bot-detector express cookie-parser better-sqlite3
yarn add @riavzon/bot-detector express cookie-parser better-sqlite3
bun add @riavzon/bot-detector express cookie-parser better-sqlite3
Replace better-sqlite3 with your preferred driver. See Database Drivers for the full list and their peer dependencies.
Initialize data sources
Run the init command to download and compile the threat intelligence databases. This step is required before the middleware can run.
npx @riavzon/bot-detector init
pnpm dlx @riavzon/bot-detector init
yarn dlx @riavzon/bot-detector init
bunx @riavzon/bot-detector init
The command verifies that mmdbctl is installed (and installs it if not found), prompts for a BGP.tools contact string required to download BGP data, then compiles all data sources in parallel. The compiled databases are written to _data-sources/ inside the package directory.
To skip the interactive prompt in CI or automated environments, pass your contact string as a flag:
npx @riavzon/bot-detector init --contact="MyApp - [email protected]"
pnpm dlx @riavzon/bot-detector init --contact="MyApp - [email protected]"
yarn dlx @riavzon/bot-detector init --contact="MyApp - [email protected]"
bunx @riavzon/bot-detector init --contact="MyApp - [email protected]"
Configure the middleware
Create a startup file and call defineConfiguration before attaching any middleware to your app. The only required field is store.main, which sets the database driver for visitor persistence.
import express from 'express'
import cookieParser from 'cookie-parser'
import { defineConfiguration, detectBots } from '@riavzon/bot-detector'
const app = express()
app.use(cookieParser())
app.use(express.json())
await defineConfiguration({
store: {
main: { driver: 'sqlite', name: './bot-detector.db' },
},
})
app.use(detectBots())
app.get('/', (req, res) => {
res.json({ ok: true, banned: req.botDetection?.banned })
})
app.listen(3000)
defineConfiguration is async and must resolve before you call detectBots() or any getter function. Call it exactly once during startup.
Create the database tables
Run load-schema once after configuration to create the required tables in your database:
npx @riavzon/bot-detector load-schema --db sqlite --db-name=./bot-detector.db
pnpm dlx @riavzon/bot-detector load-schema --db=sqlite --db-name=./bot-detector.db
yarn dlx @riavzon/bot-detector load-schema --db=sqlite --db-name=./bot-detector.db
bunx @riavzon/bot-detector load-schema --db=sqlite --db-name=./bot-detector.db
npx @riavzon/bot-detector load-schema --db mysql-pool --db-host=localhost --db-user=root --db-password=secret --db-name=botdb
pnpm dlx @riavzon/bot-detector load-schema --db=mysql-pool --db-host=localhost --db-user=root --db-password=secret --db-name=botdb
yarn dlx @riavzon/bot-detector load-schema --db=mysql-pool --db-host=localhost --db-user=root --db-password=secret --db-name=botdb
bunx @riavzon/bot-detector load-schema --db=mysql-pool --db-host=localhost --db-user=root --db-password=secret --db-name=botdb
npx @riavzon/bot-detector load-schema --db postgresql --db-host=localhost --db-user=postgres --db-password=secret --db-name=botdb
pnpm dlx @riavzon/bot-detector load-schema --db=postgresql --db-host=localhost --db-user=postgres --db-password=secret --db-name=botdb
yarn dlx @riavzon/bot-detector load-schema --db=postgresql --db-host=localhost --db-user=postgres --db-password=secret --db-name=botdb
bunx @riavzon/bot-detector load-schema --db=postgresql --db-host=localhost --db-user=postgres --db-password=secret --db-name=botdb
npx @riavzon/bot-detector load-schema --db cloudflare-d1 --db-name=my-d1-binding
pnpm dlx @riavzon/bot-detector load-schema --db=cloudflare-d1 --db-name=my-d1-binding
yarn dlx @riavzon/bot-detector load-schema --db=cloudflare-d1 --db-name=my-d1-binding
bunx @riavzon/bot-detector load-schema --db=cloudflare-d1 --db-name=my-d1-binding
npx @riavzon/bot-detector load-schema --db planetscale --db-url="mysql://<user>:<pass>@<host>/<db>"
pnpm dlx @riavzon/bot-detector load-schema --db=planetscale --db-url="mysql://<user>:<pass>@<host>/<db>"
yarn dlx @riavzon/bot-detector load-schema --db=planetscale --db-url="mysql://<user>:<pass>@<host>/<db>"
bunx @riavzon/bot-detector load-schema --db=planetscale --db-url="mysql://<user>:<pass>@<host>/<db>"
You can also call it programmatically with createTables(db). Pass getDb() to provide the initialized instance, call this after defineConfiguration resolves:
import { defineConfiguration, createTables, getDb } from '@riavzon/bot-detector';
await defineConfiguration({
store: {
main: { driver: 'sqlite', name: './bot-detector.db' },
},
});
// createTables expects a Database instance, pass getDb()
await createTables(getDb());
load-schema before starting the server for the first time. It is safe to skip on subsequent restarts.Schedule data source refreshes
Threat intelligence data degrades over time. Run refresh at least once every 24 hours to redownload and recompile the latest feeds:
npx @riavzon/bot-detector refresh
pnpm dlx @riavzon/bot-detector refresh
yarn dlx @riavzon/bot-detector refresh
bunx @riavzon/bot-detector refresh
Add this command to a cron job or a CI/CD pipeline to keep the data current.
Database Drivers
The store.main field accepts the following drivers. Each requires the corresponding peer dependency to be installed separately.
| Driver | Value | Peer Dependency |
|---|---|---|
| SQLite | sqlite | better-sqlite3 |
| MySQL pool | mysql-pool | mysql2 >= 3 |
| PostgreSQL | postgresql | pg |
| Cloudflare D1 | cloudflare-d1 | Worker environment binding |
| PlanetScale | planetscale | Serverless driver |
// SQLite
{ driver: 'sqlite', name: './bot-detector.db' }
// MySQL pool
{ driver: 'mysql-pool', host: 'localhost', user: 'root', password: 'secret', database: 'mydb' }
// PostgreSQL
{ driver: 'postgresql', connectionString: 'postgres://user:pass@localhost/mydb' }
Cache Drivers
By default, visitor state and behavioral data are stored in memory. This works for single-process deployments. For multi-process or distributed deployments, configure a shared storage driver so all instances share the same visitor cache.
await defineConfiguration({
store: { main: { driver: 'sqlite', name: './bot-detector.db' } },
storage: { driver: 'redis', host: 'localhost', port: 6379 },
})
| Driver | Value | Notes |
|---|---|---|
| Memory (default) | (omit storage) | Single-process only |
| LRU cache | lru | In-process LRU; set max and ttl |
| Redis | redis | Shared across instances |
| Upstash Redis | upstash | Serverless Redis via HTTP |
| Filesystem | fs | Local persistent storage for development |
| Cloudflare KV (binding) | cloudflare-kv-binding | Pass binding |
| Cloudflare KV (HTTP) | cloudflare-kv-http | Pass accountId, namespaceId, apiToken |
| Cloudflare R2 | cloudflare-r2-binding | Pass binding |
| Vercel | vercel | Vercel Runtime Cache |