Custom Data Sources

Compile your own JSON data into MMDB or LMDB databases with automatic TypeScript type generation, using both the CLI and the programmatic API.

Shield Base can compile any JSON data you provide into a fully typed MMDB or LMDB database. The compile subcommand and the compiler function handle both formats. The only requirements are that MMDB records contain a range field, and LMDB records contain a key or id field.


Choosing a Format

FormatUse whenRequired field
MMDBYour data is keyed by IP address or CIDR rangerange (IPv4/IPv6 address or CIDR)
LMDBYour data is keyed by any string identifierkey or id

CLI: Compile a Single File

# Compile an MMDB database from a JSON file
pnpm dlx @riavzon/shield-base compile --type mmdb --name myRanges --outputDir ./out example.json

# Compile an LMDB database from a JSON file
pnpm dlx @riavzon/shield-base compile --type lmdb --name myKeys --outputDir ./out example.json

This produces:

  • ./out/myRanges.mmdb (or myKeys.mdb + myKeys.mdb-lock for LMDB)
  • ./out/myRangesTypes.ts (TypeScript types auto-generated from the data schema)

Pass --no-types to skip type generation:

pnpm dlx @riavzon/shield-base compile --type mmdb --name myRanges --no-types example.json

CLI: Batch Processing

When you provide multiple input files, the first output uses your --name and subsequent files are indexed:

pnpm dlx @riavzon/shield-base compile --type mmdb --name myRanges --outputDir ./out file1.json file2.json file3.json

Produces: myRanges.mmdb, myRanges-1.mmdb, myRanges-2.mmdb and a matching set of type files.


Programmatic: Input Formats

The data field in the compiler options accepts three forms:

build.ts
import { compiler } from '@riavzon/shield-base';

// 1. File path
await compiler({ type: 'mmdb', input: { data: './example.json', dataBaseName: 'db', mmdbPath: 'mmdbctl', outputPath: './', generateTypes: true } });

// 2. Raw JSON string
await compiler({ type: 'mmdb', input: { data: '[{"range":"1.1.1.0/24"}]', dataBaseName: 'db', mmdbPath: 'mmdbctl', outputPath: './', generateTypes: true } });

// 3. JavaScript array
const data = [{ range: '1.1.1.0/24', name: 'Cloudflare' }];
await compiler({ type: 'mmdb', input: { data, dataBaseName: 'db', mmdbPath: 'mmdbctl', outputPath: './', generateTypes: true } });

Programmatic: Batch Processing

Provide a StringOfSources[] array to compile multiple JSON files into separate databases in one call:

build.ts
import { compiler } from '@riavzon/shield-base';
import type { StringOfSources } from '@riavzon/shield-base';

const sources: StringOfSources[] = [
  { pathToJson: 'ranges1.json', dataBaseName: 'rangesA', outputPath: './out' },
  { pathToJson: 'ranges2.json', dataBaseName: 'rangesB', outputPath: './out' },
];

// MMDB batch
await compiler({
  type: 'mmdb',
  input: {
    data: sources,
    dataBaseName: 'rangesA',
    mmdbPath: 'mmdbctl',
    outputPath: './out',
    generateTypes: true,
  },
});

// LMDB batch
await compiler({
  type: 'lmdb',
  input: {
    data: sources,
    dataBaseName: 'keysA',
    outputPath: './out',
    generateTypes: true,
  },
});

Full Example: Nested MMDB Data

Shield Base handles deeply nested JSON structures. Given this input:

example.json
[
  {
    "range": "1.1.1.0/24",
    "metadata": {
      "version": "1.0.0",
      "author": "Person",
      "tags": ["dns", "secure", "fast"],
      "sub_data": {
        "level_1": { 
          "level_2": {
            "level_3": {
              "level_4": {
                "deep_value": "Success",
                "array_of_objects": [
                  { "index": 0, "active": true },
                  { "index": 1, "active": false }
                ],
                "mixed_types": [1, "two", { "three": 3 }]
              }
            }
          }
        }
      }
    },
    "organization": {
      "name": "Cloudflare, Inc.",
      "details": {
        "headquarters": "San Francisco",
        "employees": 3000,
        "is_public": true
      }
    }
  },
]

Compile and query it:

pnpm dlx @riavzon/shield-base compile --type mmdb --name myDb --outputDir ./out example.json
Terminal
mmdbctl read -f json-pretty 1.1.1.10 ./out/myDb.mmdb
{
  "ip": "1.1.1.10",
  "metadata": {
    "author": "Person",
    "sub_data": {
      "level_1": {
        "level_2": {
          "level_3": {
            "level_4": {
              "array_of_objects": [
                {
                  "active": true,
                  "index": 0
                },
                {
                  "active": false,
                  "index": 1
                }
              ],
              "deep_value": "Success",
              "mixed_types": [
                1,
                "two",
                {
                  "three": 3
                }
              ]
            }
          }
        }
      }
    },
    "tags": [
      "dns",
      "secure",
      "fast"
    ],
    "version": "1.0.0"
  },
  "network": "1.1.1.0/24",
  "organization": {
    "details": {
      "employees": 3000,
      "headquarters": "San Francisco",
      "is_public": true
    },
    "name": "Cloudflare, Inc."
  }
}

The generated mymmdbdbTypes.ts type file reflects the full structure:

mymmdbdbTypes.ts
interface MyMmdbDb {
  range: string;
  metadata: Metadata;
  organization: Organization;
}

interface Organization {
  name: string;
  details: Details;
}

interface Details {
  headquarters: string;
  employees: number;
  is_public: boolean;
}

interface Metadata {
  version: string;
  author: string;
  tags: string[];
  sub_data: Subdata;
}

interface Subdata {
  level_1: Level1;
}

interface Level1 {
  level_2: Level2;
}

interface Level2 {
  level_3: Level3;
}

interface Level3 {
  level_4: Level4;
}

interface Level4 {
  deep_value: string;
  array_of_objects: Arrayofobject[];
  mixed_types: (Mixedtype | number | string)[];
}

interface Mixedtype {
  three: number;
}

interface Arrayofobject {
  index: number;
  active: boolean;
}

Custom Crawler Providers

getCrawlersIps accepts a ProvidersLists[] array to merge your own IP range sources with the built-in crawler datasets:

build.ts
import { getCrawlersIps } from '@riavzon/shield-base';
import type { ProvidersLists } from '@riavzon/shield-base';

const customProviders: ProvidersLists[] = [
  {
    name: 'cloudflare',   // Stored as the `provider` field in the database
    type: 'JSON',         // 'JSON' | 'CSV' | 'HTML'
    urls: [
      'https://www.cloudflare.com/ips-v4',
      'https://www.cloudflare.com/ips-v6',
    ],
  },
];

await getCrawlersIps('./out', 'mmdbctl', customProviders);

This compiles the built-in provider datasets and your custom sources into a single goodBots.mmdb database.


Input Type Reference

type CompilerOptions<T> = 
    | { type: 'lmdb'; input: LmdbInput<T> }
    | { type: 'mmdb'; input: Input<T> };


type LmdbInput<T> = Omit<Input<T>, 'mmdbPath'> & {
    data: LmdbSources<T>;   
};

interface Input<T> {
    outputPath: string;
    dataBaseName: string;
    data: T[] | StringOfSources[] | string;
    mmdbPath: string;
    generateTypes: boolean;
}

interface DatabaseRecord<T> {
    key: string,
    data: T
}

interface StringOfSources {
    pathToJson: string,
    dataBaseName: string;
    outputPath: string;
}

type LmdbSources<T> = DatabaseRecord<T>[] | StringOfSources[] | string | (T & {key: string})[];
Use the types subcommand to generate TypeScript types from a JSON file without compiling a database. This is useful for previewing the type output before committing to a schema.
Logo