A monitor-first Cloudflare Worker for bot policy enforcement, suspicious-country throttling, cookie sanitization, stateful asset rate limiting, and cleaner blocked-request UX.
- Monitor mode is now the safe default.
- Policy evaluation is modular and testable.
- Hybrid rule lanes allow known-bad traffic to stay enforced while newer rules stay monitor-only.
- Suspicious-country throttling is supported as a softer alternative to hard geo-blocking.
- Cookie sanitization can strip selected cookies before origin fetch.
- Standalone
workers.devdeployments now return an operational response unless an upstream origin is configured. - Rate limiting supports a Durable Object backend instead of relying only on process memory.
- Browser, API, and plain-text clients get different blocked responses.
- CI and a local test suite now validate the decision engine before merges.
Runtime entrypoint:
worker.jswires config, policy evaluation, cookie sanitization, logging, and responses.
Core modules:
src/presets.jsdefines conservative, balanced, and aggressive presets.src/config.jsparses environment overrides.src/policy.jsbuilds request context and evaluates policy decisions.src/rate-limiter.jsuses a Durable Object whenRATE_LIMITERis bound and falls back to in-memory limiting for local/dev use.src/responses.jsbuilds browser, API, and plain-text deny responses.src/logging.jsemits structured logs for monitor and enforced outcomes.
Default preset:
balanced
Default mode:
monitor
That means the worker will log requests that would be blocked, but it will still forward them to origin until you explicitly switch to enforce mode.
Balanced preset defaults:
- Block country:
CN - Block ASNs:
13220,132203 - Block known AI/training scrapers, including
GPTBot,ClaudeBot,anthropic-ai,ChatGPT-User, and similar crawlers - Leave monitor-only country, ASN, and scraper lanes empty until you configure them
- Rate limit
.jsasset requests at100requests per minute per IP - Bypass
OPTIONSautomatically to avoid breaking preflight traffic - Leave suspicious-country throttling and cookie stripping disabled until you configure them
npm installThe repo ships with the Durable Object binding and migration already declared in wrangler.toml.
Add environment variables under [vars]:
[vars]
BOT_BLOCKER_MODE = "monitor"
BOT_BLOCKER_PRESET = "balanced"
BOT_BLOCKER_SUPPORT_URL = "https://yourdomain.com/support"
BOT_BLOCKER_UPSTREAM_ORIGIN = "https://example.com"
BOT_BLOCKER_THROTTLED_COUNTRIES = "VN,SG"
BOT_BLOCKER_THROTTLE_LIMIT = "15"
BOT_BLOCKER_STRIPPED_COOKIES = "session_token,tracking_id"npm testnpm run deployWhen the monitor-only decisions look correct, flip:
BOT_BLOCKER_MODE = "enforce"Best when:
- You have global customers.
- You want low false-positive risk.
- You want to start with a smaller ASN and scraper policy.
Defaults:
- No country block
- Tencent-focused ASN block
- Moderate
.jsrate limit at120/min
Best when:
- You want a practical default with monitor-first rollout.
- You want AI scraper blocking without broad regional allowlisting.
Defaults:
CNblocked- Tencent ASN block
- AI/training scraper blocklist
.jsrate limit at100/min
Best when:
- You mostly serve a narrow regional market.
- You are willing to trade accessibility for stronger suppression.
Defaults:
- Country allowlist for a defined set of regions
- Expanded ASN blocklist
- Expanded scraper blocklist
- Asset rate limiting on
.js,.css,.json, and.map 30/minasset rate limit
Behavior:
BOT_BLOCKER_MODE:monitororenforceBOT_BLOCKER_PRESET:conservative,balanced, oraggressiveBOT_BLOCKER_SUPPORT_URL: optional support link shown on browser deny pagesBOT_BLOCKER_UPSTREAM_ORIGIN: optional origin to proxy allowed traffic to when the worker runs standalone
Lists:
BOT_BLOCKER_BLOCKED_COUNTRIESBOT_BLOCKER_MONITORED_COUNTRIESBOT_BLOCKER_ALLOWED_COUNTRIESBOT_BLOCKER_THROTTLED_COUNTRIESBOT_BLOCKER_BLOCKED_ASNSBOT_BLOCKER_MONITORED_ASNSBOT_BLOCKER_BLOCKED_SCRAPERSBOT_BLOCKER_MONITORED_SCRAPERSBOT_BLOCKER_ALLOWED_IPSBOT_BLOCKER_STRIPPED_COOKIESBOT_BLOCKER_DELETE_STRIPPED_COOKIESBOT_BLOCKER_COOKIE_DELETE_DOMAINBOT_BLOCKER_PROTECTED_PATH_PREFIXESBOT_BLOCKER_BYPASS_METHODSBOT_BLOCKER_RATE_LIMIT_PATH_SUFFIXESBOT_BLOCKER_RATE_LIMIT_PATH_PREFIXESBOT_BLOCKER_RATE_LIMIT_BYPASS_SAME_ORIGIN_ASSETSBOT_BLOCKER_STRICT_RATE_LIMIT_ENABLEDBOT_BLOCKER_STRICT_RATE_LIMITBOT_BLOCKER_STRICT_RATE_WINDOW_MSBOT_BLOCKER_STRICT_RATE_LIMIT_PATH_PREFIXESBOT_BLOCKER_STRICT_RATE_LIMIT_PATH_SUFFIXESBOT_BLOCKER_STRICT_RATE_LIMIT_MARKERSBOT_BLOCKER_HEALTH_PATH
Rate limiting:
BOT_BLOCKER_RATE_LIMIT_ENABLEDBOT_BLOCKER_RATE_LIMITBOT_BLOCKER_RATE_WINDOW_MS
Suspicious-country throttling:
BOT_BLOCKER_THROTTLE_LIMIT
List values are comma-separated. An explicitly empty value clears the preset default for that field.
Use throttling when you have some legitimate traffic from a region but still need to suppress heavy abuse.
- The key is
country + IP - The default window is
60seconds - Requests over the configured limit return
429 - In monitor mode, the worker logs the event but still forwards the request
You can keep proven bad traffic blocked while only monitoring newer rules:
BOT_BLOCKER_BLOCKED_COUNTRIES,BOT_BLOCKER_BLOCKED_ASNS, andBOT_BLOCKER_BLOCKED_SCRAPERSstay enforceable inenforcemodeBOT_BLOCKER_MONITORED_COUNTRIES,BOT_BLOCKER_MONITORED_ASNS, andBOT_BLOCKER_MONITORED_SCRAPERSalways log and allow- In global
monitormode, even blocked rules log instead of enforcing BOT_BLOCKER_PROTECTED_PATH_PREFIXESdoes not narrow those global blocklists; use the explicit asset path settings when you want path-scoped rate limits
That gives you a safe rollout pattern for production:
- hard block known bad traffic
- monitor uncertain traffic
- promote monitor-only rules to enforced rules only after log review
Use cookie stripping to prevent selected cookies from reaching origin for untrusted requests.
- Configure cookie names with
BOT_BLOCKER_STRIPPED_COOKIES - Matching is case-insensitive
- Requests are only rewritten when one of those cookie names is present
- A
SANITIZEDlog event is emitted with the stripped cookie names - If
BOT_BLOCKER_DELETE_STRIPPED_COOKIESis enabled, matching cookies are also expired in the browser response BOT_BLOCKER_COOKIE_DELETE_DOMAINlets you scope those deletions to a specific domain
You can run stricter asset controls without hard-coding tenant-specific app logic:
BOT_BLOCKER_RATE_LIMIT_PATH_PREFIXESscopes the standard asset limiterBOT_BLOCKER_RATE_LIMIT_BYPASS_SAME_ORIGIN_ASSETSavoids penalizing normal browser script loadsBOT_BLOCKER_STRICT_RATE_LIMIT_*variables define a stricter limiter for high-risk asset pathsBOT_BLOCKER_STRICT_RATE_LIMIT_MARKERScan target specific module names inside bundled asset paths
If the worker is running on workers.dev with no upstream origin configured:
/_bot-blocker/healthreturns a200operational response- Browser requests get an informational page instead of a failed passthrough
- API-style requests get JSON showing whether an upstream origin is configured
If you want allowed traffic proxied onward from workers.dev, set BOT_BLOCKER_UPSTREAM_ORIGIN.
Keep this repository generic.
- Store route bindings, custom domains, upstream origins, cookie names, path heuristics, and rollout decisions in a private deployment repository or secret-managed infrastructure config.
- Do not commit tenant-specific staging hosts, production hosts, or cutover playbooks into the public worker source tree.
- Treat every deployment as data layered onto the worker, not logic merged into it.
- Start with
BOT_BLOCKER_MODE = "monitor". - Watch logs for
blocked_country,blocked_asn,blocked_scraper,country_throttle_exceeded, andrate_limit_exceeded. - Add monitor-only countries, ASNs, and scrapers first for any uncertain traffic.
- Add allowlisted IPs, protected path prefixes, throttled countries, stripped cookies, or preset overrides where needed.
- Switch to
enforceonly after the monitor output matches your expectations.
Blocked requests now return:
- HTML for browser traffic with a support link and request ID
- JSON for API-style traffic with
code,message, andrequestId - Plain text for everything else
Every enforced response includes:
X-Bot-Blocker-ReasonX-Request-IdRetry-Afterfor throttled or rate-limited responses
Local test command:
npm testCoverage includes:
- Preset/config parsing
- Monitor vs enforce behavior
- Allowlist and path-scope behavior
- Suspicious-country throttling
- Cookie sanitization
- Durable Object rate limiting
- HTML and JSON response contracts
- Worker-level integration
GitHub Actions runs the same test suite on pushes and pull requests.
Config examples live in:
- The in-memory rate limiter is only a fallback for local/dev or unbound environments.
- Production rate limiting should use the bundled Durable Object binding.
OPTIONSis bypassed by default to reduce accidental API/CORS regressions.- Search-engine bots are not blocked by the balanced preset. If you want that behavior, use the aggressive preset or override the scraper list directly.
- Cookie sanitization only affects requests that are allowed to continue to origin.
MIT. See LICENSE.