Skip to content

Comments

feat: add value sanitization for UTM parameters#10

Merged
jackmisner merged 2 commits intomainfrom
feature/value-sanitization
Feb 13, 2026
Merged

feat: add value sanitization for UTM parameters#10
jackmisner merged 2 commits intomainfrom
feature/value-sanitization

Conversation

@jackmisner
Copy link
Owner

@jackmisner jackmisner commented Feb 13, 2026

Summary

🤖 Generated with Nori

  • Add a sanitization module (src/core/sanitizer.ts) that strips HTML tags, control characters, and custom regex patterns from UTM parameter values at capture time to prevent XSS
  • New SanitizeConfig type integrated into the config system with DEFAULT_SANITIZE_CONFIG (disabled by default, safe defaults when enabled)
  • Wired into captureUtmParameters() pipeline and React useUtmTracking hook — accepts Partial<SanitizeConfig> and merges with defaults
  • Includes validation for all sanitize config fields (customPattern must be RegExp, maxLength must be positive finite number)
  • Updated README with sanitization documentation and examples
  • Updated noridocs across all affected modules

Test Plan

  • 24 unit tests for sanitizeValue and sanitizeParams (HTML stripping, control chars, custom patterns, truncation, idempotency, edge cases)
  • 15 config tests for sanitize merge semantics, validation (including NaN, Infinity, non-RegExp customPattern)
  • 5 capture integration tests (enabled, disabled, not provided, camelCase, allowedParameters)
  • 1 React hook integration test (sanitize config forwarded and applied)
  • All 266 tests passing
  • Format, lint, and type checks clean

Share Nori with your team: https://www.npmjs.com/package/nori-ai

Summary by CodeRabbit

  • New Features

    • Added configurable parameter value sanitization (strip HTML, remove control characters, custom patterns, max length); disabled by default
    • Exposed sanitization utilities and config type in the public API
  • Documentation

    • Added sanitization examples and configuration guidance
  • Tests

    • Added comprehensive tests covering sanitization config, value/parameter sanitization and integration in capture flows
  • Chores

    • Updated ignore rules to include a temporary directory

Add a sanitization module that strips dangerous characters (HTML tags,
control chars, custom patterns) from UTM parameter values at capture
time to prevent XSS when values are rendered in HTML or used in URLs.

- New SanitizeConfig type with enabled, stripHtml, stripControlChars,
  maxLength, and optional customPattern fields
- sanitizeValue() and sanitizeParams() in src/core/sanitizer.ts
- Integration into captureUtmParameters() pipeline (extract → filter →
  sanitize → convert key format)
- Config system support: DEFAULT_SANITIZE_CONFIG, merge, validation
- React hook forwards sanitize config to capture
- Disabled by default with safe defaults when enabled
- 266 tests passing (45 new)
🤖 Generated with [Nori](https://nori.ai)

Co-Authored-By: Nori <contact@tilework.tech>
@coderabbitai
Copy link

coderabbitai bot commented Feb 13, 2026

Walkthrough

Adds configurable value sanitisation for UTM parameters (SanitizeConfig), implements sanitizer utilities, applies sanitisation during captureUtmParameters when enabled, exposes related types/exports, updates config merging/validation/defaults, and adds comprehensive tests and docs. Also adds temp/* to .gitignore.

Changes

Cohort / File(s) Summary
Type Definitions
src/types/index.ts, src/types/docs.md
Added SanitizeConfig type; extended UtmConfig (optional sanitize) and ResolvedUtmConfig (required sanitize).
Configuration & Defaults
src/config/defaults.ts, src/config/loader.ts, src/config/index.ts, src/config/docs.md
Added DEFAULT_SANITIZE_CONFIG; added mergeSanitizeConfig(); updated createConfig()/mergeConfig() to merge sanitize settings; extended validateConfig() to validate sanitize fields; re-exported default.
Core Sanitiser
src/core/sanitizer.ts, src/core/capture.ts, src/core/index.ts, src/core/docs.md
New sanitizeValue() and sanitizeParams(); capture pipeline now optionally sanitises values (pre-key-format) when enabled; capture option accepts sanitize partial config; catch now logs warning.
Public API & Exports
src/index.ts, src/core/index.ts, src/config/index.ts, README.md
Exported sanitizeValue, sanitizeParams, DEFAULT_SANITIZE_CONFIG, and SanitizeConfig type; README updated with sanitisation docs and examples.
React Integration
src/react/useUtmTracking.ts
Hook now forwards sanitize option into captureUtmParameters() so sanitisation can be enabled from the hook.
Tests & Docs
__tests__/core/sanitizer.test.ts, __tests__/core/capture.test.ts, __tests__/config/loader.test.ts, __tests__/react/useUtmTracking.test.tsx, __tests__/docs.md
Added extensive unit and integration tests covering sanitiser behaviour, config merging/validation, and capture integration; test docs updated.
Misc / Build
vitest.config.ts, .gitignore
Minor lint/format cleanup in vitest config; added temp/* to .gitignore.

Sequence Diagram(s)

sequenceDiagram
    actor Client
    participant Hook as useUtmTracking
    participant Capture as captureUtmParameters
    participant Sanitiser as sanitizeParams

    Client->>Hook: request capture with sanitize config
    Hook->>Capture: call with URL params + sanitize option
    Capture->>Capture: merge sanitize config with defaults
    Capture->>Capture: filter by allowlist (if provided)
    alt sanitize.enabled == true
        Capture->>Sanitiser: sanitizeParams(params, config)
        Sanitiser-->>Capture: return sanitised params
    end
    Capture->>Capture: convert keys to target keyFormat (e.g., camelCase)
    Capture-->>Hook: return formatted params
    Hook-->>Client: provide UTM parameters
Loading
🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and specifically describes the main change: adding value sanitization functionality for UTM parameters, which aligns with the comprehensive sanitization feature implementation across the codebase.
Merge Conflict Detection ✅ Passed ✅ No merge conflicts detected when merging into main

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


No actionable comments were generated in the recent review. 🎉


Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@src/core/sanitizer.ts`:
- Around line 19-66: The functions ignore config.enabled; update sanitizeValue
and sanitizeParams to no-op when sanitization is disabled: in
sanitizeValue(value, config) return the original value immediately if
config.enabled === false, and in sanitizeParams(params, config) return params
unchanged (or a shallow copy of params) if config.enabled === false to avoid
mutating values when sanitisation is disabled; keep existing logic for the
enabled case and preserve references to sanitizeValue and sanitizeParams when
updating.
🧹 Nitpick comments (1)
src/core/docs.md (1)

43-45: Consider adjusting code span formatting to resolve linter warning.

The static analysis tool flags spaces inside the code span on line 45. You could list each character in separate code spans for stricter compliance, though the current format is readable.

🔧 Optional fix for markdownlint warning
-- **sanitizer.ts**: `sanitizeValue()` strips dangerous characters from a single string value. Rules apply in order: HTML-significant characters (`< > " ' \``) --> control characters (\x00-\x1F except tab/newline/CR) --> optional custom regex pattern --> trim --> truncate to `maxLength`. `sanitizeParams()` applies `sanitizeValue()` to every non-undefined value in a `UtmParameters` object, returning a new object with keys preserved unchanged. Both functions are pure and stateless; all behavior is driven by the `SanitizeConfig` argument.
+- **sanitizer.ts**: `sanitizeValue()` strips dangerous characters from a single string value. Rules apply in order: HTML-significant characters (`<`, `>`, `"`, `'`, `` ` ``) --> control characters (\x00-\x1F except tab/newline/CR) --> optional custom regex pattern --> trim --> truncate to `maxLength`. `sanitizeParams()` applies `sanitizeValue()` to every non-undefined value in a `UtmParameters` object, returning a new object with keys preserved unchanged. Both functions are pure and stateless; all behavior is driven by the `SanitizeConfig` argument.

Both functions are public API exports and should no-op when
config.enabled is false, rather than relying on callers to check.
🤖 Generated with [Nori](https://nori.ai)

Co-Authored-By: Nori <contact@tilework.tech>
@jackmisner jackmisner merged commit 5a1de45 into main Feb 13, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant