Skip to content

[RUM-13793] 🐛 Skip potential sanitize updates on unaltered fields#4298

Draft
bdibon wants to merge 10 commits intomainfrom
boris.dibon/fix_rum-13793
Draft

[RUM-13793] 🐛 Skip potential sanitize updates on unaltered fields#4298
bdibon wants to merge 10 commits intomainfrom
boris.dibon/fix_rum-13793

Conversation

@bdibon
Copy link
Contributor

@bdibon bdibon commented Mar 9, 2026

Motivation

When a RumResourceEvent has a resource.url field that is too long one of two things can happen:

  1. beforeSend is used: the sanitize() function will return undefined and the limitModification function will set the resource.url field to undefined, effectively deleting it. This caused problem on web-ui (now fixed).

  2. beforeSend is not used: the createBatch() function will discard the message that contains the RumResourceEvent, we're loosing data.


Reflection behind the changes

What do we even fix?

Sending invalid data

The problem that surfaced to our users is now fixed, web-ui doesn't break when a RumResourceEvent has no resource.url.

The SDK shouldn't send invalid data to intake, this can be fixed in several ways:

  • For the specific resource.url field we can simply truncate it instead of setting it to undefined, then "how much do we truncate?", at lest enough for the message to intake to not be discarded
  • Have a generic fix that replaces every field sanitized to undefined by either "" or {}

Inconsistent behavior

Also note the SDK is behaving in an inconsistent manner:

  • if you don't pass a beforeSend() in the RumInitConfiguration, events with unsafe fields (large URL for example) will be discarded by the transport layer. The event is discarded
  • if you do pass a beforeSend() that doesn't alter the event, unsafe fields will be sanitized, which mean they will be either set to undefined (string too long), or truncated (the object is too big); The event or context is updated while the user never intended to do it.

Summary

Then we have two problems to solve:

  1. invalid data sent to intake
  2. inconsistent behavior with beforeSend(),

We solve 1. by not erasing required fields.

We solve 2. by not applying the sanitize process to fields the user hasn't modified, this requires us to compare the original value object[field] with the new value in setNestedValue(). This is easy for string values, but the implementation might be more involved for objects and potentially slow.

For this PR, I've have decided to fix 1 completely and 2 partially.

For string values, we don't go through the sanitize process if they didn't change.

For object values, we still go through it, at the risk of tempering the value if it's a large object (serialization wise), meaning the resulting object might be truncated. I think this is an acceptable outcome given the low probability of occurence (objects whose serialized version length is superior to 225 280 characters).

Changes

After second feedback round:

  • Skip sanitize() calls for untouched string values
  • Add a related unit test
Changes history

After first feedback round:

  • When a field is erased by the sanitize function, just replace it with "" or {} depending on the fieldType
  • Add a related unit test

Before feedback:

  • Save resource.url before beforeSend runs, so the original value can be restored if the user callback corrupts it
  • Truncate resource.url to 32 KiB if it exceeds that limit (or was set to undefined by beforeSend), with a display.warn message
  • Add two unit tests in assembly.spec.ts covering truncation both with and without a beforeSend callback.

Test instructions

  • Manual test (use the sandbox):
<script>
  fetch('/' + 'a'.repeat(300_000))
</script>
  • Unit tests:
yarn test:unit --spec 'packages/rum-core/src/domain/limitModification.spec.ts'

Checklist

  • Tested locally
  • Tested on staging
  • Added unit tests for this change.
  • Added e2e/integration tests for this change.
  • Updated documentation and/or relevant AGENTS.md file

@bdibon bdibon requested a review from a team as a code owner March 9, 2026 13:51
@bdibon bdibon marked this pull request as draft March 9, 2026 13:52
@bdibon bdibon force-pushed the boris.dibon/fix_rum-13793 branch from 0a59001 to c9c4d23 Compare March 9, 2026 13:54
@cit-pr-commenter-54b7da
Copy link

cit-pr-commenter-54b7da bot commented Mar 9, 2026

Bundles Sizes Evolution

📦 Bundle Name Base Size Local Size 𝚫 𝚫% Status
Rum 174.53 KiB 174.54 KiB +19 B +0.01%
Rum Profiler 6.16 KiB 6.16 KiB 0 B 0.00%
Rum Recorder 27.46 KiB 27.46 KiB 0 B 0.00%
Logs 56.84 KiB 56.84 KiB 0 B 0.00%
Flagging 944 B 944 B 0 B 0.00%
Rum Slim 130.21 KiB 130.23 KiB +19 B +0.01%
Worker 23.63 KiB 23.63 KiB 0 B 0.00%
🚀 CPU Performance
Action Name Base CPU Time (ms) Local CPU Time (ms) 𝚫%
addglobalcontext N/A 0.0038 N/A
addaction N/A 0.0123 N/A
adderror N/A 0.0118 N/A
addtiming N/A 0.0025 N/A
startview N/A 0.0116 N/A
startstopsessionreplayrecording N/A 0.0006 N/A
logmessage N/A 0.0133 N/A
🧠 Memory Performance
Action Name Base Memory Consumption Local Memory Consumption 𝚫
addglobalcontext N/A 26.98 KiB N/A
addaction N/A 51.05 KiB N/A
addtiming N/A 26.72 KiB N/A
adderror N/A 55.32 KiB N/A
startstopsessionreplayrecording N/A 25.79 KiB N/A
startview N/A 452.28 KiB N/A
logmessage N/A 43.72 KiB N/A

🔗 RealWorld

@datadog-datadog-prod-us1
Copy link

datadog-datadog-prod-us1 bot commented Mar 9, 2026

✅ Tests

🎉 All green!

❄️ No new flaky tests detected
🧪 All tests passed

🎯 Code Coverage (details)
Patch Coverage: 50.00%
Overall Coverage: 77.21% (-0.01%)

This comment will be updated automatically if new data arrives.
🔗 Commit SHA: f7ab50d | Docs | Datadog PR Page | Was this helpful? React with 👍/👎 or give us feedback!

@bdibon bdibon force-pushed the boris.dibon/fix_rum-13793 branch from c9c4d23 to 4874044 Compare March 10, 2026 12:21
@bdibon bdibon force-pushed the boris.dibon/fix_rum-13793 branch from 4874044 to 5f2356e Compare March 10, 2026 16:44
With the current limitModification implementation, if the modifier doesn't perform any update on serverRumEvent.resource.url and the url happens to be too long, then it will be reset to undefined.

This is an "undefined behavior" that creates a discrepancy when the user of the SDK doesn't provide a beforeSend in the init config.

This commit solves the discrepancy by not calling sanitize on fields whose value (is a string) and hasn't been altered by the modifier.
@bdibon bdibon force-pushed the boris.dibon/fix_rum-13793 branch from 5f2356e to 18017a5 Compare March 11, 2026 12:07
Comment on lines +58 to +60
// and when it provides one that doesn't modify the event.context.
// This fix only works for strings, for example when the event.context is untouched,
// it will still go through the sanitize process. It's acceptable given the unlikely eventuality.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❓ question: ‏I don't get what you have in mind around event.context. Could you clarify?

Copy link
Contributor Author

@bdibon bdibon Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The hard part of this "skip sanitization unless the value has changed" is to compare values that are objects, here object[field] === value. For a RumEvent the only "object" field that is modifiable by the user is context, see

const USER_CUSTOMIZABLE_FIELD_PATHS: ModifiableFieldPaths = {
context: 'object',
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh ok, I think I get it now.
With the current implementation, If the field is an object, updating a value of this object won't trigger a sanitize but updating the object reference will 🤔
What about using deepEqual when the field is an object?
This way it is more "Sanitize if it has been modified".

Regardless of the approach, we should add tests on these behaviors as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the current implementation, If the field is an object, updating a value of this object won't trigger a sanitize but updating the object reference will 🤔

The object[field] === value will never be true here for fieldType = "object", because we end up comparing the original object field with the one from the deep clone.

What about using deepEqual when the field is an object?
This way it is more "Sanitize if it has been modified".

I am not sure it's worth to do the deepEqual check, let's review this together.

It's not free.

deepEqual can be an expensive operation:

  • at runtime, both in time and memory, depending on the shape of the object; plus we would apply it on every event
  • implementation wise: we introduce more complexity

I think the bundle size increase would be insignificant here.

What's the value?

If we don't do the deepEqual, the following awkward flow might happen:

beforeSend() in config > "cloneContext" is unsafe (larger than 220 Kib) > beforeSend made no update to cloneContext > Unexpected Modified RumEvent Context

This is a bad outcome because the context of the RumEvent has been modified implicitly.

I think it is mitigated:

  • how likely is it to happen?
  • the event will likely be discarded by the transport layer (as it would have been originally)

What do you think?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The object[field] === value will never be true here for fieldType = "object", because we end up comparing the original object field with the one from the deep clone.

oh yeah, indeed.

deepEqual can be an expensive operation

agree on end user resource impact, implementation wise I may miss the complexity that you anticipate

the following awkward flow might happen

yeah, I think we can live without it.

OK for me with current implementation but:

  • let's make it clear the this function does not sanitize modifiable field of types object, may be worth moving all the aded comments to the function description
  • let's have a test on that behavior

@bdibon bdibon changed the title [RUM-13793] 🐛 Double check illegal payloads from resource events (URL too long) [RUM-13793] 🐛 Skip potential sanitize udates on unaltered fields Mar 11, 2026
@bdibon bdibon changed the title [RUM-13793] 🐛 Skip potential sanitize udates on unaltered fields [RUM-13793] 🐛 Skip potential sanitize updates on unaltered fields Mar 11, 2026
bdibon and others added 3 commits March 11, 2026 14:03
Co-authored-by: Bastien Caudan <1331991+bcaudan@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants