Skip to content

Remove bogus use of collation from cdjs #294

@hsivonen

Description

@hsivonen

The cdjs benchmark in JetStream is supposed to be a collision detection simulation. However, it ends up making String.prototype.localeCompare performance-sensitive for the benchmark in a way that's not a reasonable benchmark workload for locale-sensitive string sorting and is not actually testing collision detection math: It's comparing strings of the form "fooN" where N is an integer as text. Collation-wise this means skipping an identical prefix and then comparing ASCII digits in the non-numeric mode, which in the absence of this benchmark doesn't look like something that should deserve branches on the fastest path. The comparison seems to be within a red-black tree and not really about user-visible strings.

AFAICT, the benchmark isn't an extract of real-world silly code from around the Web but has been written as a benchmark to begin with.

I think we shouldn't have code this silly in a benchmark, when the silliness isn't documented to represent real silliness on the Web but has been written as benchmark code without a clear reason for the silliness that would be important engines to optimize for.

Let's remove the use of localeCompare here and use actual JavaScript numbers as red-black tree keys.

(It would be reasonable to benchmark collation, but if we do that, we should have realistic workloads of sorting user-visible strings. I have tried to come up with benchmarks for ICU4X and for Firefox PGO, and figuring out what workloads are representative is unusually hard for collation. A properly-optimized collator has an exceptionally large performance gap between the best and worse cases, so on one hand, it's very easy to come up with something that only exercises superficial best cases, and on the other hand, it's very easy to over-do attempts to exercise the more complex cases in a way that's no longer representative of real workloads.)

return this._value.localeCompare(other._value);

https://github.com/WebKit/JetStream/blob/main/cdjs/simulator.js#L29
https://github.com/WebKit/JetStream/blob/main/cdjs/benchmark.js#L81
https://github.com/WebKit/JetStream/blob/main/cdjs/red_black_tree.js#L31

Metadata

Metadata

Assignees

No one assigned

    Labels

    dubiousThis appears to encourage suspect optimizationsv4.0

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions