Skip to content

Benchmark Runner QoL Improvements#3004

Merged
lerno merged 6 commits intoc3lang:masterfrom
ManuLinares:benchmark-qol-clean
Mar 7, 2026
Merged

Benchmark Runner QoL Improvements#3004
lerno merged 6 commits intoc3lang:masterfrom
ManuLinares:benchmark-qol-clean

Conversation

@ManuLinares
Copy link
Copy Markdown
Member

Fixed merge conflicts of #2672


From original PR by @NotsoanoNimus :

I work with compile-benchmark quite often - only fair I help with its
upkeep. Slapped this together in a couple hours, because I really need
more consistent results and a way to visualize them, and the benchmark
runner is a bit of a rat's nest.

Changes, in no particular order:

  • Add a MEDIAN metric to the results
    • The mean gives us throughput information, but it's too
      heavily skewed by a set's outliers.
    • The median gives us an idea of performance expectations with
      a 50% probability. Put another way, the tested function has performed at
      or better than the median in 50% of samples taken.
    • Caveat: a sorted set is required. For high-volume iterations,
      this can get expensive.
  • Improve the output units and refactor that into NanoDuration,
    so it can be used elsewhere as desired
  • Provide some pretty colors, oooooo
  • Added CSV reporting option to get a resultant data-set
    • Not the most useful thing, but makes it easy to plug benchmark
      outputs into other software. Could probably be improved upon. and yeah
      I'm a boomer who likes CSV, sue me
  • Restructure the benchmark runtime to be more like the tester
    runtime
  • Rudimentary command-line options to adjust benchmark options

Fixed merge conflicts of c3lang#2672

```
From @NotsoanoNimus:
 I work with `compile-benchmark` quite often - only fair I help with its
upkeep. Slapped this together in a couple hours, because I really need
more consistent results and a way to visualize them, and the benchmark
runner is a bit of a rat's nest.

 Changes, in no particular order:

     * Add a MEDIAN metric to the results

       * The _mean_ gives us throughput information, but it's too
heavily skewed by a set's outliers.
       * The _median_ gives us an idea of performance expectations with
a 50% probability. Put another way, the tested function has performed at
or better than the median in 50% of samples taken.
       * Caveat: a sorted set is required. For high-volume iterations,
this can get expensive.

     * Improve the output units and refactor that into `NanoDuration`,
so it can be used elsewhere as desired

     * Provide some pretty colors, oooooo

     * Added CSV reporting option to get a resultant data-set

       * Not the most useful thing, but makes it easy to plug benchmark
outputs into other software. Could probably be improved upon. and yeah
I'm a boomer who likes CSV, sue me

     * Restructure the benchmark runtime to be more like the tester
runtime

     * Fixed a divide-by-zero crash when the benchmark iteration count
was <100

     * Rudimentary command-line options to adjust benchmark options
```
@lerno
Copy link
Copy Markdown
Collaborator

lerno commented Mar 6, 2026

I would drop the median in favour of standard deviation

differentiate measure units by color (us, ms, s)
@ManuLinares
Copy link
Copy Markdown
Member Author

I would drop the median in favour of standard deviation

I added standard deviation by default (no need for a flag)
And a few tweaks here and there on colors and stuff.
Progress bar doesn't affect bench results.

@ManuLinares
Copy link
Copy Markdown
Member Author

I tweaked all stdlib benchmarks to target ~300ms execution time

image

@lerno
Copy link
Copy Markdown
Collaborator

lerno commented Mar 7, 2026

I just fixed this. You should revert your changes.

@lerno
Copy link
Copy Markdown
Collaborator

lerno commented Mar 7, 2026

(And merge with the latest)

@ManuLinares ManuLinares force-pushed the benchmark-qol-clean branch from 38e9c80 to 4f00ea5 Compare March 7, 2026 00:17
@lerno
Copy link
Copy Markdown
Collaborator

lerno commented Mar 7, 2026

Out of scope for this one, but the benchmarking progress indicator is an old thing and works poorly, and I think it should be removed anyway:

Benchmarking crypto_hash_benchmarks::streebog_512_1mib .............. [####################] 980 / 1024 (96%)
Benchmarking crypto_hash_benchmarks::streebog_512_1mib .............. [####################] 990 / 1024 (97%)
Benchmarking crypto_hash_benchmarks::streebog_512_1mib .............. [####################] 1000 / 1024 (98%)
Benchmarking crypto_hash_benchmarks::streebog_512_1mib .............. [####################] 1010 / 1024 (99%)
Benchmarking crypto_hash_benchmarks::streebog_512_1mib .............. [####################] 1020 / 1024 (100%)
Benchmarking crypto_hash_benchmarks::streebog_512_1mib .............. [COMPLETE] 5.44 milliseconds, 5437982.00 CPU clocks, 1024 iterations (runtime 5.57 seconds)

@lerno
Copy link
Copy Markdown
Collaborator

lerno commented Mar 7, 2026

Another point of improvement would be to add a symbol for standard deviation instead.

@lerno lerno merged commit 9e2fea9 into c3lang:master Mar 7, 2026
21 checks passed
@ManuLinares ManuLinares deleted the benchmark-qol-clean branch March 7, 2026 16:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants