Skip to content

tee: increase buf size for large input#11441

Open
oech3 wants to merge 1 commit intouutils:mainfrom
oech3:tee-pv
Open

tee: increase buf size for large input#11441
oech3 wants to merge 1 commit intouutils:mainfrom
oech3:tee-pv

Conversation

@oech3
Copy link
Contributor

@oech3 oech3 commented Mar 21, 2026

taskset -c 1 yes|taskset -c 2 target/release/tee|taskset -c 3 pv>/dev/null

This PR: [3.83GiB/s]
gnu58d88d243/tee: [2.83GiB/s]
closes #11432

@xtqqczze
Copy link
Contributor

The existing code is documented to use same DEFAULT_BUF_SIZE as Rust std library for the stack allocated array.

@xtqqczze
Copy link
Contributor

GNU coreutils uses IO_BUFSIZE = 256 * 1024, we have overhead from zero-initialization though.

@oech3
Copy link
Contributor Author

oech3 commented Mar 21, 2026

Optimal value might different at each PC. 64*1024 was best for me.

@oech3

This comment was marked as outdated.

@oech3

This comment was marked as outdated.

@xtqqczze
Copy link
Contributor

taskset -c 1 yes|taskset -c 2 target/release/tee|taskset -c 3 pv>/dev/null

This measures throughput in the most ideal scenario for a large buffer, we should ensure performance does not regress in other cases where output is small.

@oech3
Copy link
Contributor Author

oech3 commented Mar 21, 2026

Time for small input is bounded. But I could use smaller buf only for 1st cycle.

@xtqqczze
Copy link
Contributor

Please test something like: taskset -c 1 yes | head -c 8192 | taskset -c 2 target/release/tee|taskset -c 3 pv>/dev/null

@oech3

This comment was marked as outdated.

@oech3 oech3 force-pushed the tee-pv branch 3 times, most recently from 6897cbc to 0a8e582 Compare March 21, 2026 14:30
@oech3
Copy link
Contributor Author

oech3 commented Mar 21, 2026

I added a code path for smaller input. There is variance at
taskset -c 1 hyperfine --runs 100 -N --warmup 10 -- "target/release/tee8 <in" "target/release/tee64 <in" "target/release/tee8and64 <in",

Both of win and lose happen at tee8and64 vs tee8, but tee64 never win.

@oech3 oech3 force-pushed the tee-pv branch 2 times, most recently from 7c43c1f to dac8b97 Compare March 21, 2026 16:38
@github-actions
Copy link

GNU testsuite comparison:

Skip an intermittent issue tests/date/resolution (fails in this run but passes in the 'main' branch)
Skip an intermittent issue tests/tail/symlink (fails in this run but passes in the 'main' branch)
Skipping an intermittent issue tests/tail/inotify-dir-recreate (passes in this run but fails in the 'main' branch)
Congrats! The gnu test tests/rm/many-dir-entries-vs-OOM is now passing!

@github-actions
Copy link

GNU testsuite comparison:

Skip an intermittent issue tests/date/date-locale-hour (fails in this run but passes in the 'main' branch)
Skip an intermittent issue tests/tail/symlink (fails in this run but passes in the 'main' branch)
Skipping an intermittent issue tests/tail/follow-name (passes in this run but fails in the 'main' branch)
Skipping an intermittent issue tests/tty/tty-eof (passes in this run but fails in the 'main' branch)
Congrats! The gnu test tests/tail/pipe-f2 is no longer failing!
Note: The gnu test tests/cut/bounded-memory is now being skipped but was previously passing.
Note: The gnu test tests/expand/bounded-memory is now being skipped but was previously passing.

@sylvestre
Copy link
Contributor

needs to be rebased

@oech3 oech3 changed the title tee: increase buf size for |pv>/dev/null tee: increase buf size for large input Mar 24, 2026
@github-actions
Copy link

GNU testsuite comparison:

Skipping an intermittent issue tests/cut/bounded-memory (passes in this run but fails in the 'main' branch)
Note: The gnu test tests/tail/tail-n0f is now being skipped but was previously passing.

@github-actions
Copy link

GNU testsuite comparison:

Skipping an intermittent issue tests/date/date-locale-hour (passes in this run but fails in the 'main' branch)
Skipping an intermittent issue tests/tty/tty-eof (passes in this run but fails in the 'main' branch)

@xtqqczze
Copy link
Contributor

I added a code path for smaller input.

@oech3 Could you quantify the performance improvement from the additional code path to justify the added complexity?

@oech3
Copy link
Contributor Author

oech3 commented Mar 24, 2026

This PR was opened for larger files. Mostly 0 perf change for small files.

@sylvestre
Copy link
Contributor

Please add a benchmark in a different PR
Once it is landed, we will retrigger this one to evaluate the performance impact
thanks

@github-actions
Copy link

GNU testsuite comparison:

Skipping an intermittent issue tests/cut/bounded-memory (passes in this run but fails in the 'main' branch)
Skipping an intermittent issue tests/date/date-locale-hour (passes in this run but fails in the 'main' branch)
Skipping an intermittent issue tests/pr/bounded-memory (passes in this run but fails in the 'main' branch)

@oech3
Copy link
Contributor Author

oech3 commented Mar 24, 2026

Is it equivalint with

truncate -s SIZE b
time tee < b >/dev/null

?

@oech3
Copy link
Contributor Author

oech3 commented Mar 24, 2026

How to use stdin at benchmark?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

tee: slower than GNU

3 participants