Skip to content

poll: Write back only revents to userspace, matching Linux behavior#12851

Open
tanyifeng wants to merge 1 commit intogoogle:masterfrom
tanyifeng:poll-revents-writeback
Open

poll: Write back only revents to userspace, matching Linux behavior#12851
tanyifeng wants to merge 1 commit intogoogle:masterfrom
tanyifeng:poll-revents-writeback

Conversation

@tanyifeng
Copy link
Copy Markdown
Contributor

@tanyifeng tanyifeng commented Apr 2, 2026

doPoll() was modifying pfd.Events (adding POLLHUP|POLLERR) and writing back the entire pollfd struct. Linux only writes back revents via unsafe_put_user() in do_sys_poll(). The polluted events field broke libevent's poll_del(), causing a busy-loop in tmux (CPU 96.6%).

Fix: use an internal mask for POLLHUP|POLLERR in initReadiness() and pollBlock(); write back only the 2-byte revents field per fd in doPoll().

@tanyifeng
Copy link
Copy Markdown
Contributor Author

Reproduction

# In gVisor (runsc) — CPU spikes to ~95%
docker run -d --name tmux-test --runtime=runsc ubuntu:22.04 sleep 3600
docker exec tmux-test bash -c "apt-get update -qq && apt-get install -y -qq tmux procps >/dev/null 2>&1"
docker exec tmux-test bash -c "tmux new-session -d -s test && sleep 30 && ps aux | grep tmux"
# tmux: 94.7% CPU

# In Linux (runc) — CPU stays at 0%
docker run -d --name tmux-ctrl --runtime=runc ubuntu:22.04 sleep 3600
docker exec tmux-ctrl bash -c "apt-get update -qq && apt-get install -y -qq tmux procps >/dev/null 2>&1"
docker exec tmux-ctrl bash -c "tmux new-session -d -s test && sleep 30 && ps aux | grep tmux"
# tmux: 0.0% CPU

doPoll() was modifying pfd.Events (adding POLLHUP|POLLERR) and writing
back the entire pollfd struct. Linux only writes back revents via
unsafe_put_user() in do_sys_poll(). The polluted events field broke
libevent's poll_del(), causing a busy-loop in tmux (CPU 96.6%).

Fix: use an internal mask for POLLHUP|POLLERR in initReadiness() and
pollBlock(); write back only the 2-byte revents field per fd in doPoll().

Signed-off-by: Tan Yifeng <yiftan@tencent.com>
@tanyifeng tanyifeng force-pushed the poll-revents-writeback branch from 7cc0267 to 6529fb0 Compare April 7, 2026 06:41
// the caller's events mask (e.g. libevent's poll backend),
// causing busy-loops when event_del() fails to fully remove
// an fd from the pollfd array due to stale POLLHUP/POLLERR bits.
//
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: empty comment line


// reventsOffsetInFD is the byte offset of the REvents field within
// linux.PollFD.
reventsOffsetInFD = int(unsafe.Offsetof(linux.PollFD{}.REvents))
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: reventsOffsetInPollFD

Comment on lines +87 to +89
// POLLERR in addition to the caller-requested events. We add these
// to a local mask rather than modifying pfd.Events, because Linux
// never writes back the events field to userspace (only revents).
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This reasoning sounds backwards. If Linux (and us now) only write back pfd.REvents, then why do we need to create a local mask rather than modifying pfd.Events? IMO we should revert this logic and similar changes on lines 170-172 and only rely on the change which writes back only REvents. IIUC, that is sufficient to fix the bug (and we avoid duplicating logic to add linux.POLLHUP | linux.POLLERR).

The following comment:

	// Compatibility warning: Linux adds POLLHUP and POLLERR just before
	// polling, in fs/select.c:do_pollfd(). Since pfd is copied out after
	// polling, changing event masks here is an application-visible difference.
	// (Linux also doesn't copy out event masks at all, only revents.)

can be simplified to:

	// Linux adds POLLHUP and POLLERR just before polling, in
    // fs/select.c:do_pollfd(). We can modify pfd[i].Events because
    // it is not copied out after polling (consistent with Linux)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants