Epoll optimizations by sgerbino · Pull Request #104 · cppalliance/corosio

sgerbino · 2026-02-04T15:53:34Z

Summary by CodeRabbit

Performance Improvements
- Faster local posting by using per-thread private queues to avoid locks when posting from the same thread.
- Private queues drained into the global queue efficiently to reduce contention.
- Scheduler avoids holding mutex while processing local/private work; timers processed only when signaled.
- Threads woken selectively based on actual outstanding work to reduce unnecessary wakeups.

coderabbitai · 2026-02-04T15:54:11Z

📝 Walkthrough

Walkthrough

Adds per-thread private work queues and bookkeeping to the epoll scheduler, enabling a same-thread fast-path for posting, draining private queues into the global completed-ops queue, adjusting wake semantics to wake only needed threads, and deferring timerfd processing until timer events occur.

Changes

Cohort / File(s)	Summary
Scheduler Implementation `src/corosio/src/detail/epoll/scheduler.cpp`	Added `scheduler_context` fields `private_queue` and `private_outstanding_work` plus constructor; implemented `find_context()`; fast-path branches in `post(capy::coro)` and `post(scheduler_op*)` to enqueue to per-thread private queue without locking; added `drain_thread_queue()` and adjusted `run_reactor`, `do_one`, and wake logic to account for private queues and selective wakeups.
Scheduler Header `src/corosio/src/detail/epoll/scheduler.hpp`	Declared `void drain_thread_queue(op_queue& queue, long count) const;` in `epoll_scheduler`. Minor comment/doc adjustments.
Behavioral/Control Flow (implicit across files)	Shifted timer processing to occur only when timerfd signals; moved more work processing off the global mutex by handling private and local queues before splicing into `completed_ops`; generalized termination/wake semantics to include private-queue work.

Sequence Diagram

sequenceDiagram
    participant Thread as Thread (posting/run)
    participant Context as scheduler_context (private)
    participant Scheduler as epoll_scheduler (global)
    participant GlobalQ as completed_ops (global queue)
    participant Other as Other Thread

    Note over Thread,Context: Same-thread fast-path
    Thread->>Context: post(work)
    Context->>Context: enqueue -> private_queue
    Context->>Context: ++private_outstanding_work
    Note over Context: no mutex, no wake (fast)

    Note over Other,Scheduler: Cross-thread slow-path
    Other->>Scheduler: post(work)
    Scheduler->>Scheduler: lock mutex
    Scheduler->>GlobalQ: enqueue completed_ops
    Scheduler->>Scheduler: unlock mutex
    Scheduler->>Other: wake necessary idle threads

    Note over Thread,Scheduler: Drain during run_reactor
    Thread->>Context: drain_thread_queue()
    Context->>Scheduler: lock mutex
    Context->>GlobalQ: splice private_queue into completed_ops
    Scheduler->>Context: unlock mutex
    Thread->>Scheduler: process completed_ops and timers (timers only if timerfd fired)

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Epoll #101: Overlaps changes to epoll scheduler reactor loop, wake/do_one logic, and selective wakeup semantics.

Poem

🐰 I hopped a queue, tucked work inside,

Threads can post where they reside.
Private paths make locking light,
Timers knock when epoll's right,
Wake just few — we saved a flight! ✨

🚥 Pre-merge checks | ✅ 2

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'Epoll optimizations' accurately describes the main changes: performance improvements to the epoll scheduler through private queues and fast-path logic.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

codecov · 2026-02-04T15:58:18Z

Codecov Report

❌ Patch coverage is 70.76923% with 19 lines in your changes missing coverage. Please review.
✅ Project coverage is 80.42%. Comparing base (d3f13ac) to head (5bce71e).
⚠️ Report is 5 commits behind head on develop.

Files with missing lines	Patch %	Lines
src/corosio/src/detail/epoll/scheduler.cpp	70.76%	19 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff             @@
##           develop     #104      +/-   ##
===========================================
- Coverage    80.48%   80.42%   -0.06%     
===========================================
  Files           64       64              
  Lines         5344     5401      +57     
===========================================
+ Hits          4301     4344      +43     
- Misses        1043     1057      +14

Files with missing lines	Coverage Δ
src/corosio/src/detail/epoll/scheduler.hpp	`0.00% <ø> (ø)`
src/corosio/src/detail/epoll/scheduler.cpp	`78.45% <70.76%> (-0.71%)`	⬇️

... and 1 file with indirect coverage changes

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d3f13ac...5bce71e. Read the comment docs.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

cppalliance-bot · 2026-02-04T15:59:36Z

An automated preview of the documentation is available at https://104.corosio.prtest3.cppalliance.org/index.html

If more commits are pushed to the pull request, the docs will rebuild at the same URL.

2026-02-04 16:38:41 UTC

cppalliance-bot · 2026-02-04T16:03:35Z

GCOVR code coverage report https://104.corosio.prtest3.cppalliance.org/gcovr/index.html
LCOV code coverage report https://104.corosio.prtest3.cppalliance.org/genhtml/index.html
Coverage Diff Report https://104.corosio.prtest3.cppalliance.org/diff-report/index.html

Build time: 2026-02-04 16:40:10 UTC

coderabbitai

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

src/corosio/src/detail/epoll/scheduler.cpp (1)
589-626: ⚠️ Potential issue | 🟡 Minor

Update cached expiry only after timerfd_settime succeeds.

If timerfd_settime throws, the cache is already updated, so later calls may early-return even though the fd wasn’t re-armed.
🔧 Suggested adjustment
-    last_timerfd_expiry_ = nearest;
-
     itimerspec ts{};
     int flags = 0;
@@
-    if (::timerfd_settime(timer_fd_, flags, &ts, nullptr) < 0)
+    if (::timerfd_settime(timer_fd_, flags, &ts, nullptr) < 0)
         detail::throw_system_error(make_err(errno), "timerfd_settime");
+
+    last_timerfd_expiry_ = nearest;

🤖 Fix all issues with AI agents

In `@src/corosio/src/detail/epoll/scheduler.cpp`:
- Around line 660-664: The epoll loop currently sets check_timers = true when
events[i].data.ptr == &timer_fd_ but does not consume the timerfd, which can
leave it readable and cause epoll to spin; modify the epoll handling in the
scheduler loop (where events[i].data.ptr is compared to &timer_fd_) to perform a
non-blocking read of timer_fd_ (read the uint64_t expirations) and ignore
EAGAIN/EWOULDBLOCK errors, logging or handling other errors as appropriate, so
the timerfd is cleared and epoll_wait won't repeatedly wake immediately.

In `@src/corosio/src/detail/epoll/scheduler.hpp`:
- Around line 170-171: The mutable field last_timerfd_expiry_ is updated from
multiple contexts (timer_service callback and reactor) and is not thread-safe;
protect concurrent access to last_timerfd_expiry_ in update_timerfd() and any
readers by using the existing mutex_ (or a dedicated std::mutex) or switch to an
atomic time representation: acquire mutex_ before reading/writing
last_timerfd_expiry_ in update_timerfd() and where it’s read, or replace
last_timerfd_expiry_ with an atomic-backed value and update it via atomic
operations to eliminate the data race.

src/corosio/src/detail/epoll/scheduler.cpp

src/corosio/src/detail/epoll/scheduler.hpp

cppalliance-bot · 2026-02-04T16:10:26Z

GCOVR code coverage report https://104.corosio.prtest3.cppalliance.org/gcovr/index.html
LCOV code coverage report https://104.corosio.prtest3.cppalliance.org/genhtml/index.html
Coverage Diff Report https://104.corosio.prtest3.cppalliance.org/diff-report/index.html

Build time: 2026-02-04 16:10:25 UTC

Reduce mutex contention by processing events into a local queue without holding the mutex. The mutex is only acquired briefly when splicing completions into the completed_ops_ queue. Changes: - Process events into a local op_queue without holding the mutex - Only acquire mutex for completed_ops_ splice operation - Add check_timers flag to only process timers when timerfd fires - Cache last timerfd expiry to skip redundant timerfd_settime calls

Only wake idle threads, and only as many as we have work available. This prevents waking all threads when only a few completions arrive.

When posting work from within the scheduler's run loop, use a thread-local queue instead of acquiring the global mutex. This matches Asio's thread_info::private_op_queue optimization. - Extend scheduler_context with private_queue and work counter - Fast path in post() detects same-thread via context_stack - Drain points: before blocking, after reactor splice, on exit - Reduces futex calls from ~450K to 1 in multi-threaded benchmarks

- Consume timerfd on expiry to prevent epoll busy-spinning (level-triggered fd must be read to clear readable state) - Remove last_timerfd_expiry_ caching optimization to match Asio (eliminates data race between timer callback and reactor thread)

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In `@src/corosio/src/detail/epoll/scheduler.cpp`:
- Around line 533-542: The drain_thread_queue implementation uses
wakeup_event_.notify_all() which only wakes condvar waiters and can leave the
reactor unaware when no threads are idle; change drain_thread_queue (and its
call sites if needed) to reuse the scheduler's existing reactor wake/interrupt
path instead of a raw notify_all: after splicing into completed_ops_ (and
preserving outstanding_work_), invoke the scheduler's established wake/interrupt
helper (the same function used elsewhere to interrupt or wake the reactor) so
queued handlers are promptly observed by the reactor; reference
epoll_scheduler::drain_thread_queue, completed_ops_, outstanding_work_, and
wakeup_event_ when locating the change.

coderabbitai · 2026-02-04T16:42:13Z

src/corosio/src/detail/epoll/scheduler.cpp

+void
+epoll_scheduler::
+drain_thread_queue(op_queue& queue, long count) const
+{
+    std::lock_guard lock(mutex_);
+    // Note: outstanding_work_ was already incremented when posting
+    completed_ops_.splice(queue);
+    if (count > 0)
+        wakeup_event_.notify_all();
+}


⚠️ Potential issue | 🟠 Major

Ensure drained work wakes or interrupts the reactor when no idle threads.

notify_all() only wakes condvar waiters. If no threads are idle and the reactor is waiting, the newly queued handlers can sit until an unrelated wakeup. Consider reusing the existing wake/interrupt logic so queued work is promptly observed.

🛠️ Suggested fix

void epoll_scheduler:: drain_thread_queue(op_queue& queue, long count) const { - std::lock_guard lock(mutex_); + std::unique_lock lock(mutex_); // Note: outstanding_work_ was already incremented when posting completed_ops_.splice(queue); if (count > 0) - wakeup_event_.notify_all(); + wake_one_thread_and_unlock(lock); }

🤖 Prompt for AI Agents

In `@src/corosio/src/detail/epoll/scheduler.cpp` around lines 533 - 542, The drain_thread_queue implementation uses wakeup_event_.notify_all() which only wakes condvar waiters and can leave the reactor unaware when no threads are idle; change drain_thread_queue (and its call sites if needed) to reuse the scheduler's existing reactor wake/interrupt path instead of a raw notify_all: after splicing into completed_ops_ (and preserving outstanding_work_), invoke the scheduler's established wake/interrupt helper (the same function used elsewhere to interrupt or wake the reactor) so queued handlers are promptly observed by the reactor; reference epoll_scheduler::drain_thread_queue, completed_ops_, outstanding_work_, and wakeup_event_ when locating the change.

sgerbino force-pushed the pr/epoll-opt-1 branch from bf8f934 to 747ddd2 Compare February 4, 2026 15:54

coderabbitai bot reviewed Feb 4, 2026

View reviewed changes

src/corosio/src/detail/epoll/scheduler.cpp Show resolved Hide resolved

src/corosio/src/detail/epoll/scheduler.hpp Outdated Show resolved Hide resolved

sgerbino added 5 commits February 4, 2026 17:32

Avoid thundering herd in reactor wake-up

931a23d

Only wake idle threads, and only as many as we have work available. This prevents waking all threads when only a few completions arrive.

Add 16-thread benchmark configuration

892a4ef

Fix timerfd handling in epoll reactor

5bce71e

- Consume timerfd on expiry to prevent epoll busy-spinning (level-triggered fd must be read to clear readable state) - Remove last_timerfd_expiry_ caching optimization to match Asio (eliminates data race between timer callback and reactor thread)

sgerbino force-pushed the pr/epoll-opt-1 branch from 6166eca to 5bce71e Compare February 4, 2026 16:32

coderabbitai bot reviewed Feb 4, 2026

View reviewed changes

sgerbino merged commit e56eb71 into cppalliance:develop Feb 4, 2026
29 of 31 checks passed

sgerbino deleted the pr/epoll-opt-1 branch February 4, 2026 17:05

coderabbitai bot mentioned this pull request Feb 6, 2026

Epoll optimizations #106

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Epoll optimizations#104

Epoll optimizations#104
sgerbino merged 5 commits intocppalliance:developfrom
sgerbino:pr/epoll-opt-1

sgerbino commented Feb 4, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Feb 4, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

codecov bot commented Feb 4, 2026 •

edited

Loading

Uh oh!

cppalliance-bot commented Feb 4, 2026 •

edited

Loading

Uh oh!

cppalliance-bot commented Feb 4, 2026 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

cppalliance-bot commented Feb 4, 2026

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Feb 4, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

sgerbino commented Feb 4, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Feb 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

codecov bot commented Feb 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

cppalliance-bot commented Feb 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cppalliance-bot commented Feb 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

cppalliance-bot commented Feb 4, 2026

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 4, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

sgerbino commented Feb 4, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Feb 4, 2026 •

edited

Loading

codecov bot commented Feb 4, 2026 •

edited

Loading

cppalliance-bot commented Feb 4, 2026 •

edited

Loading

cppalliance-bot commented Feb 4, 2026 •

edited

Loading