diff --git a/BACKLOG.md b/BACKLOG.md
new file mode 100644
index 0000000..f5477b9
--- /dev/null
+++ b/BACKLOG.md
@@ -0,0 +1,94 @@
+# Backlog
+
+## 1. Kernel CPU accounting under-reports vs hardware PMU counters
+
+**Priority**: Low (understanding only — use `perf stat` as ground truth)
+**Status**: Root cause narrowed, further experiments possible
+
+### Problem
+
+The Linux kernel's CPU time accounting (`/proc/pid/stat`, `schedstat`) consistently under-reports
+CPU utilization compared to `perf stat` (hardware PMU counters) for the Netty custom scheduler workload.
+
+**Measured on NETTY_SCHEDULER with 4 carrier threads pinned to 4 physical cores (Ryzen 9 7950X):**
+
+| Source | CPUs utilized | Notes |
+|--------|--------------|-------|
+| `perf stat` (PMU task-clock) | **3.96** | Hardware counter ground truth |
+| `/proc/pid/stat` (utime+stime) | **3.19** | Kernel accounting |
+| `schedstat` (sum of all thread run_ns) | **3.19** | CFS-level accounting |
+| `pidstat` process-level | **2.84** | Even lower (pidstat's own sampling) |
+| `pidstat` per-thread carrier sum | **2.72** | 4 x ~68% |
+
+### Key findings
+
+- **pidstat is NOT lying** — it faithfully reports what the kernel provides
+- `/proc/pid/stat` and `schedstat` agree perfectly (both at 3.19 CPUs)
+- The kernel itself under-counts by ~0.77 CPUs (19%) vs PMU hardware counters
+- This is NOT caused by `CONFIG_TICK_CPU_ACCOUNTING` — the kernel uses
+  `CONFIG_VIRT_CPU_ACCOUNTING_GEN=y` (full dyntick, precise at context switch boundaries)
+- All 31 kernel-visible threads were accounted for — no "hidden threads" from
+  `-Djdk.trackAllThreads=false` (VTs are invisible to the kernel by design, their CPU time
+  is charged to carrier threads)
+
+### Hypotheses for the 0.77 CPU gap
+
+1. **Kernel scheduling overhead**: `__schedule()` / `finish_task_switch()` runs during
+   thread transitions. Some of this CPU time may be attributed to idle/swapper rather
+   than the thread being switched in/out.
+
+2. **Interrupt handling**: hardware interrupts (NIC, timer) steal cycles from the process.
+   `perf stat` counts all cycles on cores used by the process (task-clock includes time
+   when PMU is inherited by children or interrupted contexts), while `/proc/stat` only
+   counts time explicitly attributed to the thread.
+
+3. **`task-clock` semantics**: `perf stat`'s `task-clock` measures wall-clock time that
+   at least one thread of the process was running. With 4 threads on 4 cores, task-clock
+   closely approximates 4.0 * elapsed. This includes interrupt handling time on those cores
+   that `/proc/stat` charges elsewhere.
+
+4. **Carrier thread park/unpark transitions**: even with VIRT_CPU_ACCOUNTING_GEN, the
+   accounting happens at `schedule()` boundaries. CPU cycles consumed during the entry/exit
+   paths of `LockSupport.park()` (before the actual `schedule()` call and after the wakeup)
+   may be partially lost.
+
+### Further experiments (if desired)
+
+1. **Compare `perf stat -e task-clock` vs `perf stat -e cpu-clock`**: `task-clock` counts
+   per-thread time, `cpu-clock` counts wall time. If they differ, it reveals interrupt overhead.
+
+2. **Run with `nohz_full=4-7` (isolated CPUs)**: removes timer tick interrupts from server
+   cores. If the gap shrinks, interrupt overhead is the cause.
+
+3. **Spin-wait instead of park**: replace `LockSupport.park()` with `Thread.onSpinWait()`
+   in `FifoEventLoopScheduler`. If gap shrinks, park/unpark accounting is lossy.
+
+4. **Check `/proc/interrupts`** delta during benchmark: quantify how many interrupts hit
+   cores 4-7 and estimate their CPU cost.
+
+5. **`perf stat` per-thread (`-t TID`)** for each carrier: compare PMU task-clock per
+   carrier vs schedstat per carrier to see if the gap is evenly distributed.
+
+### Conclusion
+
+For benchmarking purposes, **always use `perf stat` as the ground truth** for CPU utilization.
+pidstat is still useful for relative thread balance analysis and for monitoring non-server
+components (mock server, load generator) where the gap is less significant.
+
+---
+
+## 2. Add spin-wait phase before carrier thread parking
+
+**Priority**: Medium (performance optimization)
+**Status**: Not started
+
+In `FifoEventLoopScheduler.virtualThreadSchedulerLoop()`, the carrier thread parks immediately
+when the queue drains. Adding a brief spin-wait phase (e.g., 100-1000 iterations of
+`Thread.onSpinWait()`) before calling `LockSupport.park()` could:
+
+- Reduce wake-up latency for incoming work (avoid kernel schedule/deschedule)
+- Reduce context switch count (currently ~20/sec, could go to near-zero)
+- Trade-off: slightly higher idle CPU consumption
+
+### Key file
+- `core/src/main/java/io/netty/loom/FifoEventLoopScheduler.java` line ~199
diff --git a/benchmark-runner/README.md b/benchmark-runner/README.md
index 99b619e..64b7442 100644
--- a/benchmark-runner/README.md
+++ b/benchmark-runner/README.md
@@ -49,46 +49,44 @@ The `run-benchmark.sh` script will automatically build the JAR if missing.
 
 ## Configuration
 
-All configuration is via environment variables:
+Configuration via CLI flags (preferred) or environment variables (fallback). CLI flags take precedence.
+
+Run `./run-benchmark.sh --help` for the full list.
+
+### Server
+| CLI flag | Env var | Default | Description |
+|----------|---------|---------|-------------|
+| `--mode` | `SERVER_MODE` | NON_VIRTUAL_NETTY | Mode: NON_VIRTUAL_NETTY, REACTIVE, VIRTUAL_NETTY, NETTY_SCHEDULER |
+| `--threads` | `SERVER_THREADS` | 2 | Number of event loop threads |
+| `--mockless` | `SERVER_MOCKLESS` | false | Skip mock server; do Jackson work inline |
+| `--io` | `SERVER_IO` | epoll | I/O type: epoll, nio, io_uring |
+| `--poller-mode` | `SERVER_POLLER_MODE` | | jdk.pollerMode: 1, 2, or 3 |
+| `--fj-parallelism` | `SERVER_FJ_PARALLELISM` | | ForkJoinPool parallelism |
+| `--server-cpuset` | `SERVER_CPUSET` | 2,3 | CPU pinning |
+| `--jvm-args` | `SERVER_JVM_ARGS` | | Additional JVM arguments |
 
 ### Mock Server
-| Variable | Default | Description |
-|----------|---------|-------------|
-| `MOCK_PORT` | 8080 | Mock server port |
-| `MOCK_THINK_TIME_MS` | 1 | Simulated processing delay (ms) |
-| `MOCK_THREADS` | auto | Number of Netty threads (empty = available processors) |
-| `MOCK_TASKSET` | 4,5,6,7 | CPU affinity (e.g., "0-1") |
-
-### Handoff Server
-| Variable | Default | Description |
-|----------|---------|-------------|
-| `SERVER_PORT` | 8081 | Server port |
-| `SERVER_THREADS` | 2 | Number of event loop threads |
-| `SERVER_REACTIVE` | false | Use reactive handler with Reactor |
-| `SERVER_USE_CUSTOM_SCHEDULER` | false | Use custom Netty scheduler |
-| `SERVER_IO` | epoll | I/O type: epoll, nio, or io_uring |
-| `SERVER_NO_TIMEOUT` | false | Disable HTTP client timeout |
-| `SERVER_TASKSET` | 2,3 | CPU affinity (e.g., "2-5") |
-| `SERVER_JVM_ARGS` | | Additional JVM arguments |
-| `SERVER_POLLER_MODE` | 3 | jdk.pollerMode value: 1, 2, or 3 |
-| `SERVER_FJ_PARALLELISM` | | ForkJoinPool parallelism (empty = JVM default) |
+| CLI flag | Env var | Default | Description |
+|----------|---------|---------|-------------|
+| `--mock-port` | `MOCK_PORT` | 8080 | Mock server port |
+| `--mock-think-time` | `MOCK_THINK_TIME_MS` | 1 | Simulated processing delay (ms) |
+| `--mock-threads` | `MOCK_THREADS` | 1 | Number of Netty threads |
+| `--mock-cpuset` | `MOCK_CPUSET` | 4,5 | CPU pinning |
 
 ### Load Generator
-| Variable | Default | Description |
-|----------|---------|-------------|
-| `LOAD_GEN_CONNECTIONS` | 100 | Number of connections |
-| `LOAD_GEN_THREADS` | 2 | Number of threads |
-| `LOAD_GEN_RATE` | | Target rate (empty = max throughput with wrk) |
-| `LOAD_GEN_TASKSET` | 0,1 | CPU affinity (e.g., "6-7") |
-| `LOAD_GEN_URL` | http://localhost:8081/fruits | Target URL |
+| CLI flag | Env var | Default | Description |
+|----------|---------|---------|-------------|
+| `--connections` | `LOAD_GEN_CONNECTIONS` | 100 | Number of connections |
+| `--load-threads` | `LOAD_GEN_THREADS` | 2 | Number of threads |
+| `--duration` | `LOAD_GEN_DURATION` | 30s | Test duration |
+| `--rate` | `LOAD_GEN_RATE` | | Target rate for wrk2 (omit for max throughput) |
+| `--load-cpuset` | `LOAD_GEN_CPUSET` | 0,1 | CPU pinning |
 
 ### Timing
-| Variable | Default | Description |
-|----------|---------|-------------|
-| `WARMUP_DURATION` | 10s | Warmup duration (no profiling) |
-| `TOTAL_DURATION` | 30s | Total test duration (steady-state must be >= 20s) |
-| `PROFILING_DELAY_SECONDS` | 5 | Delay before starting profiling/perf/JFR |
-| `PROFILING_DURATION_SECONDS` | 10 | Profiling/perf/JFR duration in seconds |
+| CLI flag | Env var | Default | Description |
+|----------|---------|---------|-------------|
+| `--warmup` | `WARMUP_DURATION` | 10s | Warmup duration |
+| `--total-duration` | `TOTAL_DURATION` | 30s | Total test duration (steady-state >= 20s) |
 
 ### Profiling
 | Variable | Default | Description |
@@ -153,73 +151,126 @@ perf stat uses `PROFILING_DELAY_SECONDS` and `PROFILING_DURATION_SECONDS`.
 
 ## Example Runs
 
-### Basic comparison: custom vs default scheduler
+### Choosing CPU pinning with `lscpu -e`
+
+Good benchmarking requires NUMA-aware CPU pinning. Start by inspecting your topology:
 
 ```bash
-# With custom scheduler
-JAVA_HOME=/path/to/jdk \
-SERVER_USE_CUSTOM_SCHEDULER=true \
-./run-benchmark.sh
+$ lscpu -e
+CPU NODE SOCKET CORE L1d:L1i:L2:L3 ONLINE
+  0    0      0    0 0:0:0:0          yes    # NUMA 0, physical core 0
+  1    0      0    1 1:1:1:0          yes    # NUMA 0, physical core 1
+  ...
+  8    1      0    8 8:8:8:1          yes    # NUMA 1, physical core 8
+  ...
+ 16    0      0    0 0:0:0:0          yes    # NUMA 0, SMT sibling of core 0
+ 17    0      0    1 1:1:1:0          yes    # NUMA 0, SMT sibling of core 1
+```
+
+Key rules:
+- **Keep all benchmark components on the same NUMA node** to avoid cross-node memory latency
+- **Use physical cores only** (avoid SMT siblings) for more stable results
+- **Isolate noisy processes** (IDEs, browsers) on the other NUMA node
+
+Example layout for a 16-core/2-NUMA system with 4 server threads:
+
+| Component | CPUs | Rationale |
+|-----------|------|-----------|
+| Load generator | 0-1 | 2 physical cores, enough to saturate |
+| Mock server | 2-3 | 2 physical cores for backend simulation |
+| Handoff server | 4-7 | 4 physical cores, one per event loop thread |
+| Other processes | 8-15 | Isolated on NUMA node 1 |
+
+### NETTY_SCHEDULER with 4 threads
+
+```bash
+./run-benchmark.sh --mode NETTY_SCHEDULER --threads 4 --io nio \
+  --server-cpuset "4-7" --mock-cpuset "2-3" --load-cpuset "0-1" \
+  --jvm-args "-Xms8g -Xmx8g" \
+  --connections 10000 --load-threads 4 \
+  --mock-think-time 30 --mock-threads 4 \
+  --perf-stat
+```
+
+### Analyzing bottlenecks with perf stat
+
+Use `--perf-stat` to get reliable hardware-level metrics. The `perf-stat.txt` output is the
+ground truth for CPU utilization — pidstat per-thread numbers can be misleading with virtual threads.
 
-# With default scheduler  
-JAVA_HOME=/path/to/jdk \
-SERVER_USE_CUSTOM_SCHEDULER=false \
-./run-benchmark.sh
 ```
+Performance counter stats for process id '95868':
+
+  39,741,757,754  task-clock           #  3.970 CPUs utilized
+             806  context-switches     # 20.281 /sec
+     199,114,762,646  instructions     #  1.17 insn per cycle
+   1,338,722,757  branch-misses        #  3.08% of all branches
+```
+
+Key metrics to watch:
+- **CPUs utilized**: how many cores the server is actually using (3.97 of 4 = fully saturated)
+- **Context switches/sec**: lower is better; custom scheduler typically achieves 20-80/sec
+- **IPC (insn per cycle)**: higher is better; >1.0 is good, <0.5 suggests memory stalls
+- **Branch misses**: >5% suggests unpredictable control flow
 
-### With CPU pinning
+If CPUs utilized equals your allocated core count, the server is CPU-bound — add more cores.
+If context switches are high (>10K/sec), the scheduler or OS is thrashing.
+
+pidstat is still useful for spotting **mock server or load generator bottlenecks** —
+check `pidstat-mock.log` and `pidstat-loadgen.log` to ensure they aren't saturated.
+
+### NON_VIRTUAL_NETTY (default mode)
 
 ```bash
-JAVA_HOME=/path/to/jdk \
-MOCK_TASKSET="0" \
-SERVER_TASKSET="1-4" \
-LOAD_GEN_TASKSET="5-7" \
-SERVER_THREADS=4 \
-SERVER_USE_CUSTOM_SCHEDULER=true \
-./run-benchmark.sh
+./run-benchmark.sh --threads 4 \
+  --server-cpuset "4-7" --mock-cpuset "2-3" --load-cpuset "0-1" \
+  --connections 10000 --mock-think-time 30
 ```
 
-### With profiling
+### VIRTUAL_NETTY mode
 
 ```bash
-JAVA_HOME=/path/to/jdk \
-ENABLE_PROFILER=true \
-ASYNC_PROFILER_PATH=/path/to/async-profiler \
-PROFILER_EVENT=cpu \
-SERVER_USE_CUSTOM_SCHEDULER=true \
-WARMUP_DURATION=15s \
-TOTAL_DURATION=45s \
-./run-benchmark.sh
+./run-benchmark.sh --mode VIRTUAL_NETTY --threads 4 --io nio \
+  --server-cpuset "4-7" --mock-cpuset "2-3" --load-cpuset "0-1" \
+  --connections 10000 --mock-think-time 30
 ```
 
-### With JFR events enabled (subset)
+### Mockless mode (skip HTTP call to mock, inline Jackson work)
 
 ```bash
-JAVA_HOME=/path/to/jdk \
-ENABLE_JFR=true \
-JFR_EVENTS=NettyRunIo,VirtualThreadTaskRuns \
-SERVER_USE_CUSTOM_SCHEDULER=true \
-./run-benchmark.sh
+./run-benchmark.sh --mode NETTY_SCHEDULER --threads 4 --mockless \
+  --server-cpuset "4-7" --load-cpuset "0-1" \
+  --connections 10000
+```
+
+### With async-profiler
+
+```bash
+./run-benchmark.sh --mode NETTY_SCHEDULER --threads 4 \
+  --server-cpuset "4-7" --mock-cpuset "2-3" --load-cpuset "0-1" \
+  --profiler --profiler-path /path/to/async-profiler \
+  --warmup 15s --total-duration 45s
 ```
 
 ### Rate-limited test with wrk2
 
 ```bash
-JAVA_HOME=/path/to/jdk \
-LOAD_GEN_RATE=10000 \
-LOAD_GEN_CONNECTIONS=200 \
-TOTAL_DURATION=60s \
-WARMUP_DURATION=15s \
-./run-benchmark.sh
+./run-benchmark.sh --mode NETTY_SCHEDULER --threads 4 \
+  --server-cpuset "4-7" --mock-cpuset "2-3" --load-cpuset "0-1" \
+  --rate 120000 --connections 10000 --total-duration 60s --warmup 15s
 ```
 
-### With pidstat monitoring
+### With JFR events
 
 ```bash
-JAVA_HOME=/path/to/jdk \
-ENABLE_PIDSTAT=true \
-PIDSTAT_INTERVAL=1 \
-./run-benchmark.sh
+./run-benchmark.sh --mode NETTY_SCHEDULER --threads 4 \
+  --server-cpuset "4-7" --mock-cpuset "2-3" --load-cpuset "0-1" \
+  --jfr --jfr-events NettyRunIo,VirtualThreadTaskRuns
+```
+
+### Mixed: CLI flags + env vars
+
+```bash
+SERVER_JVM_ARGS="-XX:+PrintGCDetails" ./run-benchmark.sh --mode VIRTUAL_NETTY --threads 2
 ```
 
 ## Output
@@ -260,7 +311,7 @@ java -cp benchmark-runner/target/benchmark-runner.jar \
   8080 1    # port, thinkTimeMs (threads defaults to available processors)
 ```
 
-### Handoff Server (with custom scheduler)
+### Handoff Server (custom scheduler mode)
 
 ```bash
 java \
@@ -275,11 +326,11 @@ java \
   --port 8081 \
   --mock-url http://localhost:8080/fruits \
   --threads 2 \
-  --use-custom-scheduler true \
+  --mode netty_scheduler \
   --io epoll
 ```
 
-### Handoff Server (with default scheduler)
+### Handoff Server (default split topology)
 
 ```bash
 java \
@@ -292,6 +343,5 @@ java \
   --port 8081 \
   --mock-url http://localhost:8080/fruits \
   --threads 2 \
-  --use-custom-scheduler false \
   --io epoll
 ```
diff --git a/benchmark-runner/scripts/run-benchmark.sh b/benchmark-runner/scripts/run-benchmark.sh
index 40944f7..d437db5 100755
--- a/benchmark-runner/scripts/run-benchmark.sh
+++ b/benchmark-runner/scripts/run-benchmark.sh
@@ -25,22 +25,21 @@ JAVA_OPTS="${JAVA_OPTS:--Xms1g -Xmx1g}"
 MOCK_PORT="${MOCK_PORT:-8080}"
 MOCK_THINK_TIME_MS="${MOCK_THINK_TIME_MS:-1}"
 MOCK_THREADS="${MOCK_THREADS:-1}"
-MOCK_TASKSET="${MOCK_TASKSET:-4,5}"  # CPUs for mock server
+MOCK_CPUSET="${MOCK_CPUSET:-4,5}"  # CPUs for mock server
 
 # Handoff server configuration
 SERVER_PORT="${SERVER_PORT:-8081}"
 SERVER_THREADS="${SERVER_THREADS:-2}"
-SERVER_USE_CUSTOM_SCHEDULER="${SERVER_USE_CUSTOM_SCHEDULER:-false}"
 SERVER_IO="${SERVER_IO:-epoll}"
-SERVER_TASKSET="${SERVER_TASKSET:-2,3}"  # CPUs for handoff server
+SERVER_CPUSET="${SERVER_CPUSET:-2,3}"  # CPUs for handoff server
 SERVER_JVM_ARGS="${SERVER_JVM_ARGS:-}"
-SERVER_POLLER_MODE="${SERVER_POLLER_MODE:-3}"  # jdk.pollerMode value (1, 2, or 3)
+SERVER_POLLER_MODE="${SERVER_POLLER_MODE:-}"  # jdk.pollerMode value (1, 2, or 3); empty = JVM default, NETTY_SCHEDULER defaults to 3
 SERVER_FJ_PARALLELISM="${SERVER_FJ_PARALLELISM:-}"  # ForkJoinPool parallelism (empty = JVM default)
-SERVER_NO_TIMEOUT="${SERVER_NO_TIMEOUT:-false}"  # Disable HTTP client timeout
-SERVER_REACTIVE="${SERVER_REACTIVE:-false}"  # Use reactive handler with Project Reactor
+SERVER_MODE="${SERVER_MODE:-NON_VIRTUAL_NETTY}"  # Server mode: NON_VIRTUAL_NETTY, REACTIVE, VIRTUAL_NETTY
+SERVER_MOCKLESS="${SERVER_MOCKLESS:-false}"  # Skip mock server; do Jackson work inline
 
 # Load generator configuration
-LOAD_GEN_TASKSET="${LOAD_GEN_TASKSET:-0,1}"  # CPUs for load generator
+LOAD_GEN_CPUSET="${LOAD_GEN_CPUSET:-0,1}"  # CPUs for load generator
 LOAD_GEN_CONNECTIONS="${LOAD_GEN_CONNECTIONS:-100}"
 LOAD_GEN_THREADS="${LOAD_GEN_THREADS:-2}"
 LOAD_GEN_DURATION="${LOAD_GEN_DURATION:-30s}"
@@ -57,6 +56,7 @@ PROFILING_DURATION_SECONDS="${PROFILING_DURATION_SECONDS:-10}"
 # Profiling configuration
 ENABLE_PROFILER="${ENABLE_PROFILER:-false}"
 PROFILER_EVENT="${PROFILER_EVENT:-cpu}"
+PROFILER_FORMAT="${PROFILER_FORMAT:-flamegraph}"  # Output format: flamegraph, collapsed, jfr
 PROFILER_OUTPUT="${PROFILER_OUTPUT:-profile.html}"
 ASYNC_PROFILER_PATH="${ASYNC_PROFILER_PATH:-}"  # Path to async-profiler
 
@@ -346,7 +346,7 @@ build_jars() {
 start_mock_server() {
     log "Starting mock HTTP server..."
 
-    local taskset_cmd=$(build_taskset_cmd "$MOCK_TASKSET")
+    local taskset_cmd=$(build_taskset_cmd "$MOCK_CPUSET")
     local java_cmd="$JAVA_HOME/bin/java"
 
     local mock_threads_arg=""
@@ -373,7 +373,7 @@ start_mock_server() {
 start_handoff_server() {
     log "Starting handoff HTTP server..."
 
-    local taskset_cmd=$(build_taskset_cmd "$SERVER_TASKSET")
+    local taskset_cmd=$(build_taskset_cmd "$SERVER_CPUSET")
     local java_cmd="$JAVA_HOME/bin/java"
 
     # Build JVM args
@@ -382,9 +382,19 @@ start_handoff_server() {
     jvm_args="$jvm_args -XX:-DoJVMTIVirtualThreadTransitions"
     jvm_args="$jvm_args -Djdk.trackAllThreads=false"
 
-    if [[ "$SERVER_USE_CUSTOM_SCHEDULER" == "true" ]]; then
-        jvm_args="$jvm_args -Djdk.virtualThreadScheduler.implClass=io.netty.loom.NettyScheduler"
-        jvm_args="$jvm_args -Djdk.pollerMode=$SERVER_POLLER_MODE"
+    # Mode-specific JVM args
+    local poller_mode="$SERVER_POLLER_MODE"
+    case "$SERVER_MODE" in
+        NETTY_SCHEDULER)
+            jvm_args="$jvm_args -Djdk.virtualThreadScheduler.implClass=io.netty.loom.NettyScheduler"
+            # Default pollerMode to 3 for custom scheduler if not explicitly set
+            poller_mode="${poller_mode:-3}"
+            ;;
+    esac
+
+    # Apply pollerMode if set (explicitly or via mode default)
+    if [[ -n "$poller_mode" ]]; then
+        jvm_args="$jvm_args -Djdk.pollerMode=$poller_mode"
     fi
 
     if [[ -n "$SERVER_FJ_PARALLELISM" ]]; then
@@ -402,15 +412,19 @@ start_handoff_server() {
         jvm_args="$jvm_args $SERVER_JVM_ARGS"
     fi
 
+    local mockless_flag=""
+    if [[ "$SERVER_MOCKLESS" == "true" ]]; then
+        mockless_flag="--mockless"
+    fi
+
     local cmd="$taskset_cmd $java_cmd $JAVA_OPTS $jvm_args -cp $RUNNER_JAR \
         io.netty.loom.benchmark.runner.HandoffHttpServer \
         --port $SERVER_PORT \
         --mock-url http://localhost:$MOCK_PORT/fruits \
         --threads $SERVER_THREADS \
-        --use-custom-scheduler $SERVER_USE_CUSTOM_SCHEDULER \
         --io $SERVER_IO \
-        --no-timeout $SERVER_NO_TIMEOUT \
-        --reactive $SERVER_REACTIVE \
+        --mode $SERVER_MODE \
+        $mockless_flag \
         --silent"
 
     log "Handoff server command: $cmd"
@@ -435,7 +449,7 @@ run_warmup() {
 
     log "Running warmup for $WARMUP_DURATION..."
 
-    local taskset_cmd=$(build_taskset_cmd "$LOAD_GEN_TASKSET")
+    local taskset_cmd=$(build_taskset_cmd "$LOAD_GEN_CPUSET")
 
     # Use wrk for warmup (no rate limiting)
     $taskset_cmd jbang wrk@hyperfoil \
@@ -467,7 +481,8 @@ start_profiler() {
 
     (
         sleep "$PROFILING_DELAY_SECONDS"
-        "$asprof" --threads -e "$PROFILER_EVENT" -o flamegraph -d "$PROFILING_DURATION_SECONDS" -f "$output_file" "$SERVER_PID"
+        # --record-cpu
+        "$asprof" --threads -e "$PROFILER_EVENT" -o "$PROFILER_FORMAT" -d "$PROFILING_DURATION_SECONDS" -f "$output_file" "$SERVER_PID"
     ) &
     PROFILER_PID=$!
 
@@ -566,14 +581,16 @@ start_pidstat() {
 
     log "pidstat running (PID: $PIDSTAT_PID)"
 
-    log "Starting pidstat for mock server (PID: $MOCK_PID)..."
+    if [[ -n "${MOCK_PID:-}" ]]; then
+        log "Starting pidstat for mock server (PID: $MOCK_PID)..."
 
-    local mock_output_file="$OUTPUT_DIR/$PIDSTAT_MOCK_OUTPUT"
+        local mock_output_file="$OUTPUT_DIR/$PIDSTAT_MOCK_OUTPUT"
 
-    pidstat -p "$MOCK_PID" "$PIDSTAT_INTERVAL" > "$mock_output_file" 2>&1 &
-    PIDSTAT_MOCK_PID=$!
+        pidstat -p "$MOCK_PID" "$PIDSTAT_INTERVAL" > "$mock_output_file" 2>&1 &
+        PIDSTAT_MOCK_PID=$!
 
-    log "pidstat running for mock server (PID: $PIDSTAT_MOCK_PID)"
+        log "pidstat running for mock server (PID: $PIDSTAT_MOCK_PID)"
+    fi
 }
 
 stop_pidstat() {
@@ -637,7 +654,7 @@ run_load_test() {
 
     log "Running load test for ${test_secs}s..."
 
-    local taskset_cmd=$(build_taskset_cmd "$LOAD_GEN_TASKSET")
+    local taskset_cmd=$(build_taskset_cmd "$LOAD_GEN_CPUSET")
     local output_file="$OUTPUT_DIR/wrk-results.txt"
 
     if [[ -n "$LOAD_GEN_RATE" ]]; then
@@ -695,25 +712,28 @@ print_config() {
     log "  Port:           $MOCK_PORT"
     log "  Think Time:     ${MOCK_THINK_TIME_MS}ms"
     log "  Threads:        ${MOCK_THREADS:-<auto>}"
-    log "  CPU Affinity:   ${MOCK_TASKSET:-<none>}"
+    log "  CPU Affinity:   ${MOCK_CPUSET:-<none>}"
     log ""
     log "Handoff Server:"
     log "  Port:           $SERVER_PORT"
     log "  Threads:        $SERVER_THREADS"
-    log "  Reactive:       $SERVER_REACTIVE"
-    log "  Custom Sched:   $SERVER_USE_CUSTOM_SCHEDULER"
+    log "  Mode:           $SERVER_MODE"
+    log "  Mockless:       $SERVER_MOCKLESS"
     log "  I/O Type:       $SERVER_IO"
-    log "  No Timeout:     $SERVER_NO_TIMEOUT"
-    log "  Poller Mode:    $SERVER_POLLER_MODE"
+    local effective_poller="$SERVER_POLLER_MODE"
+    if [[ -z "$effective_poller" && "$SERVER_MODE" == "NETTY_SCHEDULER" ]]; then
+        effective_poller="3 (default for NETTY_SCHEDULER)"
+    fi
+    log "  Poller Mode:    ${effective_poller:-<JVM default>}"
     log "  FJ Parallelism: ${SERVER_FJ_PARALLELISM:-<default>}"
-    log "  CPU Affinity:   ${SERVER_TASKSET:-<none>}"
+    log "  CPU Affinity:   ${SERVER_CPUSET:-<none>}"
     log "  Extra JVM Args: ${SERVER_JVM_ARGS:-<none>}"
     log ""
     log "Load Generator:"
     log "  Connections:    $LOAD_GEN_CONNECTIONS"
     log "  Threads:        $LOAD_GEN_THREADS"
     log "  Rate:           ${LOAD_GEN_RATE:-<max throughput>}"
-    log "  CPU Affinity:   ${LOAD_GEN_TASKSET:-<none>}"
+    log "  CPU Affinity:   ${LOAD_GEN_CPUSET:-<none>}"
     log ""
     log "Timing:"
     log "  Warmup:         $WARMUP_DURATION"
@@ -766,120 +786,124 @@ print_config() {
 # ============================================================================
 
 main() {
-    # Parse command line arguments
+    # Parse command line arguments (override env vars)
     while [[ $# -gt 0 ]]; do
         case "$1" in
+            # Server
+            --mode)             SERVER_MODE="$2"; shift 2 ;;
+            --threads)          SERVER_THREADS="$2"; shift 2 ;;
+            --mockless)         SERVER_MOCKLESS=true; shift ;;
+            --io)               SERVER_IO="$2"; shift 2 ;;
+            --poller-mode)      SERVER_POLLER_MODE="$2"; shift 2 ;;
+            --fj-parallelism)   SERVER_FJ_PARALLELISM="$2"; shift 2 ;;
+            --server-cpuset)    SERVER_CPUSET="$2"; shift 2 ;;
+            --jvm-args)         SERVER_JVM_ARGS="$2"; shift 2 ;;
+            # Mock
+            --mock-port)        MOCK_PORT="$2"; shift 2 ;;
+            --mock-think-time)  MOCK_THINK_TIME_MS="$2"; shift 2 ;;
+            --mock-threads)     MOCK_THREADS="$2"; shift 2 ;;
+            --mock-cpuset)      MOCK_CPUSET="$2"; shift 2 ;;
+            # Load generator
+            --connections)      LOAD_GEN_CONNECTIONS="$2"; shift 2 ;;
+            --load-threads)     LOAD_GEN_THREADS="$2"; shift 2 ;;
+            --duration)         LOAD_GEN_DURATION="$2"; shift 2 ;;
+            --rate)             LOAD_GEN_RATE="$2"; shift 2 ;;
+            --load-cpuset)      LOAD_GEN_CPUSET="$2"; shift 2 ;;
+            # Timing
+            --warmup)           WARMUP_DURATION="$2"; shift 2 ;;
+            --total-duration)   TOTAL_DURATION="$2"; shift 2 ;;
+            # Profiling
+            --profiler)         ENABLE_PROFILER=true; shift ;;
+            --profiler-path)    ASYNC_PROFILER_PATH="$2"; shift 2 ;;
+            --profiler-event)   PROFILER_EVENT="$2"; shift 2 ;;
+            --jfr)              ENABLE_JFR=true; shift ;;
+            --jfr-events)       JFR_EVENTS="$2"; shift 2 ;;
+            --perf-stat)        ENABLE_PERF_STAT=true; shift ;;
+            --perf-stat-args)   PERF_STAT_ARGS="$2"; shift 2 ;;
+            --no-pidstat)       ENABLE_PIDSTAT=false; shift ;;
+            # Output
+            --output-dir)       OUTPUT_DIR="$2"; shift 2 ;;
+            # Help
             --help|-h)
                 cat << 'EOF'
 Benchmark Runner Script
 
 Usage: ./run-benchmark.sh [OPTIONS]
 
-Environment Variables (can also be set via command line options):
+All options can also be set via environment variables (shown in parentheses).
+CLI flags take precedence over environment variables.
+
+Server:
+  --mode <mode>             Server mode (SERVER_MODE, default: NON_VIRTUAL_NETTY)
+                            Modes: NON_VIRTUAL_NETTY, REACTIVE, VIRTUAL_NETTY, NETTY_SCHEDULER
+  --threads <n>             Event loop threads (SERVER_THREADS, default: 2)
+  --mockless                Skip mock server, inline Jackson work (SERVER_MOCKLESS)
+  --io <type>               I/O type: epoll, nio, io_uring (SERVER_IO, default: epoll)
+  --poller-mode <n>         jdk.pollerMode: 1, 2, or 3 (SERVER_POLLER_MODE)
+  --fj-parallelism <n>      ForkJoinPool parallelism (SERVER_FJ_PARALLELISM)
+  --server-cpuset <cpus>    Server CPU pinning, e.g. "2,3" (SERVER_CPUSET, default: 2,3)
+  --jvm-args <args>         Additional JVM arguments (SERVER_JVM_ARGS)
 
 Mock Server:
-  MOCK_PORT                 Mock server port (default: 8080)
-  MOCK_THINK_TIME_MS        Response delay in ms (default: 1)
-  MOCK_THREADS              Number of threads (default: auto = available processors)
-  MOCK_TASKSET              CPU affinity range (default: "4,5,6,7")
-
-Handoff Server:
-  SERVER_PORT               Server port (default: 8081)
-  SERVER_THREADS            Number of event loop threads (default: 2)
-  SERVER_REACTIVE           Use reactive handler with Reactor (default: false)
-  SERVER_USE_CUSTOM_SCHEDULER  Use custom Netty scheduler (default: false)
-  SERVER_IO                 I/O type: epoll, nio, or io_uring (default: epoll)
-  SERVER_NO_TIMEOUT         Disable HTTP client timeout (default: false)
-  SERVER_TASKSET            CPU affinity range (default: "2,3")
-  SERVER_JVM_ARGS           Additional JVM arguments
-  SERVER_POLLER_MODE        jdk.pollerMode value: 1, 2, or 3 (default: 3)
-  SERVER_FJ_PARALLELISM     ForkJoinPool parallelism (empty = JVM default)
+  --mock-port <port>        Mock server port (MOCK_PORT, default: 8080)
+  --mock-think-time <ms>    Response delay in ms (MOCK_THINK_TIME_MS, default: 1)
+  --mock-threads <n>        Number of threads (MOCK_THREADS, default: 1)
+  --mock-cpuset <cpus>      Mock server CPU pinning (MOCK_CPUSET, default: 4,5)
 
 Load Generator:
-  LOAD_GEN_CONNECTIONS      Number of connections (default: 100)
-  LOAD_GEN_THREADS          Number of threads (default: 2)
-  LOAD_GEN_DURATION         Test duration (default: 30s)
-  LOAD_GEN_RATE             Target rate for wrk2 (empty = use wrk)
-  LOAD_GEN_TASKSET          CPU affinity range (default: "0,1")
+  --connections <n>         Number of connections (LOAD_GEN_CONNECTIONS, default: 100)
+  --load-threads <n>        Number of threads (LOAD_GEN_THREADS, default: 2)
+  --duration <dur>          Test duration (LOAD_GEN_DURATION, default: 30s)
+  --rate <n>                Target rate for wrk2; omit for max throughput (LOAD_GEN_RATE)
+  --load-cpuset <cpus>      Load generator CPU pinning (LOAD_GEN_CPUSET, default: 0,1)
 
 Timing:
-  WARMUP_DURATION           Warmup duration (default: 10s)
-  TOTAL_DURATION            Total test duration (default: 30s, must keep steady state >= 20s)
-  PROFILING_DELAY_SECONDS   Profiling start delay in seconds (default: 5)
-  PROFILING_DURATION_SECONDS Profiling duration in seconds (default: 10)
+  --warmup <dur>            Warmup duration (WARMUP_DURATION, default: 10s)
+  --total-duration <dur>    Total test duration (TOTAL_DURATION, default: 30s)
 
 Profiling:
-  ENABLE_PROFILER           Enable async-profiler (default: false)
-  ASYNC_PROFILER_PATH       Path to async-profiler installation
-  PROFILER_EVENT            Profiler event type (default: cpu)
-  PROFILER_OUTPUT           Profiler output file (default: profile.html)
-  Note: profiling uses PROFILING_DELAY_SECONDS and PROFILING_DURATION_SECONDS.
-
-JFR:
-  ENABLE_JFR                Enable Netty Loom JFR events (default: false)
-  JFR_EVENTS                Comma-separated event list or "all" (default: all)
-                           Options: NettyRunIo, NettyRunTasks,
-                                    VirtualThreadTaskRuns, VirtualThreadTaskRun,
-                                    VirtualThreadTaskSubmit
-  JFR_SETTINGS_FILE         Path to a JFR settings (.jfc) file (default: auto)
-  JFR_OUTPUT                JFR output file (default: netty-loom.jfr)
-  JFR_RECORDING_NAME        JFR recording name (default: netty-loom-benchmark)
-  JFR_TIMELINE_OUTPUT       Timeline output file (default: netty-loom-timeline.jsonl, empty = skip export)
-  Note: JFR uses PROFILING_DELAY_SECONDS and PROFILING_DURATION_SECONDS.
-
-pidstat:
-  ENABLE_PIDSTAT            Enable pidstat collection (default: true)
-  PIDSTAT_INTERVAL          Collection interval in seconds (default: 1)
-  PIDSTAT_OUTPUT            Output file (default: pidstat.log)
-  PIDSTAT_MOCK_OUTPUT       Mock server output file (default: pidstat-mock.log)
-  PIDSTAT_LOAD_GEN_OUTPUT   Load generator output file (default: pidstat-loadgen.log)
-  PIDSTAT_HANDOFF_DETAILED  Include per-thread detail for handoff server (default: true)
-
-perf stat:
-  ENABLE_PERF_STAT          Enable perf stat collection (default: false)
-  PERF_STAT_OUTPUT          Output file (default: perf-stat.txt)
-  PERF_STAT_ARGS            Extra perf stat arguments (default: empty)
-
-General:
+  --profiler                Enable async-profiler (ENABLE_PROFILER)
+  --profiler-path <path>    Path to async-profiler (ASYNC_PROFILER_PATH)
+  --profiler-event <event>  Profiler event type (PROFILER_EVENT, default: cpu)
+  --jfr                     Enable JFR events (ENABLE_JFR)
+  --jfr-events <events>     Comma-separated JFR events or "all" (JFR_EVENTS, default: all)
+  --perf-stat               Enable perf stat (ENABLE_PERF_STAT)
+  --perf-stat-args <args>   Extra perf stat arguments (PERF_STAT_ARGS)
+  --no-pidstat              Disable pidstat collection (ENABLE_PIDSTAT)
+
+Output:
+  --output-dir <dir>        Output directory (OUTPUT_DIR, default: ./benchmark-results)
+
+Environment-only settings:
   JAVA_HOME                 Path to Java installation (required)
-  OUTPUT_DIR                Output directory (default: ./benchmark-results)
-  CONFIG_OUTPUT             Configuration output filename (default: benchmark-config.txt)
+  JAVA_OPTS                 JVM options (default: -Xms1g -Xmx1g)
+  PROFILING_DELAY_SECONDS   Profiling start delay (default: 10)
+  PROFILING_DURATION_SECONDS Profiling duration (default: 10)
 
 Examples:
 
-  # Basic run with custom scheduler
-  JAVA_HOME=/path/to/jdk SERVER_USE_CUSTOM_SCHEDULER=true ./run-benchmark.sh
-
-  # Run with CPU pinning and profiling
-  JAVA_HOME=/path/to/jdk \
-  MOCK_TASKSET="0" \
-  SERVER_TASKSET="1-2" \
-  LOAD_GEN_TASKSET="3" \
-  ENABLE_PROFILER=true \
-  ASYNC_PROFILER_PATH=/path/to/async-profiler \
-  ./run-benchmark.sh
-
-  # Rate-limited test with wrk2
-  JAVA_HOME=/path/to/jdk \
-  LOAD_GEN_RATE=10000 \
-  TOTAL_DURATION=60s \
-  WARMUP_DURATION=15s \
-  ./run-benchmark.sh
-
-  # Reactive handler test
-  JAVA_HOME=/path/to/jdk \
-  SERVER_REACTIVE=true \
-  SERVER_THREADS=2 \
-  ./run-benchmark.sh
+  # Virtual Netty mode, mockless
+  ./run-benchmark.sh --mode virtual_netty --threads 2 --mockless
+
+  # With CPU pinning
+  ./run-benchmark.sh --mode netty_scheduler --threads 4 \
+    --server-cpuset 2,3 --mock-cpuset 4,5 --load-cpuset 0,1
+
+  # With profiling
+  ./run-benchmark.sh --mode netty_scheduler --profiler --profiler-path /path/to/ap
 
+  # Rate-limited test
+  ./run-benchmark.sh --rate 10000 --total-duration 60s --warmup 15s
+
+  # JVM args override
+  ./run-benchmark.sh --mode virtual_netty --jvm-args "-XX:+PrintGCDetails"
 EOF
                 exit 0
                 ;;
             *)
-                error "Unknown option: $1"
+                error "Unknown option: $1. Use --help for usage."
                 ;;
         esac
-        shift
     done
 
     # Validate configuration
@@ -898,7 +922,11 @@ EOF
     build_jars
 
     # Start servers
-    start_mock_server
+    if [[ "$SERVER_MOCKLESS" != "true" ]]; then
+        start_mock_server
+    else
+        log "Mockless mode: skipping mock server"
+    fi
     start_handoff_server
 
     # Run warmup (no profiling/pidstat)
diff --git a/benchmark-runner/src/main/java/io/netty/loom/benchmark/runner/HandoffHttpServer.java b/benchmark-runner/src/main/java/io/netty/loom/benchmark/runner/HandoffHttpServer.java
index 8fdc450..cb231d0 100644
--- a/benchmark-runner/src/main/java/io/netty/loom/benchmark/runner/HandoffHttpServer.java
+++ b/benchmark-runner/src/main/java/io/netty/loom/benchmark/runner/HandoffHttpServer.java
@@ -37,8 +37,8 @@
 import io.netty.handler.codec.http.HttpServerCodec;
 import io.netty.handler.codec.http.HttpUtil;
 import io.netty.handler.codec.http.HttpVersion;
-import io.netty.loom.EventLoopSchedulerType;
 import io.netty.loom.VirtualMultithreadIoEventLoopGroup;
+import io.netty.loom.VirtualMultithreadManualIoEventLoopGroup;
 import io.netty.util.AsciiString;
 import io.netty.util.CharsetUtil;
 import org.apache.hc.client5.http.ConnectionKeepAliveStrategy;
@@ -72,7 +72,6 @@
  * <p>
  * Usage: java -cp benchmark-runner.jar
  * io.netty.loom.benchmark.runner.HandoffHttpServer \ --port 8081 \ --mock-url
- * http://localhost:8080/fruits \ --threads 2 \ --use-custom-scheduler true \
  * --io epoll
  */
 public class HandoffHttpServer {
@@ -81,54 +80,53 @@ public enum IO {
 		EPOLL, NIO, IO_URING
 	}
 
+	public enum Mode {
+		NON_VIRTUAL_NETTY, REACTIVE, VIRTUAL_NETTY, NETTY_SCHEDULER
+	}
+
 	private static final ObjectMapper OBJECT_MAPPER = new ObjectMapper();
 	private static final ByteBuf HEALTH_RESPONSE = Unpooled
 			.unreleasableBuffer(Unpooled.copiedBuffer("OK", CharsetUtil.UTF_8));
 
+	// Same JSON as MockHttpServer.CACHED_RESPONSE — used in mockless mode
+	private static final byte[] CACHED_JSON_BYTES = """
+			{
+			  "fruits": [
+			    {"name": "Apple", "color": "Red", "price": 1.20},
+			    {"name": "Banana", "color": "Yellow", "price": 0.50},
+			    {"name": "Orange", "color": "Orange", "price": 0.80},
+			    {"name": "Grape", "color": "Purple", "price": 2.00},
+			    {"name": "Mango", "color": "Yellow", "price": 1.50},
+			    {"name": "Strawberry", "color": "Red", "price": 3.00},
+			    {"name": "Blueberry", "color": "Blue", "price": 4.00},
+			    {"name": "Pineapple", "color": "Yellow", "price": 2.50},
+			    {"name": "Watermelon", "color": "Green", "price": 5.00},
+			    {"name": "Kiwi", "color": "Brown", "price": 1.00}
+			  ]
+			}
+			""".getBytes(java.nio.charset.StandardCharsets.UTF_8);
+
 	private final int port;
 	private final String mockUrl;
 	private final int threads;
-	private final boolean useCustomScheduler;
 	private final IO io;
 	private final boolean silent;
-	private final boolean noTimeout;
-	private final boolean useReactive;
-	private final EventLoopSchedulerType schedulerType;
+	private final boolean mockless;
+	private final Mode mode;
 
 	private MultiThreadIoEventLoopGroup workerGroup;
 	private Channel serverChannel;
 	private Supplier<ThreadFactory> threadFactorySupplier;
 
-	public HandoffHttpServer(int port, String mockUrl, int threads, boolean useCustomScheduler, IO io) {
-		this(port, mockUrl, threads, useCustomScheduler, io, false, false, false);
-	}
-
-	public HandoffHttpServer(int port, String mockUrl, int threads, boolean useCustomScheduler, IO io, boolean silent) {
-		this(port, mockUrl, threads, useCustomScheduler, io, silent, false, false);
-	}
-
-	public HandoffHttpServer(int port, String mockUrl, int threads, boolean useCustomScheduler, IO io, boolean silent,
-			boolean noTimeout) {
-		this(port, mockUrl, threads, useCustomScheduler, io, silent, noTimeout, false);
-	}
-
-	public HandoffHttpServer(int port, String mockUrl, int threads, boolean useCustomScheduler, IO io, boolean silent,
-			boolean noTimeout, boolean useReactive) {
-		this(port, mockUrl, threads, useCustomScheduler, io, silent, noTimeout, useReactive,
-				EventLoopSchedulerType.FIFO);
-	}
-
-	public HandoffHttpServer(int port, String mockUrl, int threads, boolean useCustomScheduler, IO io, boolean silent,
-			boolean noTimeout, boolean useReactive, EventLoopSchedulerType schedulerType) {
+	public HandoffHttpServer(int port, String mockUrl, int threads, IO io, boolean silent, boolean mockless,
+			Mode mode) {
 		this.port = port;
 		this.mockUrl = mockUrl;
 		this.threads = threads;
-		this.useCustomScheduler = useCustomScheduler;
 		this.io = io;
 		this.silent = silent;
-		this.noTimeout = noTimeout;
-		this.useReactive = useReactive;
-		this.schedulerType = schedulerType == null ? EventLoopSchedulerType.FIFO : schedulerType;
+		this.mockless = mockless;
+		this.mode = mode;
 	}
 
 	public void start() throws InterruptedException {
@@ -138,26 +136,48 @@ public void start() throws InterruptedException {
 			case IO_URING -> IoUringIoHandler.newFactory();
 		};
 
-		Class<? extends ServerSocketChannel> serverChannelClass = switch (io) {
-			case NIO -> NioServerSocketChannel.class;
-			case EPOLL -> EpollServerSocketChannel.class;
-			case IO_URING -> IoUringServerSocketChannel.class;
-		};
-
-		Class<? extends io.netty.channel.socket.SocketChannel> clientChannelClass = switch (io) {
-			case NIO -> io.netty.channel.socket.nio.NioSocketChannel.class;
-			case EPOLL -> io.netty.channel.epoll.EpollSocketChannel.class;
-			case IO_URING -> io.netty.channel.uring.IoUringSocketChannel.class;
-		};
-
-		if (useCustomScheduler) {
-			var group = new VirtualMultithreadIoEventLoopGroup(threads, ioHandlerFactory, schedulerType);
-			threadFactorySupplier = group::vThreadFactory;
-			workerGroup = group;
-		} else {
-			workerGroup = new MultiThreadIoEventLoopGroup(threads, ioHandlerFactory);
-			var defaultFactory = Thread.ofVirtual().factory();
-			threadFactorySupplier = () -> defaultFactory;
+		final Class<? extends ServerSocketChannel> serverChannelClass;
+		final Class<? extends io.netty.channel.socket.SocketChannel> clientChannelClass;
+
+		switch (mode) {
+			case VIRTUAL_NETTY -> {
+				var group = new VirtualMultithreadManualIoEventLoopGroup(threads, NioIoHandler.newFactory());
+				workerGroup = group;
+				var defaultFactory = Thread.ofVirtual().factory();
+				threadFactorySupplier = () -> defaultFactory;
+				serverChannelClass = NioServerSocketChannel.class;
+				clientChannelClass = io.netty.channel.socket.nio.NioSocketChannel.class;
+			}
+			case NETTY_SCHEDULER -> {
+				serverChannelClass = switch (io) {
+					case NIO -> NioServerSocketChannel.class;
+					case EPOLL -> EpollServerSocketChannel.class;
+					case IO_URING -> IoUringServerSocketChannel.class;
+				};
+				clientChannelClass = switch (io) {
+					case NIO -> io.netty.channel.socket.nio.NioSocketChannel.class;
+					case EPOLL -> io.netty.channel.epoll.EpollSocketChannel.class;
+					case IO_URING -> io.netty.channel.uring.IoUringSocketChannel.class;
+				};
+				var group = new VirtualMultithreadIoEventLoopGroup(threads, ioHandlerFactory);
+				threadFactorySupplier = group::vThreadFactory;
+				workerGroup = group;
+			}
+			default -> {
+				serverChannelClass = switch (io) {
+					case NIO -> NioServerSocketChannel.class;
+					case EPOLL -> EpollServerSocketChannel.class;
+					case IO_URING -> IoUringServerSocketChannel.class;
+				};
+				clientChannelClass = switch (io) {
+					case NIO -> io.netty.channel.socket.nio.NioSocketChannel.class;
+					case EPOLL -> io.netty.channel.epoll.EpollSocketChannel.class;
+					case IO_URING -> io.netty.channel.uring.IoUringSocketChannel.class;
+				};
+				workerGroup = new MultiThreadIoEventLoopGroup(threads, ioHandlerFactory);
+				var defaultFactory = Thread.ofVirtual().factory();
+				threadFactorySupplier = () -> defaultFactory;
+			}
 		}
 		ServerBootstrap b = new ServerBootstrap();
 		b.group(workerGroup).channel(serverChannelClass).childOption(ChannelOption.TCP_NODELAY, true)
@@ -167,8 +187,8 @@ protected void initChannel(SocketChannel ch) {
 						ChannelPipeline p = ch.pipeline();
 						p.addLast(new HttpServerCodec());
 						p.addLast(new HttpObjectAggregator(65536));
-						if (useReactive) {
-							p.addLast(new ReactiveHandoffHandler(mockUrl, noTimeout, clientChannelClass));
+						if (mode == Mode.REACTIVE) {
+							p.addLast(new ReactiveHandoffHandler(mockUrl, clientChannelClass));
 						} else {
 							p.addLast(new HandoffHandler());
 						}
@@ -178,17 +198,18 @@ protected void initChannel(SocketChannel ch) {
 		serverChannel = b.bind(port).sync().channel();
 		if (!silent) {
 			System.out.printf("Handoff HTTP Server started on port %d%n", port);
-			System.out.printf("  Mode: %s%n", useReactive ? "Reactive (Project Reactor)" : "Virtual Thread");
-			System.out.printf("  Mock URL: %s%n", mockUrl);
-			System.out.printf("  Threads: %d%n", threads);
-			if (!useReactive) {
-				System.out.printf("  Custom Scheduler: %s%n", useCustomScheduler);
-				if (useCustomScheduler) {
-					System.out.printf("  Scheduler Type: %s%n", schedulerType);
-				}
+			System.out.printf("  Mode: %s%n", switch (mode) {
+				case NON_VIRTUAL_NETTY -> "Non-Virtual Netty (platform IO + VT blocking)";
+				case REACTIVE -> "Reactive (pure async, no VTs)";
+				case VIRTUAL_NETTY -> "Virtual Netty (IO loops as VTs on ForkJoinPool)";
+				case NETTY_SCHEDULER -> "Netty Scheduler (platform IO + Netty VT scheduler)";
+			});
+			System.out.printf("  Mockless: %s%n", mockless);
+			if (!mockless) {
+				System.out.printf("  Mock URL: %s%n", mockUrl);
 			}
-			System.out.printf("  I/O: %s%n", io);
-			System.out.printf("  No Timeout: %s%n", noTimeout);
+			System.out.printf("  Threads: %d%n", threads);
+			System.out.printf("  I/O: %s%n", mode == Mode.VIRTUAL_NETTY ? "NIO (forced)" : io);
 		}
 	}
 
@@ -218,12 +239,16 @@ private class HandoffHandler extends SimpleChannelInboundHandler<FullHttpRequest
 		private ExecutorService orderedExecutorService;
 
 		HandoffHandler() {
-			ConnectionKeepAliveStrategy keepAliveStrategy = (HttpResponse response,
-					HttpContext context) -> TimeValue.NEG_ONE_MILLISECOND;
-			BasicHttpClientConnectionManager cm = new BasicHttpClientConnectionManager();
-			RequestConfig requestConfig = noTimeout ? NO_TIMEOUT_HTTP_REQUEST_CONFIG : RequestConfig.DEFAULT;
-			httpClient = HttpClientBuilder.create().setConnectionManager(cm).setDefaultRequestConfig(requestConfig)
-					.setConnectionManagerShared(false).setKeepAliveStrategy(keepAliveStrategy).build();
+			if (mockless) {
+				httpClient = null;
+			} else {
+				ConnectionKeepAliveStrategy keepAliveStrategy = (HttpResponse response,
+						HttpContext context) -> TimeValue.NEG_ONE_MILLISECOND;
+				BasicHttpClientConnectionManager cm = new BasicHttpClientConnectionManager();
+				httpClient = HttpClientBuilder.create().setConnectionManager(cm)
+						.setDefaultRequestConfig(NO_TIMEOUT_HTTP_REQUEST_CONFIG).setConnectionManagerShared(false)
+						.setKeepAliveStrategy(keepAliveStrategy).build();
+			}
 		}
 
 		@Override
@@ -244,10 +269,16 @@ protected void channelRead0(ChannelHandlerContext ctx, FullHttpRequest request)
 			}
 
 			if (uri.equals("/") || uri.startsWith("/fruits")) {
-				// Hand off to virtual thread for blocking processing
-				orderedExecutorService.execute(() -> {
-					doBlockingProcessing(ctx, eventLoop, keepAlive);
-				});
+				// Hand off to virtual thread for processing
+				if (mockless) {
+					orderedExecutorService.execute(() -> {
+						doMocklessProcessing(ctx, eventLoop, keepAlive);
+					});
+				} else {
+					orderedExecutorService.execute(() -> {
+						doBlockingProcessing(ctx, eventLoop, keepAlive);
+					});
+				}
 				return;
 			}
 
@@ -294,6 +325,28 @@ private void doBlockingProcessing(ChannelHandlerContext ctx, IoEventLoop eventLo
 			}
 		}
 
+		private void doMocklessProcessing(ChannelHandlerContext ctx, IoEventLoop eventLoop, boolean keepAlive) {
+			try {
+				// Same Jackson work as doBlockingProcessing, without the HTTP call
+				FruitsResponse fruitsResponse = OBJECT_MAPPER.readValue(CACHED_JSON_BYTES, FruitsResponse.class);
+				byte[] responseBytes = OBJECT_MAPPER.writeValueAsBytes(fruitsResponse);
+				eventLoop.execute(() -> {
+					ByteBuf content = Unpooled.wrappedBuffer(responseBytes);
+					sendResponse(ctx, content, HttpHeaderValues.APPLICATION_JSON, keepAlive);
+				});
+			} catch (Throwable e) {
+				eventLoop.execute(() -> {
+					ByteBuf content = Unpooled.copiedBuffer("{\"error\":\"" + e.getMessage() + "\"}",
+							CharsetUtil.UTF_8);
+					FullHttpResponse response = new DefaultFullHttpResponse(HttpVersion.HTTP_1_1,
+							HttpResponseStatus.INTERNAL_SERVER_ERROR, content);
+					response.headers().set(HttpHeaderNames.CONTENT_TYPE, HttpHeaderValues.APPLICATION_JSON);
+					response.headers().setInt(HttpHeaderNames.CONTENT_LENGTH, content.readableBytes());
+					ctx.writeAndFlush(response).addListener(ChannelFutureListener.CLOSE);
+				});
+			}
+		}
+
 		private void sendResponse(ChannelHandlerContext ctx, ByteBuf content, AsciiString contentType,
 				boolean keepAlive) {
 			FullHttpResponse response = new DefaultFullHttpResponse(HttpVersion.HTTP_1_1, HttpResponseStatus.OK,
@@ -319,7 +372,9 @@ public void channelInactive(ChannelHandlerContext ctx) throws Exception {
 			try {
 				orderedExecutorService.execute(() -> {
 					try {
-						httpClient.close();
+						if (httpClient != null) {
+							httpClient.close();
+						}
 					} catch (IOException e) {
 					} finally {
 						orderedExecutorService.shutdown();
@@ -332,28 +387,23 @@ public void channelInactive(ChannelHandlerContext ctx) throws Exception {
 	}
 
 	public static void main(String[] args) throws InterruptedException {
-		// Parse arguments
 		int port = 8081;
 		String mockUrl = "http://localhost:8080/fruits";
 		int threads = 1;
-		boolean useCustomScheduler = false;
 		IO io = IO.EPOLL;
 		boolean silent = false;
-		boolean noTimeout = false;
-		boolean useReactive = false;
-		EventLoopSchedulerType schedulerType = EventLoopSchedulerType.FIFO;
+		boolean mockless = false;
+		Mode mode = Mode.NON_VIRTUAL_NETTY;
 
 		for (int i = 0; i < args.length; i++) {
 			switch (args[i]) {
 				case "--port" -> port = Integer.parseInt(args[++i]);
 				case "--mock-url" -> mockUrl = args[++i];
 				case "--threads" -> threads = Integer.parseInt(args[++i]);
-				case "--use-custom-scheduler" -> useCustomScheduler = Boolean.parseBoolean(args[++i]);
-				case "--scheduler-type" -> schedulerType = EventLoopSchedulerType.valueOf(args[++i].toUpperCase());
 				case "--io" -> io = IO.valueOf(args[++i].toUpperCase());
 				case "--silent" -> silent = true;
-				case "--no-timeout" -> noTimeout = Boolean.parseBoolean(args[++i]);
-				case "--reactive" -> useReactive = Boolean.parseBoolean(args[++i]);
+				case "--mockless" -> mockless = true;
+				case "--mode" -> mode = Mode.valueOf(args[++i].toUpperCase());
 				case "--help" -> {
 					printUsage();
 					return;
@@ -361,8 +411,7 @@ public static void main(String[] args) throws InterruptedException {
 			}
 		}
 
-		HandoffHttpServer server = new HandoffHttpServer(port, mockUrl, threads, useCustomScheduler, io, silent,
-				noTimeout, useReactive, schedulerType);
+		HandoffHttpServer server = new HandoffHttpServer(port, mockUrl, threads, io, silent, mockless, mode);
 		server.start();
 
 		// Shutdown hook
@@ -372,25 +421,24 @@ public static void main(String[] args) throws InterruptedException {
 	}
 
 	private static void printUsage() {
-		System.out.println(
-				"""
-						Usage: java -cp benchmark-runner.jar io.netty.loom.benchmark.runner.HandoffHttpServer [options]
-
-						Options:
-						  --port <port>                  HTTP port (default: 8081)
-						  --mock-url <url>               Mock server URL (default: http://localhost:8080/fruits)
-						  --threads <n>                  Number of event loop threads (default: 1)
-						  --use-custom-scheduler <bool>  Use custom Netty scheduler (default: false, ignored if --reactive is true)
-						  --scheduler-type <fifo>        Scheduler type for custom scheduler (default: fifo)
-						  --io <epoll|nio|io_uring>      I/O type (default: epoll)
-						  --no-timeout <true|false>      Disable HTTP client timeout (default: false)
-						  --reactive <true|false>        Use reactive handler with Reactor (default: false)
-						  --silent                       Suppress output messages
-						  --help                         Show this help
-
-						Modes:
-						  Virtual Thread (default): Uses virtual threads with blocking Apache HttpClient
-						  Reactive (--reactive true): Uses Project Reactor with non-blocking Reactor Netty HTTP client
-						""");
+		System.out.println("""
+				Usage: java -cp benchmark-runner.jar io.netty.loom.benchmark.runner.HandoffHttpServer [options]
+
+				Options:
+				  --port <port>                  HTTP port (default: 8081)
+				  --mock-url <url>               Mock server URL (default: http://localhost:8080/fruits)
+				  --threads <n>                  Number of event loop threads (default: 1)
+				  --io <epoll|nio|io_uring>      I/O type (default: epoll)
+				  --mockless                     Skip mock server; do Jackson work inline (default: off)
+				  --mode <mode>                  Server mode (default: virtual_thread)
+				  --silent                       Suppress output messages
+				  --help                         Show this help
+
+				Modes:
+				  NON_VIRTUAL_NETTY (default): Platform thread IO + virtual thread blocking work
+				  REACTIVE: Pure Netty async, no virtual threads
+				  VIRTUAL_NETTY: Netty IO event loops as VTs on ForkJoinPool
+				  NETTY_SCHEDULER: Platform thread IO + Netty custom VT scheduler
+				""");
 	}
 }
diff --git a/benchmark-runner/src/main/java/io/netty/loom/benchmark/runner/ReactiveHandoffHandler.java b/benchmark-runner/src/main/java/io/netty/loom/benchmark/runner/ReactiveHandoffHandler.java
index 31eed11..0d96f18 100644
--- a/benchmark-runner/src/main/java/io/netty/loom/benchmark/runner/ReactiveHandoffHandler.java
+++ b/benchmark-runner/src/main/java/io/netty/loom/benchmark/runner/ReactiveHandoffHandler.java
@@ -40,13 +40,11 @@
 import io.netty.handler.codec.http.HttpResponseStatus;
 import io.netty.handler.codec.http.HttpUtil;
 import io.netty.handler.codec.http.HttpVersion;
-import io.netty.handler.timeout.ReadTimeoutHandler;
 import io.netty.util.AsciiString;
 import io.netty.util.CharsetUtil;
 
 import java.net.InetSocketAddress;
 import java.net.URI;
-import java.util.concurrent.TimeUnit;
 
 /**
  * Pure Netty non-blocking HTTP handler.
@@ -67,8 +65,6 @@ public class ReactiveHandoffHandler extends SimpleChannelInboundHandler<FullHttp
 	private final String mockUrl;
 	private final URI parsedUri;
 	private final Class<? extends SocketChannel> channelClass;
-	private final boolean noTimeout;
-
 	// This handler owns exactly ONE client channel
 	private Channel clientChannel;
 
@@ -76,10 +72,9 @@ public class ReactiveHandoffHandler extends SimpleChannelInboundHandler<FullHttp
 	// atomic)
 	private ResponseCallback pendingCallback;
 
-	public ReactiveHandoffHandler(String mockUrl, boolean noTimeout, Class<? extends SocketChannel> channelClass) {
+	public ReactiveHandoffHandler(String mockUrl, Class<? extends SocketChannel> channelClass) {
 		this.mockUrl = mockUrl;
 		this.channelClass = channelClass;
-		this.noTimeout = noTimeout;
 		try {
 			this.parsedUri = new URI(mockUrl);
 		} catch (Exception e) {
@@ -104,15 +99,10 @@ public void channelActive(ChannelHandlerContext ctx) throws Exception {
 		Bootstrap bootstrap = new Bootstrap().group(eventLoop) // Critical: use the SAME event loop
 				.channel(channelClass) // Use same channel type as server
 				.option(ChannelOption.SO_KEEPALIVE, true).option(ChannelOption.TCP_NODELAY, true)
-				.option(ChannelOption.CONNECT_TIMEOUT_MILLIS, noTimeout ? 0 : 30000)
-				.handler(new ChannelInitializer<SocketChannel>() {
+				.option(ChannelOption.CONNECT_TIMEOUT_MILLIS, 0).handler(new ChannelInitializer<SocketChannel>() {
 					@Override
 					protected void initChannel(SocketChannel ch) {
 						ChannelPipeline p = ch.pipeline();
-						// Add read timeout handler if timeout is enabled
-						if (!noTimeout) {
-							p.addLast(new ReadTimeoutHandler(30, TimeUnit.SECONDS));
-						}
 						p.addLast(new HttpClientCodec());
 						p.addLast(new HttpObjectAggregator(65536));
 						// Add permanent response handler
diff --git a/benchmark-runner/src/test/java/io/netty/loom/benchmark/runner/BenchmarkIntegrationTest.java b/benchmark-runner/src/test/java/io/netty/loom/benchmark/runner/BenchmarkIntegrationTest.java
index 5e60abf..4dcad60 100644
--- a/benchmark-runner/src/test/java/io/netty/loom/benchmark/runner/BenchmarkIntegrationTest.java
+++ b/benchmark-runner/src/test/java/io/netty/loom/benchmark/runner/BenchmarkIntegrationTest.java
@@ -36,10 +36,7 @@
  * <p>
  * Tests cover:
  * <ul>
- * <li>NIO I/O with default scheduler</li>
- * <li>NIO I/O with custom scheduler</li>
- * <li>EPOLL I/O with default scheduler</li>
- * <li>EPOLL I/O with custom scheduler</li>
+ * <li>NIO I/O with virtual Netty mode</li>
  * </ul>
  */
 class BenchmarkIntegrationTest {
@@ -53,16 +50,11 @@ class BenchmarkIntegrationTest {
 
 	static Stream<Arguments> serverConfigurations() {
 		return Stream.of(
-				// IO type, use custom scheduler, use reactive, description
-				Arguments.of(HandoffHttpServer.IO.NIO, false, false, "NIO with default scheduler"),
-				Arguments.of(HandoffHttpServer.IO.NIO, true, false, "NIO with custom scheduler"),
-				Arguments.of(HandoffHttpServer.IO.NIO, false, true, "NIO with reactive handler"),
-				Arguments.of(HandoffHttpServer.IO.EPOLL, false, false, "EPOLL with default scheduler"),
-				Arguments.of(HandoffHttpServer.IO.EPOLL, true, false, "EPOLL with custom scheduler"),
-				Arguments.of(HandoffHttpServer.IO.EPOLL, false, true, "EPOLL with reactive handler"));
+				// IO type, mode, description
+				Arguments.of(HandoffHttpServer.IO.NIO, HandoffHttpServer.Mode.VIRTUAL_NETTY, "NIO with Netty on FJ"));
 	}
 
-	void startServers(HandoffHttpServer.IO ioType, boolean useCustomScheduler, boolean useReactive) throws Exception {
+	void startServers(HandoffHttpServer.IO ioType, HandoffHttpServer.Mode mode) throws Exception {
 		// Use unique ports for each test to avoid conflicts
 		mockPort = PORT_COUNTER.getAndIncrement();
 		handoffPort = PORT_COUNTER.getAndIncrement();
@@ -81,8 +73,8 @@ void startServers(HandoffHttpServer.IO ioType, boolean useCustomScheduler, boole
 		});
 
 		// Start handoff server with specified configuration
-		handoffServer = new HandoffHttpServer(handoffPort, "http://localhost:" + mockPort + "/fruits", 1,
-				useCustomScheduler, ioType, true, false, useReactive);
+		handoffServer = new HandoffHttpServer(handoffPort, "http://localhost:" + mockPort + "/fruits", 1, ioType, true,
+				false, mode);
 		handoffServer.start();
 
 		// Wait for handoff server to be ready
@@ -107,64 +99,64 @@ void stopServers() {
 		}
 	}
 
-	@ParameterizedTest(name = "{3}")
+	@ParameterizedTest(name = "{2}")
 	@MethodSource("serverConfigurations")
-	void mockServerHealthEndpoint(HandoffHttpServer.IO ioType, boolean useCustomScheduler, boolean useReactive,
-			String description) throws Exception {
-		startServers(ioType, useCustomScheduler, useReactive);
+	void mockServerHealthEndpoint(HandoffHttpServer.IO ioType, HandoffHttpServer.Mode mode, String description)
+			throws Exception {
+		startServers(ioType, mode);
 		given().port(mockPort).when().get("/health").then().statusCode(200).body(equalTo("OK"));
 	}
 
-	@ParameterizedTest(name = "{3}")
+	@ParameterizedTest(name = "{2}")
 	@MethodSource("serverConfigurations")
-	void mockServerFruitsEndpoint(HandoffHttpServer.IO ioType, boolean useCustomScheduler, boolean useReactive,
-			String description) throws Exception {
-		startServers(ioType, useCustomScheduler, useReactive);
+	void mockServerFruitsEndpoint(HandoffHttpServer.IO ioType, HandoffHttpServer.Mode mode, String description)
+			throws Exception {
+		startServers(ioType, mode);
 		given().port(mockPort).when().get("/fruits").then().statusCode(200).contentType(ContentType.JSON)
 				.body("fruits", hasSize(10)).body("fruits[0].name", equalTo("Apple"))
 				.body("fruits[0].color", equalTo("Red")).body("fruits[0].price", equalTo(1.20f));
 	}
 
-	@ParameterizedTest(name = "{3}")
+	@ParameterizedTest(name = "{2}")
 	@MethodSource("serverConfigurations")
-	void handoffServerHealthEndpoint(HandoffHttpServer.IO ioType, boolean useCustomScheduler, boolean useReactive,
-			String description) throws Exception {
-		startServers(ioType, useCustomScheduler, useReactive);
+	void handoffServerHealthEndpoint(HandoffHttpServer.IO ioType, HandoffHttpServer.Mode mode, String description)
+			throws Exception {
+		startServers(ioType, mode);
 		given().port(handoffPort).when().get("/health").then().statusCode(200).body(equalTo("OK"));
 	}
 
-	@ParameterizedTest(name = "{3}")
+	@ParameterizedTest(name = "{2}")
 	@MethodSource("serverConfigurations")
-	void handoffServerFruitsEndpoint(HandoffHttpServer.IO ioType, boolean useCustomScheduler, boolean useReactive,
-			String description) throws Exception {
-		startServers(ioType, useCustomScheduler, useReactive);
+	void handoffServerFruitsEndpoint(HandoffHttpServer.IO ioType, HandoffHttpServer.Mode mode, String description)
+			throws Exception {
+		startServers(ioType, mode);
 		given().port(handoffPort).when().get("/fruits").then().statusCode(200).contentType(ContentType.JSON)
 				.body("fruits", hasSize(10)).body("fruits[0].name", equalTo("Apple"))
 				.body("fruits[0].color", equalTo("Red"));
 	}
 
-	@ParameterizedTest(name = "{3}")
+	@ParameterizedTest(name = "{2}")
 	@MethodSource("serverConfigurations")
-	void handoffServerRootEndpoint(HandoffHttpServer.IO ioType, boolean useCustomScheduler, boolean useReactive,
-			String description) throws Exception {
-		startServers(ioType, useCustomScheduler, useReactive);
+	void handoffServerRootEndpoint(HandoffHttpServer.IO ioType, HandoffHttpServer.Mode mode, String description)
+			throws Exception {
+		startServers(ioType, mode);
 		given().port(handoffPort).when().get("/").then().statusCode(200).contentType(ContentType.JSON).body("fruits",
 				hasSize(10));
 	}
 
-	@ParameterizedTest(name = "{3}")
+	@ParameterizedTest(name = "{2}")
 	@MethodSource("serverConfigurations")
-	void handoffServer404ForUnknownPath(HandoffHttpServer.IO ioType, boolean useCustomScheduler, boolean useReactive,
-			String description) throws Exception {
-		startServers(ioType, useCustomScheduler, useReactive);
+	void handoffServer404ForUnknownPath(HandoffHttpServer.IO ioType, HandoffHttpServer.Mode mode, String description)
+			throws Exception {
+		startServers(ioType, mode);
 		given().port(handoffPort).when().get("/unknown").then().statusCode(404);
 	}
 
-	@ParameterizedTest(name = "{3}")
+	@ParameterizedTest(name = "{2}")
 	@MethodSource("serverConfigurations")
-	void handoffServerReturnsAllFruits(HandoffHttpServer.IO ioType, boolean useCustomScheduler, boolean useReactive,
-			String description) throws Exception {
-		startServers(ioType, useCustomScheduler, useReactive);
+	void handoffServerReturnsAllFruits(HandoffHttpServer.IO ioType, HandoffHttpServer.Mode mode, String description)
+			throws Exception {
+		startServers(ioType, mode);
 
 		List<String> fruitNames = given().port(handoffPort).when().get("/fruits").then().statusCode(200).extract()
 				.jsonPath().getList("fruits.name", String.class);
@@ -175,11 +167,11 @@ void handoffServerReturnsAllFruits(HandoffHttpServer.IO ioType, boolean useCusto
 		assertTrue(fruitNames.contains("Kiwi"));
 	}
 
-	@ParameterizedTest(name = "{3}")
+	@ParameterizedTest(name = "{2}")
 	@MethodSource("serverConfigurations")
-	void handoffServerHandlesMultipleRequests(HandoffHttpServer.IO ioType, boolean useCustomScheduler,
-			boolean useReactive, String description) throws Exception {
-		startServers(ioType, useCustomScheduler, useReactive);
+	void handoffServerHandlesMultipleRequests(HandoffHttpServer.IO ioType, HandoffHttpServer.Mode mode,
+			String description) throws Exception {
+		startServers(ioType, mode);
 
 		// Send multiple requests to verify server handles concurrent load
 		for (int i = 0; i < 10; i++) {
@@ -187,11 +179,11 @@ void handoffServerHandlesMultipleRequests(HandoffHttpServer.IO ioType, boolean u
 		}
 	}
 
-	@ParameterizedTest(name = "{3}")
+	@ParameterizedTest(name = "{2}")
 	@MethodSource("serverConfigurations")
-	void verifyEndToEndJsonParsing(HandoffHttpServer.IO ioType, boolean useCustomScheduler, boolean useReactive,
-			String description) throws Exception {
-		startServers(ioType, useCustomScheduler, useReactive);
+	void verifyEndToEndJsonParsing(HandoffHttpServer.IO ioType, HandoffHttpServer.Mode mode, String description)
+			throws Exception {
+		startServers(ioType, mode);
 
 		// This test verifies the complete flow:
 		// 1. HandoffHttpServer receives request
diff --git a/core/src/main/java/io/netty/loom/ManualIoEventLoopTask.java b/core/src/main/java/io/netty/loom/ManualIoEventLoopTask.java
new file mode 100644
index 0000000..fc89084
--- /dev/null
+++ b/core/src/main/java/io/netty/loom/ManualIoEventLoopTask.java
@@ -0,0 +1,45 @@
+/*
+ * Copyright 2026 The Netty VirtualThread Scheduler Project
+ *
+ * The Netty VirtualThread Scheduler Project licenses this file to you under the Apache License,
+ * version 2.0 (the "License"); you may not use this file except in compliance with the
+ * License. You may obtain a copy of the License at:
+ *
+ *   https://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software distributed under the
+ * License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND,
+ * either express or implied. See the License for the specific language governing permissions
+ * and limitations under the License.
+ */
+package io.netty.loom;
+
+import io.netty.channel.IoEventLoopGroup;
+import io.netty.channel.IoHandlerFactory;
+import io.netty.channel.ManualIoEventLoop;
+
+import java.util.concurrent.TimeUnit;
+
+public class ManualIoEventLoopTask extends ManualIoEventLoop implements Runnable {
+
+	private static final long RUNNING_YIELD_US = TimeUnit.MICROSECONDS
+			.toNanos(Integer.getInteger("io.netty.loom.running.yield.us", 1));
+
+	public ManualIoEventLoopTask(IoEventLoopGroup parent, Thread owningThread, IoHandlerFactory factory) {
+		super(parent, owningThread, factory);
+	}
+
+	@Override
+	public void run() {
+		while (!isShuttingDown()) {
+			run(0, RUNNING_YIELD_US);
+			Thread.yield();
+			runNonBlockingTasks(RUNNING_YIELD_US);
+			Thread.yield();
+		}
+		while (!isTerminated()) {
+			runNow();
+			Thread.yield();
+		}
+	}
+}
diff --git a/core/src/main/java/io/netty/loom/VirtualMultithreadManualIoEventLoopGroup.java b/core/src/main/java/io/netty/loom/VirtualMultithreadManualIoEventLoopGroup.java
new file mode 100644
index 0000000..f33e1c8
--- /dev/null
+++ b/core/src/main/java/io/netty/loom/VirtualMultithreadManualIoEventLoopGroup.java
@@ -0,0 +1,43 @@
+/*
+ * Copyright 2026 The Netty VirtualThread Scheduler Project
+ *
+ * The Netty VirtualThread Scheduler Project licenses this file to you under the Apache License,
+ * version 2.0 (the "License"); you may not use this file except in compliance with the
+ * License. You may obtain a copy of the License at:
+ *
+ *   https://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software distributed under the
+ * License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND,
+ * either express or implied. See the License for the specific language governing permissions
+ * and limitations under the License.
+ */
+package io.netty.loom;
+
+import io.netty.channel.IoEventLoop;
+import io.netty.channel.IoHandlerFactory;
+import io.netty.channel.MultiThreadIoEventLoopGroup;
+
+import java.util.concurrent.Executor;
+import java.util.concurrent.ThreadFactory;
+
+public class VirtualMultithreadManualIoEventLoopGroup extends MultiThreadIoEventLoopGroup {
+
+	private ThreadFactory threadFactory;
+
+	public VirtualMultithreadManualIoEventLoopGroup(int nThreads, IoHandlerFactory factory) {
+		super(nThreads, (Executor) null, factory);
+	}
+
+	@Override
+	protected IoEventLoop newChild(Executor executor, IoHandlerFactory ioHandlerFactory, Object... args) {
+		if (threadFactory == null) {
+			threadFactory = Thread.ofVirtual().factory();
+		}
+		var manualTask = new ManualIoEventLoopTask(this, null, ioHandlerFactory);
+		var newThread = threadFactory.newThread(manualTask);
+		manualTask.setOwningThread(newThread);
+		newThread.start();
+		return manualTask;
+	}
+}