Skip to content

fix: separate infrastructure vs user code error handling#1339

Merged
TooTallNate merged 11 commits intomainfrom
fix/separate-infra-user-error-handling
Mar 13, 2026
Merged

fix: separate infrastructure vs user code error handling#1339
TooTallNate merged 11 commits intomainfrom
fix/separate-infra-user-error-handling

Conversation

@TooTallNate
Copy link
Member

@TooTallNate TooTallNate commented Mar 12, 2026

Summary

Transient network errors (ECONNRESET, etc.) during infrastructure calls (event listing, event creation) were caught by a shared try/catch that also handles user code errors, incorrectly marking runs as run_failed or steps as step_failed instead of letting the queue redeliver the message for retry.

This PR structurally separates infrastructure calls from user code execution and removes redundant in-process retry wrappers, relying on undici's RetryAgent and queue-level redelivery for transient error recovery.

Changes

Structural error separation (runtime.ts, step-handler.ts)

  • Infrastructure calls (world.runs.get(), getAllWorkflowRunEvents(), world.events.create(), getEncryptionKeyForRun(), input hydration, result dehydration) are outside the user-code try/catch. Errors propagate to the queue handler for automatic retry.
  • Only runWorkflow() / stepFn.apply() (user code) is wrapped in the try/catch that produces run_failed / step_failed.
  • WorkflowRuntimeError in infrastructure setup (data integrity issues like missing startedAt) now produces run_failed / step_failed instead of retrying forever via the queue.
  • Safety net in step handler re-throws WorkflowAPIError 5xx/410 from user code catch, preventing step attempt consumption on infra errors.

Removed in-process retry wrappers (helpers.ts)

  • withServerErrorRetry removed — redundant with undici's RetryAgent which retries 5xx at the HTTP dispatcher level.
  • withThrottleRetry removed — redundant with undici's RetryAgent configured with retryAfter: true for 429 handling.
  • isTransientNetworkError removed — no longer needed without withServerErrorRetry.

Test updates

  • Removed serverError5xxRetryWorkflow e2e test and its fault injection helpers from workbench/example/workflows/99_e2e.ts — this test specifically validated withServerErrorRetry behavior.
  • Removed unit tests for withServerErrorRetry, isTransientNetworkError, and withThrottleRetry.

…nd step handler

Transient network errors (ECONNRESET, etc.) during infrastructure calls
(event listing, event creation) were caught by a shared try/catch that
also handles user code errors, incorrectly marking runs as run_failed
or steps as step_failed instead of letting the queue redeliver.

- runtime.ts: Move infrastructure calls outside the user-code try/catch
  so errors propagate to the queue handler for automatic retry
- step-handler.ts: Same structural separation — only stepFn.apply() is
  wrapped in the try/catch that produces step_failed/step_retrying
- helpers.ts: Add isTransientNetworkError() and update withServerErrorRetry
  to retry network errors in addition to 5xx responses
- helpers.test.ts: Add tests for network error detection and retry
@TooTallNate TooTallNate requested a review from a team as a code owner March 12, 2026 00:09
Copilot AI review requested due to automatic review settings March 12, 2026 00:09
@changeset-bot
Copy link

changeset-bot bot commented Mar 12, 2026

🦋 Changeset detected

Latest commit: fa30093

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 16 packages
Name Type
@workflow/core Patch
@workflow/builders Patch
@workflow/cli Patch
@workflow/next Patch
@workflow/nitro Patch
@workflow/vitest Patch
@workflow/web-shared Patch
workflow Patch
@workflow/world-testing Patch
@workflow/astro Patch
@workflow/nest Patch
@workflow/rollup Patch
@workflow/sveltekit Patch
@workflow/vite Patch
@workflow/nuxt Patch
@workflow/ai Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@vercel
Copy link
Contributor

vercel bot commented Mar 12, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
example-nextjs-workflow-turbopack Ready Ready Preview, Comment Mar 13, 2026 0:13am
example-nextjs-workflow-webpack Ready Ready Preview, Comment Mar 13, 2026 0:13am
example-workflow Ready Ready Preview, Comment Mar 13, 2026 0:13am
workbench-astro-workflow Ready Ready Preview, Comment Mar 13, 2026 0:13am
workbench-express-workflow Ready Ready Preview, Comment Mar 13, 2026 0:13am
workbench-fastify-workflow Ready Ready Preview, Comment Mar 13, 2026 0:13am
workbench-hono-workflow Ready Ready Preview, Comment Mar 13, 2026 0:13am
workbench-nitro-workflow Building Building Preview, Comment Mar 13, 2026 0:13am
workbench-nuxt-workflow Ready Ready Preview, Comment Mar 13, 2026 0:13am
workbench-sveltekit-workflow Ready Ready Preview, Comment Mar 13, 2026 0:13am
workbench-vite-workflow Ready Ready Preview, Comment Mar 13, 2026 0:13am
workflow-nest Ready Ready Preview, Comment Mar 13, 2026 0:13am
workflow-swc-playground Ready Ready Preview, Comment Mar 13, 2026 0:13am
1 Skipped Deployment
Project Deployment Actions Updated (UTC)
workflow-docs Skipped Skipped Mar 13, 2026 0:13am

@github-actions
Copy link
Contributor

github-actions bot commented Mar 12, 2026

📊 Benchmark Results

📈 Comparing against baseline from main branch. Green 🟢 = faster, Red 🔺 = slower.

workflow with no steps

💻 Local Development

World Framework Workflow Time Wall Time Overhead Samples vs Fastest
💻 Local 🥇 Nitro 0.032s (~) 1.006s (~) 0.974s 10 1.00x
💻 Local Express 0.033s (+2.8%) 1.006s (~) 0.973s 10 1.02x
🐘 Postgres Express 0.052s (+4.8%) 1.011s (~) 0.959s 10 1.62x
🐘 Postgres Nitro 0.053s (+32.5% 🔺) 1.012s (~) 0.959s 10 1.64x
💻 Local Next.js (Turbopack) ⚠️ missing - - - -
🐘 Postgres Next.js (Turbopack) ⚠️ missing - - - -

▲ Production (Vercel)

World Framework Workflow Time Wall Time Overhead Samples vs Fastest
▲ Vercel 🥇 Next.js (Turbopack) 0.460s (-11.9% 🟢) 2.298s (-79.4% 🟢) 1.838s 10 1.00x
▲ Vercel Nitro 0.511s (-9.3% 🟢) 1.948s (-83.4% 🟢) 1.437s 10 1.11x
▲ Vercel Express 0.526s (+33.4% 🔺) 2.459s (-64.7% 🟢) 1.933s 10 1.14x

🔍 Observability: Next.js (Turbopack) | Nitro | Express

workflow with 1 step

💻 Local Development

World Framework Workflow Time Wall Time Overhead Samples vs Fastest
💻 Local 🥇 Nitro 1.102s (~) 2.006s (~) 0.904s 10 1.00x
💻 Local Express 1.104s (~) 2.006s (~) 0.902s 10 1.00x
🐘 Postgres Express 1.120s (-1.1%) 2.011s (~) 0.891s 10 1.02x
🐘 Postgres Nitro 1.122s (+2.8%) 2.011s (~) 0.889s 10 1.02x
💻 Local Next.js (Turbopack) ⚠️ missing - - - -
🐘 Postgres Next.js (Turbopack) ⚠️ missing - - - -

▲ Production (Vercel)

World Framework Workflow Time Wall Time Overhead Samples vs Fastest
▲ Vercel 🥇 Nitro 1.943s (-6.5% 🟢) 3.073s (-61.2% 🟢) 1.129s 10 1.00x
▲ Vercel Next.js (Turbopack) 1.995s (-6.3% 🟢) 3.458s (-68.8% 🟢) 1.463s 10 1.03x
▲ Vercel Express 1.999s (-4.2%) 3.617s (-65.8% 🟢) 1.618s 10 1.03x

🔍 Observability: Nitro | Next.js (Turbopack) | Express

workflow with 10 sequential steps

💻 Local Development

World Framework Workflow Time Wall Time Overhead Samples vs Fastest
💻 Local 🥇 Express 10.786s (~) 11.024s (~) 0.238s 3 1.00x
💻 Local Nitro 10.786s (~) 11.023s (~) 0.238s 3 1.00x
🐘 Postgres Express 10.808s (-0.9%) 11.042s (~) 0.234s 3 1.00x
🐘 Postgres Nitro 10.843s (+2.4%) 11.039s (~) 0.196s 3 1.01x
💻 Local Next.js (Turbopack) ⚠️ missing - - - -
🐘 Postgres Next.js (Turbopack) ⚠️ missing - - - -

▲ Production (Vercel)

World Framework Workflow Time Wall Time Overhead Samples vs Fastest
▲ Vercel 🥇 Nitro 16.329s (-5.1% 🟢) 17.395s (-42.8% 🟢) 1.066s 2 1.00x
▲ Vercel Next.js (Turbopack) 16.883s (-1.7%) 18.452s (-54.9% 🟢) 1.569s 2 1.03x
▲ Vercel Express 17.239s (+3.1%) 19.209s (-54.2% 🟢) 1.970s 2 1.06x

🔍 Observability: Nitro | Next.js (Turbopack) | Express

workflow with 25 sequential steps

💻 Local Development

World Framework Workflow Time Wall Time Overhead Samples vs Fastest
🐘 Postgres 🥇 Express 26.910s (-0.6%) 27.056s (-2.4%) 0.146s 3 1.00x
🐘 Postgres Nitro 26.979s (+2.3%) 27.390s (+1.2%) 0.411s 3 1.00x
💻 Local Nitro 27.200s (~) 28.052s (~) 0.852s 3 1.01x
💻 Local Express 27.208s (~) 28.053s (~) 0.844s 3 1.01x
💻 Local Next.js (Turbopack) ⚠️ missing - - - -
🐘 Postgres Next.js (Turbopack) ⚠️ missing - - - -

▲ Production (Vercel)

World Framework Workflow Time Wall Time Overhead Samples vs Fastest
▲ Vercel 🥇 Express 43.069s (+0.5%) 44.580s (-26.7% 🟢) 1.511s 2 1.00x
▲ Vercel Next.js (Turbopack) 43.089s (-2.9%) 44.570s (-26.4% 🟢) 1.481s 2 1.00x
▲ Vercel Nitro 43.816s (-9.5% 🟢) 45.081s (-25.5% 🟢) 1.264s 2 1.02x

🔍 Observability: Express | Next.js (Turbopack) | Nitro

workflow with 50 sequential steps

💻 Local Development

World Framework Workflow Time Wall Time Overhead Samples vs Fastest
🐘 Postgres 🥇 Express 53.803s (~) 54.091s (~) 0.288s 2 1.00x
🐘 Postgres Nitro 53.824s (+2.1%) 54.092s (+1.9%) 0.268s 2 1.00x
💻 Local Nitro 56.010s (~) 56.101s (-1.8%) 0.091s 2 1.04x
💻 Local Express 56.072s (~) 56.101s (-1.8%) 0.029s 2 1.04x
💻 Local Next.js (Turbopack) ⚠️ missing - - - -
🐘 Postgres Next.js (Turbopack) ⚠️ missing - - - -

▲ Production (Vercel)

World Framework Workflow Time Wall Time Overhead Samples vs Fastest
▲ Vercel 🥇 Nitro 94.768s (-0.7%) 95.718s (-20.7% 🟢) 0.950s 1 1.00x
▲ Vercel Next.js (Turbopack) 95.520s (-5.7% 🟢) 97.016s (-19.0% 🟢) 1.496s 1 1.01x
▲ Vercel Express 98.340s (+4.0%) 99.659s (-17.4% 🟢) 1.319s 1 1.04x

🔍 Observability: Nitro | Next.js (Turbopack) | Express

Promise.all with 10 concurrent steps

💻 Local Development

World Framework Workflow Time Wall Time Overhead Samples vs Fastest
🐘 Postgres 🥇 Nitro 1.326s (+4.7%) 2.010s (~) 0.684s 15 1.00x
🐘 Postgres Express 1.348s (~) 2.010s (~) 0.662s 15 1.02x
💻 Local Nitro 1.410s (-1.0%) 2.005s (~) 0.594s 15 1.06x
💻 Local Express 1.412s (-1.1%) 2.006s (~) 0.594s 15 1.06x
💻 Local Next.js (Turbopack) ⚠️ missing - - - -
🐘 Postgres Next.js (Turbopack) ⚠️ missing - - - -

▲ Production (Vercel)

World Framework Workflow Time Wall Time Overhead Samples vs Fastest
▲ Vercel 🥇 Express 2.363s (+0.6%) 3.795s (-43.9% 🟢) 1.432s 8 1.00x
▲ Vercel Nitro 2.589s (+3.6%) 3.561s (-55.4% 🟢) 0.972s 9 1.10x
▲ Vercel Next.js (Turbopack) 2.914s (+22.4% 🔺) 4.133s (-63.9% 🟢) 1.219s 8 1.23x

🔍 Observability: Express | Nitro | Next.js (Turbopack)

Promise.all with 25 concurrent steps

💻 Local Development

World Framework Workflow Time Wall Time Overhead Samples vs Fastest
🐘 Postgres 🥇 Express 2.439s (-1.0%) 3.012s (~) 0.574s 10 1.00x
🐘 Postgres Nitro 2.483s (+3.3%) 3.013s (~) 0.530s 10 1.02x
💻 Local Nitro 2.558s (-2.1%) 3.007s (~) 0.450s 10 1.05x
💻 Local Express 2.619s (-1.1%) 3.008s (~) 0.389s 10 1.07x
💻 Local Next.js (Turbopack) ⚠️ missing - - - -
🐘 Postgres Next.js (Turbopack) ⚠️ missing - - - -

▲ Production (Vercel)

World Framework Workflow Time Wall Time Overhead Samples vs Fastest
▲ Vercel 🥇 Nitro 2.449s (-20.9% 🟢) 3.624s (-72.4% 🟢) 1.175s 9 1.00x
▲ Vercel Express 2.946s (+1.6%) 4.317s (-60.7% 🟢) 1.371s 7 1.20x
▲ Vercel Next.js (Turbopack) 3.023s (+7.9% 🔺) 4.358s (-45.3% 🟢) 1.335s 7 1.23x

🔍 Observability: Nitro | Express | Next.js (Turbopack)

Promise.all with 50 concurrent steps

💻 Local Development

World Framework Workflow Time Wall Time Overhead Samples vs Fastest
🐘 Postgres 🥇 Express 3.916s (-2.7%) 4.302s (-6.3% 🟢) 0.385s 7 1.00x
🐘 Postgres Nitro 3.927s (+9.3% 🔺) 4.590s (+14.3% 🔺) 0.663s 7 1.00x
💻 Local Nitro 7.248s (-6.5% 🟢) 8.019s (~) 0.770s 4 1.85x
💻 Local Express 7.567s (~) 8.020s (~) 0.453s 4 1.93x
💻 Local Next.js (Turbopack) ⚠️ missing - - - -
🐘 Postgres Next.js (Turbopack) ⚠️ missing - - - -

▲ Production (Vercel)

World Framework Workflow Time Wall Time Overhead Samples vs Fastest
▲ Vercel 🥇 Next.js (Turbopack) 3.139s (+9.4% 🔺) 4.624s (+13.6% 🔺) 1.485s 7 1.00x
▲ Vercel Nitro 3.178s (+11.1% 🔺) 4.356s (-60.2% 🟢) 1.178s 7 1.01x
▲ Vercel Express 3.336s (+18.3% 🔺) 4.892s (-56.9% 🟢) 1.556s 7 1.06x

🔍 Observability: Next.js (Turbopack) | Nitro | Express

Promise.race with 10 concurrent steps

💻 Local Development

World Framework Workflow Time Wall Time Overhead Samples vs Fastest
🐘 Postgres 🥇 Nitro 1.339s (+6.7% 🔺) 2.010s (~) 0.671s 15 1.00x
🐘 Postgres Express 1.369s (+1.5%) 2.010s (~) 0.641s 15 1.02x
💻 Local Nitro 1.418s (-3.3%) 2.004s (~) 0.586s 15 1.06x
💻 Local Express 1.462s (+1.1%) 2.005s (~) 0.543s 15 1.09x
💻 Local Next.js (Turbopack) ⚠️ missing - - - -
🐘 Postgres Next.js (Turbopack) ⚠️ missing - - - -

▲ Production (Vercel)

World Framework Workflow Time Wall Time Overhead Samples vs Fastest
▲ Vercel 🥇 Next.js (Turbopack) 2.084s (-11.7% 🟢) 3.506s (+1.0%) 1.422s 9 1.00x
▲ Vercel Express 2.140s (-2.8%) 3.610s (-55.0% 🟢) 1.470s 9 1.03x
▲ Vercel Nitro 2.172s (-2.8%) 3.223s (-65.9% 🟢) 1.051s 10 1.04x

🔍 Observability: Next.js (Turbopack) | Express | Nitro

Promise.race with 25 concurrent steps

💻 Local Development

World Framework Workflow Time Wall Time Overhead Samples vs Fastest
🐘 Postgres 🥇 Nitro 2.447s (+0.9%) 3.012s (~) 0.565s 10 1.00x
🐘 Postgres Express 2.465s (~) 3.011s (~) 0.546s 10 1.01x
💻 Local Nitro 2.669s (-3.0%) 3.007s (~) 0.338s 10 1.09x
💻 Local Express 2.710s (-2.2%) 3.008s (~) 0.298s 10 1.11x
💻 Local Next.js (Turbopack) ⚠️ missing - - - -
🐘 Postgres Next.js (Turbopack) ⚠️ missing - - - -

▲ Production (Vercel)

World Framework Workflow Time Wall Time Overhead Samples vs Fastest
▲ Vercel 🥇 Nitro 2.152s (-16.2% 🟢) 3.111s (-66.2% 🟢) 0.958s 10 1.00x
▲ Vercel Express 2.535s (+3.0%) 3.938s (-49.8% 🟢) 1.403s 9 1.18x
▲ Vercel Next.js (Turbopack) 2.706s (+10.0% 🔺) 3.851s (-48.8% 🟢) 1.145s 8 1.26x

🔍 Observability: Nitro | Express | Next.js (Turbopack)

Promise.race with 50 concurrent steps

💻 Local Development

World Framework Workflow Time Wall Time Overhead Samples vs Fastest
🐘 Postgres 🥇 Nitro 3.938s (+6.2% 🔺) 4.303s (+4.0%) 0.365s 7 1.00x
🐘 Postgres Express 3.939s (-2.3%) 4.445s (-6.1% 🟢) 0.506s 7 1.00x
💻 Local Nitro 7.687s (-7.9% 🟢) 8.019s (-11.2% 🟢) 0.332s 4 1.95x
💻 Local Express 8.146s (-0.9%) 8.772s (-2.8%) 0.626s 4 2.07x
💻 Local Next.js (Turbopack) ⚠️ missing - - - -
🐘 Postgres Next.js (Turbopack) ⚠️ missing - - - -

▲ Production (Vercel)

World Framework Workflow Time Wall Time Overhead Samples vs Fastest
▲ Vercel 🥇 Nitro 2.747s (-14.6% 🟢) 3.803s (-59.2% 🟢) 1.055s 8 1.00x
▲ Vercel Express 2.924s (+4.2%) 4.162s (-47.6% 🟢) 1.238s 8 1.06x
▲ Vercel Next.js (Turbopack) 3.057s (+12.4% 🔺) 4.340s (-30.9% 🟢) 1.282s 7 1.11x

🔍 Observability: Nitro | Express | Next.js (Turbopack)

Stream Benchmarks (includes TTFB metrics)
workflow with stream

💻 Local Development

World Framework Workflow Time TTFB Slurp Wall Time Overhead Samples vs Fastest
💻 Local 🥇 Nitro 0.168s (-1.6%) 1.002s (~) 0.012s (+5.5% 🔺) 1.017s (~) 0.849s 10 1.00x
💻 Local Express 0.170s (~) 1.003s (~) 0.012s (+4.4%) 1.017s (~) 0.847s 10 1.02x
🐘 Postgres Nitro 0.200s (+51.7% 🔺) 0.991s (-0.9%) 0.001s (-7.7% 🟢) 1.012s (~) 0.812s 10 1.19x
🐘 Postgres Express 0.201s (+0.9%) 0.992s (-0.5%) 0.002s (+23.1% 🔺) 1.012s (~) 0.811s 10 1.20x
💻 Local Next.js (Turbopack) ⚠️ missing - - - - -
🐘 Postgres Next.js (Turbopack) ⚠️ missing - - - - -

▲ Production (Vercel)

World Framework Workflow Time TTFB Slurp Wall Time Overhead Samples vs Fastest
▲ Vercel 🥇 Next.js (Turbopack) 1.527s (-6.5% 🟢) 2.511s (-59.8% 🟢) 0.006s (+7.4% 🔺) 3.024s (-54.9% 🟢) 1.497s 10 1.00x
▲ Vercel Nitro 1.528s (-15.1% 🟢) 2.215s (-66.0% 🟢) 0.005s (+10.9% 🔺) 2.665s (-63.2% 🟢) 1.137s 10 1.00x
▲ Vercel Express 1.599s (-7.0% 🟢) 2.623s (-60.3% 🟢) 0.006s (+7.4% 🔺) 3.214s (-55.8% 🟢) 1.615s 10 1.05x

🔍 Observability: Next.js (Turbopack) | Nitro | Express

Summary

Fastest Framework by World

Winner determined by most benchmark wins

World 🥇 Fastest Framework Wins
💻 Local Nitro 11/12
🐘 Postgres Express 7/12
▲ Vercel Nitro 6/12
Fastest World by Framework

Winner determined by most benchmark wins

Framework 🥇 Fastest World Wins
Express 🐘 Postgres 6/12
Next.js (Turbopack) ▲ Vercel 12/12
Nitro 💻 Local 4/12
Column Definitions
  • Workflow Time: Runtime reported by workflow (completedAt - createdAt) - primary metric
  • TTFB: Time to First Byte - time from workflow start until first stream byte received (stream benchmarks only)
  • Slurp: Time from first byte to complete stream consumption (stream benchmarks only)
  • Wall Time: Total testbench time (trigger workflow + poll for result)
  • Overhead: Testbench overhead (Wall Time - Workflow Time)
  • Samples: Number of benchmark iterations run
  • vs Fastest: How much slower compared to the fastest configuration for this benchmark

Worlds:

  • 💻 Local: In-memory filesystem world (local development)
  • 🐘 Postgres: PostgreSQL database world (local development)
  • ▲ Vercel: Vercel production/preview deployment
  • 🌐 Turso: Community world (local development)
  • 🌐 MongoDB: Community world (local development)
  • 🌐 Redis: Community world (local development)
  • 🌐 Jazz: Community world (local development)

📋 View full workflow run

@github-actions
Copy link
Contributor

github-actions bot commented Mar 12, 2026

🧪 E2E Test Results

Some tests failed

Summary

Passed Failed Skipped Total
✅ ▲ Vercel Production 560 0 67 627
✅ 💻 Local Development 600 0 84 684
✅ 📦 Local Production 600 0 84 684
✅ 🐘 Local Postgres 600 0 84 684
✅ 🪟 Windows 54 0 3 57
❌ 🌍 Community Worlds 116 55 15 186
✅ 📋 Other 144 0 27 171
Total 2674 55 364 3093

❌ Failed Tests

🌍 Community Worlds (55 failed)

mongodb (3 failed):

  • hookWorkflow is not resumable via public webhook endpoint
  • webhookWorkflow
  • concurrent hook token conflict - two workflows cannot use the same hook token simultaneously

redis (2 failed):

  • hookWorkflow is not resumable via public webhook endpoint
  • concurrent hook token conflict - two workflows cannot use the same hook token simultaneously

turso (50 failed):

  • addTenWorkflow
  • addTenWorkflow
  • wellKnownAgentWorkflow (.well-known/agent)
  • should work with react rendering in step
  • promiseAllWorkflow
  • promiseRaceWorkflow
  • promiseAnyWorkflow
  • importedStepOnlyWorkflow
  • hookWorkflow
  • hookWorkflow is not resumable via public webhook endpoint
  • webhookWorkflow
  • sleepingWorkflow
  • parallelSleepWorkflow
  • nullByteWorkflow
  • workflowAndStepMetadataWorkflow
  • fetchWorkflow
  • promiseRaceStressTestWorkflow
  • error handling error propagation workflow errors nested function calls preserve message and stack trace
  • error handling error propagation workflow errors cross-file imports preserve message and stack trace
  • error handling error propagation step errors basic step error preserves message and stack trace
  • error handling error propagation step errors cross-file step error preserves message and function names in stack
  • error handling retry behavior regular Error retries until success
  • error handling retry behavior FatalError fails immediately without retries
  • error handling retry behavior RetryableError respects custom retryAfter delay
  • error handling retry behavior maxRetries=0 disables retries
  • error handling catchability FatalError can be caught and detected with FatalError.is()
  • hookCleanupTestWorkflow - hook token reuse after workflow completion
  • concurrent hook token conflict - two workflows cannot use the same hook token simultaneously
  • hookDisposeTestWorkflow - hook token reuse after explicit disposal while workflow still running
  • stepFunctionPassingWorkflow - step function references can be passed as arguments (without closure vars)
  • stepFunctionWithClosureWorkflow - step function with closure variables passed as argument
  • closureVariableWorkflow - nested step functions with closure variables
  • spawnWorkflowFromStepWorkflow - spawning a child workflow using start() inside a step
  • health check (queue-based) - workflow and step endpoints respond to health check messages
  • pathsAliasWorkflow - TypeScript path aliases resolve correctly
  • Calculator.calculate - static workflow method using static step methods from another class
  • AllInOneService.processNumber - static workflow method using sibling static step methods
  • ChainableService.processWithThis - static step methods using this to reference the class
  • thisSerializationWorkflow - step function invoked with .call() and .apply()
  • customSerializationWorkflow - custom class serialization with WORKFLOW_SERIALIZE/WORKFLOW_DESERIALIZE
  • instanceMethodStepWorkflow - instance methods with "use step" directive
  • crossContextSerdeWorkflow - classes defined in step code are deserializable in workflow context
  • stepFunctionAsStartArgWorkflow - step function reference passed as start() argument
  • cancelRun - cancelling a running workflow
  • cancelRun via CLI - cancelling a running workflow
  • pages router addTenWorkflow via pages router
  • pages router promiseAllWorkflow via pages router
  • pages router sleepingWorkflow via pages router
  • hookWithSleepWorkflow - hook payloads delivered correctly with concurrent sleep
  • sleepWithSequentialStepsWorkflow - sequential steps work with concurrent sleep (control)

Details by Category

✅ ▲ Vercel Production
App Passed Failed Skipped
✅ astro 50 0 7
✅ example 50 0 7
✅ express 50 0 7
✅ fastify 50 0 7
✅ hono 50 0 7
✅ nextjs-turbopack 55 0 2
✅ nextjs-webpack 55 0 2
✅ nitro 50 0 7
✅ nuxt 50 0 7
✅ sveltekit 50 0 7
✅ vite 50 0 7
✅ 💻 Local Development
App Passed Failed Skipped
✅ astro-stable 48 0 9
✅ express-stable 48 0 9
✅ fastify-stable 48 0 9
✅ hono-stable 48 0 9
✅ nextjs-turbopack-canary 54 0 3
✅ nextjs-turbopack-stable 54 0 3
✅ nextjs-webpack-canary 54 0 3
✅ nextjs-webpack-stable 54 0 3
✅ nitro-stable 48 0 9
✅ nuxt-stable 48 0 9
✅ sveltekit-stable 48 0 9
✅ vite-stable 48 0 9
✅ 📦 Local Production
App Passed Failed Skipped
✅ astro-stable 48 0 9
✅ express-stable 48 0 9
✅ fastify-stable 48 0 9
✅ hono-stable 48 0 9
✅ nextjs-turbopack-canary 54 0 3
✅ nextjs-turbopack-stable 54 0 3
✅ nextjs-webpack-canary 54 0 3
✅ nextjs-webpack-stable 54 0 3
✅ nitro-stable 48 0 9
✅ nuxt-stable 48 0 9
✅ sveltekit-stable 48 0 9
✅ vite-stable 48 0 9
✅ 🐘 Local Postgres
App Passed Failed Skipped
✅ astro-stable 48 0 9
✅ express-stable 48 0 9
✅ fastify-stable 48 0 9
✅ hono-stable 48 0 9
✅ nextjs-turbopack-canary 54 0 3
✅ nextjs-turbopack-stable 54 0 3
✅ nextjs-webpack-canary 54 0 3
✅ nextjs-webpack-stable 54 0 3
✅ nitro-stable 48 0 9
✅ nuxt-stable 48 0 9
✅ sveltekit-stable 48 0 9
✅ vite-stable 48 0 9
✅ 🪟 Windows
App Passed Failed Skipped
✅ nextjs-turbopack 54 0 3
❌ 🌍 Community Worlds
App Passed Failed Skipped
✅ mongodb-dev 3 0 2
❌ mongodb 51 3 3
✅ redis-dev 3 0 2
❌ redis 52 2 3
✅ turso-dev 3 0 2
❌ turso 4 50 3
✅ 📋 Other
App Passed Failed Skipped
✅ e2e-local-dev-nest-stable 48 0 9
✅ e2e-local-postgres-nest-stable 48 0 9
✅ e2e-local-prod-nest-stable 48 0 9

📋 View full workflow run

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors the workflow and step runtimes to more explicitly separate “infrastructure” operations (hydration/serialization and event writes) from user code execution, and extends server-retry logic to include transient network failures so backend/network flakiness can be retried instead of incorrectly failing runs/steps.

Changes:

  • Refactors step execution flow to split input hydration, user step execution, and step completion into distinct phases.
  • Refactors workflow entrypoint flow to split run preparation, workflow replay execution, and run completion into distinct phases.
  • Adds transient network error detection and retries to withServerErrorRetry, plus corresponding unit tests.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
packages/core/src/runtime/step-handler.ts Restructures step execution phases and error handling boundaries.
packages/core/src/runtime/helpers.ts Adds transient network error detection + expands withServerErrorRetry retry conditions.
packages/core/src/runtime/helpers.test.ts Adds tests covering network error retry behavior and isTransientNetworkError.
packages/core/src/runtime.ts Restructures workflow execution phases and moves run completion outside user-code try/catch.
.changeset/fix-infra-error-handling.md Records patch release note for infra vs user error handling changes.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Copy link
Contributor

@vercel vercel bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additional Suggestion:

Infrastructure call getEncryptionKeyForRun is inside the user-code try/catch block, causing transient network errors to be misclassified as user code failures (run_failed).

Fix on Vercel

Redundant with undici RetryAgent which already handles 5xx retries
and network error retries at the HTTP dispatcher level.
Copy link
Member

@VaguelySerious VaguelySerious left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM overall, there's a few cases I think we should do different though (if what Claude says applies)

… as fatal, deduplicate check

- Remove withThrottleRetry (undici RetryAgent handles 429s)
- WorkflowRuntimeError in infrastructure setup now produces run_failed/step_failed
  instead of retrying forever via queue
- Deduplicate redundant WorkflowAPIError.is check in step-handler.ts
Copy link
Member

@VaguelySerious VaguelySerious left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM assuming e2e tests pass

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants