Skip to content

Intercept sigaction to prevent libraries from overwriting signal handlers#420

Merged
jbachorik merged 2 commits intomainfrom
jb/wasmtime_workaround
Mar 18, 2026
Merged

Intercept sigaction to prevent libraries from overwriting signal handlers#420
jbachorik merged 2 commits intomainfrom
jb/wasmtime_workaround

Conversation

@jbachorik
Copy link
Collaborator

@jbachorik jbachorik commented Mar 17, 2026

Summary

  • Intercept sigaction() calls via GOT patching to prevent other libraries from overwriting our SIGSEGV/SIGBUS handlers
  • Store intercepted handlers as chain targets and call them from our handlers
  • Protects against libraries like wasmtime that install broken signal handlers calling malloc() (not async-signal-safe)

Problem

Wasmtime's SIGSEGV handler calls __tls_get_addr which can call malloc(), violating async-signal-safety. When the profiler uses safefetch, this causes deadlocks:

  1. Application code holds malloc's lock
  2. Profiler's signal handler runs and calls safefetch
  3. safefetch triggers SIGSEGV
  4. Wasmtime's handler runs and calls malloc() → deadlock

(see bytecodealliance/wasmtime#12787 wasmtime issue)

Solution

  • Patch sigaction GOT entry in all loaded libraries
  • Intercept SIGSEGV/SIGBUS handler installations (SA_SIGINFO only)
  • Keep our handler on top, store their handler as chain target
  • Call chain target from our handler after our logic completes

Test plan

  • Verify profiler works with wasmtime-based applications without deadlocking
  • Verify signal handler chaining works correctly
  • Run existing test suite

🤖 Generated with Claude Code

@jbachorik jbachorik added the AI label Mar 17, 2026
@jbachorik jbachorik force-pushed the jb/wasmtime_workaround branch from 15846f8 to 379b35e Compare March 17, 2026 13:47
Wasmtime's SIGSEGV handler calls malloc() via __tls_get_addr, which is
not async-signal-safe and causes deadlocks when profiler uses safefetch.

Patch wasmtime's sigaction GOT entry to intercept handler installations.
Their handlers are stored as chain targets and called from our handlers.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@jbachorik jbachorik force-pushed the jb/wasmtime_workaround branch from 379b35e to 2fcf604 Compare March 17, 2026 13:56
@jbachorik jbachorik marked this pull request as ready for review March 17, 2026 13:58
@jbachorik jbachorik requested a review from a team as a code owner March 17, 2026 13:58
@pr-commenter
Copy link

pr-commenter bot commented Mar 17, 2026

Integration Tests

All 40 integration tests passed

📊 Dashboard · 👷 Pipeline · 📦 e35fff97

@dd-octo-sts
Copy link

dd-octo-sts bot commented Mar 17, 2026

CI Test Results

Run: #23245408117 | Commit: ecadf06 | Duration: 10m 21s (longest job)

All 32 test jobs passed

Status Overview

JDK glibc-aarch64/debug glibc-amd64/debug musl-aarch64/debug musl-amd64/debug
8 - - -
8-ibm - - -
8-j9 - -
8-librca - -
8-orcl - - -
11 - - -
11-j9 - -
11-librca - -
17 - -
17-graal - -
17-j9 - -
17-librca - -
21 - -
21-graal - -
21-librca - -
25 - -
25-graal - -
25-librca - -

Legend: ✅ passed | ❌ failed | ⚪ skipped | 🚫 cancelled

Summary: Total: 32 | Passed: 32 | Failed: 0


Updated: 2026-03-18 13:02:32 UTC

@jbachorik jbachorik requested review from rkennke and zhengyu123 March 18, 2026 07:21
Copy link
Contributor

@rkennke rkennke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall (well as good as patching-out call addresses can look, I suppose), I just have two questions.

}

// Check if already patched or array is full
if (_sigaction_size >= MAX_NATIVE_LIBS) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So what happens if the array is full? We silently ignore it and don't patch? Isn't this asking for weird failures that would be hard to notice?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we would risk those libraries to be able to mess up our safefetch processing. The truth is, vast majority of the libraries are ok.
Also, the whole codecache infra will work only with at most MAX_NATIVE_LIBS libraries, so we will work in a weird mode anyway.
We can add a warning or a counter to indicate the overflow, though.

return __atomic_load_n(&_bus_chain_target, __ATOMIC_ACQUIRE);
}

// sigaction hook - called via GOT patching to intercept sigaction calls
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does GOT stand for?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

@rkennke rkennke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, thanks for the explanations. Feel free to add a warning, if you like.

- Add NATIVE_LIBS_DROPPED counter and LOG_WARN macro
- CodeCacheArray::add() returns bool, logs once on overflow
- Fix memory leak: delete CodeCache on failed add
- Add missing macOS stubs for sigaction protection

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@jbachorik
Copy link
Collaborator Author

Yes, added warning and also a counter that would tell us how many libraries we missed due to overflow.

@jbachorik jbachorik merged commit 0ec65ff into main Mar 18, 2026
83 checks passed
@jbachorik jbachorik deleted the jb/wasmtime_workaround branch March 18, 2026 18:35
@github-actions github-actions bot added this to the 1.40.0 milestone Mar 18, 2026
@jbachorik
Copy link
Collaborator Author

Backported to release/1.39._ via #422

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants