[CoreCLR][Signal] Bump shutdown notif and crashdump before prev handler #123735

mdh1418 · 2026-01-28T23:12:22Z

The previously registered signal action/handler aren't guaranteed to return, so we lose out on notifying shutdown and creating a dump in those cases. Specifically, PROCCreateCrashDumpIfEnabled would be the last chance to provide the managed context for the thread that crashed.

e.g. On Android CoreCLR, it seems that, by default, signal handlers are already registered by Android's runtime (/apex/com.android.runtime/bin/linker64 + /system/lib64/libandroid_runtime.so). Whenever an unhandled synchronous fault occurs, the previously registered handler will not return back to invoke_previous_action and aborts the thread itself, so PROCCreateCrashDumpIfEnabled will not be hit.

Sigsegv behavior Android CoreCLR vs other platforms

Android CoreCLR

When intentionally writing to NULL (sigsegv) on Android CoreCLR, the previously registered signal handler goes down this path

runtime/src/coreclr/pal/src/exception/signal.cpp

Line 454 in 40e8c73

action->sa_sigaction(code, siginfo, context);

, and the thread aborts before hitting PROCNotifyProcessShutdown and PROCCreateCrashDumpIfEnabled.

MacOS/Linux/NativeAOT(linux)

On MacOS, Linux, NativeAOT (Only checked linux at time of writing), the same intentional SIGSEGV will hit

runtime/src/coreclr/pal/src/exception/signal.cpp

Lines 431 to 448 in 40e8c73

    
           if (signalRestarts) 
        
           { 
        
               // Shutdown and create the core dump before we restore the signal to the default handler. 
        
               PROCNotifyProcessShutdown(IsRunningOnAlternateStack(context)); 
        
               PROCCreateCrashDumpIfEnabled(code, siginfo, context, true); 
        
               // Restore the original and restart h/w exception. 
        
               restore_signal(code, action); 
        
               return; 
        
           } 
        
           else 
        
           { 
        
               // We can't invoke the original handler because returning from the 
        
               // handler doesn't restart the exception. 
        
               PROCAbort(code, siginfo, context); 
        
           }

instead because there is no previously registered signal handler. In those cases, PROCCreateCrashDumpIfEnabled is hit and managed callstacks are captured in the dump.

History investigation

From a github history dive, I didn't spot anything in particular requiring the previous signal handler to be invoked before PROCNotifyProcessShutdown + PROCCreateCrashDumpIfEnabled.

PROCNotifyProcessShutdown was first introduced in 1433c3f. It doesn't seem to state a particular reason for invoking it after the previous signal handler.

PROCCreateCrashDumpIfEnabled was added to signal.cpp in 7f9bd2c because the PROCNotifyProcessShutdown didn't create a crash dump. It doesn't state any particular reason for being invoked after the previously registered signal handler, and was probably just placed next to PROCNotifyProcessShutdown.

invoke_previous_action was introduced in a740f65 and was refactoring while maintaining the order.

Android CoreCLR behavior after swapping order

Locally, I have POC changes to emit managed callstacks in Android's PROCCreateCrashDumpIfEnabled.

01-28 17:26:40.951  2416  2440 F DOTNET  : Native crash detected; attempting managed stack trace.
01-28 17:26:40.951  2416  2440 F DOTNET  : {"stack":[
01-28 17:26:40.951  2416  2440 F DOTNET  : {"ip":"0x0","module":"0x0","offset":"0x0","name":"Program.MemSet(Void*, Int32, UIntPtr)"},
01-28 17:26:40.951  2416  2440 F DOTNET  : {"ip":"0x78d981145973","module":"0x0","offset":"0x0","name":"Program.MemSet(Void*, Int32, UIntPtr)"},
01-28 17:26:40.951  2416  2440 F DOTNET  : {"ip":"0x78d981145973","module":"0x0","offset":"0x73","name":"Program.ForceNativeSegv()"},
01-28 17:26:40.951  2416  2440 F DOTNET  : {"ip":"0x78d981141b60","module":"0x0","offset":"0x70","name":"Program.Main(System.String[])"}
01-28 17:26:40.951  2416  2440 F DOTNET  : ]}
01-28 17:26:40.952  2416  2440 F DOTNET  : Crash dump hook completed.
--------- beginning of crash
01-28 17:26:40.952  2416  2440 F libc    : Fatal signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0x0 in tid 2440 (.dot.MonoRunner), pid 2416 (ulator.JIT.Test)
.....
01-28 17:26:46.882  2921  2921 F DEBUG   : *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
01-28 17:26:46.882  2921  2921 F DEBUG   : Build fingerprint: 'google/sdk_gphone64_x86_64/emu64xa:16/BE2A.250530.026.D1/13818094:user/release-keys'
01-28 17:26:46.882  2921  2921 F DEBUG   : Revision: '0'
01-28 17:26:46.882  2921  2921 F DEBUG   : ABI: 'x86_64'
01-28 17:26:46.882  2921  2921 F DEBUG   : Timestamp: 2026-01-28 17:26:41.492831700-0500
01-28 17:26:46.882  2921  2921 F DEBUG   : Process uptime: 20s
01-28 17:26:46.883  2921  2921 F DEBUG   : Cmdline: net.dot.Android.Device_Emulator.JIT.Test
01-28 17:26:46.883  2921  2921 F DEBUG   : pid: 2416, tid: 2440, name: .dot.MonoRunner  >>> net.dot.Android.Device_Emulator.JIT.Test <<<
01-28 17:26:46.883  2921  2921 F DEBUG   : uid: 10219
01-28 17:26:46.883  2921  2921 F DEBUG   : signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0x0000000000000000
01-28 17:26:46.883  2921  2921 F DEBUG   : Cause: null pointer dereference
01-28 17:26:46.883  2921  2921 F DEBUG   : Abort message: 'CoreCLR: previous handler for '
01-28 17:26:46.883  2921  2921 F DEBUG   :     rax 0000000000000000  rbx 000078da87ffade0  rcx 0000000000000000  rdx 0000000000000001
01-28 17:26:46.884  1237  1297 I s.nexuslauncher: AssetManager2(0x78dd08cd9178) locale list changing from [] to [en-US]
01-28 17:26:46.903  2447  2594 I BugleNotifications: Creating notification input ids [CONTEXT im_entry_input="" im_notification_input="" im_settings_store_input="" im_final_input="" ]
01-28 17:26:46.905  2921  2921 F DEBUG   :     r8  00007ffcde5a8080  r9  34d9bb0e67871eb0  r10 000078ddb4111870  r11 0000000000000293
01-28 17:26:46.906  2921  2921 F DEBUG   :     r12 0000000000000001  r13 000078da87ffafa0  r14 0000000000000000  r15 000078da87ffaf18
01-28 17:26:46.906  2921  2921 F DEBUG   :     rdi 0000000000000000  rsi 0000000000000000
01-28 17:26:46.906  2921  2921 F DEBUG   :     rbp 000078da87ffac40  rsp 000078da87ffabc8  rip 000078ddb41118a2
01-28 17:26:46.906  2921  2921 F DEBUG   : 2 total frames
01-28 17:26:46.906  2921  2921 F DEBUG   : backtrace:
01-28 17:26:46.906  2921  2921 F DEBUG   :       #00 pc 000000000008f8a2  /apex/com.android.runtime/lib64/bionic/libc.so (memset_avx2+50) (BuildId: fcb82240218d1473de1e3d2137c0be35)
01-28 17:26:46.906  2921  2921 F DEBUG   :       #01 pc 0000000000049972  /memfd:doublemapper (deleted) (offset 0x111000)

Now theres a window to log managed callstacks before the original signal handler aborts and triggers a tombstone.

Android Mono behavior

Mono provides two embeddings APIs to configure signal and crash chaining

runtime/src/mono/mono/mini/driver.c

Lines 2864 to 2894 in 61d3943

    
           /** 
        
            * mono_set_signal_chaining: 
        
            * 
        
            * Enable/disable signal chaining. This should be called before \c mono_jit_init. 
        
            * If signal chaining is enabled, the runtime saves the original signal handlers before 
        
            * installing its own handlers, and calls the original ones in the following cases: 
        
            * - a \c SIGSEGV / \c SIGABRT signal received while executing native (i.e. not JITted) code. 
        
            * - \c SIGPROF 
        
            * - \c SIGFPE 
        
            * - \c SIGQUIT 
        
            * - \c SIGUSR2 
        
            * Signal chaining only works on POSIX platforms. 
        
            */ 
        
           void 
        
           mono_set_signal_chaining (gboolean chain_signals) 
        
           { 
        
           	mono_do_signal_chaining = chain_signals; 
        
           } 
        
           /** 
        
            * mono_set_crash_chaining: 
        
            * 
        
            * Enable/disable crash chaining due to signals. When a fatal signal is delivered and 
        
            * Mono doesn't know how to handle it, it will invoke the crash handler. If chrash chaining 
        
            * is enabled, it will first print its crash information and then try to chain with the native handler. 
        
            */ 
        
           void 
        
           mono_set_crash_chaining (gboolean chain_crashes) 
        
           { 
        
           	mono_do_crash_chaining = chain_crashes; 
        
           }

that determine whether synchronous faults would chain

runtime/src/mono/mono/mini/mini-runtime.c

Lines 3892 to 3903 in 61d3943

    
           if (!ji) { 
        
           	if (!mono_do_crash_chaining && mono_chain_signal (MONO_SIG_HANDLER_PARAMS)) 
        
           		return; 
        
           	mono_handle_native_crash (mono_get_signame (SIGSEGV), &mctx, (MONO_SIG_HANDLER_INFO_TYPE*)info); 
        
           	if (mono_do_crash_chaining) { 
        
           		if (!mono_chain_signal (MONO_SIG_HANDLER_PARAMS)) 
        
           			mono_chain_signal_to_default_sigsegv_handler (); 
        
           		return; 
        
           	} 
        
           }

They would only chain to the previous signal handler

runtime/src/mono/mono/mini/mini-posix.c

Lines 193 to 210 in 61d3943

    
           gboolean 
        
           MONO_SIG_HANDLER_SIGNATURE (mono_chain_signal) 
        
           { 
        
           	int signal = MONO_SIG_HANDLER_GET_SIGNO (); 
        
           	struct sigaction *saved_handler = (struct sigaction *)get_saved_signal_handler (signal); 
        
           	// Ignores chaining to default signal handlers i.e. when saved_handler->sa_handler == SIG_DFL 
        
           	if (saved_handler && saved_handler->sa_handler) { 
        
           		if (!(saved_handler->sa_flags & SA_SIGINFO)) { 
        
           			saved_handler->sa_handler (signal); 
        
           		} else { 
        
           #ifdef MONO_ARCH_USE_SIGACTION 
        
           			saved_handler->sa_sigaction (MONO_SIG_HANDLER_PARAMS); 
        
           #endif /* MONO_ARCH_USE_SIGACTION */ 
        
           		} 
        
           		return TRUE; 
        
           	} 
        
           	return FALSE;

only after attempting to walk native and managed stacks

runtime/src/mono/mono/mini/mini-exceptions.c

Lines 2992 to 3012 in 61d3943

    
           g_async_safe_printf("\n=================================================================\n"); 
        
           g_async_safe_printf("\tNative Crash Reporting\n"); 
        
           g_async_safe_printf("=================================================================\n"); 
        
           g_async_safe_printf("Got a %s while executing native code. This usually indicates\n", signal); 
        
           g_async_safe_printf("a fatal error in the mono runtime or one of the native libraries \n"); 
        
           g_async_safe_printf("used by your application.\n"); 
        
           g_async_safe_printf("=================================================================\n"); 
        
           mono_dump_native_crash_info (signal, mctx, info); 
        
           /* !jit_tls means the thread was not registered with the runtime */ 
        
           // This must be below the native crash dump, because we can't safely 
        
           // do runtime state probing after we have walked the managed stack here. 
        
           if (jit_tls && mono_thread_internal_current () && mctx) { 
        
           	g_async_safe_printf ("\n=================================================================\n"); 
        
           	g_async_safe_printf ("\tManaged Stacktrace:\n"); 
        
           	g_async_safe_printf ("=================================================================\n"); 
        
           	mono_walk_stack_full (print_stack_frame_signal_safe, mctx, jit_tls, mono_get_lmf (), MONO_UNWIND_LOOKUP_IL_OFFSET | MONO_UNWIND_SIGNAL_SAFE, NULL); 
        
           	g_async_safe_printf ("=================================================================\n"); 
        
           }

Alternatives

If there is any particular reason to preserve the order of sa_sigaction/sa_handler with respect to PROCNotifyProcessShutdown and PROCCreateCrashDumpIfEnabled for CoreCLR, a config knob can be added to allow Android CoreCLR to opt into the swapped ordering behavior. This may be in the form of config property key/values

runtime/src/coreclr/dlls/mscoree/exports.cpp

Lines 237 to 238 in 54ca569

    
           const char** propertyKeys, 
        
           const char** propertyValues,

or clrconfigvalues. That way AndroidSDK/AndroidAppBuilder may opt-in at build-time.

Given that the history of the ordering didn't reveal any problems with swapping the order, we can fallback to this behavior if the order swap causes problems down the line.

The other way around is more restrictive. Should we first introduce all the overhead to enable an opt-in/opt-out config knob, and later discover that no platforms need to invoke their previous handlers before PROCNotifyProcessShutdown/PROCCreateCrashDumpIfEnabled, it seems harder to justify removing the knob.

Copilot

Pull request overview

Adjusts CoreCLR’s signal chaining so shutdown notification and crash dump creation happen before invoking a previously-registered signal handler, ensuring these steps still run when the prior handler doesn’t return (notably on Android).

Changes:

Reorders PROCNotifyProcessShutdown and PROCCreateCrashDumpIfEnabled to run before chaining to the prior sigaction handler.
Adds an assertion to document/enforce that the “custom handler” path isn’t reached for SIG_DFL/SIG_IGN.

jkotas · 2026-01-28T23:42:53Z

src/coreclr/pal/src/exception/signal.cpp

+
+    _ASSERTE(!IsSigDfl(action) && !IsSigIgn(action));
+
+    PROCNotifyProcessShutdown(IsRunningOnAlternateStack(context));


I do not think this works well when there are multiple runtimes loaded in the process that all want to handle segfaults. It can be multiple .NET runtimes (e.g. CoreCLR + NAOT, or multiple NAOT), or it can be .NET and some other runtime (e.g. Java runtime).

The expected behavior in these situations is that the given runtime will check whether the signal happened in the code that it cares about. If yes, it will handle the signal. If no, it will forward the signal to the next runtime, and so on.

With this change, I think we will shutdown our runtime instance and generate crashdump if there is segfault gracefully handled by some other runtime.

Gracefully handled by another runtime, as in sa_sigaction/sa_handler? Wouldn't those still hit PROCNotifyProcessShutdown/PROCCreateCrashDumpIfEnabled in the original implementation? Or do they somehow return from invoke_previous_action

If the signal is handled in some other runtime, the handler registered by that runtime would not return.

I see, so given that we cannot tell how the other runtime will handle the signal, would it make sense then to pivot to an opt-in config switch that allows triggering a shutdown/create dump, even if the other runtime handled it gracefully?

I am not sure yet if there is a way for Android CoreCLR to not have a previously registered signal handler, so in those cases we wouldn't ever create crash dumps for signals.

Do we know what the handlers that are registered on Android before us do?

We should be writing the unhandled exception message and stacktrace to the logcat already (even before this change). Is that correct?

Mostly, with the exception of a native crash. If that happens, then we have very little to go off of unless a customer can provide a repro or answer a bunch of sluething questions. This is what @mdh1418 is looking to improve.

We should be writing the unhandled exception message and stacktrace to the logcat already (even before this change). Is that correct?

Yes, for unhandled managed exceptions, we have information in logcat already (function name not accurate)

01-31 14:45:41.891 1346 1372 E DOTNET : Unhandled exception. System.Exception: MIHW 01-31 14:45:41.891 1346 1372 E DOTNET : at Program.ForceNativeSegv() 01-31 14:45:41.891 1346 1372 E DOTNET : at Program.Main(String[] args)

Just on other native crashes like FailFast or synchronous faults we don't see anything in logcat besides Android's native crash report provided by tombstone, which isn't very useful given their crash reporter doesn't know about runtime's symbols. e.g.

--------- beginning of crash 01-31 14:45:41.903 1346 1372 F libc : Fatal signal 6 (SIGABRT), code -1 (SI_QUEUE) in tid 1372 (.dot.MonoRunner), pid 1346 (ulator.JIT.Test) 01-31 14:45:41.979 1383 1383 I crash_dump64: obtaining output fd from tombstoned, type: kDebuggerdTombstoneProto 01-31 14:45:41.985 297 297 I tombstoned: received crash request for pid 1372 01-31 14:45:41.989 1383 1383 I crash_dump64: performing dump of process 1346 (target tid = 1372) 01-31 14:45:42.649 1383 1383 F DEBUG : *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** 01-31 14:45:42.649 1383 1383 F DEBUG : Build fingerprint: 'google/sdk_gphone64_x86_64/emu64xa:16/BE2A.250530.026.D1/13818094:user/release-keys' 01-31 14:45:42.649 1383 1383 F DEBUG : Revision: '0' 01-31 14:45:42.649 1383 1383 F DEBUG : ABI: 'x86_64' 01-31 14:45:42.649 1383 1383 F DEBUG : Timestamp: 2026-01-31 14:45:41.994681400-0500 01-31 14:45:42.649 1383 1383 F DEBUG : Process uptime: 3s 01-31 14:45:42.649 1383 1383 F DEBUG : Cmdline: net.dot.Android.Device_Emulator.JIT.Test 01-31 14:45:42.649 1383 1383 F DEBUG : pid: 1346, tid: 1372, name: .dot.MonoRunner >>> net.dot.Android.Device_Emulator.JIT.Test <<< 01-31 14:45:42.649 1383 1383 F DEBUG : uid: 10236 01-31 14:45:42.649 1383 1383 F DEBUG : signal 6 (SIGABRT), code -1 (SI_QUEUE), fault addr -------- 01-31 14:45:42.650 1383 1383 F DEBUG : Abort message: ' 01-31 14:45:42.650 1383 1383 F DEBUG : =================================================================' 01-31 14:45:42.650 1383 1383 F DEBUG : rax 0000000000000000 rbx 0000000000000542 rcx 000078ddb40ec424 rdx 0000000000000006 01-31 14:45:42.650 1383 1383 F DEBUG : r8 000078ddda802743 r9 000078ddda802743 r10 000078da89f5bec0 r11 0000000000000203 01-31 14:45:42.650 1383 1383 F DEBUG : r12 000078da89f5c270 r13 000078da89f5fc00 r14 000000000000055c r15 000078da89f5c030 01-31 14:45:42.650 1383 1383 F DEBUG : rdi 0000000000000542 rsi 000000000000055c 01-31 14:45:42.650 1383 1383 F DEBUG : rbp 000078da89f5bf60 rsp 000078da89f5bec0 rip 000078ddb40ec424 01-31 14:45:42.650 1383 1383 F DEBUG : 9 total frames 01-31 14:45:42.650 1383 1383 F DEBUG : backtrace: 01-31 14:45:42.650 1383 1383 F DEBUG : #00 pc 000000000006a424 /apex/com.android.runtime/lib64/bionic/libc.so (abort+196) (BuildId: fcb82240218d1473de1e3d2137c0be35) 01-31 14:45:42.650 1383 1383 F DEBUG : #01 pc 0000000000532990 /data/app/~~8YH3Qdi2lWh5fJeOXBDnXw==/net.dot.Android.Device_Emulator.JIT.Test-kACIH0Xal8xfCbo6BL-LTA==/lib/x86_64/libcoreclr.so (BuildId: 421516eafc8ed405a19fe9a807f625c8816c9dda) 01-31 14:45:42.650 1383 1383 F DEBUG : #02 pc 00000000005328a3 /data/app/~~8YH3Qdi2lWh5fJeOXBDnXw==/net.dot.Android.Device_Emulator.JIT.Test-kACIH0Xal8xfCbo6BL-LTA==/lib/x86_64/libcoreclr.so (BuildId: 421516eafc8ed405a19fe9a807f625c8816c9dda) 01-31 14:45:42.650 1383 1383 F DEBUG : #03 pc 00000000004a09aa /data/app/~~8YH3Qdi2lWh5fJeOXBDnXw==/net.dot.Android.Device_Emulator.JIT.Test-kACIH0Xal8xfCbo6BL-LTA==/lib/x86_64/libcoreclr.so (BuildId: 421516eafc8ed405a19fe9a807f625c8816c9dda) 01-31 14:45:42.650 1383 1383 F DEBUG : #04 pc 000000000049d795 /data/app/~~8YH3Qdi2lWh5fJeOXBDnXw==/net.dot.Android.Device_Emulator.JIT.Test-kACIH0Xal8xfCbo6BL-LTA==/lib/x86_64/libcoreclr.so (BuildId: 421516eafc8ed405a19fe9a807f625c8816c9dda) 01-31 14:45:42.650 1383 1383 F DEBUG : #05 pc 000000000049d334 /data/app/~~8YH3Qdi2lWh5fJeOXBDnXw==/net.dot.Android.Device_Emulator.JIT.Test-kACIH0Xal8xfCbo6BL-LTA==/lib/x86_64/libcoreclr.so (BuildId: 421516eafc8ed405a19fe9a807f625c8816c9dda) 01-31 14:45:42.650 1383 1383 F DEBUG : #06 pc 000000000041e09e /data/app/~~8YH3Qdi2lWh5fJeOXBDnXw==/net.dot.Android.Device_Emulator.JIT.Test-kACIH0Xal8xfCbo6BL-LTA==/lib/x86_64/libcoreclr.so (BuildId: 421516eafc8ed405a19fe9a807f625c8816c9dda) 01-31 14:45:42.650 1383 1383 F DEBUG : #07 pc 00000000004f8bc8 /data/app/~~8YH3Qdi2lWh5fJeOXBDnXw==/net.dot.Android.Device_Emulator.JIT.Test-kACIH0Xal8xfCbo6BL-LTA==/lib/x86_64/libcoreclr.so (BuildId: 421516eafc8ed405a19fe9a807f625c8816c9dda) 01-31 14:45:42.650 1383 1383 F DEBUG : #08 pc 0000000000047205 /memfd:doublemapper (deleted) (offset 0x111000)

The managed crash report that lateralusX was mentioning earlier is #123824.

Lack of symbol information is a secondary problem. When a crash is reported, we can get them by asking the app developer to give us copy of e.g. libcoreclr.so from that app package and we can use dotnet-symbol to fetch the debug info, then resolve the symbols.

The more (most) important problem is what you see on frame #08 in the above trace - the Android stack walker cannot figure out where to go next from that frame, so it won't show the trace all the way down to the bottom of the stack. What we miss there is not only the managed frames, but also the Java ones (which would be resolved to symbols by the Android unwinder) - all of that together is priceless information when dealing with remote crashes.

The above stopped on our first managed frame and the memfd:doublemapper is the name of the memory region holding managed code (maybe we should have a more clear name for that region since it ends up in stacktraces) and since there is no unwind info for managed frames, native Android unwinder can't move passed them (even if they do have frame pointers). If there where Java frames before #08 I would expect them to be included in the report. I don't believe a native crash report on Android running Mono to give any information after frame #08 since Android unwinder will have the same issue with Mono managed frames. Here is an example of a Mono stacktrace on Android:

#00 pc 0x0000000000078dec /apex/com.android.runtime/lib64/bionic/libc.so (syscall+28) #01 pc 0x000000000007d7e0 /apex/com.android.runtime/lib64/bionic/libc.so (__futex_wait_ex(void volatile*, bool, int, bool, timespec const*)+144) #02 pc 0x000000000008ba8c /apex/com.android.runtime/lib64/bionic/libc.so (sem_wait+108) #03 pc 0x00000000001eeb20 /data/app/~~BDdfU_FYGQL1DW1U7YrqhQ==/<app.bundle.id>-RkmEjhS8iZlVjNzV7bp1xA==/split_config.arm64_v8a.apk!libmonosgen-2.0.so (mono_os_sem_timedwait+204) #04 pc 0x00000000001ee844 /data/app/~~BDdfU_FYGQL1DW1U7YrqhQ==/<app.bundle.id>-RkmEjhS8iZlVjNzV7bp1xA==/split_config.arm64_v8a.apk!libmonosgen-2.0.so (mono_threads_wait_pending_operations+323) #05 pc 0x00000000002a50f4 /data/app/~~BDdfU_FYGQL1DW1U7YrqhQ==/<app.bundle.id>-RkmEjhS8iZlVjNzV7bp1xA==/split_config.arm64_v8a.apk!libmonosgen-2.0.so (unified_suspend_stop_world+349) #06 pc 0x00000000002a4e84 /data/app/~~BDdfU_FYGQL1DW1U7YrqhQ==/<app.bundle.id>-RkmEjhS8iZlVjNzV7bp1xA==/split_config.arm64_v8a.apk!libmonosgen-2.0.so (sgen_client_stop_world+159) #07 pc 0x00000000002c5350 /data/app/~~BDdfU_FYGQL1DW1U7YrqhQ==/<app.bundle.id>-RkmEjhS8iZlVjNzV7bp1xA==/split_config.arm64_v8a.apk!libmonosgen-2.0.so (sgen_stop_world+3974) #08 pc 0x00000000002c21bc /data/app/~~BDdfU_FYGQL1DW1U7YrqhQ==/<app.bundle.id>-RkmEjhS8iZlVjNzV7bp1xA==/split_config.arm64_v8a.apk!libmonosgen-2.0.so (sgen_perform_collection+2637) #09 pc 0x00000000002c20ec /data/app/~~BDdfU_FYGQL1DW1U7YrqhQ==/<app.bundle.id>-RkmEjhS8iZlVjNzV7bp1xA==/split_config.arm64_v8a.apk!libmonosgen-2.0.so (sgen_ensure_free_space+2616) #10 pc 0x00000000002be8b8 /data/app/~~BDdfU_FYGQL1DW1U7YrqhQ==/<app.bundle.id>-RkmEjhS8iZlVjNzV7bp1xA==/split_config.arm64_v8a.apk!libmonosgen-2.0.so (sgen_alloc_obj_nolock+279) #11 pc 0x00000000002bed9c /data/app/~~BDdfU_FYGQL1DW1U7YrqhQ==/<app.bundle.id>-RkmEjhS8iZlVjNzV7bp1xA==/split_config.arm64_v8a.apk!libmonosgen-2.0.so (sgen_alloc_obj+454) #12 pc 0x00000000002a686c /data/app/~~BDdfU_FYGQL1DW1U7YrqhQ==/<app.bundle.id>-RkmEjhS8iZlVjNzV7bp1xA==/split_config.arm64_v8a.apk!libmonosgen-2.0.so (mono_gc_alloc_obj+690) #13 pc 0x0000000000004438

So it have the similar characteristics, it can stackwalk system library + Mono runtime native frames up until first managed frame where it stops. This example has been symbolicated offline, so the reports originally only contained address + build id for each native Mono runtime frame.

It is hard to improve this in the native crash reports generated by Android crash reporter since there are no API's to inject managed unwind info in a structured and controlled way. What we could improve is the stacktraces we produce while the runtime still has control, like we have in this case, then we could extend our runtime stack walker to handle both managed and native frames with the possibility to even use Android libunwind on API levels where we could leverage it, then we could probably produce a full stacktrace including native + dotnet managed frames + java frames.

@lateralusX I tried to repro this with Mono and failed, but I know for sure we've had traces from the Android unwinder that somehow skipped over Mono's managed frames and included traces all the way down to Zygote (the process launcher for Android apps) - perhaps, at least at some point, Mono managed frames did include enough info for Android unwinder to be able to follow the chain all the way to the bottom?

grendello · 2026-01-29T13:19:07Z

@mdh1418 it doesn't really matter what handlers Android installs, as long as you chain up to any you've captured.

[CoreCLR][Signal] Bump shutdown notif and crashdump before prev handler

5be9bbf

mdh1418 requested review from Copilot, janvorli, jkotas, lateralusX, noahfalk and tmds January 28, 2026 23:12

mdh1418 added the area-ExceptionHandling-coreclr label Jan 28, 2026

dotnet-policy-service bot assigned mdh1418 Jan 28, 2026

Copilot started reviewing on behalf of mdh1418 January 28, 2026 23:13 View session

Copilot AI reviewed Jan 28, 2026

View reviewed changes

jkotas reviewed Jan 28, 2026

View reviewed changes

mdh1418 mentioned this pull request Jan 30, 2026

[Android CoreCLR] Log managed callstacks on native crash #123824

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CoreCLR][Signal] Bump shutdown notif and crashdump before prev handler #123735

[CoreCLR][Signal] Bump shutdown notif and crashdump before prev handler #123735

mdh1418 commented Jan 28, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

jkotas Jan 28, 2026 •

edited

Loading

Uh oh!

mdh1418 Jan 29, 2026

Uh oh!

janvorli Jan 29, 2026

Uh oh!

mdh1418 Jan 29, 2026

Uh oh!

janvorli Jan 29, 2026

Uh oh!

steveisok Jan 31, 2026

Uh oh!

mdh1418 Jan 31, 2026

Uh oh!

grendello Feb 2, 2026

Uh oh!

lateralusX Feb 2, 2026 •

edited

Loading

Uh oh!

grendello Feb 2, 2026

Uh oh!

grendello commented Jan 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

	if (signalRestarts)
	{
	// Shutdown and create the core dump before we restore the signal to the default handler.
	PROCNotifyProcessShutdown(IsRunningOnAlternateStack(context));

	PROCCreateCrashDumpIfEnabled(code, siginfo, context, true);

	// Restore the original and restart h/w exception.
	restore_signal(code, action);

	return;
	}
	else
	{
	// We can't invoke the original handler because returning from the
	// handler doesn't restart the exception.
	PROCAbort(code, siginfo, context);
	}

	/**
	* mono_set_signal_chaining:
	*
	* Enable/disable signal chaining. This should be called before \c mono_jit_init.
	* If signal chaining is enabled, the runtime saves the original signal handlers before
	* installing its own handlers, and calls the original ones in the following cases:
	* - a \c SIGSEGV / \c SIGABRT signal received while executing native (i.e. not JITted) code.
	* - \c SIGPROF
	* - \c SIGFPE
	* - \c SIGQUIT
	* - \c SIGUSR2
	* Signal chaining only works on POSIX platforms.
	*/
	void
	mono_set_signal_chaining (gboolean chain_signals)
	{
	mono_do_signal_chaining = chain_signals;
	}

	/**
	* mono_set_crash_chaining:
	*
	* Enable/disable crash chaining due to signals. When a fatal signal is delivered and
	* Mono doesn't know how to handle it, it will invoke the crash handler. If chrash chaining
	* is enabled, it will first print its crash information and then try to chain with the native handler.
	*/
	void
	mono_set_crash_chaining (gboolean chain_crashes)
	{
	mono_do_crash_chaining = chain_crashes;
	}

	if (!ji) {
	if (!mono_do_crash_chaining && mono_chain_signal (MONO_SIG_HANDLER_PARAMS))
	return;

	mono_handle_native_crash (mono_get_signame (SIGSEGV), &mctx, (MONO_SIG_HANDLER_INFO_TYPE*)info);

	if (mono_do_crash_chaining) {
	if (!mono_chain_signal (MONO_SIG_HANDLER_PARAMS))
	mono_chain_signal_to_default_sigsegv_handler ();
	return;
	}
	}

	gboolean
	MONO_SIG_HANDLER_SIGNATURE (mono_chain_signal)
	{
	int signal = MONO_SIG_HANDLER_GET_SIGNO ();
	struct sigaction saved_handler = (struct sigaction )get_saved_signal_handler (signal);

	// Ignores chaining to default signal handlers i.e. when saved_handler->sa_handler == SIG_DFL
	if (saved_handler && saved_handler->sa_handler) {
	if (!(saved_handler->sa_flags & SA_SIGINFO)) {
	saved_handler->sa_handler (signal);
	} else {
	#ifdef MONO_ARCH_USE_SIGACTION
	saved_handler->sa_sigaction (MONO_SIG_HANDLER_PARAMS);
	#endif /* MONO_ARCH_USE_SIGACTION */
	}
	return TRUE;
	}
	return FALSE;

	g_async_safe_printf("\n=================================================================\n");
	g_async_safe_printf("\tNative Crash Reporting\n");
	g_async_safe_printf("=================================================================\n");
	g_async_safe_printf("Got a %s while executing native code. This usually indicates\n", signal);
	g_async_safe_printf("a fatal error in the mono runtime or one of the native libraries \n");
	g_async_safe_printf("used by your application.\n");
	g_async_safe_printf("=================================================================\n");
	mono_dump_native_crash_info (signal, mctx, info);

	/* !jit_tls means the thread was not registered with the runtime */
	// This must be below the native crash dump, because we can't safely
	// do runtime state probing after we have walked the managed stack here.
	if (jit_tls && mono_thread_internal_current () && mctx) {
	g_async_safe_printf ("\n=================================================================\n");
	g_async_safe_printf ("\tManaged Stacktrace:\n");
	g_async_safe_printf ("=================================================================\n");

	mono_walk_stack_full (print_stack_frame_signal_safe, mctx, jit_tls, mono_get_lmf (), MONO_UNWIND_LOOKUP_IL_OFFSET \| MONO_UNWIND_SIGNAL_SAFE, NULL);

	g_async_safe_printf ("=================================================================\n");
	}


		_ASSERTE(!IsSigDfl(action) && !IsSigIgn(action));

		PROCNotifyProcessShutdown(IsRunningOnAlternateStack(context));

[CoreCLR][Signal] Bump shutdown notif and crashdump before prev handler #123735

Are you sure you want to change the base?

[CoreCLR][Signal] Bump shutdown notif and crashdump before prev handler #123735

Conversation

mdh1418 commented Jan 28, 2026

Sigsegv behavior Android CoreCLR vs other platforms

Android CoreCLR

MacOS/Linux/NativeAOT(linux)

History investigation

Android CoreCLR behavior after swapping order

Android Mono behavior

Alternatives

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

jkotas Jan 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lateralusX Feb 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

grendello commented Jan 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

jkotas Jan 28, 2026 •

edited

Loading

lateralusX Feb 2, 2026 •

edited

Loading