Skip to content

Major reliability improvement for logging stack traces#287

Open
borrrden wants to merge 14 commits intomasterfrom
crash_handling
Open

Major reliability improvement for logging stack traces#287
borrrden wants to merge 14 commits intomasterfrom
crash_handling

Conversation

@borrrden
Copy link
Copy Markdown
Member

@borrrden borrrden commented Mar 20, 2026

This took many days of effort and so it's involved a lot with various stack unwinding and address retrieval mechanisms so I'll try to provide a high level summary here. Firstly to avoid a giant spaghetti of ifdef I broke out the implementation across several files that only get compiled on certain platforms:

  • Backtrace+capture-darwin.cc: Apple specific logic for getting a symbol from a frame in a stack trace. This uses standard backtrace_symbols and dladdr since the implementation on Apple is much more enhanced than the glibc linux counterpart (Most likely the compiler is more friendly about it when it emits machine code).
  • Backtrace+capture-linux.cc: Linux specific logic for getting a symbol from a frame in a stack trace. This makes use of the GCC provided libbacktrace.a (it's present even when using clang because by default clang still uses GCC for its runtime stuff in c++). This library is very attuned to specifically getting frames on Linux even when frame pointers are omitted and handles all the gnarly unwinding stuff safely (I found when trying to implement register based stack crawling that I often got it wrong and caused an SIGSEGV while trying to do it)
  • Backtrace+capture-posix.cc: Other stuff common to all POSIX environments like unmangling and actual printing. In this we need to be very careful since my plan is to handle signals here as well. We can only use async signal safe functions. In particular this means no streams, no printf, and no unmangling either.
  • Backtrace+capture-win32.cc: Windows specific logic for getting a symbol from a frame in a stack trace. Printing is not restricted since the way that Windows faults already moves to an entirely separate context in which it is safe to do just about anything. This uses the StackWalk64 API.
  • Backtrace+signals-posix.cc: POSIX signal handlers! RAII style class that will register signal handlers and write out a stack trace, and then pass on control to the next handler.
  • Backtrace+signals-win32.cc: Win32 exception handlers! Windows doesn't really have the equivalent of POSIX signals. It has a few basic ones but that's at. Most faults result in one of the various types of exception handlers (this is an overloaded term, since Windows was using the term "exception" way before it had any meaning for c++. These "exceptions" can happen even in C. They are basically signals by another name except the way the handlers are invoked is more safe).
  • Backtrace.cc: This now encapsulates any logic that operates one level above all of this logically. It uses the primitives defines in the files above.

Note that since we cannot use ostreams in signal handlers, I created a new crash log in the logs directory that a crash will be written to upon a signal. The existing logs are all buried inside of ofstream objects.

@borrrden borrrden requested review from jianminzhao and snej March 20, 2026 23:10
Comment thread Fleece/Support/Backtrace+signals-posix.cc Outdated
Comment thread Fleece/Support/Backtrace+signals-posix.cc Outdated
Comment thread Fleece/Support/Backtrace+capture-darwin.cc Outdated
Comment on lines +27 to +28
"_C_A_T_C_H____T_E_S_T_",
"Catch::TestInvokerAsFunction::invoke() const",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you checked that these names are still valid after we upgraded Catch2?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have not. Is there a method you used to determine them in the first place?

}
}

char* unmangle(const char* function) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Likewise, could this be a private static member of Backtrace?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was originally not even before my PR. Making it a private static member requires bringing the other Unmangle methods along with it. Is that the outcome you want?

Comment thread Fleece/Support/Backtrace.hh Outdated
[[nodiscard]] static std::shared_ptr<Backtrace> capture(void* from, unsigned maxFrames = 50,
void* context = nullptr);

static int raw_capture(void** buffer, int max, void* context = nullptr);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Likewise, is there a need to make this public?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It needs to be callable from the signal handler, as well as from Backtrace itself.

Comment thread Fleece/Support/Backtrace.hh
@borrrden borrrden requested a review from snej March 28, 2026 00:07
/// @param maxFrames Maximum number of frames to capture
explicit Backtrace(unsigned skipFrames =0, unsigned maxFrames =50);
/// @param context If non-null, a platform-specific context to capture from
/// @param from If non-null, the address to start from instead of the current frame
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is "from" in the signature?
As it is, it can be linked to Core without changes in Core, right?
How platform passes the context to Backtrace?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "platform" doesn't pass anything, it's all handled via the various Backtrace capture classes I've written. I think "from" here is a dangling comment from something I removed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants