Draft
Conversation
b0a327d to
af6aa65
Compare
When running an x86_64 magic-trace binary under binfmt_misc/qemu-user on an aarch64 host, uname -m returns x86_64 but the host's cs_etm device is available in sysfs. Switch all detection (OCaml and C) to probe for device existence at runtime instead of checking architecture. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
CoreSight ETM produces "tr end jcc" and "tr end jmp" branch types that Intel PT doesn't. Add these to the branches regex. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
CoreSight ETM can end a trace at any branch type (e.g. "tr end jcc", "tr end jmp"). These were previously considered impossible events. Treat them like other trace ends by transitioning to untraced. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
CoreSight ETM produces "tr end jcc" / "tr end jmp" for brief decoder sync gaps that resume immediately with "tr strt". Pushing an [untraced] call frame for these caused a staircase effect since the subsequent "tr strt" never pops that frame. Instead, treat these as no-ops. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
On aarch64, CoreSight ETM classifies PLT stub indirect branches (br xN) as returns. A "return" from foo@plt to foo is really a jump into the resolved function. Misclassifying it as a return causes the trace writer to pop a frame and push a duplicate, creating a staircase effect. Detect this case by checking if the src symbol ends with @plt and the dst symbol matches after stripping that suffix. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
CoreSight ETM classifies indirect branches (br xN) as returns, but many are actually tail calls or vtable dispatches (e.g. __overflow tail-calling _IO_file_overflow). In check_current_symbol, after popping the misclassified frame, re-check whether the new stack top already matches the destination before pushing. This prevents duplicate frames that caused staircase artifacts in traces. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The more general fix in check_current_symbol (re-checking the stack top after popping) already handles the PLT case, so the @plt-specific hack in perf_decode.ml is unnecessary. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
After a `ret` has already popped the returning function, if the new stack top doesn't match the return destination, the old code would pop the caller (treating it as a tail call) and then push the destination. This incorrectly destroyed the caller's stack frame on every PLT/vtable trampoline return (e.g. `call @plt; return @plt -> real_func`). Add `check_current_symbol_after_ret` which only pushes on mismatch (never pops), since after a return the destination is a callee resolved through a trampoline, not a tail-call replacement. The existing `check_current_symbol` (pop-then-push) remains for Jump events where tail-call semantics are correct. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The pure push-only approach was too aggressive — it accumulated orphaned frames when a return skipped over intermediate "ghost" frames (e.g. same-function different-label like __entry_text_start vs entry_SYSCALL_64_after_hwframe). New approach: after ret pops one frame, search the callstack for the destination symbol. If found, pop down to it (normal return skipping ghost frames). If not found, push (trampoline target like PLT/vtable). This correctly handles both cases: - PLT/vtable trampolines: destination not on stack → push as callee - Ghost frame returns: destination on stack → pop to it - Normal returns: destination at top → no-op Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…terns Tests based on real CoreSight ETM perf traces that exercise: - PLT trampoline: call @plt, return @plt → real_func, return to caller - Vtable dispatch: call wrapper, return wrapper → impl - Tail call via jump: _setjmp → __sigsetjmp - Repeated PLT calls: verifies no staircase (caller stays open) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Just a draft for now, to figure out what needs to be done.
Sample usage in its current state: