ARM coresight support by cgaebel · Pull Request #354 · janestreet/magic-trace

cgaebel · 2026-03-16T20:04:58Z

Just a draft for now, to figure out what needs to be done.

Sample usage in its current state:

cd /tmp
mkdir -p trace-dir
perf record -e cs_etm//u -o ./trace-dir/perf.data -- ./a.out
echo "()" > ./trace-dir/hits.sexp
MAGIC_TRACE_NO_DLFILTER=1 ./magic-trace decode -working-directory ./trace-dir -executable ./a.out

When running an x86_64 magic-trace binary under binfmt_misc/qemu-user on an aarch64 host, uname -m returns x86_64 but the host's cs_etm device is available in sysfs. Switch all detection (OCaml and C) to probe for device existence at runtime instead of checking architecture. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

CoreSight ETM produces "tr end jcc" and "tr end jmp" branch types that Intel PT doesn't. Add these to the branches regex. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

CoreSight ETM can end a trace at any branch type (e.g. "tr end jcc", "tr end jmp"). These were previously considered impossible events. Treat them like other trace ends by transitioning to untraced. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

CoreSight ETM produces "tr end jcc" / "tr end jmp" for brief decoder sync gaps that resume immediately with "tr strt". Pushing an [untraced] call frame for these caused a staircase effect since the subsequent "tr strt" never pops that frame. Instead, treat these as no-ops. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

@plt

On aarch64, CoreSight ETM classifies PLT stub indirect branches (br xN) as returns. A "return" from foo@plt to foo is really a jump into the resolved function. Misclassifying it as a return causes the trace writer to pop a frame and push a duplicate, creating a staircase effect. Detect this case by checking if the src symbol ends with @plt and the dst symbol matches after stripping that suffix. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

CoreSight ETM classifies indirect branches (br xN) as returns, but many are actually tail calls or vtable dispatches (e.g. __overflow tail-calling _IO_file_overflow). In check_current_symbol, after popping the misclassified frame, re-check whether the new stack top already matches the destination before pushing. This prevents duplicate frames that caused staircase artifacts in traces. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

The more general fix in check_current_symbol (re-checking the stack top after popping) already handles the PLT case, so the @plt-specific hack in perf_decode.ml is unnecessary. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

@plt

After a `ret` has already popped the returning function, if the new stack top doesn't match the return destination, the old code would pop the caller (treating it as a tail call) and then push the destination. This incorrectly destroyed the caller's stack frame on every PLT/vtable trampoline return (e.g. `call @plt; return @plt -> real_func`). Add `check_current_symbol_after_ret` which only pushes on mismatch (never pops), since after a return the destination is a callee resolved through a trampoline, not a tail-call replacement. The existing `check_current_symbol` (pop-then-push) remains for Jump events where tail-call semantics are correct. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

The pure push-only approach was too aggressive — it accumulated orphaned frames when a return skipped over intermediate "ghost" frames (e.g. same-function different-label like __entry_text_start vs entry_SYSCALL_64_after_hwframe). New approach: after ret pops one frame, search the callstack for the destination symbol. If found, pop down to it (normal return skipping ghost frames). If not found, push (trampoline target like PLT/vtable). This correctly handles both cases: - PLT/vtable trampolines: destination not on stack → push as callee - Ghost frame returns: destination on stack → pop to it - Normal returns: destination at top → no-op Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

@plt

…terns Tests based on real CoreSight ETM perf traces that exercise: - PLT trampoline: call @plt, return @plt → real_func, return to caller - Vtable dispatch: call wrapper, return wrapper → impl - Tail call via jump: _setjmp → __sigsetjmp - Repeated PLT calls: verifies no staircase (caller stays open) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

cgaebel marked this pull request as draft March 16, 2026 20:05

cgaebel marked this pull request as ready for review March 16, 2026 20:05

cgaebel marked this pull request as draft March 16, 2026 20:05

first draft from claude

af6aa65

cgaebel force-pushed the coresight-aarch64-support branch from b0a327d to af6aa65 Compare March 17, 2026 13:55

cgaebel and others added 12 commits March 17, 2026 11:24

Support CoreSight ETM branch types in perf output parser

522e0c9

CoreSight ETM produces "tr end jcc" and "tr end jmp" branch types that Intel PT doesn't. Add these to the branches regex. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Fix ocamlformat formatting in perf_decode.ml

8f2ff80

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Fix ocamlformat: collapse short expect block to one line

1c6c6c9

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Revert PLT-specific @plt return reclassification

4be5639

The more general fix in check_current_symbol (re-checking the stack top after popping) already handles the PLT case, so the @plt-specific hack in perf_decode.ml is unnecessary. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ARM coresight support#354

ARM coresight support#354
cgaebel wants to merge 13 commits intojanestreet:masterfrom
cgaebel:coresight-aarch64-support

cgaebel commented Mar 16, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

cgaebel commented Mar 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

cgaebel commented Mar 16, 2026 •

edited

Loading