miri: add guest-debugging, including frame accesses.#12575
miri: add guest-debugging, including frame accesses.#12575cfallin merged 5 commits intobytecodealliance:mainfrom
Conversation
f0d35ce to
c43a590
Compare
The fix to `vm_store_context` provenance in `record_unwind`/`unwind` is a little weird to me. I was seeing mut access to the `vm_store_context` via `store.vm_store_context_mut()` in `record_unwind` (before this diff) then access via the previously saved raw pointer in the `CallThreadState`, which was registered as invalid and I believe is indeed invalid. This was only manifesting when setting `Config::guest_debug`, even without the frame-handle accesses added here. I didn't dig into the exact diff in codegen or runtime behavior that caused this but in any case, accessing `vm_store_context` via these two different paths (with one mut) appears to be unsound in any case. The fix here is to set the unwind state via the raw pointer in `CallThreadState` since that's the only path that the subsequent `unwind` has access to. Unrelated but useful: `ci/miri-provenance.test.sh` now accepts `MIRI_RUST_VERSION=+nightly` or whatnot, which is nice for running locally (I keep `stable` as my default toolchain).
c43a590 to
4fb36d0
Compare
alexcrichton
left a comment
There was a problem hiding this comment.
Thanks! Do you think it'd be worth running through anything else in the debug API as well? Or does getting a module more-or-less exercise everything?
|
Getting the current module exercises almost everything (frame-walking, getting metadata, reading raw vmctx and converting back to proper borrow of instance, etc) but I added a few more reads for good measure. Thanks! |
|
@alexcrichton pinging for re-review here -- another miri test failure (not part of the Pulley provenance test) has pushed me back toward another approach that involves re-deriving the raw |
6cb69d7 to
d27711a
Compare
…re-deriving the raw pointer in `CallThreadState`. prtest:full
d27711a to
ad9103f
Compare
|
(There's still a failure on s390x only that I suspect is an endianness issue when running Pulley + guest-debug instrumentation -- will take a look) |
alexcrichton
left a comment
There was a problem hiding this comment.
Looks reasonable to me yeah 👍
|
A little more detail: Pulley on s390x with guest-debug enabled (which hasn't been tested before -- only native s390x with guest-debug) is generating a big-endian vector store in the CLIF to compile to Pulley, which Cranelift's Pulley backend doesn't support. That's a nontrivial bit of implementation (lane swapping, etc). I'll instead refactor the instrumentation to work as e.g. object fields do, always little endian, in a separate PR. |
…reload vectors in guest-debugging slot and ABI clobbers.
When running Pulley on an s390x (or other big-endian) host, and enabling
guest-debugging instrumentationa very strange confluence of events
occurs:
- Pulley uses "native endian" of the host by default for loads and
stores.
- Patchable calls to debug hooks use the `preserve_all` ABI, which
spills all registers in the trampoline adapter (callee in this ABI),
including vector registers.
- Saving vector-typed locals/operand stack values to the debugger state
slot also uses vector stores.
- All of these stores were thus big-endian on big-endian hosts.
- Pulley's bytecode only supports little-endian vector loads/stores.
We were thus hitting an assert in Pulley codegen (the Cranelift
backend) when encountering a `VStore` VCode instruction with a
big-endian mode.
This PR makes two changes that avoid this issue:
- The ABI code for Pulley is careful to specify little-endian mode
explicitly for any vector load/store.
- The debug instrumentation code is refactored to use little-endian
explicitly for vector types *only*.
- (Why not for all types? Because we GC-root GC ref values, and
these need to be provided to the collector as mutable storage
cells, so need to be in native endianness.)
Test will come as part of bytecodealliance#12575 incorporating a
Pulley-with-guest-debugging test and running on s390x amongst our
platforms.
…reload vectors in guest-debugging slot and ABI clobbers.
When running Pulley on an s390x (or other big-endian) host, and enabling
guest-debugging instrumentation, a very strange confluence of events
occurs:
- Pulley uses "native endian" of the host by default for loads and
stores.
- Patchable calls to debug hooks use the `preserve_all` ABI, which
spills all registers in the trampoline adapter (callee in this ABI),
including vector registers.
- Saving vector-typed locals/operand stack values to the debugger state
slot also uses vector stores.
- All of these stores were thus big-endian on big-endian hosts.
- Pulley's bytecode only supports little-endian vector loads/stores.
We were thus hitting an assert in Pulley codegen (the Cranelift
backend) when encountering a `VStore` VCode instruction with a
big-endian mode.
This PR makes two changes that avoid this issue:
- The ABI code for Pulley is careful to specify little-endian mode
explicitly for any vector load/store.
- The debug instrumentation code is refactored to use little-endian
explicitly for vector types *only*.
- (Why not for all types? Because we GC-root GC ref values, and
these need to be provided to the collector as mutable storage
cells, so need to be in native endianness.)
Test will come as part of bytecodealliance#12575 incorporating a
Pulley-with-guest-debugging test and running on s390x amongst our
platforms.
prtest:full
|
s390x fix in #12585; will |
…reload vectors in guest-debugging slot and ABI clobbers. (#12585) * Cranelift/Wasmtime/Pulley/Debugging: use little-endian mode to spill/reload vectors in guest-debugging slot and ABI clobbers. When running Pulley on an s390x (or other big-endian) host, and enabling guest-debugging instrumentation, a very strange confluence of events occurs: - Pulley uses "native endian" of the host by default for loads and stores. - Patchable calls to debug hooks use the `preserve_all` ABI, which spills all registers in the trampoline adapter (callee in this ABI), including vector registers. - Saving vector-typed locals/operand stack values to the debugger state slot also uses vector stores. - All of these stores were thus big-endian on big-endian hosts. - Pulley's bytecode only supports little-endian vector loads/stores. We were thus hitting an assert in Pulley codegen (the Cranelift backend) when encountering a `VStore` VCode instruction with a big-endian mode. This PR makes two changes that avoid this issue: - The ABI code for Pulley is careful to specify little-endian mode explicitly for any vector load/store. - The debug instrumentation code is refactored to use little-endian explicitly for vector types *only*. - (Why not for all types? Because we GC-root GC ref values, and these need to be provided to the collector as mutable storage cells, so need to be in native endianness.) Test will come as part of #12575 incorporating a Pulley-with-guest-debugging test and running on s390x amongst our platforms. prtest:full * Review feedback. * Re-bless Cranelift tests (explicit `little` flags on `preserve_all` tests).
The fix to
vm_store_contextprovenance inrecord_unwind/unwindis a little weird to me. I was seeing mut access to thevm_store_contextviastore.vm_store_context_mut()inrecord_unwind(before this diff) then access via the previously saved raw pointer in theCallThreadState, which was registered as invalid and I believe is indeed invalid. This was only manifesting when settingConfig::guest_debug, even without the frame-handle accesses added here. I didn't dig into the exact diff in codegen or runtime behavior that caused this but in any case, accessingvm_store_contextvia these two different paths (with one mut) appears to be unsound in any case. The fix here is to set the unwind state via the raw pointer inCallThreadStatesince that's the only path that the subsequentunwindhas access to.