Skip to content

hipFile: Ensure HIP Runtime is initialized before calling into hipAmdFileRead/hipAmdFileWrite#223

Open
kurtmcmillan wants to merge 4 commits intodevelopfrom
kumcmill/rocm-20222
Open

hipFile: Ensure HIP Runtime is initialized before calling into hipAmdFileRead/hipAmdFileWrite#223
kurtmcmillan wants to merge 4 commits intodevelopfrom
kumcmill/rocm-20222

Conversation

@kurtmcmillan
Copy link
Collaborator

@kurtmcmillan kurtmcmillan commented Mar 17, 2026

Motivation

Avoid causing a SEGFAULT in the HIP Runtime.

AIHIPFILE-157

Technical Details

A storage partner recently hit a SEGFAULT when testing hipFile with vllm. vllm performs IO within a thread and the first call into the HIP Runtime is hipAmdFileRead/hipAmdFileWrite through the fastpath backend. Since these HIP Runtime APIs are private, it wasn't anticipated that they would need to perform any initialization. Library initialization checks are being added to hipAmdFileRead/hipAmdFileWrite (ROCm/rocm-systems#4068) This change is a temporary while there is the possibility that hipFile will run with an unpatched HIP Runtime.

hipFileRead & hipFileWrite use private/hidden HIP Runtime APIs which,
initially, did not ensure that the HIP Runtime was initialized which resulted
in a SEGFAULT.

This change ensures that the HIP Runtime is initialized when
hipFileRead/hipFileWrite are the first HIP API calls of a thread.
@kurtmcmillan kurtmcmillan marked this pull request as ready for review March 17, 2026 20:49
Copilot AI review requested due to automatic review settings March 17, 2026 20:49
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds an explicit HIP runtime initialization step to the Fastpath backend before calling the private hipAmdFileRead/hipAmdFileWrite APIs, to avoid a HIP runtime segfault when those are the first HIP calls on a newly created thread.

Changes:

  • Add a Hip::hipInit() wrapper method and a corresponding MHip mock method.
  • Add a per-thread initialization guard in Fastpath::io() that calls hipInit() once per thread.
  • Update Fastpath unit tests to expect an initialization call.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
src/amd_detail/backend/fastpath.cpp Adds per-thread HIP runtime init guard in Fastpath::io()
src/amd_detail/hip.h Extends Hip interface with hipInit()
src/amd_detail/hip.cpp Implements Hip::hipInit() via ::hipInit(0)
test/amd_detail/mhip.h Adds hipInit() mock to MHip
test/amd_detail/fastpath.cpp Updates Fastpath IO tests to expect hipInit()
CHANGELOG.md Documents the Fastpath initialization safeguard
Comments suppressed due to low confidence (1)

test/amd_detail/fastpath.cpp:346

  • The new EXPECT_CALL(mhip, hipInit); expectation is set in every expect_io() call, but the production code uses a thread_local guard and will only call hipInit() once per thread for the whole test process. Since gtest typically runs these parametrized tests sequentially on the same thread, after the first Fastpath().io(...) call hipInit() will no longer be invoked and later tests will fail due to an unsatisfied expectation. Adjust the tests to allow hipInit() to be called 0 or 1 times (or restructure to run the IO on a fresh thread / reset the per-thread init state in a test-only way).
struct FastpathIoParam : public FastpathTestBase, public TestWithParam<IoType> {

    StrictMock<MHip> mhip;

    void expect_io()
    {
        EXPECT_CALL(mhip, hipInit);
        EXPECT_CALL(*mbuffer, getBuffer).WillOnce(Return(DEFAULT_BUFFER_ADDR));
        EXPECT_CALL(*mbuffer, getLength).WillOnce(Return(DEFAULT_BUFFER_LENGTH));
        EXPECT_CALL(*mfile, getUnbufferedFd).WillOnce(Return(DEFAULT_UNBUFFERED_FD));
    }

    void expect_io(optional<int> fd, void *bufptr, size_t buflen)
    {
        EXPECT_CALL(mhip, hipInit);
        EXPECT_CALL(*mbuffer, getBuffer).WillOnce(Return(bufptr));
        EXPECT_CALL(*mbuffer, getLength).WillOnce(Return(buflen));
        EXPECT_CALL(*mfile, getUnbufferedFd).WillOnce(Return(fd));
    }

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants