Skip to content

Add architecture documentation for hipFile#226

Open
glimchb wants to merge 1 commit intoROCm:developfrom
glimchb:architecture
Open

Add architecture documentation for hipFile#226
glimchb wants to merge 1 commit intoROCm:developfrom
glimchb:architecture

Conversation

@glimchb
Copy link

@glimchb glimchb commented Mar 18, 2026

Document the full system architecture including:

  • Public API organization (Core, Error, Driver, RDMA, File Handle, Batch, Async, Configuration groups)
  • AMD backend internals: Backend scoring system, Fastpath (P2PDMA) and Fallback (bounce buffer) paths, kernel call chain through HIP → ROCR → HSA → Thunk → KFD
  • NVIDIA backend pass-through to cuFile
  • State management: DriverState singleton, FileMap, BufferMap, StreamMap, BatchContextMap with reference counting
  • Build system overview and dual-platform support
  • Current limitations and unimplemented API surface

Motivation

Technical Details

Test Plan

Test Result

Submission Checklist

Document the full system architecture including:
- Public API organization (Core, Error, Driver, RDMA, File Handle,
  Batch, Async, Configuration groups)
- AMD backend internals: Backend scoring system, Fastpath (P2PDMA)
  and Fallback (bounce buffer) paths, kernel call chain through
  HIP → ROCR → HSA → Thunk → KFD
- NVIDIA backend pass-through to cuFile
- State management: DriverState singleton, FileMap, BufferMap,
  StreamMap, BatchContextMap with reference counting
- Build system overview and dual-platform support
- Current limitations and unimplemented API surface

Signed-off-by: Boris Glimcher <Boris.Glimcher@emc.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant