Skip to content

Conversation

@Andy-Jost
Copy link
Contributor

@Andy-Jost Andy-Jost commented Feb 3, 2026

Summary

Converts _program.py to Cython (_program.pyx) for improved code organization and future performance optimization.

  • Converts Program to cdef class with .pxd declarations
  • Reorganizes file structure and factors long methods into cdef inline helpers
  • Adds new public API: Program.driver_can_load_nvrtc_ptx_output()

Changes

  • _program.py_program.pyx with cdef class Program
  • New _program.pxd with typed attribute declarations
  • Module docstring, __all__, type alias section added
  • Docstrings updated to use :class: refs and public paths
  • Type annotations added to public methods
  • _nvvm_exception_manager inlined (single use)
  • Union import removed in favor of | syntax
  • Tests updated to use new public API

Test plan

  • All existing test_program.py tests pass
  • All existing test_module.py tests pass
  • pre-commit passes

Closes #1082

- Rename _program.py to _program.pyx
- Convert Program to cdef class with _program.pxd declarations
- Extract _MembersNeededForFinalize to module-level _ProgramMNFF
  (nested classes not allowed in cdef class)
- Add __repr__ method to Program
- Keep ProgramOptions as @DataClass (unchanged)
- Keep weakref.finalize pattern for handle cleanup
- Move _translate_program_options to Program_translate_options (cdef)
- Move _can_load_generated_ptx to Program_can_load_generated_ptx (cdef)
- Remove unused TYPE_CHECKING import block
- Follow _memory/_buffer.pyx helper function patterns
- Reorganize file structure per developer guide (principal class first)
- Add module docstring, __all__, type alias section
- Factor long methods into cdef inline helpers
- Add proper exception specs to cdef functions
- Fix docstrings (use :class: refs, public paths)
- Add type annotations to public methods
- Inline _nvvm_exception_manager (single use)
- Remove Union import, use | syntax
- Add public Program.driver_can_load_nvrtc_ptx_output() API
- Update tests to use new public API

Closes NVIDIA#1082
@Andy-Jost Andy-Jost added this to the cuda.core beta 12 milestone Feb 3, 2026
@Andy-Jost Andy-Jost added P0 High priority - Must do! cuda.core Everything related to the cuda.core module labels Feb 3, 2026
@Andy-Jost Andy-Jost self-assigned this Feb 3, 2026
@copy-pr-bot
Copy link
Contributor

copy-pr-bot bot commented Feb 3, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@Andy-Jost
Copy link
Contributor Author

/ok to test 85dbbb5

@github-actions
Copy link

github-actions bot commented Feb 3, 2026

@leofang leofang self-requested a review February 3, 2026 06:02
mod_obj = ObjectCode.from_ptx(ptx, symbol_mapping=sym_map)
assert mod.code == ptx
if not Program._can_load_generated_ptx():
if not Program.driver_can_load_nvrtc_ptx_output():
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is worthy of discussion. I changed this private API into a public one after refactoring because it seems somewhat useful to an end user. An alternative would be to continue using the private API if the value here is too low.

Comment on lines +425 to +442
class _ProgramMNFF:
"""Members needed for postrm release of program handles."""

__slots__ = "handle", "backend"

def __init__(self, program_obj, handle, backend):
self.handle = handle
self.backend = backend
weakref.finalize(program_obj, self.close)

def close(self):
if self.handle is not None:
if self.backend == "NVRTC":
handle_return(nvrtc.nvrtcDestroyProgram(self.handle))
elif self.backend == "NVVM":
nvvm = _get_nvvm_module()
nvvm.destroy_program(self.handle)
self.handle = None
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am keeping MNFF and the weakref.finalize cleanup for this PR. In a subsequent PR, I plan to extend the resource handles to nvrtc and nvvm and add nogil blocks.

@Andy-Jost Andy-Jost requested review from cpcloud and rwgk February 3, 2026 15:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cuda.core Everything related to the cuda.core module P0 High priority - Must do!

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Cythonize _program.py

1 participant