Skip to content

feat: add CUDA/gsplat environment check script#11

Open
cicorias wants to merge 1 commit intoAzure-Samples:mainfrom
cicorias:simple-gsplat-train
Open

feat: add CUDA/gsplat environment check script#11
cicorias wants to merge 1 commit intoAzure-Samples:mainfrom
cicorias:simple-gsplat-train

Conversation

@cicorias
Copy link
Member

Add scripts/gsplat_check — a lightweight Python tool (managed by uv) that verifies whether the current device can run the gsplat 3DGS training backend.

Checks performed:

  • CUDA GPU detection via nvidia-smi + PyTorch tensor smoke-test
  • gsplat library import and rasterization kernel validation (8 Gaussians)
  • External tool availability (nvidia-smi, python3, ffmpeg, colmap)

Reports a structured pass/fail verdict similar to the Rust preflight binary.

Usage: cd scripts/gsplat_check && uv run main.py

Also adds a reference to the new tool in the root README documentation section.

Add scripts/gsplat_check — a lightweight Python tool (managed by uv) that
verifies whether the current device can run the gsplat 3DGS training backend.

Checks performed:
- CUDA GPU detection via nvidia-smi + PyTorch tensor smoke-test
- gsplat library import and rasterization kernel validation (8 Gaussians)
- External tool availability (nvidia-smi, python3, ffmpeg, colmap)

Reports a structured pass/fail verdict similar to the Rust preflight binary.

Usage: cd scripts/gsplat_check && uv run main.py

Also adds a reference to the new tool in the root README documentation section.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings March 20, 2026 15:03
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new lightweight Python (uv-managed) preflight script to validate CUDA + gsplat functionality on a machine, plus docs linking to the tool.

Changes:

  • Add scripts/gsplat_check with a main.py verifier and uv pyproject.toml
  • Add tool-specific README with usage/examples
  • Link the new environment check tool from the root README.md

Reviewed changes

Copilot reviewed 5 out of 6 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
scripts/gsplat_check/pyproject.toml Defines the uv-managed Python project and dependencies for the checker script
scripts/gsplat_check/main.py Implements CUDA detection, PyTorch smoke test, gsplat rasterization smoke test, and tool probing
scripts/gsplat_check/README.md Documents what is checked, prerequisites, and example outputs
scripts/gsplat_check/.python-version Pins the local Python version for the tool directory
README.md Adds a link to the new gsplat environment check tool

Comment on lines +237 to +257
_heading("External Tools")
for cmd, args in [
("nvidia-smi", ["--version"]),
("python3", ["--version"]),
("ffmpeg", ["-version"]),
("colmap", ["--version"]),
]:
ver = _cmd_version(cmd, args)
if ver:
_ok(cmd, ver)
else:
_fail(cmd, "not found")

# ── verdict ───────────────────────────────────────────────────────────
_heading("Environment Verdict")
if failures:
print()
print(" ❌ ENVIRONMENT CHECK FAILED")
for f in failures:
print(f" • {f}")
return 1
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The “External Tools” checks currently never affect failures, so the final exit code can be 0 even when ffmpeg/colmap/nvidia-smi are missing. If tool availability is intended to be part of the pass/fail verdict (per the PR description and the README’s “Exits 0 on success, 1 on failure”), append a failure reason when a required tool is not found; alternatively, explicitly label these checks as informational-only in output/docs.

Copilot uses AI. Check for mistakes.

print(f" {'Platform':<16}: {gpu['platform']}")
print(f" {'Device':<16}: {gpu['device'] or 'n/a'}")
print(f" {'VRAM':<16}: {gpu['vram_gb'] or 'n/a'} GB")
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This prints n/a GB when VRAM isn’t detected, which is a bit misleading. Consider emitting either n/a without the unit, or formatting conditionally (only append GB when vram_gb is present) for clearer output.

Suggested change
print(f" {'VRAM':<16}: {gpu['vram_gb'] or 'n/a'} GB")
vram_detail = f"{gpu['vram_gb']} GB" if gpu["vram_gb"] else "n/a"
print(f" {'VRAM':<16}: {vram_detail}")

Copilot uses AI. Check for mistakes.
try:
import torch # noqa: E402
from gsplat import rasterization # noqa: E402

Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

check_gsplat() always attempts CUDA tensor allocation and rasterization even when the earlier CUDA/PyTorch check has already determined the GPU is unusable. This can lead to redundant/less-actionable kernel error output. Consider passing the CUDA usability result into check_gsplat() (or early-returning in main() before kernel probing) so the gsplat section can clearly report “skipped (CUDA unavailable)” instead of a generic exception string.

Suggested change
# If CUDA is not available/usable, skip kernel probing to avoid
# redundant low-level CUDA errors and report a clear status instead.
if not torch.cuda.is_available():
info["kernel_error"] = (
"skipped: CUDA unavailable or unusable "
"(torch.cuda.is_available() is False)"
)
return info

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants