Skip to content

DPython is a custom Python launcher for automatic single-GPU or multi-machine distributed training over LAN using Hugging Face Accelerate. It supports Windows, auto-detects local and remote GPUs, falls back safely to single-GPU when needed, and simplifies real-world multi-GPU training on low-VRAM systems.

Notifications You must be signed in to change notification settings

vikashvr1024/Multi-GPU-Training

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🚀 DPython – Distributed Multi-GPU Training Launcher (LAN)

Note: You need to change some path in the cluster.json and dpython.py

DPython is a custom Python-based launcher that enables automatic single-GPU or multi-machine distributed training over a local network (LAN) using Hugging Face Accelerate.

It is built for real-world setups, including low-VRAM GPUs (GTX 1650 safe), and removes the complexity of manual distributed configuration.


✨ Key Features

  • ✅ Auto-detects local GPU
  • 🌐 Checks remote GPU availability via SSH
  • 🔁 Automatically falls back to single-GPU
  • ⚡ Launches true multi-machine distributed training
  • 🧠 No changes required inside your training logic
  • 🖥️ Windows-friendly (batch file included)
  • 🔥 Powered by Hugging Face Accelerate

📁 Project Structure

distributed_env/ │ ├── dpython.py # Distributed training launcher ├── cluster.json # Cluster configuration ├── run.bat # Windows one-click launcher │ ├── train.py # Example Accelerate training script └── README.md


⚙️ Requirements

Local & Remote Machine

  • Windows or Linux
  • NVIDIA GPU
  • NVIDIA Driver + CUDA
  • Python 3.9+
  • SSH enabled (passwordless SSH recommended)
  • Same project path on both machines

Python Packages

pip install torch accelerate

🔧 Configuration

1️⃣ Cluster Configuration (cluster.json)

{
  "master_ip": "192.168.1.10",
  "worker_ip": "192.168.1.11",
  "port": 29500,
  "ssh_user": "vikas"
}

Notes:

  • master_ip → Machine where you run the command
  • worker_ip → Remote GPU machine
  • port → Any free port (default 29500)
  • SSH must work without prompts

▶️ Usage

🖱️ One-Click (Windows)

dpython.bat train.py

⌨️ TO use the Multi GPU from any where in you pc add the main folder to your environment variable

dpython train.py

📦 With Arguments

python dpython.py train.py --epochs 10 --batch_size 16

🧪 Example Training Script (train.py)

DPython works with any script using Hugging Face Accelerate.

Minimum required setup:

from accelerate import Accelerator

accelerator = Accelerator()
device = accelerator.device

DPython automatically manages:

  • Process ranks
  • Device placement
  • Multi-GPU synchronization
  • Distributed launch across machines

🖥️ Runtime Behavior

✅ When Remote GPU is Available

================ GPU STATUS ================
LOCAL GPU: NVIDIA GTX 1650
REMOTE GPU: NVIDIA RTX 3050

[DPYTHON] Launching remote worker...
[DPYTHON] Launching local master...
Distributed training initialized

⚠️ When Remote GPU is NOT Available

REMOTE GPU NOT AVAILABLE
Falling back to LOCAL GPU ONLY
[DPYTHON] Running single-GPU training

🧠 Why DPython?

Problem Solution
Low VRAM GPUs Multi-machine training
Manual Accelerate setup Fully automated
Idle remote GPU Auto utilization
Complex configs Simple JSON
Research-grade setups Production-ready script

🚧 Known Limitations

  • Python environment must match on all machines
  • Dataset paths must exist on both machines
  • LAN latency affects scaling efficiency

📜 License

MIT License Free to use, modify, and distribute.


👤 Author

V R Vikash
AI & Distributed Systems Developer
Built for real-world, low-VRAM GPU environments


⭐ Support

If this project helped you:

  • ⭐ Star the repository
  • 🍴 Fork and improve
  • 🐛 Open issues or suggestions

Happy Distributed Training 🚀

About

DPython is a custom Python launcher for automatic single-GPU or multi-machine distributed training over LAN using Hugging Face Accelerate. It supports Windows, auto-detects local and remote GPUs, falls back safely to single-GPU when needed, and simplifies real-world multi-GPU training on low-VRAM systems.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published