🔥 RustForge RL

A high-performance Reinforcement Learning framework built from the ground up in Rust.

Features • Architecture • Quick Start • Roadmap • Contributing

Why RustForge RL?

The Reinforcement Learning ecosystem is dominated by Python frameworks (Stable-Baselines3, RLlib, CleanRL), which are powerful but carry inherent limitations:

Pain Point	Python Frameworks	RustForge RL
Runtime Speed	GIL bottleneck, interpreter overhead	Zero-cost abstractions, native speed
Memory Safety	Runtime errors, memory leaks	Compile-time guarantees via ownership
Concurrency	Fragile multiprocessing	Fearless concurrency with `Send`/`Sync`
Deployment	Heavy runtimes, dependency hell	Single static binary, no runtime
Reproducibility	Floating-point non-determinism	Deterministic seeding at every layer

RustForge RL aims to be the first comprehensive, production-grade RL framework in Rust — not just a toy implementation, but a framework you can use to train real agents and deploy them anywhere.

Features

🧮 Tensor Engine (`rustforge-tensor`) — ✅ Complete

A PyTorch-style tensor library built on top of ndarray:

Creation: from_vec, zeros, ones, eye, arange, linspace, scalar, full
Shape Transforms: reshape, flatten, transpose, permute, unsqueeze, squeeze
Arithmetic: Overloaded +, -, *, / with full broadcasting support
Matrix Math: matmul supporting dot products, matrix-vector, and batch matrix multiplication
Reductions: sum, mean, max, argmax, var, std_dev (with axis + keepdim support)
Activations: relu, sigmoid, tanh, softmax, log_softmax (numerically stable)
Math Ops: exp, log, pow, sqrt, abs, clamp, neg, reciprocal
Concatenation: cat and stack with arbitrary axis support
Random Init: Uniform, Normal, Xavier/Glorot, Kaiming/He initialization strategies
Display: PyTorch-style pretty printing with automatic truncation for large tensors

🔄 Autograd Engine (`rustforge-autograd`) — ✅ Complete

Variable wrapper with gradient tracking (Rc<RefCell<>>)
Dynamic computational graph construction via GradFn trait
Backward pass via topological sort + chain rule
17 gradient mappings for operations and math functions
Numerical gradient checking (finite difference method)
Optimizers: SGD (w/ momentum), Adam (bias-corrected)

🧠 Neural Network Modules (`rustforge-nn`) — ✅ Complete

Linear, Conv2d, BatchNorm, LayerNorm
Sequential container, Module trait
Loss functions: MSE, CrossEntropy, Huber
Model serialization and checkpointing

🎮 RL Algorithms (`rustforge-rl`) — 📋 Planned

Value-Based: DQN, Double DQN, Dueling DQN, Prioritized Experience Replay
Policy Gradient: REINFORCE, A2C, PPO (clip & penalty variants)
Off-Policy: SAC, TD3, DDPG
Environment Interface: Gymnasium-compatible trait for custom environments
Replay Buffers: Uniform, Prioritized (SumTree), HER

🐍 Python Bindings (`rustforge-python`) — 📋 Planned

PyO3-powered Python API for seamless integration
NumPy array interop (zero-copy where possible)
Drop-in replacement for select PyTorch/SB3 workflows

📊 Training Dashboard (`rustforge-dashboard`) — 📋 Planned

Real-time web-based monitoring via Axum + WebSocket
Live reward curves, loss plots, episode statistics
Hyperparameter tracking and experiment comparison

Architecture

RustForge RL is organized as a Cargo workspace with strict, one-directional dependencies:

rustforge-rl/
├── Cargo.toml                 # Workspace root
├── crates/
│   ├── rustforge-tensor/      # 🧮 Tensor computation engine
│   │   └── src/
│   │       ├── lib.rs         # Crate entry point & re-exports
│   │       ├── tensor.rs      # Core Tensor struct + operations
│   │       ├── ops.rs         # Operator overloading (+, -, *, /, matmul)
│   │       ├── shape.rs       # Broadcasting rules & shape utilities
│   │       ├── random.rs      # Random initialization strategies
│   │       ├── display.rs     # Pretty-print formatting
│   │       └── error.rs       # Type-safe error definitions
│   │
│   ├── rustforge-autograd/    # 🔄 Automatic differentiation
│   ├── rustforge-nn/          # 🧠 Neural network layers
│   └── rustforge-rl/          # 🎮 RL algorithms
│
├── examples/                  # Runnable examples (coming soon)
└── benches/                   # Performance benchmarks (coming soon)

Dependency graph:

tensor ← autograd ← nn ← rl
                          ↓
                      dashboard
                      python bindings

Each layer only depends on the layer below it, ensuring clean separation of concerns and independent testability.

Quick Start

Prerequisites

Rust 1.75+ (2021 edition)
A C compiler (for ndarray's BLAS backend — MSVC on Windows, GCC/Clang on Linux/macOS)

Build & Test

# Clone the repository
git clone https://github.com/YOUR_USERNAME/rustforge-rl.git
cd rustforge-rl

# Build the entire workspace
cargo build

# Run all tests (51 unit tests + 9 doc tests)
cargo test -p rustforge-tensor

# Run with optimizations for benchmarking
cargo build --release

Basic Usage

use rustforge_tensor::Tensor;

fn main() {
    // Create tensors
    let weights = Tensor::xavier_uniform(&[128, 64], Some(42));
    let input = Tensor::rand_normal(&[32, 64], 0.0, 1.0, None);
    let bias = Tensor::zeros(&[128]);

    // Forward pass: output = input @ weights^T + bias
    let output = input.matmul(&weights.t()) + bias;

    // Activation
    let activated = output.relu();

    // Softmax for probability distribution
    let probs = activated.softmax(1).unwrap();

    println!("Output shape: {:?}", probs.shape());
    println!("Probabilities:\n{}", probs);
}

Tensor Operations Examples

use rustforge_tensor::Tensor;

// Broadcasting: [3, 1] + [1, 4] → [3, 4]
let a = Tensor::from_vec(vec![1.0, 2.0, 3.0], &[3, 1]);
let b = Tensor::from_vec(vec![10.0, 20.0, 30.0, 40.0], &[1, 4]);
let c = &a + &b;  // shape: [3, 4]

// Reductions
let data = Tensor::rand_uniform(&[100, 50], 0.0, 1.0, Some(0));
println!("Mean: {:.4}", data.mean().item());
println!("Std:  {:.4}", data.std_dev().item());

// Matrix multiplication
let q = Tensor::randn(&[8, 64], Some(1));  // queries
let k = Tensor::randn(&[8, 64], Some(2));  // keys
let attention = q.matmul(&k.t());           // [8, 8] attention scores
let weights = (& attention / 8.0_f32.sqrt()).softmax(1).unwrap();

Roadmap

Phase	Milestone	Status
Phase 1	Tensor Engine	✅ Complete (51 tests passing)
Phase 1	Autograd Engine	✅ Complete (49 tests passing)
Phase 2	Neural Network Modules	✅ Complete (37 tests passing)
Phase 2	Optimizers (SGD, Adam)	✅ Complete
Phase 3	DQN + CartPole	📋 Planned
Phase 3	PPO + Continuous Control	📋 Planned
Phase 4	SAC, TD3, DDPG	📋 Planned
Phase 4	Python Bindings (PyO3)	📋 Planned
Phase 5	Training Dashboard	📋 Planned
Phase 5	GPU Support (wgpu)	📋 Planned

Design Philosophy

1. Correctness First, Then Performance

Every operation is backed by comprehensive unit tests with numerical precision checks. We use approx for floating-point comparisons and deterministic seeding for reproducibility.

2. PyTorch-Familiar API

If you've used PyTorch, you'll feel right at home. Method names, broadcasting rules, and tensor semantics are intentionally aligned with PyTorch conventions.

3. Zero-Cost Abstractions

Rust's ownership system lets us provide a safe, high-level API without runtime overhead. No garbage collector, no reference counting at the tensor layer — just stack-allocated wrappers around contiguous memory.

4. Modular by Design

Each crate is independently usable. Need just tensors? Use rustforge-tensor. Want autograd without RL? Use rustforge-autograd. The workspace structure enforces clean boundaries.

Performance

RustForge RL is built on ndarray which leverages BLAS for matrix operations. Preliminary benchmarks on common operations:

Operation	Shape	RustForge	Notes
MatMul	[512, 512] × [512, 512]	~2ms	With OpenBLAS
Softmax	[1024, 1024]	~1ms	Numerically stable
Xavier Init	[1024, 1024]	~3ms	ChaCha20 RNG
Broadcasting Add	[1000, 1] + [1, 1000]	~0.5ms	Native ndarray

Note: Benchmarks are from development builds. Release builds (--release) are typically 10-30× faster.

Contributing

We welcome contributions of all kinds! RustForge RL is in its early stages, making it an excellent time to get involved.

Ways to Contribute

🐛 Bug Reports: Found an issue? Open a GitHub Issue with reproduction steps
📖 Documentation: Improve doc comments, add examples, write tutorials
🧪 Tests: Add edge cases, property-based tests, or integration tests
🚀 Features: Pick an item from the roadmap and submit a PR
💡 Ideas: Suggest new RL algorithms, optimizations, or API improvements

Getting Started

# Fork and clone
git clone https://github.com/YOUR_USERNAME/rustforge-rl.git
cd rustforge-rl

# Create a feature branch
git checkout -b feat/your-feature

# Make changes and run tests
cargo test --workspace
cargo clippy --workspace

# Submit a PR!

Code Style

Run cargo fmt before committing
Run cargo clippy and address all warnings
Add doc comments with /// for all public items
Include unit tests for new functionality
Use Conventional Commits for commit messages

Tech Stack

Component	Technology
Language	Rust 2021 Edition
Tensor Backend	ndarray 0.16
Random Number Generation	rand 0.8 + ChaCha20
Serialization	serde + bincode
Logging	tracing
Testing	Built-in + approx
Future: Python Bindings	PyO3
Future: Web Dashboard	Axum + WebSocket
Future: GPU	wgpu

References & Inspiration

This project draws inspiration from and builds upon ideas in:

PyTorch — API design and tensor semantics
tch-rs — Rust bindings for libtorch
candle — Minimalist ML framework in Rust
burn — Deep learning framework in Rust
Stable-Baselines3 — RL algorithm implementations
CleanRL — Single-file RL implementations

Key Papers

Mnih et al., "Playing Atari with Deep Reinforcement Learning" (DQN, 2013)
Schulman et al., "Proximal Policy Optimization Algorithms" (PPO, 2017)
Haarnoja et al., "Soft Actor-Critic" (SAC, 2018)
Glorot & Bengio, "Understanding the difficulty of training deep feedforward neural networks" (Xavier Init, 2010)
He et al., "Delving Deep into Rectifiers" (Kaiming Init, 2015)

License

This project is dual-licensed under:

You may choose either license. See LICENSE-MIT and LICENSE-APACHE for details.

Built with 🦀 Rust and ❤️ passion for RL
_{RustForge RL — Forging intelligent agents, one tensor at a time.}

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
.github/workflows		.github/workflows
crates		crates
CONTRIBUTING.md		CONTRIBUTING.md
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

🔥 RustForge RL

Why RustForge RL?

Features

🧮 Tensor Engine (rustforge-tensor) — ✅ Complete

🔄 Autograd Engine (rustforge-autograd) — ✅ Complete

🧠 Neural Network Modules (rustforge-nn) — ✅ Complete

🎮 RL Algorithms (rustforge-rl) — 📋 Planned

🐍 Python Bindings (rustforge-python) — 📋 Planned

📊 Training Dashboard (rustforge-dashboard) — 📋 Planned

Architecture

Quick Start

Prerequisites

Build & Test

Basic Usage

Tensor Operations Examples

Roadmap

Design Philosophy

1. Correctness First, Then Performance

2. PyTorch-Familiar API

3. Zero-Cost Abstractions

4. Modular by Design

Performance

Contributing

Ways to Contribute

Getting Started

Code Style

Tech Stack

References & Inspiration

Key Papers

License

About

Topics

Resources

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

🧮 Tensor Engine (`rustforge-tensor`) — ✅ Complete

🔄 Autograd Engine (`rustforge-autograd`) — ✅ Complete

🧠 Neural Network Modules (`rustforge-nn`) — ✅ Complete

🎮 RL Algorithms (`rustforge-rl`) — 📋 Planned

🐍 Python Bindings (`rustforge-python`) — 📋 Planned

📊 Training Dashboard (`rustforge-dashboard`) — 📋 Planned

Packages