Stable AI Flow: Phase-Locked Live AI Filter

Watch the video:

A real-time AI video filter that uses phase correlation and fractal coherence analysis to create temporally stable visual transformations.

🔬 The Core Discovery: Physics-Based Temporal Coherence

Most AI video filters suffer from severe hallucinations because they treat each frame independently. This project solves that by treating the video stream as a complex signal with measurable structural coherence (fractal dimension) and using phase locking to maintain temporal stability.

The Three Pillars

1. Phase Locking (Temporal Registration)

Uses OpenCV's phaseCorrelate() to calculate sub-pixel motion between frames via FFT. The previous AI-generated frame is physically warped to match your current movement before the next diffusion step, creating perfect temporal alignment.

Why this matters: Traditional optical flow is computationally expensive and unreliable. Phase correlation gives us the exact Fourier-domain shift in milliseconds, allowing real-time lock.

2. Fractal Viscosity (Coherence Gating)

Measures the information viscosity of the input signal by computing fractal dimension (β) across blur scales:

β = texture_complexity(blur_level_1) - texture_complexity(blur_level_4)

High viscosity (low motion): System trusts its internal model → strong feedback → crystallization
Low viscosity (high motion): System trusts sensory input → weak feedback → adapts to change

Why this matters: By measuring how structure degrades under blur, we detect scene stability without expensive optical flow. Only regenerate AI frames when coherence changes significantly.

3. The Crystallizer (Entropy Resistance)

Feedback loops naturally average and blur. To fight this, an unsharp mask is applied inside the feedback path:

output = (1 + sharpness) × locked_frame - sharpness × gaussian_blur(locked_frame)

Since phase lock keeps edges aligned across frames, the sharpener repeatedly reinforces structural lines, evolving the image toward a "line art attractor state."

Why this matters: Without this, the system collapses to gray mush within seconds. With it, fine details crystallize and stabilize.

🎯 What Makes This Different

Traditional AI Filters	Phase-Locked Flow
Each frame independent	Frames phase-locked
Constant flickering	Temporally coherent
No scene understanding	Fractal coherence gating
Blur accumulation	Active sharpening in loop
Full-frame processing	Selective masking (face-only)

🚀 Installation

Prerequisites

Python 3.10 or 3.11 (recommended)
NVIDIA GPU with 8GB+ VRAM
CUDA 11.8 or higher

Setup

# Clone the repository
git clone https://github.com/anttiluode/stableaiflow.git
cd stableaiflow

# Install PyTorch with CUDA (critical for performance)
pip install torch torchvision --index-url https://download.pytorch.org/whl/cu118

# Install remaining dependencies
pip install -r requirements.txt

Requirements

opencv-python>=4.8.0
numpy>=1.24.0
torch>=2.0.0
diffusers>=0.21.0
transformers>=4.30.0
accelerate>=0.20.0
Pillow>=10.0.0

🎮 Usage

python StableAIflow.py

On first run, the model (~7GB) will download automatically. Subsequent runs start instantly.

Controls Guide

Dream Strength (0.1 - 1.0)

Low (0.3-0.5): Subtle artistic enhancement
Medium (0.5-0.7): Strong stylization
High (0.7-1.0): Complete transformation

Phase Lock Gravity (0.0 - 0.99)

Controls feedback loop strength (temporal memory)
0.0: No memory, instant adaptation (flickery)
0.7: Balanced (recommended)
0.9+: Strong memory, crystallized structures

Loop Sharpness (Crystallizer) (0.0 - 2.0)

0.0: Soft, blurry (entropy wins)
1.0: Sharp, etched (structure wins) ← Recommended
2.0: Hyper-sharp (can introduce artifacts)

Mask Scale (0.5 - 2.0)

Adjusts face detection region
1.0: Default face size
1.5: Expands to include hair/shoulders

Enable Face Mask (checkbox)

✅ On: AI effect only on face, real background
❌ Off: Full-frame AI transformation

🧪 The Science: Why This Works

Fractal Coherence as Scene Stability

The measurement of β (fractal dimension) across blur scales acts as a zero-shot scene change detector:

def should_regenerate(image):
    β_shallow = measure_texture(image, blur=2px)
    β_deep = measure_texture(image, blur=8px)
    gradient = β_deep - β_shallow
    
    # Stable scene: gradient stays consistent
    # Scene change: gradient shifts
    return abs(gradient - last_gradient) > threshold

This is not just blur detection—it's measuring information viscosity. Objects with true multi-scale structure (faces, buildings) resist blur. Noise collapses immediately. This lets us distinguish between:

Camera shake (high viscosity change → don't regenerate)
Actual motion (controlled viscosity change → regenerate)
Stillness (no change → reuse last frame)

Phase Locking as Temporal Prior

By warping the previous frame to match current motion, we provide the diffusion model with a temporally consistent starting point. This is equivalent to giving the model a "prediction" of what the current frame should look like, dramatically reducing the search space and preventing flicker.

Crystallization as Learned Attractor

The sharpening loop creates a natural attractor toward high-frequency structural patterns (edges, lines). Over time, the system converges on the simplest stable representation: clean line art. This is why "sketch" and "etching" styles emerge naturally without explicit prompting.

📊 Performance

Tested on RTX 4060 (8GB):

15-20 FPS at 512×512 (SDXL Turbo, 2-4 inference steps)
~400ms per generation
70-90% frame reuse (coherence gating efficiency)

Memory usage:

~6GB VRAM (model + single frame buffer)
~2GB RAM (video capture + GUI)

🐛 Troubleshooting

Black screen on startup:

Check camera index (try 0, 1, or 2)
Ensure webcam not in use by another app

Model won't load:

Check internet connection (first run)
Verify CUDA installation: python -c "import torch; print(torch.cuda.is_available())"
Try fallback: Change model to "stabilityai/stable-diffusion-2-base"

Low FPS / Stuttering:

Reduce Dream Strength (fewer inference steps)
Lower Phase Lock Gravity (less computation)
Close other GPU-intensive apps

Face mask not working:

Ensure good lighting
Face camera directly
Increase Mask Scale if face is large
Check that Haar cascades are installed (part of opencv-python)

🔬 Technical Deep Dive

Why Phase Correlation?

Traditional optical flow (Lucas-Kanade, Farneback) is:

Computationally expensive (~50-100ms)
Sensitive to lighting changes
Requires iterative refinement

Phase correlation via FFT:

Constant time: O(n log n)
Sub-pixel accuracy
Robust to brightness/contrast changes
Perfect for real-time applications

The Feedback Equation

next_frame = diffusion(
    input = blend(
        webcam,
        phase_lock(last_ai_frame, motion_vector),
        weight = viscosity × gravity
    ),
    strength = strength
)

This creates a predictive coding loop: the system maintains an internal model (last_ai_frame) and only updates when prediction error (viscosity change) exceeds threshold.

Comparison to Video Diffusion Models

Traditional video models (AnimateDiff, Stable Video Diffusion) require:

Massive training datasets
Temporal attention layers (expensive)
Multi-frame context windows

This approach achieves temporal coherence through:

Zero training (pure physics)
Single-frame context (cheap)
Explicit temporal alignment (phase lock)

Trade-off: Less "intelligent" transitions, but 10-100× faster and requires no training.

📚 Related Research

This project synthesizes ideas from:

Predictive Coding (Friston et al.) - Brain as Bayesian inference engine
Communication Through Coherence (Fries 2005) - Phase synchronization
Fractal Dimension Analysis (Mandelbrot, Buzsáki) - Multi-scale structure
Mechanistic Interpretability (Anthropic 2024) - Understanding model internals

📄 License

MIT License - Free to use, modify, and distribute.

🙏 Acknowledgments

Stability AI for SDXL Turbo
OpenCV team for phase correlation implementation
The research discussions that led to fractal coherence analysis
Community feedback and testing

Note: This is experimental research software. While it runs in real-time, it's not optimized for production use. Use at your own risk, and have fun exploring the phase-locked state space! 🚀

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
LICENSE		LICENSE
README.md		README.md
StableAIflow.py		StableAIflow.py
requirements.txt		requirements.txt
stable.png		stable.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Stable AI Flow: Phase-Locked Live AI Filter

🔬 The Core Discovery: Physics-Based Temporal Coherence

The Three Pillars

1. Phase Locking (Temporal Registration)

2. Fractal Viscosity (Coherence Gating)

3. The Crystallizer (Entropy Resistance)

🎯 What Makes This Different

🚀 Installation

Prerequisites

Setup

Requirements

🎮 Usage

Controls Guide

🧪 The Science: Why This Works

Fractal Coherence as Scene Stability

Phase Locking as Temporal Prior

Crystallization as Learned Attractor

📊 Performance

🐛 Troubleshooting

🔬 Technical Deep Dive

Why Phase Correlation?

The Feedback Equation

Comparison to Video Diffusion Models

📚 Related Research

📄 License

🙏 Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Stable AI Flow: Phase-Locked Live AI Filter

🔬 The Core Discovery: Physics-Based Temporal Coherence

The Three Pillars

1. Phase Locking (Temporal Registration)

2. Fractal Viscosity (Coherence Gating)

3. The Crystallizer (Entropy Resistance)

🎯 What Makes This Different

🚀 Installation

Prerequisites

Setup

Requirements

🎮 Usage

Controls Guide

🧪 The Science: Why This Works

Fractal Coherence as Scene Stability

Phase Locking as Temporal Prior

Crystallization as Learned Attractor

📊 Performance

🐛 Troubleshooting

🔬 Technical Deep Dive

Why Phase Correlation?

The Feedback Equation

Comparison to Video Diffusion Models

📚 Related Research

📄 License

🙏 Acknowledgments

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages