by Scott Howard Swain
Automates “cleaning” of synchronized dual-video recordings (host + guest) while keeping host/guest perfectly aligned.
Features:
- Removes filler words: Detects configured filler words from
config.pyand mutes them during processing. Depending on length of silence (including surrounding silence), the muted area may be cut in theCuts pausesstep. - Normalizes guest loudness to match the host (or to a standard LUFS target).
- Reduces volume spikes in the guest track above a configured threshold.
- Cuts pauses: Shortens long mutual-silence pauses to a configurable minimum duration by removing the excess.
- Maintains sync by applying the same keep/remove decisions to both videos.
Notes:
- The processing pipeline is detector/processor-based so behavior can be extended without rewriting the full flow.
- Most behavior is controlled via
config.py, including enabled processors and the filler words list.
flowchart LR
A[Probe media] --> B[Detect events]
B --> C[Mute filler words]
C --> D[Cut pauses]
D --> E[Normalize guest loudness]
E --> F[Reduce guest volume spikes]
F --> G[Render host output]
F --> H[Render guest output]
G --> I[Align processed pair]
H --> I
Give credit to the creator. This application is a labor of love. It's free to use, fork, change with no licensing rules. Please show your love by starring the repo at: https://github.com/ScotterMonk/AV-Cleaner
- Python >= 3.13
- FFmpeg available on PATH
pip install -r requirements.txt
- Run GUI:
py app.py - Run CLI:
python main.py process --host path/to/host.mp4 --guest path/to/guest.mp4 - Override normalization mode:
python main.py process --host ... --guest ... --norm-mode MATCH_HOST|STANDARD_LUFS
Edit config.py to change thresholds (spikes/silence), normalization behavior, rendering options, enabled processors, and the filler words to detect/mute.
- Keep non-secret app behavior in
config.py. - Keep credentials and other secret values in
.envat the project root. - The app loads
.envautomatically on startup for bothpython app.pyandpython main.py process .... - Read values in Python with
os.getenv("YOUR_SECRET_NAME").
- The tool always renders a processed sync'd host + processed guest pair to preserve alignment.
- Outputs are written as MP4 (even if inputs are AVI/MKV/etc.).
- Audio extraction can load entire audio into RAM on long videos. 16GB system RAM should be fine for two ~4GB 4K input videos.
- Frame-accurate cutting depends on codec/encoding settings; changing codecs can reduce cut accuracy.
- Run tests:
pytest