Local & private
Whisper-class speech-to-text runs on your machine. Audio is never uploaded; the only network calls are when you ask CMVideo to download a video from a URL.
v0.4.0-alpha · open · local-first · live
Drag in any clip. Drop out a clean copy. CMVideo finds every swear and slur with on-device speech recognition, then silences, beeps, or replaces them with a Microsoft Sam style TTS - whichever you pick.
100% local. No accounts, no telemetry, no clip uploads. Audio never leaves your machine.
Whisper-class speech-to-text runs on your machine. Audio is never uploaded; the only network calls are when you ask CMVideo to download a video from a URL.
Drop one or many files, or paste a YouTube / yt-dlp-supported URL. Pick the output format (MP4, MOV, MP3, WAV, OGG) and quality. Batches run unattended.
Replace flagged words with crisp silence, a classic 1 kHz censor beep, or a Microsoft Sam style TTS overlay that turns a 7-hour stream into something genuinely watchable.
Source release. Requires Python 3.10+. The bundled
install.sh / install.ps1 sets up
ffmpeg and a private venv with the heavy ML deps on first
launch - usually 1-3 minutes.
.tar.gz · install with
./install.sh
.zip · right-click
install.ps1 → Run with PowerShell
.tar.gz · works via the Linux instructions
untested
Checksums and full release notes: github.com → releases → v0.4.0-alpha
MP4, MOV, MP3, WAV, OGG locally - or any YouTube / 1,800+ site URL supported by yt-dlp.
Swears, slurs, or both. Choose Silence, Beep, or Fun TTS. Optionally save a full transcript.
CMVideo transcribes, finds every match (including phonetic / leet-speak variants), and writes the cleaned file next to the original.