Audio Analysis Tools
The voiceprint.py tool provides acoustic fingerprinting, voice identification, and quality checking for the audiobook pipeline.
Build the fingerprint database
python3 voiceprint.py --build-db
Scans all WAV files in the wall directories and writes voice-db.json with MFCC, f0, spectral centroid, RMS, and duration for each voice.
Identify a voice
python3 voiceprint.py identify <file.wav>
Compares a WAV file against the fingerprint database and returns the closest matching voice with a confidence score.
Compare two files
python3 voiceprint.py compare <a.wav> <b.wav>
Pairwise comparison of two audio files, reporting MFCC distance, f0 delta, and an overall similarity score.
Check a script
python3 voiceprint.py check <script.md>
Reads an episode script with speaker tags, checks each generated audio line against the expected voice, and flags mismatches.
Scan a file (voice trace)
python3 voiceprint.py scan <file.wav>
Chunks a longer audio file into segments and identifies the voice in each chunk, producing a timeline trace of who is speaking when.
Adding a New Voice
How to add a new voice reference to the wall and wire it into the audiobook pipeline.
- Record or obtain a reference clip — 10–30 seconds of clean speech, mono, 16–48 kHz sample rate, WAV format.
- Place the file — cast voices go in
~/w9/wall/cast/, standalone voices in~/w9/wall/. Use snake_case:character_name_register.wav. - Rebuild the database — run
python3 voiceprint.py --build-dbto add it to the fingerprint database. - Register in the pipeline — add an entry to
VOICE_MAPingenerate_episode.py. - Set volume scale — if RMS differs from default (0.07–0.09), add to
VOLUME_SCALE. - Test — run a short sample before full episode generation.