The Wall of Voices — Shapes of Intelligence

Audio Analysis Tools

The voiceprint.py tool provides acoustic fingerprinting, voice identification, and quality checking for the audiobook pipeline.

python3 voiceprint.py --build-db

Scans all WAV files in the wall directories and writes voice-db.json with MFCC, f0, spectral centroid, RMS, and duration for each voice.

python3 voiceprint.py identify <file.wav>

Compares a WAV file against the fingerprint database and returns the closest matching voice with a confidence score.

python3 voiceprint.py compare <a.wav> <b.wav>

Pairwise comparison of two audio files, reporting MFCC distance, f0 delta, and an overall similarity score.

python3 voiceprint.py check <script.md>

Reads an episode script with speaker tags, checks each generated audio line against the expected voice, and flags mismatches.

python3 voiceprint.py scan <file.wav>

Chunks a longer audio file into segments and identifies the voice in each chunk, producing a timeline trace of who is speaking when.

How to add a new voice reference to the wall and wire it into the audiobook pipeline.

Record or obtain a reference clip — 10–30 seconds of clean speech, mono, 16–48 kHz sample rate, WAV format.
Place the file — cast voices go in ~/w9/wall/cast/, standalone voices in ~/w9/wall/. Use snake_case: character_name_register.wav.
Rebuild the database — run python3 voiceprint.py --build-db to add it to the fingerprint database.
Register in the pipeline — add an entry to VOICE_MAP in generate_episode.py.
Set volume scale — if RMS differs from default (0.07–0.09), add to VOLUME_SCALE.
Test — run a short sample before full episode generation.