Reference: build/experiments/013-voice-and-pauses/gemini.wav Total Duration: 69.89s Speech Duration: 57.22s Silence Duration: 12.67s Silence Ratio: 18.1% Pause Count: 32 Avg Pause: 0.396s Max Pause: 0.768s Mean F0: 168.5 Hz Median F0: 161.0 Hz StdDev F0: 43.4 Hz F0 Range: 64-493 Hz Voiced: 22.5% Test: build/experiments/013-voice-and-pauses/kokoro-final.wav Total Duration: 72.66s Speech Duration: 61.17s Silence Duration: 11.50s Silence Ratio: 15.8% Pause Count: 28 Avg Pause: 0.411s Max Pause: 0.875s Mean F0: 206.0 Hz Median F0: 196.1 Hz StdDev F0: 46.3 Hz F0 Range: 69-402 Hz Voiced: 40.1% --- Pause Comparison --- Metric Ref Test Diff ------ --- ---- ---- Total Duration (s) 69.891 72.664 +2.773 Speech Duration (s) 57.219 61.165 +3.946 Silence Duration (s) 12.672 11.499 -1.173 Silence Ratio 0.181 0.158 -0.023 Pause Count 32 28 -4 Avg Pause (s) 0.396 0.411 +0.015 Max Pause (s) 0.768 0.875 +0.107 --- Pause Duration Histogram --- Ref: <0.2s=4 0.2-0.5s=20 0.5-1.0s=8 >1.0s=0 Test: <0.2s=3 0.2-0.5s=18 0.5-1.0s=7 >1.0s=0 --- Pitch Comparison --- Metric Ref Test Diff ------ --- ---- ---- Mean F0 (Hz) 168.534 205.967 +37.433 Median F0 (Hz) 160.999 196.089 +35.091 StdDev F0 (Hz) 43.409 46.265 +2.856 Min F0 (Hz) 63.745 69.332 +5.587 Max F0 (Hz) 493.125 402.032 -91.093 Voiced (%) 22.505 40.058 +17.552 F0 Range (Hz) 429.380 332.701 -96.680 --- Transcription --- Ref: The word was bridge. Lena was typing a quarterly report, revenue projections for the neural interface division, the kind of document she could produce half a sleep, when her fingers stopped. She'd meant to write budget. The sentence on her screen read, "Q3 bridge allocations exceeded projections by 14%. She stared at it, deleted bridge, typed budget. Her fingers moved correctly, the letters appeared correctly, but for a fraction of a second, between the intention and the keystroke, something had shifted." She leaned back in her chair. The echo device hummed against the bone behind her left ear, a sound so constant she usually forgot it was there, the way you forget the sound of your own breathing. The device had been part of her daily life for three years. It recorded her cognitive patterns, indexed her memories for easy retrieval, and ran passive diagnostics on her mental state. 40 million people wore one. It was as unremarkable as a wristwatch. Except that lately, the hum had changed pitch. Not dramatically, she might have imagined it, but enough to make her aware of the device in a way she hadn't been since the first week she'd worn it. Test: The word was "bridge." Lena was typing a quarterly report revenue projections for the Neural Interface Division. The kind of document she could produce half a sleep. When her fingers stopped, she'd meant to write "budget." The sentence on her screen read, "A three-bridge allocations exceeded projections by 14 percent." She starred at it, deleted "bridge" typed "budget." Her fingers moved correctly. The letters appeared correctly, but for a fraction of a second, and the keystroke, something had shifted. She leaned back in her chair. The echo device hummed against the bone behind her left ear. A sound so constant she usually forgot it was there. The way you forget the sound of your own breathing. The device had been part of her daily life for three years. It recorded her cognitive patterns, indexed her memories for easy retrieval, and ran passive diagnostics on her mental state. 40 million people wore one. It was as unremarkable as a wristwatch. Except that lately, the hum had changed pitch. Not dramatically, she might have imagined it, but enough to make her aware of the device in a way she hadn't been since the first week she'd worn it. Ref spectrogram: build/experiments/013-voice-and-pauses/ref-spectrogram.png Test spectrogram: build/experiments/013-voice-and-pauses/test-spectrogram.png