I just released a new version of nv speechPlayer. Not speechPlayer in espeak, but the actual nv speechPlayer.
This version fixes the issues with the latest espeak that comes with nvda, and I also enabled more Klatt settings.
http://grossgang.com/tts/speechPlayer%2 … nvda-addon

Thumbs up +2


What do all the settings mean in the voice dialog.

Bitcoin Address:

Thumbs up

3 (edited by slender 2017-03-18 18:52:28)

Many of them are settings related to Klatt Synthesis, the method NV Speech Player is using. Some are related to cascade formants, some are pitch settings. It's best to play around with them and see what you get, as there are too many to explain.

And thus the beast grew powerful, and fire and thunder swept the land. But Mammon stirred in their hearts, and the beast Foundered, and its Corpse arose, and commanded "thou shalt not fly in my name." And the blazes shall freeze cold, and the souls of the followers of Mammon shall learn to tremble in the
face of ice as they did before the fire.
from The Book of Ice, 10:13

Thumbs up


The settings are explained in frame.h if you download the source code, and I am copying the part of the code that explains the settings below.

    // voicing and cascaide
    speechPlayer_frameParam_t voicePitch; //  fundermental frequency of voice (phonation) in hz
    speechPlayer_frameParam_t vibratoPitchOffset; // pitch is offset up or down in fraction of a semitone
    speechPlayer_frameParam_t vibratoSpeed; // Speed of vibrato in hz
    speechPlayer_frameParam_t voiceTurbulenceAmplitude; // amplitude of voice breathiness from 0 to 1
    speechPlayer_frameParam_t glottalOpenQuotient; // fraction between 0 and 1 of a voice cycle that the glottis is open (allows voice turbulance, alters f1...)
    speechPlayer_frameParam_t voiceAmplitude; // amplitude of voice (phonation) source between 0 and 1.
    speechPlayer_frameParam_t aspirationAmplitude; // amplitude of aspiration (voiceless h, whisper) source between 0 and 1.
    speechPlayer_frameParam_t dcf1, dcb1; // change in hz in frequency and bandwidth of cascaide formant 1 in voice cycle while glottis is open
    speechPlayer_frameParam_t cf1, cf2, cf3, cf4, cf5, cf6, cfN0, cfNP; // frequencies of standard cascaide formants, nasal (anti) 0 and nasal pole in hz
    speechPlayer_frameParam_t cb1, cb2, cb3, cb4, cb5, cb6, cbN0, cbNP; // bandwidths of standard cascaide formants, nasal (anti) 0 and nasal pole in hz
    speechPlayer_frameParam_t ca1, ca2, ca3, ca4, ca5, ca6, caN0, caNP; // amplitudes of standard cascaide formants, nasal (anti) 0 and nasal pole in hz
    // fricatives and parallel
    speechPlayer_frameParam_t fricationAmplitude; // amplitude of frication noise from 0 to 1.
    speechPlayer_frameParam_t pf1, pf2, pf3, pf4, pf5, pf6; // parallel formants in hz
    speechPlayer_frameParam_t pb1, pb2, pb3, pb4, pb5, pb6; // parallel formant bandwidths in hz
    speechPlayer_frameParam_t pa1, pa2, pa3, pa4, pa5, pa6; // amplitude of parallel formants between 0 and 1
    speechPlayer_frameParam_t parallelBypass; // amount of signal which should bypass parallel resonators from 0 to 1
    speechPlayer_frameParam_t preFormantGain; // amplitude from 0 to 1 of all vocal tract sound (voicing, frication) before entering formant resonators. Useful for stopping/starting speech
    speechPlayer_frameParam_t outputGain; // amplitude from 0 to 1 of final output (master volume)
    speechPlayer_frameParam_t endVoicePitch; //  pitch of voice at the end of the frame length

Thumbs up