26

It sounds like it was made via Festival...
In which case, technically, it could be made to sing, as Flinger seems to more or less be an expansion on Festival. (Assuming it still exists?)

Some of my games
Keep up to date by following @Jeqofire on twitter!
Ear Ninja?

Thumbs up

27

Heh, it sounds just so awesome... Lol! I just like such old-sounding robotic synths... big_smile

Red fox! :D
To see a world in a grain of sand, and a heaven in a wild flower.
Hold infinity in the palm of your hand, and eternity in an hour.
William Blake - Auguries of Innocence, line 1 to 4

Thumbs up

28

Hi all,

I have now updated the initial post with a new link to a sample that I uploaded to Dropbox. I'll be interested to hear your opinions on the differences.

Kind regards,

Philip Bennefall

Thumbs up

29

Philip, way to go. It does sound quite a bit better than the last demonstration. As you say it does require some more work, but I like it so far. big_smile

Sincerely,
Thomas Ward
USA Games Interactive
http://www.usagamesinteractive.com

Thumbs up

30

hi, nice try for the voice. however, I still wait for the final result...

Thumbs up

31

[wow]. Nice improvements there. How do you make a voice like this better? I thought you recorded lots and lots of sentences which you might have done, but what next? what can you do to improve the speech quallity?

Best regards SLJ.
If you like the post, then please give it a thumps up.
Feel free to contact me privately if you have something in mind. If you do so, then please send me a mail instead of using the private message on the forum, since I don't check those very often.
Happy gaming... :D

Thumbs up

32 (edited by raygrote 2012-09-28 12:22:34)

Hi,
The second voice sounds less jumpy, but needs a pretty big intonation increase in my opinion. It sounds monotone to me. But the intelligibility has improved.

Make more of less, that way you won't make less of more!
If you like what you're reading, please give a thumbs-up.

Thumbs up

33

Yes, the latest version does sound less jumpy, but I must agree with Ray that it does have next to no inflection. I think the pitch should be raised slightly, not lowered, but that is just my personal prefference.

I can't wait to hear the next update! I would also be interested to know how you go about improving the overall quality of the voice.

My opinions are my own. I try not to state them as facts and if I'm not sure about something, I do whatever research I can. I feel everyone should consider doing the same.

Thumbs up

34

Hi Philip!
If I listen very close, it sounds pretty much like your voice, I mean the second recording. I couldn't believe that it was your "voice" speaking. lol How did you make it? I may have missed something, but I heard something of "Festival", is it the developing kid for a sapi voice?

Keep up the good work!!!

For more randomness, follow me on Audioboom at Sanostro!
Aut enim do tibi, ut des, aut do, ut facias, aut facio, ut des, aut facio, ut facias.

Thumbs up

35

Since someone necroed this, I've got to ask.  Is this using Phonemes?  Diphones?  Words?
Also, are you blind, Philip?  I'm just curious--everything I found made making a voice like this into a very complex and visual process (open up your audio tools, select this view, and start clipping with the mouse on a sample-perfect basis...).
And for the record,at least the Espeak with NVDA is great.  I wouldn't go to anything else, now.  So, so fast, and yet clear.  Not natural, but clear all the same.
The one on Linux, especially with Orca, sucks: you can't change the prosity/inflection/whatever it's officially called.  one of these days, I'm going to track someone down who can help me reconfigure it all to actually work well and get Libsonic support without disabling audio permanently (oopse, and yay for VM).

My Blog
Twitter: @camlorn38 (Try Chicken Nugget)

Thumbs up

36

Hello Camlorn,

Using the Festival and Festvox tools, it doesn't have to be a visual process at all. Sure you have to edit some audio, but the editing is trivial and can be automated for the most part, which suits me as I am completely blind and don't want to mess with visual interfaces either. I really just wanted to see how far I could take the voice, and since I am not particularly happy with the end result I have shelved it for now. I'll take it up again once a new version of Festvox is released that can do better prosodic models, but it's hard to tell when that will be. The method I use is called Clustergen, and creates a statistical model of my voice based on phonemes and diphones when available. So the size and phonetic balance/coverage are vital when you construct your dataset.

As for ESpeak, it's not for me. I only use it when I have absolutely no other option or if I am feeling particularly masochistic. Grins.

Kind regards,

Philip Bennefall

Thumbs up

37

Awww, sorry to hear you've shelved it, I really liked it. smile

Red fox! :D
To see a world in a grain of sand, and a heaven in a wild flower.
Hold infinity in the palm of your hand, and eternity in an hour.
William Blake - Auguries of Innocence, line 1 to 4

Thumbs up

38

Yeah the second recording was quite well made. Of corse it's not the best ever tts, but it would work. smile

For more randomness, follow me on Audioboom at Sanostro!
Aut enim do tibi, ut des, aut do, ut facias, aut facio, ut des, aut facio, ut facias.

Thumbs up

39

Well, I love such voices that aren't "the best" as you say. smile Yeah, I'm weird. Heh.

Red fox! :D
To see a world in a grain of sand, and a heaven in a wild flower.
Hold infinity in the palm of your hand, and eternity in an hour.
William Blake - Auguries of Innocence, line 1 to 4

Thumbs up

40

To be honest, I thought the first recording of them all was the best.  I also now dislike Eloquence and think that the Espeak with NVDA is the best synthesis ever, so...take it with a grain of salt.
I need to look into this again.  if it's using statistics, it could be possible to get the modle to be more accurate by providing more data from different people, or at least to get it sounding more interesting, and I kinda wonder if you couldn't somehow duplicate Eloquence with it by using recordings.
How are you automating audio?  I'm not familiar of any audio analysis and modification scripts, save maybe Nyquist, but that's probably overkill.  The rest are all geared towards music, or so it seems.
If I can get or find a good microphone and a quiet place to record, this could be a lot of fun to play with.

My Blog
Twitter: @camlorn38 (Try Chicken Nugget)

Thumbs up

41

Hi guys,

After a few years of silence, I picked up this thread again. There have been some improvements in the voice creation tools, and I regenerated the voice with them using the existing dataset. Here's a new recording for those who may be interested:

https://www.dropbox.com/s/yb3zx4dt3rmdde4/text.wav?dl=1

I plan to rebuild the voice yet again with the latest snapshot of the tools that came out just a few days ago, but that's a project for the weekend when I'm off work.

Kind regards,

Philip Bennefall

Thumbs up

42

I personally would just wait to see if Lyrebird releases their AI, as in my opinion it sounds a bit better than whatever you're using right now. Still, I definitely see myself downloading it if you made it a sapi voice or if you made it for NVDA. Plus, I don't think Lyrebird will be releasing there AI any time in the near future...

You can follow me on twitter @brogar2000, and my Skype is garrett.brown2014. If you want to follow me on any of these things, please tell me you're from the forum, or else I won't follow you back. Also, it depends on who you are.

Thumbs up

43

@shotgunshell I definitely agree that Lyrebird sounds better. My goal is really just to experiment with the Festvox system to see what results I can achieve. I will only release it if I get a voice that I personally consider usable, in which case I could easily make it into a Sapi voice or a DLL for NVDA or whatever other format people might want. I'm doing this in my free time, of which I don't have a lot, so I have no idea when/if I'll have something usable. I'm just playing around and wanted to revive this topic to post the current output.

Kind regards,

Philip Bennefall

Thumbs up +1

44

Would you be willing to make it an Android TTS voice one day? I'd be willing to pay for it!

I myself quite like it!

Thumbs up

45

I'm kind of curious about this app myself, does it have command line parameters?

You can follow me on twitter @brogar2000, and my Skype is garrett.brown2014. If you want to follow me on any of these things, please tell me you're from the forum, or else I won't follow you back. Also, it depends on who you are.

Thumbs up

46

Your links don't work.

Bitcoin Address:
1MeNca7h6m8du4TV3psN4m4X666p6Y36u5m

Thumbs up

47 (edited by defender 2017-11-15 02:17:14)

Sounds like a fuzzy bucket. :-)
Decent inflection though, and it actually does sound like you.
No noticeable artifacts in the portion I heard either, but that crazy smoother thing that makes it sound fuzzy probably hides them all anyway...
May be the way you wrote it but, it seems kind of, droning, not enough commas? The sentences don't have defined separators, no real inflection change at the ends.

This... -- Is CNN'.
Well Ted, it sure looks like there's been uh, quite a bit of violence around here
"aaoh, that violence was terrible'!"
Yeah it was, pretty bad.

Thumbs up

48

If it would be as easy as making several concatenative files with different speech samples, and the whole interface was made to be accessible, people could definitely put in the effort to make their own. As for whether or not they want to to sell it...

Ulysses
AKA Green_Gables_fan and HeavenlyHarmony
My new, self-hosted version of WordPress!

Thumbs up

49

The main thing I'm wanting to fix is the inflection, and the endless sentences. I have solutions for both, as well as a few other tweaks I want to do. I'll post another sample when I have it.

The old links don't work, but that version sounded awful. For kicks, here they are for comparison.

The very first link, with a tiny dataset of just 500 recordings:
https://www.dropbox.com/s/upe4x3ckssv5m … 0.wav?dl=1

And the second link, with a few more recordings and a slightly different rendering method:
https://www.dropbox.com/s/0w31xv2h2utvr … l.wav?dl=1

Now that, that's fuzzy for sure. I die a little every time I go back and listen to these. They were made with very very old versions of the tools.

When it comes to platforms, as long as I'm happy with the voice itself, I'm game to try building it for whatever I can get my hands on. But again, I have not decided at all what I'm going to do with it.

Kind regards,

Philip Bennefall

Thumbs up

50

hay philop,
could you provide a tutorial on how to make a sapi voice using festvox?

story in a game is like story in a spam movie. it's expected to be there, but it's not that important.
-John Romero
http://youtu.be/zqJ3Lisp6Cg

Thumbs up