2017-04-12 14:02:28

ok guys, here is the link for my feedback for microsoft to fork or take over espeak and make the best sapi5 eloquence sounding klatt voice built into windows10 and also for use on android devices with talkback. . please up-vote and comment on it so in future we don't have to fork our money over to nuance to buy eloquence.


https://aka.ms/Jt6vzv

2017-04-12 14:18:32

I like the idea, but I sadly doubt this would work without a complete reworking of Espeak. The algorithm is very different, and custom variants can only go so far. What's more , I don't know how Microsoft will prioritize this. Eloquence is probably only known by a small subset of windows users, considering it's popular primarily among the blind community which is what Nuance takes advantage of to make shitloads of money off an abandoned synth *cough* but then there's the original espeak. A fork would be the best approach, but what makes you think the current maintainers of espeak would want to actually let their project go? I think forking is the best approach in this scenario, not taking over.

2017-04-12 15:59:21

I do not think this is a good idea. Microsoft already have their own tts to deal with and develop.

2017-04-12 23:31:45

Aaron nailed it. AS much as accessibility is more of a priority for Microsoft than it's ever been, modernizing a practically 15+ year old synthesizer is far beneath their priorities. No offense. Plus, Espeak will never sound like eloquence. Two incredibly different voice models. Formant, yes. Could the algorithm match eloquences? Possibly, not exactly though without a complete reworking of the speaking cadence of the engine. Same diction and sample sound? Absolutely not. Not worth it in my honest opinion, you'll never be getting a perfect Eloquence impression from that engine. Hate to burst your bubble, but this is true. It was true for Keynote, it was true for Dectalk, and it's true for Eloquence now. Don't get me wrong, these old synths are great, but you seem to have this pattern of severely overestimating the Espeak engine's capabilities. It's great, versatile, multilingual *even if some of the actual language modules are laughable* but it will never go beyond its actual voice model. Affordable Eloquence for Android, or at least arm Linux distros, *may* be a possibility. Don't go spreading this around like the best news in the world, because this is only a possibility. After getting a near immediate refund on my Voxin purchase I asked the guys behind that if they'd heard that Eloquence runs on Android, and they said they'd have to look into that because the Viavoice binary, obsolete but affordable, was purchased in 2012 and, being an obsolete binary, the binary itself was never updated. The unfortunate thing now is that I believe the reason they chose the voavoice binary was to stay affordable. A great move, as it's 5 bucks for the package, but that means that their only other option may be going through Nuance, in which they'd have to charge more, something they would much rather not do as their aim is to stay as affordable as possible.

2017-04-13 02:06:08

well I sent microsoft my feedback earlier today. I don't expect espeak to sound exactly like eloquence, just maybe a little like eloquence improve the language pronunciations and improve the klatt voice as much as they can. but anyway my feedback has been sent so we'll just see what happens if anything happens at all.

2017-04-13 06:27:49

I wish you luck. With that objective cleared up now, perhaps Espeak could sound like Eloquence's algorithm with some serious reworking of the voice module. The unfortunate thing with variants is that they exist around a centered, unmodified voice module that has a set speaking cadence. Variants have a small subset of knobs and switches, so to speak, compared to the main module. Basically the main module would have to be reworked, and I  wouldn't expect just a custom variant if I was you, probably a stand-alone full-on espeak fork. Technically possible? I should think so. Done by Microsoft/ Doubt it. But that's alright. It's ok, because anyone knowledgeable of a synth's controls enough so as to change the voice module that much, can go for it and create said fork.

2017-04-13 07:23:56

Unless I will record my voice and tell them that I'm the first voice for Panamanian Spanish lamguage

73 Wj3u

2017-04-13 08:49:30

I agree with Josh and Jack, in the sense that they shouldn't make E Speak sound like Eloquence, but they should definitely either get rid of the voices they have because they don't sound to great, or make them sound better. On another note, if E-Speak is open sourced, doesn't that revoke their right to say that Microsoft or some other company can't make their own branch of it?

Discord: dangero#0750
Steam: dangero2000
TWITCH
YOUTUBE and YOUTUBE DISCORD SERVER

2017-04-13 22:54:16

Sorry Josh if this comes out as me attacking you, but seriously. You need to understand what you're asking for. I see you do this time and time again, first with Keynote for NVDA/Android, and you still pushed that, even after person after person told you why it couldn't be done or would be very difficult. If you really want ESpeak to sound better or like Eloquence, maybe you should consider learning how to code. And as for Eloquence? Well, get disassemblin', budd. Maybe when you figure out the amazing algorithms behind it, you could integrate them in to ESpeak, and, hmm, maybe send your changes as a pull request? big_smile

Oh no! Somebody released the h key! Everybody run and hide!

2017-04-14 05:15:44 (edited by jack 2017-04-14 05:23:56)

John and Slender basically nailed it. Let's take this hypothetical situation where something possessed Microsoft to do this wonderful branch you speak of. In a situation where Microsoft will not hide their understandable but blatant unwillingness to work with anything open source, they have two choices. They can either release their branch as closed source and proprietary, thereby violating the gpl. And don't think for a second that the outcry will be falling on deaf ears either. Microsoft has been a prime target for the guys over at the Free Software Foundation, and rightfully so, because Microsoft are very much defective by design, but I won't go on for hours about the reasons why. Their second option that wouldn't put them at risk of tarnishing their reputation would be to simply not want anything to do with anything open source. Only, sidenote, it doesn't seem like that's entirely true, it's said that their server is running on Apache, *ahem* if you wanna talk hypocrisy, that's it. They could've damn well run their server on, well duh, windows server? Granted there's Apache for windows, but that too is open sourced, and if they were as anti-open sourced as they say they are they wouldn't be practically endorsing it by using Apache. But enough rambling about that, just thought I'd point that out. Ok, back to the real situation. Even if Microsoft somehow wanted to work with open sourced software, making a synth is beneath them by a longshot. That's akin to asking Apple to make a braille display. It, just, makes, no, sense. Either way, what makes you think they'll carry through with it? You're old enough to have seen Microsoft through with their early synth efforts. You've probably seen every cop-out, every letdown. They were going to make a pretty damn good singing synthesizer which is rumored to have existed as a prototype in a beta of Windows Whistler which is what Windows Xp was *going* to be, only to abort it and trash it like it was nothing. Much later, they disappointed us with Microsoft Anna, and also by removing all the good Microsoft voices from Vista onward. And now they're pulling some half-assed prerecorded wave samples combined with some speech units on us and then calling it a synth! I mean just look at David! Turn that key echo on and press some letters and numbers, and you can clearly tell that the letter and number names are clearly not speech units. And while I still would've thought it was a cop-out if this wasn't part of the equation, what gets me and just makes me laugh is the half-assed attempt at trimming the samples, and that's a dead giveaway that they're not part of the speech units at all. Any true synth developer would dry heave the moment they so much as heard the prerecorded samples, but hearing some of the static that wasn't cut off the end of some of the samples is enough to wish the barf-bag was nearby if it wasn't already. lol Then there's the gear being used. I could clearly hear static. It's ironic that one of the biggest megamillion dollar fat corporations uses gear that is of the caliber of a lower end consumer-grade digital recorder or usb desktop mic. It's sad, and it honestly makes me wonder how much, or how little, money they allocate towards their apparent version of synth-creation. I'm sorry, but while they may be great with xbox accessibility, which don't get me wrong, I really do commend them for that despite all this, but I will not fall for anymore of their synth-creating escapades unless they pull something of the caliber of Google Wavenet out of thin air. My point? Simple, and I'll make it as clear as possible. Synth-making, is not, Microsoft's department.

2017-04-14 16:40:31 (edited by musicalman 2017-04-14 16:43:12)

Yep, pretty much everything has been said that needs to be said. Microsoft has no interest in open source products or Android. Why you would think they would take such a project is beyond me.
I do have to respectfully disagree with Jack on how horrible their TTS is. While it's true that their singing synth was supposed to be really cool and was shelved before release, and while it's true that many including myself consider Anna a disgrace to synthesizers everywhere, and while letters and numbers sound like crap with David, I still hold the opinion that Microsoft's TTS is better now than it's ever been. The old XP Sapi voices were just plain annoying for me to listen to, especially the SAPI 5 versions. Anna was even worse. But David actually has some clarity. Sure, the voices are still not the best. Sure, the pauses are too great, the voices are still laggy, they speed up poorly, they may have pronunciation issues, they may do odd things at times... But you know what? Most people who will be using Sapi voices aren't going to care about those things as much as we do. They just want something they can understand, and the current voices are better than anything else Microsoft has released before. I think the voices from Win8 onwards are well-suited for lighter TTS needs. There are additional mobile voices on Win10 that with a little registry tweaking, can, so far as I've heard, be unveiled. I haven't tried to use them myself, but I've heard them elsewhere and some of them are imho somewhat better than the default voices.
If the TTS and its platform were improved to address the needs of screen reader users, then I would consider using them full-time if I didn't have Eloquence. If that fails, Microsoft could just buy Vocalizer like everyone else seems to be doing. I just hope Vocalizer wouldn't get a laggy Sapi incarnation. One might say, "Oh it's Microsoft so they're gonna screw it up," but I'd like to be positive. Their TTS and accessibility has taken a step in the right direction, and maybe it will continue to do so. So they haven't caught up to Wavenet? So what? That's not even a working prototype yet so far as I know, and is only in the experimental stage. To be honest, with a little work I believe their TTS could be adequate enough for even our use. If any resources go to improving the TTS, I'd vote that they should go to building off of what they have already done, rather than suggest a ridiculous proposition like forking ESpeak.

Make more of less, that way you won't make less of more!
If you like what you're reading, please give a thumbs-up.

2017-04-14 17:04:13

I don't think Microsoft will do it, but I also don't think it's ridiculous. E-Speak is a fast and reliable TTS engine, unlike the Microsoft voices, which lag a lot and don't stop speaking when they're supposed to on some occasions.

Discord: dangero#0750
Steam: dangero2000
TWITCH
YOUTUBE and YOUTUBE DISCORD SERVER

2017-04-14 17:08:53

To be fair Shotgun, I have had that with NVDA when my laptop is under high load like running a VM for instance.

Also, MS doesn't like open source software...yet...they go put a bash shell into Win10 Anniversary update. Kind of hypocritical if you ask me really...

Warning: Grumpy post above
Also on Linux natively

Jace's EA PGA Tour guide for blind golfers

2017-04-14 18:13:09

@DracoSelene89 That's precisely what I'm saying. Microsoft hates open source software yet they pretty much welcome it into windows 10. How do you explain that? And Ray, ok I'll agree with you on that. As much as post 11 was a slam on the prerecorded samples, I'll agree that the voices themselves definitely sound good for what they are. Responsive? No, unless you're using the mobile voices, which definitely do sound better and are also available directly through an nvda addon that exposes nvda to the mentioned exploit to get the voices. The latency and the prerecorded samples are pretty much the only problems I have. with the voices. I just think the prerecorded samples are just not the way to go for synth creation, because 1 they don't support pitch/speed change, or at least not as effective as most synths. It's like a non-server dependent version of speechhub, in a sense that the most used phrases are essentially loaded into memory. Now don't get me wrong, I like Speech Hub, but you'll notice in some instances, if you change the pitch and the speed of a voice after hitting some keys, next time you hit those keys you may be getting the samples you heard before. That's what I'm talking about when I say it's all preloaded into memory. Except for windows 10, that's always how it is, they're just loaded into the framework, permanently. The actual synthesis units are more than satisfactory, but the samples, not so much. Then again, tts is not Microsoft's department, and most probably wouldn't care in the least. Which is why I do agree, Microsoft partnering with Nuance would be a good idea despite Nuance's track record of borderline unethical domination of the market, but it would really give Microsoft a solid synthesizer to work with, allowing them to focus their efforts on accessibility without worrying about the speech implementation, so long as the implementation was just as solid. And no matter how much you try, just know that espeak will never sound like eloquence. You may get the algorithm if you try hard enough, but keep in mind that most parametric synths use vocoders. Don't believe me? Turn some of the dials down to 0 on the nv speech player, and you'll notice you'll hear straight tones. That's exactly how it works. The phonemes are ran through signal processing and matched to the vocoder's processing as well. Normal speech would be samples as the modulator with tones as the carrier. A whisper voice consists of the same phonemes as the modulator, and a white-noise waveform as the carrier. Eloquence's signal processing is different to that of other synths, so even if you managed to get the algorithm pitch perfect, you'd never be able to imitate the true sound of Eloquence due to the different signal processing. It's akin to trying to make Espeak sound like a Votrax.

2017-04-14 18:24:44

So, wait. You're okay with a company that for years had borderline unethical and illegal (according to the DoJ) domination of the PC market (and again those whispers are starting with W10 and haven't gone away), partnering up with a firm that's a dominant in its field.

Look, I'm not saying it's a bad idea, I'm simply stating competition is a good thing. Competition. Choice. Call it what you will, it's why I want a rival NFL game to Madden, a rival to 2K's NBA games every year, a rival to...you name it, competition breeds innovation and technology....and simply having one platform is a bad, bad idea (see Windows).

I'd much, much rather, to use Linux as an example, have the ability to be able to choose each component of my OS, in a Windows type environment, from the ground up. If  MS ever get pushed to that point though (and I doubt it'll happen) it probably would be a very, very bad thing.

Warning: Grumpy post above
Also on Linux natively

Jace's EA PGA Tour guide for blind golfers

2017-04-15 03:44:16 (edited by jack 2017-04-15 03:44:31)

I'm not necessarily ok with that. What I was saying is that would give Microsoft a solid tts framework to work with, but it would lead nowhere fast if two dominants in the field tried to strike a partnership. It would likely ultimately result in one of the companies, more likely Nuance, getting bought out. And Microsoft buying Nuance would be really, really bad, that's putting it lightly. But at the same time, I don't think tts creation is their calling, so what I was basically saying was they need help from a solidly run tts. A division comparable to that of Google's Deepmind, the people developing Google's Wavenet speech synthesis. A division within Microsoft free to experiment. Of course, I highly doubt Ms would actually do this, because Google's been all about experimentation from the word go, unlike MS.

2017-04-15 03:52:25

I was about to suggest pairing up with Neospeech, but Wavenet would be good too.

Discord: dangero#0750
Steam: dangero2000
TWITCH
YOUTUBE and YOUTUBE DISCORD SERVER

2017-04-15 05:03:10

Problem with wavent is Google has access to ridiculous amounts of compute power, probably more than all of the world's supercomputer's combined, allocated to Wavenet. Unless they manage to get those system resources down, it would take days to generate utterances. It wouldn't be a good idea to partner with some tts that big.

2017-04-15 09:52:46

My big problem with MS partnering up with...well....anyone is locking down a system. I mean, their track record isn't exactly great and with them pushing the Win 10 app store (which is awful regardless of if you can see what it is or not), they are  pushing towards a walled garden though far as their ecosystem goes, which does not fill me with confidence about them partnering up with any TTS tech since they hate competition and want to, on Win 10 at least, force everyone to the app store by default.

Warning: Grumpy post above
Also on Linux natively

Jace's EA PGA Tour guide for blind golfers

2017-04-15 14:36:09 (edited by jack 2017-04-15 14:36:38)

And that is just one more reason I'm anti-windows10. Installation of sanitized apps through a closed-off app store. Goodbye drm-free programs, then. Once more, people are gonna be completely unprepared if they hit malware. It's pretty much akin to the schools locking down the computers so tight that it's physically impossible to install malware, rather than teaching people safe browsing and letting them learn from mistakes. Google has the right idea when they run their play store. They are a lot more liberal when it comes to exactly how the store is run, aren't as tight on devs as Apple and probably Microsoft are, and have always allowed the installation outside the store. Granted Microsoft allows that too, but for how long? How long until windows devs are either forced to, or decide to go the appstore route, thereby locking other users out? Then again, I wouldn't put it past them, they're already ramming windows 10 down our virtual throat if you will. How long until people realize, oh, maybe they want this os to work for them, nor for us? How long till people start switching to linux or osx in droves? In my opinion, they have no business making the operating system that runs a computer a service. It's not. It should only be a collection of low-level programs designed to launch and communicate with other programs. That's it. And if you want a mobile ecosystem, go mobile, but don't shove it on the desktop.

2017-04-15 15:26:27

Again more reasons to be anti MS partnering up with Nuance or any other TTS firm, given their anti consumer history and anti-choice history. How many people would be fine with....say...espeak suddenly becoming closed source and suddenly espeak coders either being hired up by MS or given C and D notices for infringing on MS's copyrights?

Also,  given their history of not liking competition I would hazard a guess they'd do all they ould to block NVDA, JAWS, etc (though Jaws is slowly going away due to drivers and such as is) and Supernova and so forth due to them having their own, proprietary, synth tech.

Warning: Grumpy post above
Also on Linux natively

Jace's EA PGA Tour guide for blind golfers

2017-04-15 15:33:57

Exactly. Espeak should stay open source. The last thing we need is another synth becoming closed source and proprietary.

2017-04-15 22:13:09

All the bad stuff like only being allowed to install windows store apps, it just won't happen, and I'll tell you why. Because then, everyone would stop using windows or they would hack it or something. Case in point, Microsoft would lose tuns of money, and they'd be forced to stop doing it or they'd lose their business. You know what they say, the customer is always right.

Discord: dangero#0750
Steam: dangero2000
TWITCH
YOUTUBE and YOUTUBE DISCORD SERVER