Just to make it clear: both console and clipboard output are already implemented (they're very trivial to program actually, took me like a few minutes each at most), I'm just trying to make sure that I'm not providing something that is useless in the end (which creates false hopes).
Also I managed to turn on the screen reader on Linux and it decided to start reading anything I wanted except what I was pointing to, including a not focused window at some point *sigh* This is going to take a while.
Sebby wrote:We designed both the TTS processor and the engine output so that we could do portable TTS ( actually, now I think of it, it was pretty awesome, though I say so myself ). The idea is that each platform has some idiom, EG Windows had ag_say which used SAPI (you're right, if you can go direct, is better), OS X had the say command, and Linux had--what was it? oh yeah, we handled that directly and used serial to an Apollo TTS device.
Is ag_say something from AudioQuake? Because that's what I seem to have found around. I guess that won't work for a default setup, but then again I don't know how common is to have that program installed for a blind user who plays games (maybe it is and I can just rely on it). Also I couldn't find info on how to use it (e.g. is it "game | ag_say" or "ag_say game" or "ag_say text-to-say"?)
And yeah, Linux is a horrible mess. There isn't an equivalent to SAPI but rather several different engines (at the very least two major ones, it seems one for Gnome and one for KDE) and any of them could be installed on a given system. That's annoying, I'm not sure if there's some de facto standard API to communicate with them. (mind you, it's likely you can just pipe the output to the engines directly, knowing Unix philosophy, although I'd still need to know their filenames)
It seems that on Linux there's Festival, but I'm not sure how it works. I should check.
Sebby wrote:To encode priority, you have to contrive some scheme that the backend knows, EG you have a prefix before a message string result in a TTS reset prior to speaking the new string. Etc.
Oh, so engine specific.
Sebby wrote:Now our game launcher is handling all the TTS on both OS X and Windows in Python using PyTTS, so there must be a market for a universal TTS API and hopefully there will be one in your language.
I'm using C (not C++) and the game is for Windows and Linux, so if there's something cross-platform that works on those, it'd be nice (oh, and there's the issue of the license being compatible with the GPL 3 as well). Tried doing a quick search but I didn't seem to be able to find anything useful (the only one I found was using an incompatible license).
Sebby wrote:The other measure you suggested, that of using the clipboard, is also known to work with a tool people have been using, though if you're going that far you might as well just build in SAPI support directly.
Clipboard support is so easy with SDL 2 that it's a no-brainer though, compared to implementing SAPI support (not to mention still having to figure out what to do with Linux, which seems to have two major engines at least).
camlorn wrote:You can easily call the screen reader APIs directly, and to be honest that's what I'd do. Clipboard requires extra setup, as does the console thing.
I assume you mean extra setup for the user? Because programming those two is actually much easier, console output is just a call to printf, while clipboard output is just a call to SDL_SetClipboard (I'm using SDL 2), while programming the screen reader APIs directly is a much more involved effort (not to mention platform specific, so I need multiple codebases), and I don't have any suitable TTS engine available right now.
camlorn wrote:SAPI can be used and will work, but most of us find the SAPI voices to be sad and way, way, way too slow. The average screen reader user is going 3 times as fast as what Microsoft seems to think the maxes ought to be.
Actually, looking at the SAPI documentation the impression I got was that the program set a speed relative to the user settings, which would solve that problem for every program. The problem is that Microsoft's own engine doesn't provide any settings from what I've read.
camlorn wrote:See http://hg.q-continuum.net/accessible_output2/ which provides Python source to talk to all of the ones on Windows, and I believe also Mac. Not sure if it handles Linux. It's fairly simple, though I can't rattle off the API function names and whatnot--I just use accessible_output2 when I need them. Porting it to your language of choice should be pretty simple, though.
Well it doesn't seem as simple as me, but then again I don't have enough experience with SAPI (I just looked up some functions to get an idea of how it works), so that probably isn't helping. And that's assuming I ignore the rest of the APIs it supports. I'd need to see. (EDIT: meh, forget that, the SAPI 5 one seems easy actually, but I still need to learn how to use the API and what it wants)
Ideally I'd want to make a proper TTS engine that's more portable (ideally just render sound) and is more suitable for games (since sounding natural is more important for games than with normal applications), but that will take a while so that's why I want to have output to screen readers meanwhile. When I have that one ready I will definitely implement it in my game though (as well as making it available to everybody, of course).