2019-02-28 18:02:34 (edited by ianhamilton_ 2019-02-28 23:57:55)

We are obviously now starting to see AAA mainstream games implementing voiced interfaces, and that's something that's only going to increase over the coming years.

Even just within the couple of examples so far there are very different approaches to how it is implemented, so I thought it might be helpful to have a chat about what direction you would like developers to take with it.

Some examples -

1. Screenreader compatibility, e.g. skullgirls

2. Using platform level text to speech API, e.g. crackdown 3

3. Building a custom synthesised speech solution inside the game itself that works the same across all platforms (PC/Xbox/PlayStation/Switch etc), e.g. Division 2

4. Building a custom synthesised speech solution inside he game itself that also has in-game control over voice and speech speed, e.g. Eagle Island

5. Recording on-brand human speech for all menu items, e.g. Freeq

Ignoring all practicalities, what would your dream setup be? Which approach would you like developers to take, and why? Strictly from your own perspective, how it affects your experience of a game, not how it might affect developers.

Or would you even prefer more than one, for example one approach for menu navigation and another for in-game interfaces?

[EDIT] Also considering many developers are coming to this from scratch with no prior knowledge of TTS conventions, any tips on what kind of contextual information to communicate? Label/role/state/announcements etc?

2019-02-28 18:34:25

Personally, if you're going to have a multi-platform release, it might be best to have an in-game solution that allows for speed/voice controls.  This way the user has elements of customisation that are only really rivalled by screen readers at this point.

Of course though if we're talking exclusives, Microsoft's Speech Synthesis API (MSSA) is a great idea as there's less work needed to get things up and running as far as I understand.  PS4 presents its own set of issues though given there's no equivalent API yet, but hopefully Sony can rectify that as time goes on.  Would be great to see the next God Of War etc have menu accessibility at least if not accessible gameplay.

I do still think the first and most important barrier to platform accessibility is the lack of accessibility features in certain regions if at all (see also: PS4 not having text to speech outside the US or Switch not having an accessible interface of any kind at all).  Once those issues are resolved, I think it might be easier to justify purchase of said platforms, thereby increasing the market for accessible mainstream content.

As an aside, I feel it's important that developers understand just how much ease of use means when you're setting up a new game, even if that game doesn't need the communications aspects need for CVAA compliance (i.e. a single player experience).  Such ease of use, in the case of Crackdown 3, allowed me to check what save slots are used, what agent I have selected, how the controls work etc, all without needing sighted assistance which for some gamers isn't a thing they have access to in the first place.

I'm not sure why developers would want to go with a combination of methods, though I suppose for areas like character select in a fighting game, immersive voiceovers could work well whereas for all the other menus they could be voiced by TTS.  The question then is raised of "why don't all the menus use the TTS functionality" as well as the issue of localisation etc for voice acting in menus.  If you, for instance, have a Spanish speaker accessing your game who doesn't speak a word of English but they still want to play, that's a road block.

Using custom solutions as in the Division 2 is a good idea if the voice can be adjusted, but the main issue I found during my tests was that the voice, at the maximum setting, wasn't high enough in volume to begin with.  It was a shame but at least, as the silver lining of sorts, I could use the options without the need for assistance once I had it set up.


This is a really good question though and I look forward to seeing what developers work with in the coming year and beyond not just for menus, but for gameplay as well.

Regards,
Sightless Kombat.
***If you wish to refer to me in @replies, use Sightless***

2019-02-28 18:59:02

@2 I assume for you then there aren't there any scenarios in which you would prefer recorded voiceover to synthesised speech?

2019-02-28 19:12:05

As long as it works, I don't care. Also thinking further to certain non-menu elements of the interface, let's say we're playing a survival game (an interesting genre to me) and I need to know when my hunger and thirst and so forth are low. I wouldn't necessarily want that information relayed through speech as it might be slow talking and loud, which might cause me to be bitten by an alligator, a wolf, or something I didn't hear because the speech was rattling off at such a slow rate. Also, most of the time speech ducks the audio, I'm not sure why, we're quite capable of listening to sounds and speech at the same time XD. I can see making that optional, but not compulsory.

I know that stuff's in the future, but if we're talking interfaces, those are elements of an interface. Also, I've seen in games where the character will wake up from an injury and the screen is like tinged red at the edges and there's like this sound and it sort of pulses. Well, blind people are gonna have no clue that this is indicating the character possibly has a headache, is hung over, or otherwise temporarily impaired In other words, the character isn't going to do things well, not fight well etc. So perhaps that could be conveyed through speech. I think let's say games similar to GTA V, where it posts notifications at the top left, or sometimes above the GPS at the bottom left, those could be spoken using TTS as long as they finish before dialogs. I don't want to have dialog cut off or ducked under as I could theoretically review the notification later, since most of the time in GTA games, they show up on the briefing tab of the pause menu. Another element would be the compass of some games, like Skyrim has a compass at the top center with the direction, points of interest, and probably most importantly, your quest marker which points at your current objective. That marker could be sonified such that it emits a periodic ping which ought to be centered to be walking towards the objective. Also, it should probably be set so that you can cause it to ping if you need to, and have it manually activated in case you don't need it constantly pinging. You could also start pulsing the rumble feature of the controller to indicate you're closer. The pulses would start out slow and get faster and faster. I find this would be better than the same effect in audio, since you might miss out on something if it's pinging rapidly.

Facts with Tom MacDonald, Adam Calhoun, and Dax
End racism
End division
Become united

2019-02-28 19:19:11

I voted for screen reader support, but if I could vote for both screen reader or text to speech API I don't really have preference one way or the other, as long as the game talks to me.

The other solutions are OK as well, but more expensive to do - recorded VoiceOver in particular cost a lot to record, add disk space and aren't flexible - for pronouncing things like nicknames for example. THis gets even more expensive and less practical if this was to be done globally.

Having options in the game to adjust the voice settings is fine as well. From what I heard in recordings Eagle Island just uses the native Windows SAPI but just gives you settings to adjust it.

<Insert passage from "The Book Of Chrome" here>

2019-02-28 19:41:50 (edited by ianhamilton_ 2019-02-28 19:45:59)

Here's an excerpt from a post about the development of Freeq. Even without player feedback like this it's easy to see how developers would think that recorded voiceover in keeping with the feel of the game would be preferable to generic sounding synthesised speech. I assume those of you who posted so far wouldn't agree with the perspective of the Freeq players that she is citing?

Excerpt begins -

"Our other major win, from our players' perspective, came from our inability to use VoiceOver.  You see, VoiceOver uses the Siri voice.  For everything.  So it doesn't matter what you're listening to, whether it's a travel app or a medieval fantasy story or the settings menu for the phone itself, it's all read to you by Siri.  And as you can imagine, that can get rather old, rather quickly.  But because we couldn't use VoiceOver, we had to record our own "Siri" voice.  Our lead designer just so happened to also be both a voice actor in the game and our primary sound engineer, so he locked himself in his office for a couple of afternoons and recorded every piece of voiceover we'd need to provide instructions and interface cues.  And since he was the one who had processed all of FREEQ's audio to begin with, adding filters and additional effects here and there to make it sound unique, it made sense to do the same for our replacement VoiceOver reader.

This was one of the most popular parts of the game, among our blind and low-vision players.  The fact that the game provided the functionality they needed, but did it in a way that didn't break their immersion, was something they really liked. Everything about the game existed "in-world", and we have heard again and again that they very much appreciated this extra little bit of polish."

(Full article at https://www.gamasutra.com/blogs/DianaHu … ayers.php)

2019-02-28 19:59:10 (edited by Chris 2019-02-28 20:00:15)

I agree with screen reader support. The problem is that the only platform this works well on is Windows. Does Microsoft provide a speech API that sends information to Narrator on the Xbox or is it simply sending it to the system API like SAPI 5?

Recorded speech is fine, but as was previously pointed out, it costs more, takes more disk space and reduces the amount of information that can potentially be conveyed. I'm not holding my breath on Sony or Nintendo. They're both Japanese and don't seem to give two shits about accessibility. If there were more playable games, I'd either get them on Windows or buy an Xbox.

Grab my Adventure at C: stages Right here.

2019-02-28 20:06:55

@7 it's a system API as there aren't enough system resources available to run narrator while a game is open, consoles don't really do multi-tasking! But who knows what the situation will be with the next gen of consoles, they're apparently being announced at E3. There's also the between approach, a game running its own in-game platform agnostic speech synthesis as in The Division 2.

2019-02-28 20:18:24

Also considering many developers are coming to this from scratch with no prior knowledge of TTS conventions, any tips on what kind of contextual information to communcicate? label/role/state/announcements etc?

2019-02-28 21:37:29 (edited by Chris 2019-02-28 21:38:31)

Consoles can't multitask? That's interesting, I thought most of these had quite impressive hardware specs. Then again, most of that is probably used to drive graphically intensive games.

I'm not an expert on this, but my thought is to have menu items read aloud when you navigate with the controller. If a menu has multiple options or selections, speak what's currently selected. In games, voice the critical elements like health indicators, chat messages, etc. Most things the player must interact with should be modified to indicate them using sound. For example, make movement sounds pan in the stereo field or use binaural technologies.

I really don't know why this is so hard. Lots of games have great sounds, they're just not being used to effectively convey the same information a sighted gamer receives. I assume it's mostly a lack of understanding?

Grab my Adventure at C: stages Right here.

2019-02-28 22:27:14

@6 yes I would disagree, but you have to appreciate the effort. Still, that would be like if we tried to tell sighted people how to decorate their apartments, or wear this color shirt because it really makes your eyes stand out, etc. Those kinds of things are just not within our understanding, so we should not do that. Sighted people assume because they find the speech unpleasant, that it's unpleasant for us, but it isn't. I sort of get the point about voice over work being better because it doesn't break immersion, and I guess that's rather subjective, because I wouldn't really find that synthetic speech would for me, it's just a way for me to obtain information someone else would just see immediately. But maybe they're on the other side of right in that regard, since I watch lp's on youtube and people are always saying this and that and the other thing is immersion breaking. If the information was conveyed using pre recorded voice over, and the same voice was played by one of the character in game, now that probably would be a bit weird for me.

Facts with Tom MacDonald, Adam Calhoun, and Dax
End racism
End division
Become united

2019-02-28 23:20:53

@10 I would say can't really rather than can't smile

But yeah it's by design, despite the impressive specs while a game is running the OS hands over nearly all of the system resources to the game

2019-02-28 23:34:20 (edited by ianhamilton_ 2019-02-28 23:50:47)

I do find it interesting that everyone so far has prioritised ease of navigation rather than voice recordings that feel part of the game world.

Not because I don't get it, I totally understand the reasons for preferring synthesised.

But because of those FREEQ players who preferred the immersion of the recorded voiceovers, in situated that the hadn't been anyone commenting or voting who has shared their views.

2019-02-28 23:52:09 (edited by ross 2019-02-28 23:53:55)

I vote APIs, mainly because I feel they're easiest to use. I also feel that Microsoft should get creddit where credit is due. Plus, you think a developer is going to want to higher an entire voice actor? No. It's best to make it as simple as possible for them so isn't no intimidating.

2019-02-28 23:56:19

@14 but what would your dream scenario be, outside of any practicalities? Not from a developer's perspective, from your perspective as a gamer and how it affects your experience?

2019-03-01 00:01:28

For me, probably still the same. Microsoft has been innovating left and right, so I feel they should reap the benefits for it. As for what would be best for everyone, I feel that a custom speech within the game would be best, that way people wouldn't have to worry about which consol to use and whatnot.

2019-03-01 00:06:13

@16 by custom speech you mean synthesised text-to-speech like in The Division 2, rather than recorded voice actors like in FREEQ?

2019-03-01 00:33:16

I've never heard of this FREEQ game, but maybe the voice over thing kind of fits, I'm not sure. I know it wouldn't really fit in the types of games I find interesting, at least not for me.

Facts with Tom MacDonald, Adam Calhoun, and Dax
End racism
End division
Become united

2019-03-01 02:38:18 (edited by ianhamilton_ 2019-03-01 02:39:05)

@18 it's an old game, isn't up on the app store any more unfortunately.

I guess another example would be how Killer Instinct speaks the name of the character you've just chosen in the announcer's voice:

https://youtu.be/WJCgB6ZGSk0

So if that speech that currently occurs on selection was to happen when you highlight a character instead.

Would synthesised speech with with control over speed (regardless of ether generated by the game or via screenreaders or TTS API) be preferable to you than that the kind of recorded speech in that killer instinct video?

Or if you do prefer the recorded announcer voice for that character select screen, are there are any other areas of the game that you would prefer it for too?

2019-03-01 02:43:03

As it stands, I'd love for a way for multiplatform games to ahve accessibility features no matter the platform. Depending on what works and what doesn't, a combination could be practical, perhaps a voice reading the menus but a tts reading the dynamic content, such as nickname and such. Although for a game like Magic The Gathering which does not use voice acting, then everything should ber ead by tts. You should be able to get info such as what a card does, and a way to view yours and opponents critical information such as health. However, I don't exactly know how this could work with a controller if there's a game with a lot of stats to view a.k.a magic the gathering.
For a game like Rock Band, use tts, for the menus and for the end of song screens such as stats and scores.
If a fighting game has customizable features a.k.a the upcoming mk11 then all that really neds to be done is make sure the gear has a readable name with the tts, but it'd be kind of cool if the announcer voice or some other voice could read the main menu, although at the same time, it depends. Having a tts voice all the way, offers consistancy.

2019-03-01 03:03:41

I would want text to speech. as long as I get info on what and where I am on the screen that's all that matters. if its a game like fable where you have quest markers have the control vibrate as you get closer to the quest location. if your at the guild and doing the archery stuff I guess depending on how far the targets are have them play a beep or a tone when your centered on them.

PSN ID: AvidLitRPGer
Twitter: https://twitter.com/AvidLitRPGer
Facebook: https://www.facebook.com/AvidLitRPGer
leave me a message saying how you found me.

2019-03-01 04:53:51

I'd much rather have the announcer read out character selection stuff like that, but I wouldn't say if it's not a thing in the game, make it one, if that makes any sense. Like, don't diminish the experience we already have, since the announcer isn't reading health or special meter or whatever, that I think would sort of not be cool. I'd like to see the tts in those situations, or some other option, maybe a pattern, like let's say I fill a special meter in a fighting game, you could let me know by three quick rumbles. On the xbox, you could use the trigger rumbles, though I never have experienced that since I don't own a console.

I'd say if there's a lot of audio information going on, try to come up with another way, like haptics, if that's not an option, like if haptics and rumbles already are being used, then audio is fine but keep it simple and unobtrusive.

Facts with Tom MacDonald, Adam Calhoun, and Dax
End racism
End division
Become united

2019-03-01 05:37:39 (edited by ianhamilton_ 2019-03-01 05:45:15)

@22

"but I wouldn't say if it's not a thing in the game, make it one, if that makes any sense"

Just to be sure, do you mean...

"If something is already voice recorded for sighted players, don't replace it with synthesised speech.

But if you're providing extra info specifically for blind players, don't use recorded voice for it, use synthesised speech.

Or if needed and appropriate for some things, use some other form of audio or haptic cue"

I'm interested to know more about "diminishing the expeirence". So in that example of a cue to let you know meters are charged, you would see adding recorded voice for that as diminishing the expeirence, but would not see text to speech as diminishing the expeirence. Why is that?

2019-03-01 07:27:55

Yes precisely, and because the recorded voice is the recorded voice and the synthesized voice can be sped up, I would need information as fast as it can be given or as fast as I can take it if that makes sense, so that I can make rapid decisions on how to proceed rather than waiting for normal voice recorded speech, OK I might be able to predict what they're saying before they finish, but then its going to be louder over top the game sounds or they might duck the game sounds under the voices which is really bad IMO.

I'm generally not a fan of ducking, though I see the use of it, for example, certain things can be so loud that you literally can't hear your speech, so in such an instant, ducking is OK. That's why On NVDA on windows, a key shortcut to turn that on or off exists, that way you can quickly duck it if needed, but then return to normal. The same with the ducking on my iPhone. I put it in the VoiceOver rotor for when I literally can't hear my speech, I turn it on to resolve that issue, then turn it right back off again because in general, I can do sound and speech at the same time without issue.

I feel like some people leave the autoducking on permanently, which is cool, but I would think most gamers would actually not want sound ducked under speech. I may not have an XBOX, but I have heard recordings of gameplay done on one and when people have narrator on, notifications are spoken in game and the game sounds are ducked down to accommodate the speech. That is not necessary though, and could just screw your chance of survival or whatever you're doing. Something might happen in those few seconds that you didn't hear because the speech forced the sound to duck under it.

So I guess the TL;DR is if stuff already in game, use it, if not, use synthetic speech as we can crank it and please give option to not duck sound under speech.

Facts with Tom MacDonald, Adam Calhoun, and Dax
End racism
End division
Become united

2019-03-01 09:21:17

Hi.
I vote on "Internal synthesised speech with speed/voice options."
1. It will work on all consoles and computers as well.
2. The speed can be adjusted to what people prefer.
3. Voice can be changed which can be important for those who are having a hearing impairment.
With this feature, stats and other useful things could be read out loud, which it normally wouldn't by voice actors.
I wouldn't like voice actors to read out menu items for the following reason:
Sighted people reads text at different speed. Blind people do that as well, just by speech or Braille.

Best regards SLJ.
Feel free to contact me privately if you have something in mind. If you do so, then please send me a mail instead of using the private message on the forum, since I don't check those very often.
Facebook: https://facebook.com/sorenjensen1988
Twitter: https://twitter.com/soerenjensen