Honestly, I think a lot of why voices in general don't react to exclamations is that manufacturers don't really know what to do. Eloquence and ESpeak do a pretty good subtle hint of an exclamation mark, but True Voice goes way over the top with it, and imho the old Microsoft Sapi 5 voices went a little weird with it too. I've always found the exclamation mark to be one of the most ambiguous symbols ever, since at the end of the day, it can mean an extreme of anything. Extreme sadness, happiness, shrieking in fear, or shouting in anger. I agree with the op that Eloquence got it right; it sort of does something without being dramatic about it, but some of the above synths, well, take a bit more of an interpretative approach. Maybe some companies just don't want to deal with the conundrum. Or perhaps they simply just don't want to program exclamation inflection at all because nobody will care. In any case, it's annoying, but I'm used to it to some extent.
Also natural voices open up a new can of worms. The models which are used to concatonate the recorded segments are only really optimized for sentences/ statements. Some natural voices try to do questions but it often sounds fake to me. Like sometimes the inflection goes up by a small amount, but other times it rises considerably more. And with some voices, particularly older/less developed ones I messed with back in the days where I was truly obsessed with this stuff, they only rise in certain cases, or they don't rise at all. But even if the pitch does rise, it can be kinda hard to tell if the question is supposed to sound real or contrived, given how it can inconsistently break the flow of normal sentences which the voice is speaking. But maybe that's just me.
I highly doubt they could do exclamations naturally unless they either A, use a cheap trick like only picking from stressed/high pitched phonemes in the recorded database, or B, slow speech rate slightly near the exclamation mark (which creates artifacts in natural voices). I don't know if Mac Alex does anything with exclamation marks, but from what I've heard of him, he had some of those digital artifacts I have come to dislike in natural voices, so maybe that's why other companies don't want to over-process their recorded phrases. At the end of the day, I think it is technically just too challenging/not worth it for natural voice companies especially.
I'm kinda thinking that if these companies thought it was simple enough to introduce exclamation inflection into existing voices, then they wouldn't be resorting to recording specific phrases like "Help!" or "Run!" Or "stop it!" That strategy bothers me, because it breaks away from the consistency of the voice when reading, and is about as jarring to me as true Voice's shouting. Imho if a voice sounds bored, it shouldn't suddenly become animated because an exclamation mark triggered a prerecorded phrase. But, I think a lot of other people don't mind that as much as I do.
With AI and Wavenet, this could change. But sadly, it might be a while before it does, since interest in exclamation marks in TTS doesn't really seem to be going up.
Make more of less, that way you won't make less of more!
If you like what you're reading, please give a thumbs-up.