[macvoiceover] Re: iPhone voices

  • From: Bryan Smart <bryansmart@xxxxxxxxxxxxxx>
  • To: "macvoiceover@xxxxxxxxxxxxx" <macvoiceover@xxxxxxxxxxxxx>
  • Date: Wed, 23 Jun 2010 14:06:10 -0400

Samantha was one of the first generation of voices created for RealSpeak. Just 
like why Microsoft SAM is the default synth for SAPI versions 5 and earlier, 
she is the default synth because she had the least amount of pronunciation 
mistakes. For SAPI, for example Mike and Mary both sounded more human-like, and 
less depressed than Sam, but, due to limitations with their recorded audio, 
they mispronounced more words. RealSpeak Tom, the only other RealSpeak and 
Vocalizer US English voice, also has this problem on occasion.

Another reason for her continued use is that, by fortunate coincidence, she 
still sounds alright when extremely compressed. To make Vocalizer voices, the 
RealSpeak audio recordings for the voices are highly compressed. Compression 
produces audio artifacts. If you listen to Tom on the Stream, for example, he 
sounds horrible when compared to the RealSpeak version for the desktop. For 
tech reasons that would take a while to explain, higher pitched voices compress 
better. So, this has further pushed out Tom, and secured Samantha as the voice 
that you hear when Vocalizer is used in mobile applications. I think that this 
is why Apple doesn't bother to include Tom on iOS.

The horrible thing about Samantha is her intonation model. Intonation is the 
way the pitch of a person's speech rises and falls while speaking, in order to 
convey emotion and provide the listener with information regarding the approach 
of clause and sentence breaks. Samantha has little to no work in this area. 
That's why she sounds like, as a I recently heard someone say, "an angry 
robot". The RealSpeak voices since Samantha have had a lot more work put in to 
intonation. Even with Tom's pronunciation mistakes, his intonation is very 
smooth when compared to Samantha. For an extreme example, listen to Daniel. His 
pattern models an announcer for radio voiceovers.

Intonation is hard to get right, cause the computer doesn't know anything about 
the meaning of what it is reading. TTS programmers must create a model that is 
general purpose, without sounding lifeless.

Nuance really needs to make another US English voice. As one of the first 
products of RealSpeak, she has served well for a long time, but she's out of 
date. I don't know of any plans to do anything about that situation, though. 
Vocalizer is available for most every imbedded architecture, and supports most 
all major languages. They don't have any real competition at the moment. Where 
they did have competition, they've purchased the companies that made it, and 
killed the products. For example, they also own both Eloquence and DECtalk. 
Eloquence is being discontinued for anyone except those that hold long-term 
licensing agreements (like Freedom Scientific and GW Micro), though updates to 
the synth won't continue, and even those people will have to move to something 
else in a few years (most likely RealSpeak). DECtalk is being similarly 
canceled, due, in part, to the source code being lost as it was repeatedly 
bought and sold in the early 2000s. Accapella is about their only real 
competition now. If you saw a list of all of the companies that Nuance has 
bought over the last 15 years, you wouldn't believe it. Acquiring companies is 
what they do. I worry that, some day soon, I'll see a press release saying that 
they've purchased the Accapella Group, and soon after, will discontinue their 
synth. At that point, the only TTS system left that supports a comprehensive 
set of multiple architectures and languages will be ESpeak. Yuck!

Bryan

-----Original Message-----
From: macvoiceover-bounce@xxxxxxxxxxxxx 
[mailto:macvoiceover-bounce@xxxxxxxxxxxxx] On Behalf Of Cheryl Homiak
Sent: Wednesday, June 23, 2010 1:12 PM
To: macvoiceover@xxxxxxxxxxxxx
Subject: [macvoiceover] iPhone voices

I'm wondering why our U.S. English voice has to sound so awful (imho) when some 
of the voices sound not bad at all. I think Australian sounds nice but maybe 
that's just because of the accent; maybe Australians think this voice is as bad 
as Samantha. Spanish has a male voice which I also think doesn't sound bad at 
all. I always thought we had to put up with Samantha due to the size of the 
program or resources used but now i'm wondering. Anyway, it's nice to have a 
choice. My apologies to those who like Samantha; I know voice preferences are a 
deeply personal matter but I strongly dislike her on the Stream and only find 
her slightly more tolerable on the iPhone and iPad. Again, it is great to 
easily be able to change it now.


-- 

Cheryl
"Let the words of my mouth,
and the meditation of my heart,
be acceptable in thy sight,
O LORD, my strength, and my redeemer."
(Psalm 19:14  Bible KJV)





>
> Click on the link below to go to our homepage.
> http://www.icanworkthisthing.com
>
> Manage your subscription by using the web interface on the link below.
> //www.freelists.org/list/macvoiceover
>
> Users can subscribe to this list by sending email to  
> macvoiceover-request@xxxxxxxxxxxxx
> with 'subscribe' in the Subject field OR by logging into the Web 
> interface at //www.freelists.org/list/macvoiceover
>
>
> Click on the link below to go to our homepage.
> http://www.icanworkthisthing.com
>
> Manage your subscription by using the web interface on the link below.
> //www.freelists.org/list/macvoiceover
>
> Users can subscribe to this list by sending email to
>  macvoiceover-request@xxxxxxxxxxxxx
> with 'subscribe' in the Subject field OR by logging into the Web
> interface at //www.freelists.org/list/macvoiceover
>

Other related posts: