RE: siRE: .Net 3.0 Speech functionality

  • From: "Ken Perry" <whistler@xxxxxxxxxxxxx>
  • To: <programmingblind@xxxxxxxxxxxxx>
  • Date: Sun, 6 Apr 2008 08:18:33 -0700


I just sent this to your personal mail but I will block it here so others
can see what I found.  Its ugly but here it is.


 
 
 man like everything else xml they have made this a bitch to use.  Here is
the document as listed in that last email.  I have blocked the part about
pitch here but what a f-ing mess go to this page read about Prosody its what
you want.  Also the Emph tag has been changed to Emphasis if I am spelling
it right here.  Again read this ssml 1.0 document  this reminds me of using
soap. I mean come on if your going to use speakSSML there should be flags
you can set so you dont' ahve to do all this doc shit.
 
http://www.w3.org/TR/speech-synthesis/#S3.2.1
 
Pitch stuff:
 
Pitch contour
The pitch contour is defined as a set of white space-separated targets at
specified time positions in the speech output. The algorithm for
interpolating between the targets is processor-specific. In each pair of the
form (time position,target), the first value is a percentage of the period
of the contained text (a number followed by "%") and the second value is the
value of the pitch attribute (a number followed by "Hz", a relative change,
or a label value). Time position values outside 0% to 100% are ignored. If a
pitch value is not defined for 0% or 100% then the nearest pitch target is
copied. All relative values for the pitch are relative to the pitch value
just before the contained text.

<?xml version="1.0"?>
<speak version="1.0" xmlns="http://www.w3.org/2001/10/synthesis";
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance";
         xsi:schemaLocation="http://www.w3.org/2001/10/synthesis
                   http://www.w3.org/TR/speech-synthesis/synthesis.xsd";
         xml:lang="en-US">
  <prosody contour="(0%,+20Hz) (10%,+30%) (40%,+10Hz)">
    good morning
  </prosody>
</speak>

 


 

-----Original Message-----
From: programmingblind-bounce@xxxxxxxxxxxxx
[mailto:programmingblind-bounce@xxxxxxxxxxxxx] On Behalf Of Chris Hofstader
Sent: Sunday, April 06, 2008 7:03 AM
To: programmingblind@xxxxxxxxxxxxx
Subject: siRE: .Net 3.0 Speech functionality

Hi,

To those of you who may have been following the discussion between Ken and I
about changing pitch and using other SSML items in a C# program, I finally
(with massive amounts of help from KP) got pitch changes to work but only
via the COM interface to SAPI 5.3.  If anyone can find an easy way of doing
exactly this with a .Net assembly, I would appreciate the help just to make
the source code to the project a bit more consistent.

The following code (SSML thanks to KP) works:

        static SpVoice synth = new SpVoice();


        static void TestSAPIWithCOM()
        {
            String ssmlInfo = "<rate absspeed=\"-5\"> <pitch
middle=\"-5\">lowest <emph>is</emph>  -10</pitch> <pitch
middle=\"5\">highest <emph>is</emph> 10</pitch>   ";


            synth.Speak("before sending any XML",
SpeechVoiceSpeakFlags.SVSFDefault);
            synth.Speak(ssmlInfo, SpeechVoiceSpeakFlags.SVSFIsXML);
            synth.Speak("After sending the XML...",
SpeechVoiceSpeakFlags.SVSFDefault);

        }  // TestSAPIWithCOM()

-----Original Message-----
From: programmingblind-bounce@xxxxxxxxxxxxx
[mailto:programmingblind-bounce@xxxxxxxxxxxxx] On Behalf Of Ken Perry
Sent: Friday, April 04, 2008 11:36 PM
To: programmingblind@xxxxxxxxxxxxx
Subject: RE: .Net 3.0 Speech functionality


I did some tests and that page and example I sent out was bang on.  In old
sapi the rate, volume, and spell tags works fine but the emph and the pitch
don't' seem to so you will have to try it in 5.3.  I know I am doing it
right though so give that at try.  You probably don't have to set it into
xml flag mode because I am betting 5.3 does that by default.

Ken 

-----Original Message-----
From: programmingblind-bounce@xxxxxxxxxxxxx
[mailto:programmingblind-bounce@xxxxxxxxxxxxx] On Behalf Of Chris Hofstader
Sent: Friday, April 04, 2008 10:49 AM
To: programmingblind@xxxxxxxxxxxxx
Subject: RE: .Net 3.0 Speech functionality

Thanks so much!!!  When I first posted the question yesterday, I made a bet
that you would be the one with the answer.  And I was right.  I owe you a
free lunch.

-----Original Message-----
From: programmingblind-bounce@xxxxxxxxxxxxx
[mailto:programmingblind-bounce@xxxxxxxxxxxxx] On Behalf Of Ken Perry
Sent: Friday, April 04, 2008 1:44 PM
To: programmingblind@xxxxxxxxxxxxx
Subject: RE: .Net 3.0 Speech functionality



You don't send it as an argument you set the xml flag and you send the pitch
change in the string of text.  For example I don't have the exact sentext
but its something like speak ("<pitch value=5>My text",flags)

And you have to have the xml flag turned on.

Ken 

-----Original Message-----
From: programmingblind-bounce@xxxxxxxxxxxxx
[mailto:programmingblind-bounce@xxxxxxxxxxxxx] On Behalf Of Chris Hofstader
Sent: Friday, April 04, 2008 10:01 AM
To: programmingblind@xxxxxxxxxxxxx
Subject: RE: .Net 3.0 Speech functionality

Unfortunately emphasis is separate from pitch and the Speak function that
takes the emphasis argument makes fairly subtle changes to the speech and
there is no Speak that accepts a pitch as an argument.

I'm playing around with the PromptBuilder.AppendSsmlMarkup method right now.
Pitch can be changed from within the prosody item but I can't seem to figure
out how to write the string passed to the AppendSsmlMarkup method without
causing the Speak function to crash.

Loving life...
cdh 

-----Original Message-----
From: programmingblind-bounce@xxxxxxxxxxxxx
[mailto:programmingblind-bounce@xxxxxxxxxxxxx] On Behalf Of Ken Perry
Sent: Friday, April 04, 2008 12:26 PM
To: programmingblind@xxxxxxxxxxxxx
Subject: RE: .Net 3.0 Speech functionality



Hmm I am not sure prompts are what you want to use though if your output is
dynamic that is not really what prompts are all about.  I would think you
just would need the speak function with ssml to get all the different
emphasis wouldn't you?

Ken 

-----Original Message-----
From: programmingblind-bounce@xxxxxxxxxxxxx
[mailto:programmingblind-bounce@xxxxxxxxxxxxx] On Behalf Of Chris Hofstader
Sent: Friday, April 04, 2008 3:40 AM
To: programmingblind@xxxxxxxxxxxxx
Subject: RE: .Net 3.0 Speech functionality

Thanks Ken,

The little bit of documentation and the few articles I could find that
discuss PromptBuilder say that it makes an SSML object that, through the
Speak method, gets sent to the SAPI synth.  What I cannot find, though, is
any documentation on how to set up the parameters to pass to PromptBuilder
to make it speak what I want.

For instance, my math program can say, "x superscript squared" or it can
say, "x squared" with the squared in a higher pitch.  As you know, I'm a
nazi when it comes to decreasing syllables and using text augmentations
instead as it will increase productivity and PromptBuilder seems to be the
only .Net way to do this.

cdh

-----Original Message-----
From: programmingblind-bounce@xxxxxxxxxxxxx
[mailto:programmingblind-bounce@xxxxxxxxxxxxx] On Behalf Of Ken Perry
Sent: Friday, April 04, 2008 2:00 AM
To: programmingblind@xxxxxxxxxxxxx
Subject: RE: .Net 3.0 Speech functionality



Well first off if they only rapped the old API which I am going to go find
out if they did Pitch will only work if you use an xml tag to set it per
item you speech same for emphasis.

As for speech prompts you do realize that those are pre recorded prompts.
So for example you might record "This is the best software in the world"
Then you could play it as a speech prompt.

I will go see if I can dig up an example of how to do pitch with 3.0 I am
thinking though its going to be no better than 5.1 was.

Ken  

-----Original Message-----
From: programmingblind-bounce@xxxxxxxxxxxxx
[mailto:programmingblind-bounce@xxxxxxxxxxxxx] On Behalf Of Chris Hofstader
Sent: Thursday, April 03, 2008 4:49 AM
To: programmingblind@xxxxxxxxxxxxx
Cc: 'Will Pearson'
Subject: .Net 3.0 Speech functionality

Hi,

[I'm typing this from memory so please excuse any misspellings of the .Net
terms as I don't have a file open from which I can copy and paste them.]

I've been pulling my hair out trying to find some answers to questions
regarding the System.Speech.Synthesis namespace in .Net 3.0 and higher.  I
am fairly certain that I have the latest SDK installed and I've just
switched over to VS 2008 which seems to be happier with .Net 3.x than was VS
2005.

The System.Speech.Synthesis namespace contains a class called
SpeechSynthesizer which has a method, Speak.  The Speak method is overloaded
to take any of three different types: string, Prompt or PromptBuilder.  The
string entry point is self explanatory, one passes in a string and Speak
causes SAPI to read the text aloud.  By using some of the other items in the
SpeechSynthesizer class, one can increase speech rate, volume and a few
other minor changes to the way the speech sounds when one calls Speak.

Most of the more interesting augmentations, however, seem to require using
the Prompt or PromptBuilder overloads for the Speak method.  Hence, my
quandary, I know what I want to do but cannot find any documentation that
describes or even offers an example of how a Prompt is built.  Yes, I've
already searched google and various open source sites to try to find an
answer so I'm not relying entirely on MS documentation.

Searching the Microsoft docs, however, did bring me to find
System.Speech.TTSEngine which seems to have methods and member variables and
methods for the more interesting spoken text augmentations like pitch,
countour, emphasis, etc.  The MS documentation, though, says that TTSEngine
items should not be called directly from an application (to make matters
worse, I couldn't find an explanation as to why or how these methods and
member variables can be accessed).

I will be deeply grateful to anyone who can help me figure this out either
explicitly or by sending me pointers to useful documentation, articles and
such.

Thanks,
Cdh

PS:  What do I do to bring the VS scripts over to 2008?  The seriously
broken (when JAWS is running) find feature of the help system is driving me
absolutely nuts.

__________
View the list's information and change your settings at
//www.freelists.org/list/programmingblind

__________
View the list's information and change your settings at
//www.freelists.org/list/programmingblind



__________ NOD32 3002 (20080404) Information __________

This message was checked by NOD32 antivirus system.
http://www.eset.com


__________
View the list's information and change your settings at
//www.freelists.org/list/programmingblind

__________
View the list's information and change your settings at
//www.freelists.org/list/programmingblind



__________ NOD32 3003 (20080404) Information __________

This message was checked by NOD32 antivirus system.
http://www.eset.com


__________
View the list's information and change your settings at
//www.freelists.org/list/programmingblind

__________
View the list's information and change your settings at
//www.freelists.org/list/programmingblind



__________ NOD32 3003 (20080404) Information __________

This message was checked by NOD32 antivirus system.
http://www.eset.com


__________
View the list's information and change your settings at
//www.freelists.org/list/programmingblind

__________
View the list's information and change your settings at
//www.freelists.org/list/programmingblind



__________ NOD32 3004 (20080405) Information __________

This message was checked by NOD32 antivirus system.
http://www.eset.com


__________
View the list's information and change your settings at 
//www.freelists.org/list/programmingblind

__________
View the list's information and change your settings at 
//www.freelists.org/list/programmingblind

Other related posts:

  • » RE: siRE: .Net 3.0 Speech functionality