Hi Vic, On Thu 05/07/12,10:24, Vic Beckley wrote: > This does seem to be a good combination. On my test file, with eSpeak as the > synth, the two characters in question are silent but the triple press does > yield their Unicode value. Great, it should work for you 100% of the time, as requested. :) > With the SAPI 5 synth that I usually use, the > character are spoken as "question" and, again, the triple press yields the > Unicode value. I wonder why it is different with different synths? Implementers usually only define names for characters that are appropriate for the synths supported languages. The remaining question then is what to do with everything else. Some synths just say nothing, others may beep, or with your synth, it says question mark. another factor is if the character is being passed to the synth as a word or as a character. for example the letter m, if passed as a word probably yields mmm sound, while when passed as a character the synth pronounces it as em (e followed by m) > This is in spite of the fact that they are not showing on the screen. Yes this points to suitable fonts not being installed/selected. I am afraid I dont have much information on how to go about getting fonts for windows. > Is there somewhere on the web that you can find a definitive description of > what all characters represent. There are a lot, but often not very accessible with screenreaders. Just now I managed to find: http://www.utf8-chartable.de/unicode-utf8-table.pl?utf8=0x which seems quite nice here with orca. > For example, the u+0099 symbol is defined in > the cy-cy-g1.utb table as a specific dot pattern. Window-Eyes calls this > character a trademark symbol. NVDA doesn't know what it is. On the reference > I found on the web it is just called a control character. Yes, I dont think its a trademark symbol, just something lost in conversion. These people seem to have identified it as a bug: http://stackoverflow.com/questions/7341274/how-can-i-check-that-the-trademark-character-is-set-correctly-in-my-oracle-da The best thing to do is to delete that symbol, and replace it with \x2122 or the actual trademark symbol. > The character > u+0080 is similar. If it also had a comment after it, you should be able to look up its codepoint in unicodedefs.cti and correct it in the same way. > The unicodedefs.cti table just has > these characters as blank, probably because they are supposed to be control > characters. > Please explain this confusion? This probably happend because the author of the table had one code page, while the target language has a different one. Hopefully now that we are working in unicode it should be much faster and less error prone to correct these and many simular mistakes. Please do keep a record of what looks suspicious so that we can fix them in one go for a given table. Thanks much, Mesar For a description of the software, to download it and links to project pages go to http://www.abilitiessoft.com