John, every text editor that I use is fully capable of using UTF8. Even Windows Notepad, the most backward one of all, can use UTF8. And as Masar points out, anybody writing in languages using extended character sets mostly uses UTF8 nowadays anyhow. Windows of course uses UTF16 as its default set, but it converts to/from UTF8 okay, with screen readers being the main victim somehow. I had understood that you would need to make big changes to liblouis to use UTF8. If that is not the case, then I guess I am now wondering why not. Seems to me that we could continue to use the /x notation for backward compatibility. What is the down side? John G -----Original Message----- From: liblouis-liblouisxml-bounce@xxxxxxxxxxxxx [mailto:liblouis-liblouisxml-bounce@xxxxxxxxxxxxx] On Behalf Of John J. Boyer Sent: Tuesday, July 03, 2012 9:17 AM To: liblouis-liblouisxml@xxxxxxxxxxxxx Subject: [liblouis-liblouisxml] Re: [liblouis] r715 committed - the last batch of files converted to utf-8. I feel that it is important that the tables should be human-readable and editable with simple text editors. This is not the case if either UTF-8 or Latin-1 is used. That is why I think we should use the \xhhhh notation for all characters above 127. I don't care that it doesn['t look pretty. It makes things easier for people who have to maintain tables after the original author is finished with them. Moreover, why should European languages be favored in terms of ease of entering characters? Non-Eupopean languages must use the \xhhhh notation anyway. Finally, I don't think it is a good idea to suddenly change a way of writing tables that has been used from the beginning. John On Tue, Jul 03, 2012 at 04:21:01PM +0100, Mesar Hameed wrote: > Hi John, > > On Tue 03/07/12,10:05, John J. Boyer wrote: > > If UTF8 is allowed in the character argument of opcodes, no characters > > above 127 can be used. This would invalidate any tables using Latin-1. > > Yes, they are already been converted to utf8, i.e. no byte uses values above 127 as per the unicode standard. > If they need to represent aumlouts greek symbols etc, they are correctly encoded using utf8. > > > People would not be able to simply type letters on their keyboards. > > Sorry i think you are mixing computer representation, with human readable representation. > The point of utf8 is: > * for everyone to be able to use their keyboards, and directly see on the screen what characters they have typed. > * their text independant of location of the reader, is exactly what the author wrote. > > I can quite happely write swedish text and send it to you, and you can receive it perfectly fine, as long as the file is utf8 encoded. > If i send it as latin1, I am presuming that you will open it in latin1, and if you dont then you will not be able to read the content correctly. > > To reiterate, At this moment in time, no tables are using values above 127, but they are displaying correctly for everyone, because they use utf8. > so we wish opcodes to accept utf8 operands. > Of course, when it comes to sending for printing, these values are correctly mapped using the dis files to fall in within the characters that > the hardware accepts. > > Mesar > For a description of the software, to download it and links to > project pages go to http://www.abilitiessoft.com -- John J. Boyer; President, Chief Software Developer Abilitiessoft, Inc. http://www.abilitiessoft.com Madison, Wisconsin USA Developing software for people with disabilities For a description of the software, to download it and links to project pages go to http://www.abilitiessoft.com For a description of the software, to download it and links to project pages go to http://www.abilitiessoft.com