[liblouis-liblouisxml] Re: Semantic action files, which are for what format

  • From: Keith Creasy <kcreasy@xxxxxxx>
  • To: "'liblouis-liblouisxml@xxxxxxxxxxxxx'" <liblouis-liblouisxml@xxxxxxxxxxxxx>
  • Date: Wed, 30 Oct 2013 12:47:26 +0000

So, how do you get the Word XML document? Is it a "Save-as" choice? I'm just 
curious about this since I am really a pretty rudimentary Word user. I probably 
know more about the DOCX format than I do about actually using Word. :)




-----Original Message-----
From: liblouis-liblouisxml-bounce@xxxxxxxxxxxxx 
[mailto:liblouis-liblouisxml-bounce@xxxxxxxxxxxxx] On Behalf Of Michael Whapples
Sent: Wednesday, October 30, 2013 8:38 AM
To: liblouis-liblouisxml@xxxxxxxxxxxxx
Subject: [liblouis-liblouisxml] Re: Semantic action files, which are for what 
format

One obvious difference is that Word XML has the root element <wordDocument> 
where as docx has the root element <document>.

I think Word XML I am getting is Word 2003 XML, but that might be due to using 
Word 2003. The docx format was introduced in office 2007 and is an ISO/IEC 
29500 standard and open ECMA 376 standard. Word 2003 XML is not such a standard.

The main reason I had chosen to go with the Word XML is because it is a single 
XML file, containing the style information and everything else, where as if I 
were to extract the word/document.xml from a docx archive I would not have the 
style and other parts of the document.

Michael Whapples
On 30/10/2013 12:27, Keith Creasy wrote:
> So, what is the difference between DOCX and  what you are calling the Word 
> XML format? I mean other than the fact that DOCX is a zip archive that 
> contains an XML document file.
>
> In looking at them none of the Word semantic action files are very good. 
> You'll get Braille but not very good Braille. I hope soon we can actually 
> support .docx files. I'll probably get grief about wanting to call the clean 
> semantic action file the "docx.sem" file because someone probably uses the 
> old one.
>
>
> -----Original Message-----
> From: liblouis-liblouisxml-bounce@xxxxxxxxxxxxx 
> [mailto:liblouis-liblouisxml-bounce@xxxxxxxxxxxxx] On Behalf Of 
> Michael Whapples
> Sent: Wednesday, October 30, 2013 6:03 AM
> To: liblouis-liblouisxml@xxxxxxxxxxxxx
> Subject: [liblouis-liblouisxml] Re: Semantic action files, which are 
> for what format
>
> The document action is not the issue now, it solved the problem.
>
> My question really relates to what files are for what, or in some cases 
> whether a file is still relevant.
>
> OK, docx.sem is in the repository but not in the 2.5.0 tarball release.
>
> Examining docx.sem it is not for Word XML but the docx format. As I said 
> previously Word XML and docx are different formats. It does appear that 
> wordDocument.sem is the file for Word XML, but there is also 
> w_wordDocument.sem which is similar, so I do not know which is the one I 
> should be using. Previously you said that wordDocument.sem and 
> w_wordDocument.sem are no longer used, but I cannot find any other file which 
> seems to deal with the Word XML format, so which should I be using for Word 
> XML?
>
> Michael Whapples
> On 30/10/2013 09:55, John J. Boyer wrote:
>> The docx.sem file should be in the liblouisutdml repository. It was 
>> pushed some time ago. As I have stated several times, the document 
>> semantic action should be on the element that contains the document 
>> and nothing else. It should not be on the <head> element or on the 
>> root element.
>>
>> John
>>
>> On Wed, Oct 30, 2013 at 06:07:59AM +0000, Michael Whapples wrote:
>>> Hello,
>>> I previously commented on problems with the wordDocument.sem 
>>> semantic action file placing brl nodes outside the <w:Body> element.
>>>
>>> John stated that wordDocument.sem and w_wordDocument.sem are out of 
>>> date and that I should use docx.sem.
>>>
>>> First of all docx.sem does not exist. I do find a doc.sem but it 
>>> does not seem to relate to the Word XML format which one gets when 
>>> saving in Word as XML. NOTE: Word XML is not docx format.
>>>
>>> The only semantic action files which look remotely like the Word XML 
>>> format are wordDocument.sem and w_wordDocument.sem. I am not really 
>>> sure what the difference between these files is meant to be.
>>>
>>> It might be worth removing files which are no longer used.
>>>
>>> Also it might be worth adding comments to the top of semantic action 
>>> files to state which format the file is used for, particularly as 
>>> sometimes the file name might not help.
>>>
>>> Michael Whapples
>>> For a description of the software, to download it and links to 
>>> project pages go to http://www.abilitiessoft.com
> For a description of the software, to download it and links to project 
> pages go to http://www.abilitiessoft.com For a description of the 
> software, to download it and links to project pages go to 
> http://www.abilitiessoft.com

For a description of the software, to download it and links to project pages go 
to http://www.abilitiessoft.com
For a description of the software, to download it and links to
project pages go to http://www.abilitiessoft.com

Other related posts: