[liblouis-liblouisxml] Re: UTDML and where brl noes can appear

  • From: Michael Whapples <mwhapples@xxxxxxx>
  • To: liblouis-liblouisxml@xxxxxxxxxxxxx
  • Date: Fri, 18 Oct 2013 17:37:44 +0100

So from what you are saying are the files wordDocument.sem and w_wordDocument.sem no longer needed? If so why have they not been deleted from the project?


I must stress though, this is not docx, it is Word XML, like if you select the format Word XML 2003 in the save dialog of word (it is a single XML not an archive containing multiple files). Will docx.sem work on this?

Will I need to create a config file to ensure docx.sem is used?

Michael Whapples
On 18/10/2013 17:25, John J. Boyer wrote:
Separate semantic-action files are not needed for documents that use
namespaces. Semantic-action files use only the local names. Given the
possibility of name conflicts, maybe this should be changed.

Both of the old MSWord semantic-action files are wrong. Look at docx.sem
in the lbu_files directory of liblouisutdmll. You can correct or delete
the old files.

John

On Fri, Oct 18, 2013 at 04:37:35PM +0100, Michael Whapples wrote:
If that is so then the Word XML semantic action files (I believe both
wordDocument.sem and w_wordDocument.sem are wrong as they assign
document action to wordDocument and w:wordDocument elements.

PS. Why do we need separate semantic action files for when a namespace
is used or not? Also relying on the XML to alias the namespace as w:
seems to be wrong as potentially one can alias it as they like: eg.
xmlns:w="..." or xmlns:word="...", both would be correct XML and mean
the same.

Michael Whapples
On 18/10/2013 02:16, John J. Boyer wrote:
In dtbook.sem ad nimas.sem the action document is given to <book>
because it contains the actual docummennt. <dtbook> contains the head
node as well. So for the xml part of docx <body> must be given the
action document. The root element <document> gets no action. That is the
way things are set up.

John

On Thu, Oct 17, 2013 at 09:09:52PM +0100, Michael Whapples wrote:
No the document node should be w:wordDocument.

Looks like w:body is given the action no.

In Word XML w:body is like body in HTML, where the body of the document
will be found, so why should brl nodes appear outside that?

Also I notice two .sem files for Word documents, looks like one is for
when there is a XML namespace in the Word XML and the other is for where
there is not. There seems to be differences between these (eg.
w:wordDocument has action no where as wordDocument has action document).

Michael Whapples
On 17/10/2013 18:09, John J. Boyer wrote:
What semantic action is asigned to the w:body node? It should be
document. Lok at the docx.sem file in lbu_files .

John

On Thu, Oct 17, 2013 at 05:46:50PM +0100, Michael Whapples wrote:
Where can the brl nodes appear in a UTDML document?

We have a Word XML file which has been marked with UTDML and the last
brl node for the last page number appears outside the document body
(w:body) element. This looks strange and means that one has to get the
parser to look outside the body of the document to get all Braille,
which is certainly sub-optimal as one will need to filter out much
unwanted stuff.

Is LibLouisUTDML working correctly in placing this brl node outside the
body element?

Michael Whapples
For a description of the software, to download it and links to
project pages go to http://www.abilitiessoft.com
For a description of the software, to download it and links to
project pages go to http://www.abilitiessoft.com
For a description of the software, to download it and links to
project pages go to http://www.abilitiessoft.com

For a description of the software, to download it and links to
project pages go to http://www.abilitiessoft.com

Other related posts: