atw: Re: XML software for Word-like formatting

  • From: James Hunt <jameshunt@xxxxxxxxxxxxx>
  • To: austechwriter@xxxxxxxxxxxxx
  • Date: Wed, 24 Jun 2009 23:02:01 +1000


On 22 Jun 2009, at 11:23 PM, davebgar wrote:

Hi,
I'm an editor looking for an XML publishing package that gives me
flexibility with formatting scientific/technical reports in a way similar
to MS Word - such as with table formatting, changing pagination for
sections, and incorporating graphics. I'm also considering remote editing, where several authors can collaborate on a server-based document that can
highlight the edits made and who made them. Have read that WebDAV XML
software (on a WebDAV-enabled server) could achieve this. My only XML
experience so far has been with Arbortext EPIC running with Documentum,
and DocBook schema.

Any suggestions to help narrow down my research on software is most welcome.

Dave Gardiner


**************************************************
To view the austechwriter archives, go to www.freelists.org/ archives/austechwriter

To unsubscribe, send a message to austechwriter- request@xxxxxxxxxxxxx with "unsubscribe" in the Subject field (without quotes).

To manage your subscription (e.g., set and unset DIGEST and VACATION modes) go to www.freelists.org/list/austechwriter

To contact the list administrator, send a message to austechwriter- admins@xxxxxxxxxxxxx
**************************************************

-------------------

This request covers a great deal of ground.

In the technical writing field as we know it, products like ArborText Epic are used in larger enterprises for book and report production. Authors work in text editors, and produce ASCII text files. Text may be tagged, usually by selecting opening and closing tags from various menus, in accordance with various rules, and the result - still an ASCII text file - is fed to a typesetter, which uses the TeX engine to produce the formatted version of the document, in PDF. The tags comprise a dialect of XML. The database program behind everything (often Documentum) can keep track of changes made, versions, and so on. The text file in which the author works is reconstructed from tagged elements and presented to the author by the database program, and this process is transparent. By issuing suitable instructions, a publisher could produce multiple editions of a book from the database of versions of tagged elements, by specifying selection rules for those tagged elements.

ArborText Epic is ruinously expensive to buy, and hard to maintain in production. From a writer's point of view, though, it is very easy to use. If you can afford it, Epic+Documentum would meet your requirements. Otherwise, you will have to devise your own toolchain.

There is, AFAIK, no freeware equivalent of Epic+Documentum, but the general idea of feeding tagged ASCII text to a typesetting program has been around since the Stone Age of Computing, when the freeware TeX typesetting system was devised in 1978 or thereabouts. (TeX is used by ArborText Epic for its typesetting.)

In the book and journal publishing world, the tools used by writers and editors of scientific, technical, and medical (STM) works vary by subject area. In equation-rich fields such as mathematics, physics, and some branches of engineering, LaTeX (a version of the aforementioned TeX), reigns supreme. Authors write text files and insert their own tags, and can typeset their own work. For detailed information, downloads, and free introductory texts, see http:// www.tug.org.

This is a world where WYSIWYG is practically unknown, and Microsoft Word is a toy for secretaries.

LaTeX is not an XML dialect. LaTeX and friends, such as its newer cousin ConTeXt, have features that make XML folk nervous: for example, the frequent termination of commands by blanks, and the fact that structural tags can double as processing commands.

Major book and journal publishers use tag converters to produce XML- tagged versions of LaTeX-coded works, for archival purposes: Google "LaTeX to XML" for more information.

Representing mathematics is straightforward in LaTeX, but not in XML. There is an XML dialect called MathML, which was originally designed for displaying equations on Web pages. In comparison with LaTeX code, MathML is blindingly verbose, and few writers use MathML directly. LaTeX to MathML tag converters are available. Again, Google.

Many writers of Web pages with mathematics get around the verbosity problem by inserting little JPG images of equations that were produced by LaTeX utilities; see, for example:

http://www.cmmp.ucl.ac.uk/~ajf/course_notes/node42.html

On that page, the ALT data for the images comprises the LaTeX code used to generate the images, but there is no real sense in which the mathematics is part of the structure of the page.

A good general reference on LaTeX, XML, etc. is:

Goossens, M. and Rahtz, S.: "The LaTeX Web Companion: Integrating TeX, HTML, and XML" (Addison-Wesley, 1999). ISBN: 0 201 43311 7.

JH

**************************************************
To view the austechwriter archives, go to 
www.freelists.org/archives/austechwriter

To unsubscribe, send a message to austechwriter-request@xxxxxxxxxxxxx with 
"unsubscribe" in the Subject field (without quotes).

To manage your subscription (e.g., set and unset DIGEST and VACATION modes) go 
to www.freelists.org/list/austechwriter

To contact the list administrator, send a message to 
austechwriter-admins@xxxxxxxxxxxxx
**************************************************

Other related posts: