[austechwriter] I'm not sure how well a strict XML structure would work in the real world

Hello all,
This is my second attempt to post this, so apologies if you've seen it before.

But following on from last week's discussion re Word & Frame and 'font 
fondling' vs structured and all that, I (as unabashed XML proselytizer) feel I 
have to draw attention to a couple of facts.

First is this: YOU DO NOT NEED A DTD TO RUN XML
Sorry about the caps, but it's such a widespread misconception that the 
contrary is true. Certainly if you want to validate the XML file then you must 
have a DTD (or other rules file). But XML has so much to offer apart from 
validation that it's a shame to see techwriters dismissing it on account of a 
complication that isn't relevant to 90% of the work that they do. Sure if 
you're in Bill Hall's situation (Defence, stringent content control, high 
volumes) then you'll be buying into the full catastrophe. But others should be 
aware that XML has introduced (as a qualification to the stricter, more complex 
& powerful SGML) what could be described as an 'entry level grammar' . The so 
called rules of WELL-FORMEDNESS prescribe a very simple structure that a 
document must have in order to be processable by an XML aware application.

This fact qualifies the following Mike Buckler statements.

Authoring XML/SGML is definitely slower. There are several reasons.

1) Using a DTD as complex as Docbook requires many hours of study.
True, but why a DTD? And even if you're using one, why DocBook. My point is 
that the questions should be put - there are of course good reasons for both, 
in some circumstances.

2) The current crop of XML/SGML aware tools are not particularly user friendly.
This is off the present topic but I'd have to differ again: XML Spy is the 
goods IMHO; I can't comment on others 'cause I haven't seriously tried them (no 
need to :-)

3) If the DTD doesn't fit the type of document exactly, then you can
spend as much time revising the DTD and style rules as actually
writing the document.
As for 1 above. And Spy (and other tools) will reverse engineer a DTD from an 
XML document.

4) Things like tables do not fit into the classic XML/SGML model. This
is where presentation (column and row layout) comes into play and
messes up the whole point of using XML/SGML in the first place.
I'm at a loss to understand where this comment is coming from, but venture the 
following. I suspect that Mike has formed his opinion under the influence of 
DocBook. DocBook is a very rich DTD and will save lots of DTD developers lots 
of work over the long term. But its default table model is CALS which (I 
believe) is of longstanding use in the defence/aerospace SGML community. CALS 
is somewhat more complex than the much more widely understood HTML model, and 
translation between the two is less than straightforward (but no more than a 
few hours work for a competent XSLT, Perl, whatever, scripter).

But for most techwriters most of the time this is all pretty esoteric. There is 
no 'classic model'. Describe your table as you would in HTML and worry about 
more complex models as and when the need arises.

The current hype around XML is mainly to do with data interchange
between application that do data processing. For example an insurance
claim form where a set number of fields must be completed in order to
create a "valid document".

I don't know that the hype is quite current (I thought it peaked a couple of 
years back, but maybe that's just me) - but the distinction between data 
centric and document centric XML is important. It's true that most of the XML 
development focus through the e-commerce period has been on data centric forms 
such as Mike cites. But SGML was designed for narrative style documents and XML 
is only slightly less capable in that regard. For myself I've spent a good deal 
of time figuring out the XSLT necessary to navigate narrative style doc 
structures, wondering all the while what the much vaunted implementation of XML 
in Word would mean. Well now that it's here (in beta) I can say that the good 
news (and fact number two in this little spiel) is

WORD 2003'S XML FUNCTIONALITY IS NOT PROPRIETARY
How so? Because the standards bar is actually quite low. As I've pointed out 
earlier an XML doc need only conform with well-formedness rules to 'be' XML. 
It's true that Word 2000's Save as Web Page conversion introduced some shonky 
syntax. But that's all gone in the 2003 save as XML functionality. What you get 
is pretty much what you always got in RTF, a complete description of the doc 
with content and formatting closely juxtaposed, except that it's now expressed 
in XML. The 'close juxtaposition' makes it a bit laborious to step around all 
the formatting tags (when XSLT processing), but it's quite doable.

Fact number three: INTERNET EXPLORER AND NOTEPAD ARE ALL YOU NEED TO DO XML
Well, if we're talking 'real world' then it's unlikely that you'd limit 
yourself to these. I guess my point is to urge writers such as Elizabeth who 
are wondering what the go is with XML to start experimenting. The outlay isn't 
financial it's time spent learning the standards, namely XML, XSLT and 
CSS/HTML. XSLT is the most difficult of these but by no means the proverbial 
rocket science. I'd rate it about 60% the difficulty of VBA. 

Is it worth it? There will always be a fair degree of subjectivity in that 
assessment but I'm confident that the coming together of public standards and 
Microsofts's implementations in IE and Office means that these skills will 
increasingly become 'core' for techwriters.

Hello all,
the operative word in Elizabeth's question is 'strict'.
Forgive me for chasing the train well after it's departed the station. But 
following on from last week's discussion re Word & Frame and 'font fondling' vs 
structured and all that, I (as unabashed XML proselytizer) feel I have to draw 
attention to a couple of facts.

First is this: YOU DO NOT NEED A DTD TO RUN XML
Sorry about the caps, but it's such a widespread misconception that the 
contrary is true. Certainly if you want to validate the XML file then you must 
have a DTD (or other rules file). But XML has so much to offer apart from 
validation that it's a shame to see techwriters dismissing it on account of a 
complication that isn't relevant to 90% of the work that they do. Sure if 
you're in Bill Hall's situation (Defence, stringent content control, high 
volumes) then you'll be buying into the full catastrophe. But others should be 
aware that XML has introduced (as a qualification to the stricter, more complex 
& powerful SGML) what could be described as an 'entry level grammar' . The so 
called rules of WELL-FORMEDNESS prescribe a very simple structure that a 
document must have in order to be processable by an XML aware application.

This fact qualifies the following Mike Buckler statements.

Authoring XML/SGML is definitely slower. There are several reasons.

1) Using a DTD as complex as Docbook requires many hours of study.
True, but why a DTD? And even if you're using one, why DocBook. My point is 
that the questions should be put - there are of course good reasons for both, 
in some circumstances.

2) The current crop of XML/SGML aware tools are not particularly user friendly.
This is off the present topic but I'd have to differ again: XML Spy is the 
goods IMHO; I can't comment on others 'cause I haven't seriously tried them (no 
need to :-)

3) If the DTD doesn't fit the type of document exactly, then you can
spend as much time revising the DTD and style rules as actually
writing the document.
As for 1 above. And Spy (and other tools) will reverse engineer a DTD from an 
XML document.

4) Things like tables do not fit into the classic XML/SGML model. This
is where presentation (column and row layout) comes into play and
messes up the whole point of using XML/SGML in the first place.
I'm at a loss to understand where this comment is coming from, but venture the 
following. I suspect that Mike has formed his opinion under the influence of 
DocBook. DocBook is a very rich DTD and will save lots of DTD developers lots 
of work over the long term. But its default table model is CALS which (I 
believe) is of longstanding use in the defence/aerospace SGML community. CALS 
is somewhat more complex than the much more widely understood HTML model, and 
translation between the two is less than straightforward (but no more than a 
few hours work for a competent XSLT, Perl, whatever, scripter).

But for most techwriters most of the time this is all pretty esoteric. There is 
no 'classic model'. Describe your table as you would in HTML and worry about 
more complex models as and when the need arises.

The current hype around XML is mainly to do with data interchange
between application that do data processing. For example an insurance
claim form where a set number of fields must be completed in order to
create a "valid document".

I don't know that the hype is quite current (I thought it peaked a couple of 
years back, but maybe that's just me) - but the distinction between data 
centric and document centric XML is important. It's true that most of the XML 
development focus through the e-commerce period has been on data centric forms 
such as Mike cites. But SGML was designed for narrative style documents and XML 
is only slightly less capable in that regard. For myself I've spent a good deal 
of time figuring out the XSLT necessary to navigate narrative style doc 
structures, wondering all the while what the much vaunted implementation of XML 
in Word would mean. Well now that it's here (in beta) I can say that the good 
news (and fact number two in this little spiel) is

WORD 2003'S XML FUNCTIONALITY IS NOT PROPRIETARY
How so? Because the standards bar is actually quite low. As I've pointed out 
earlier an XML doc need only conform with well-formedness rules to 'be' XML. 
It's true that Word 2000's Save as Web Page conversion introduced some shonky 
syntax. But that's all gone in the 2003 save as XML functionality. What you get 
is pretty much what you always got in RTF, a complete description of the doc 
with content and formatting closely juxtaposed, except that it's now expressed 
in XML. The 'close juxtaposition' makes it a bit laborious to step around all 
the formatting tags (when XSLT processing), but it's quite doable.

Fact number three: INTERNET EXPLORER AND NOTEPAD ARE ALL YOU NEED TO DO XML
Well, if we're talking 'real world' then it's unlikely that you'd limit 
yourself to these. I guess my point is to urge writers such as Elizabeth who 
are wondering what the go is with XML to start experimenting. The outlay isn't 
financial it's time spent learning the standards, namely XML, XSLT and 
CSS/HTML. XSLT is the most difficult of these but by no means the proverbial 
rocket science. I'd rate it about 60% the difficulty of VBA.

Is it worth it? There will always be a degree of subjectivity in that 
assessment. But I'm confident that the coming together of public standards and 
Microsoft implementations in IE and Office mean that these skills will 
increasingly become 'core' for techwriters.

rgds,
Tony Cusack,
www.textology.com.au

**************************************************
To post a message to austechwriter, send the message to 
austechwriter@xxxxxxxxxxxxxx

To subscribe to austechwriter, send a message to 
austechwriter-request@xxxxxxxxxxxxx with "subscribe" in the Subject field.

To unsubscribe, send a message to austechwriter-request@xxxxxxxxxxxxx with 
"unsubscribe" in the Subject field.

To search the austechwriter archives, go to 
www.freelist.org/archives/austechwriter

To contact the list administrator, send a message to 
austechwriter-admins@xxxxxxxxxxxxx
**************************************************

Other related posts: