[program-l] Re: Changing Document Layout in Word using VBA

  • From: Christopher Edwards <edwardsc2007@xxxxxxxxxxxxxx>
  • To: <program-l@xxxxxxxxxxxxx>
  • Date: Tue, 13 Oct 2009 14:10:42 +0100


Hi Ian,

Thank you for this information. I was sure I was beginning to use a sledgehammer to crack a walnut.

The Original document is in MS Word format. You mentioned saving this as XML. Unfortunately I only have access to Word 2002 at the moment and this is not an option in the Save As dialogue although I understood that XML is the underlying format forOffice documents. Maybe I could use a stand-alone program to do this conversion? Although I have used HTML and XHTML I have not done much with XML at the moment although I am happy to learn. You mentioned applying a style sheet to the XML document. Is this similar to CSS?

One thing I did try was to save the Word document as HTML. although this displayed correctly in IE and all the text was easily accessible with a screen reader apart from the fact that the screen reader read the right hand column of boxes in reverse order with one out of sequence and the heading line after the main boxes and before the footer information. As you know, the HTML generated by MS Word is pretty complex so I am not too sure about messing with that. I will get hold of HTML tidy again and see if that will clean it up.

If I was able to convert the document to XML would the style sheet on its own allow me to control this reading order or would I have to parse the document and make changes to it as well?

Finally, What application would I use to read the XML document? It strikes me that reading it with a browser would be a good idea if that is possible so that people are not restricted to certain versions of MS Office.

As far as getting the documents reformatted at source my friend is going to have another go. Unfortunately, as he is a contract employee there seem to be issues in the organisation about who should be contacting the people supplying the documents. I don't want to sound mysterious but I probably know more than I should so will just say that there seems to be a lot of office politics involved.

I hope I have managed to be reasonably clear.

Chris

----- Original Message ----- From: "Ian Sharpe" <isforums@xxxxxxxx>
To: <program-l@xxxxxxxxxxxxx>
Sent: Friday, October 09, 2009 4:49 PM
Subject: [program-l] Re: Changing Document Layout in Word using VBA


Hi Chris

Apologies if I've missed anything but it's not clear to me what format the
original documents are in? Is it Word format or HTML?

I'd be tempted to convert the document into XML then apply a stylesheet to
it. If the source is Word document you can save as XML so it's easy. You
then would just need to look at the structure of the XML file produced and
it should be relatively straight forward from there.

If the doc is in HTML format, you can run it through HTML TIDY, an open
source utility that will turn HTML documents into XHTML. Then you're back to
simply transforming it to whatever you want.

I know you said that you or at least the tutor, is not able to influence or change the way the document is layed out, but it maybe possible to actually get access to the source information, then generate a document in a suitable
layout from the beginning. Obviously not sure of the environment you're
working in but I find that if you approach those responsible for the
document in an open and constructive manner and explain the problem, you may
find they're happy to do what they can and probably weren't awrae of the
issue in the first place. Failing that, throw the DDA at them!! (just
joking!!)

Cheers
ian

-----Original Message-----
From: program-l-bounce@xxxxxxxxxxxxx [mailto:program-l-bounce@xxxxxxxxxxxxx]
On Behalf Of Christopher Edwards
Sent: 08 October 2009 14:45
To: program-l@xxxxxxxxxxxxx
Subject: [program-l] Changing Document Layout in Word using VBA

Hi,

As promised, here is my first question. I am sorry if the message seems
long.

A friend of mine has asked me if I can automate the conversion of a lot of
documents so that they are more easily read by his students using screen
readers. The source of the documents is outside his control and his efforts
to get the layout changed at source have been time consuming and
unsuccessful so far.

Here is a brief description of the situation.

Each document consists of one page.

The layout of the page has been achieved by putting all the text (and one
graphic) inside text boxes.

at the top of the page there is some heading information and there is also a
small amount of information at the bottom of the page.

The middle of the page consists of two columns of five text boxes reading
from top to bottom and then starting at the top of the second column. The
text in each box starts with a number from 1 to 10. These boxes are much
larger than the others.

This is what I have done so far.

I have written code which displays an Open File dialog, opens the selected
file and opens a new document to contain the reformatted text.

I then loop through the text boxes, extract the text, insert a blank line at
the end of the new document and then paste the text after that. Finally, I
save the new document.

This works fine but, although the text from each box is correctly formatted
and in the correct sequence the groups themselves are in an unpredictable
order.

I originally thought this would be all right as my friend is sighted and I
thought it would be quite easy for him to rearrange the text. This is the
case but new documents may come out during the year and the whole lot is
replaced each year. So, I want to try and reduce his workload if I can.

There is nothing unique about the Text Box names and many have the same
name. I am wondering if it would be possible to identify their position and change this to a sort key which I could write to a collection or array to be sorted into the correct order before pasting the text into the new document.
I have studied the Help system but have not found anything that seems to
indicate how I might do this so am at the limits of my knowledge.

I have also thought of another method which is rather a brute force
solution. I could loop through the boxes and use the coordinates of each box
to create a sort key which I could use to create and array I can sort. I
could use the dimensions of the boxes to identify the 10 big boxes and the
order they should be in. I could then loop through the sorted array and each
time loop through the text boxes to find the next box to have its contents
pasted into the new document. I could delete the text box once I have pasted
its contents so I don't have to loop through it next time. If I could
interrogate the contents of the text in the text boxes this would help. If I could identify the number at the beginning of each of the main boxes I could tell the position where it needed to be placed.However, although I can copy and pase the text I have found no way to examine it, although I am sure this
must be possible.

This sounds extremely complex and I learned in my days of programming
mainframes many years ago that if things seemed to be getting unduly
complicated I was probably missing something obvious.

If anyone can give me any clues as to what I might be able to do I would be
very grateful. I am using Office XP.

Many thanks,

Chris Edwards



__________ Information from ESET Smart Security, version of virus signature
database 4467 (20090929) __________

The message was checked by ESET Smart Security.

http://www.eset.com



** To leave the list, click on the immediately-following link:-
** [mailto:program-l-request@xxxxxxxxxxxxx?subject=unsubscribe]
** If this link doesn't work then send a message to:
** program-l-request@xxxxxxxxxxxxx
** and in the Subject line type
** unsubscribe
** For other list commands such as vacation mode, click on the
** immediately-following link:-
** [mailto:program-l-request@xxxxxxxxxxxxx?subject=faq]
** or send a message, to
** program-l-request@xxxxxxxxxxxxx with the Subject:- faq

** To leave the list, click on the immediately-following link:-
** [mailto:program-l-request@xxxxxxxxxxxxx?subject=unsubscribe]
** If this link doesn't work then send a message to:
** program-l-request@xxxxxxxxxxxxx
** and in the Subject line type
** unsubscribe
** For other list commands such as vacation mode, click on the
** immediately-following link:-
** [mailto:program-l-request@xxxxxxxxxxxxx?subject=faq]
** or send a message, to
** program-l-request@xxxxxxxxxxxxx with the Subject:- faq

** To leave the list, click on the immediately-following link:-
** [mailto:program-l-request@xxxxxxxxxxxxx?subject=unsubscribe]
** If this link doesn't work then send a message to:
** program-l-request@xxxxxxxxxxxxx
** and in the Subject line type
** unsubscribe
** For other list commands such as vacation mode, click on the
** immediately-following link:-
** [mailto:program-l-request@xxxxxxxxxxxxx?subject=faq]
** or send a message, to
** program-l-request@xxxxxxxxxxxxx with the Subject:- faq

Other related posts: