[mso] Re: Exporting Messages

I've got a pretty good version at this point. There are a few stray
binary characters floating in and out of the results depending on how
Outlook puts out the MSG file. If I could get solid documentation on
this type of file (MSG COMMON DOCUMENT), I'd use Extended MAPI, from
which this format comes, to dump the text.

Unfortunately, that documentation is weak, so I'm left with string
comparisons and regular expressions to determine what's valid text
content and what isn't. One other thing I WON'T BE SUPPORTING in this
script: Extended ASCII characters (characters 128 through 255). Text
renderers like the FileSystemObject are not too keen on determining the
difference between character 234 and seeing that character as part of a
Hex/Bin/Octal/Unicode expression. Everything below 127 is standard ASCII
(and, therefore, English only) which should cover most of us...including
the Xenophobes.<g> Finally, all header info is preserved for you who
want to actively pursue spammers and the relays they use. Any RFC2822
compatible mail server (Gee, that almost rules out Notes and
Exchange!<g>) will have the headers in the MSG file and I've taken care
to NOT destroy that info because it's "Good Stuff, man!"

Operation is simple, too. All you have to do with WinNT/2000/XP is drag
the folder containing the messages you want to parse and drop it on the
WSF (Windows Script File) that is the parsing script. It will hit each
file and create a new version, stripped of binary and HTML, with the
same file name & "NoHTML.txt" in the directory. It will do the same for
all files in all subdirectories.

If you're on a 9x system (including ME), however, the drag and drop
method won't work since there is no file association on those systems
for VBS, JS and WSF files. Instead, open a command prompt and type:
Cscript "<drive>:\<path>\StripHTML.wsf" "<path of folder to process>" 
...Where <drive> is the drive on which the script resides and <path> is
the folder path to the file. <Path of folder to process> is the drive
and file path of the folder containing all the messages and this path
should be surrounded in double quotes (").

If you're nto yet familiar with Cscript and Wscript (the Windows Script
Hosts), I highly recommend you hoof it on over to
http://msdn.microsoft.com/scripting and start giving it a look see. The
path from VBA ops to using the Windows Script Host is nearly painless
and is, in a lot of ways, easier to learn. I'll also recommend that you
at least download the documentation (a pretty well written Help file) as
much of what's in there is 100% portable to VB/VBA and it's one helluva
lot easier to work with than the Office VBA Help. Can you believe it?

When you save the file to disk, be sure to save it as StripHTML.wsf 

I'll pass it to Dian so that she can put it wherever she's going to put
it.<g> I expect she'll let you all know when it's there and how to get
it.

Greg Chapman
http://www.mousetrax.com 
"Counting in binary is as easy as 01, 10, 11!
With thinking this clear, is coding really a good idea?"


> -----Original Message-----
> From: mso-bounce@xxxxxxxxxxxxx 
> [mailto:mso-bounce@xxxxxxxxxxxxx] On Behalf Of Dian Chapman
> Sent: Tuesday, July 30, 2002 10:23 PM
> To: mso@xxxxxxxxxxxxx
> Subject: [mso] Re: Exporting Messages
> 
> 
> 
> True...it's gonna be a messy txt file. Greg is actually 
> working on some code to try to clear the binary/html junk outta there.
> 
> If he gets a script to work, we'll post it on the MouseTrax 
> download page at www.mousetrax.com/downloads.html and let you 
> know here, too.
> 
> Dian Chapman
> Technical Consultant, Instructor,
> Microsoft MVP & TechTrax Editor
> 
> Tutorial web site: http://www.mousetrax.com/techpage.html
> TechTrax Ezine: http://www.mousetrax.com/techtrax/
> Instructor: http://www.eclecticacademy.com/faculty.htm
> Word MVP support site: http://www.mvps.org/word
> Microsoft support groups: http://support.microsoft.com

*************************************************************
You are receiving this mail because you subscribed to mso@xxxxxxxxxxxxx or 
MicrosoftOffice@xxxxxxxxxxxxxxxx

To send mail to the group, simply address it to mso@xxxxxxxxxxxxx

To Unsubscribe from this group, send an email to 
mso-request@xxxxxxxxxxxxx?Subject=unsubscribe

Or, visit the group's homepage and use the dropdown menu.  This will also allow 
you to change your email settings to digest or vacation (no mail).
http://www.freelists.org/webpage/mso

To be able to use the files section for sharing files with the group, send a 
request to mso-moderators@xxxxxxxxxxxxx and you will be sent an invitation with 
instructions.  Once you are a member of the files group, you can go here to 
upload/download files:
http://www.smartgroups.com/vault/msofiles
*************************************************************

Other related posts: