Re: MSHTML DocType object? Can't get the string

Unfortunately I don't know anything about Perl, but I'm not sure if it would help in this case since I'm creating an ActiveX control to be initialized within a JAWS script.


I thought this would be simple. This is ridiculous.

There is a DocType property, which is supposed to be referenced from the DOM, as indicated at
http://msdn2.microsoft.com/en-us/library/ms531073.aspx

The DocType is part of the source code, so, logically, it should be accessible from the DOM as a string right? Just like every other element is?

Apparently not, as indicated at
http://www.thescripts.com/forum/thread348964.html

Which has a solution included with the explanation. However, the solution is apparently written for a VB.net version, and I have no idea how to convert this to a VB6 format. When I try to do so directly, the code is full of build errors.

Does anyone know how to convert this to a VB6 format? Or know of a possible solution that may work in VB6 that I could try instead?

At this point, I'm willing to try anything... I would even do a rain dance by firelight with the fabled carot of good fortune while making strange noises by moonlight... But I doubt it would work.

Best wishes,

Bryan

----- Original Message ----- From: "Octavian Rasnita" <orasnita@xxxxxxxxx>
To: <programmingblind@xxxxxxxxxxxxx>
Sent: Sunday, September 30, 2007 12:29 AM
Subject: Re: MSHTML DocType object? Can't get the string


Hi,

I don't know the general format for the document type definitions in order to be able to give you a good regexp, but if all the definitions look like the following, then it wouldn't be too hard to create a regexp:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd";>

If you want to get the entire content of this definition, (without the beginning < and closing > and without !DOCTYPE, you would need to use a regexp like:

/<!DOCTYPE\s+([^>]+)>/gs

\s means in perl regexp any space, tab or newline so you could replace it with something more strict if you want (like a single space, for example).

C# uses perl's regular expressions, but I don't know how's in VB.
Anyway, it shouldn't be very complicated.

If I'd do it in perl, and if the content of that page would be in $content variable, I'd do it like this:

my ($doctype) = $content =~ /<!DOCTYPE\s+([^>]+)>/gs;

And $doctype should contain:

html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd";

And it would be also pretty simple to get those parts if you need it.

But anyway, for a web page validation I think there are already made perl modules that do a much better and complete work.

Octavian

----- Original Message ----- From: "Bryan Garaventa" <bryan@xxxxxxxxxxxxxxxxxxx>
To: <programmingblind@xxxxxxxxxxxxx>
Sent: Sunday, September 30, 2007 9:46 AM
Subject: Re: MSHTML DocType object? Can't get the string


Which regular expressions would that be? This is part of the problem, I need to capture the entire source code of a page programmatically.

This is easy to do for the HTML tag, which would be simply

Dim Source as String = DocumentObject.documentElement.outerHTML

However, this doesn't include the DocType declaration if one is present on the page, which is fairly critical during html validation.

If there is a regular expression for this, that would be great. I don't know what it is though.


----- Original Message ----- From: "Octavian Rasnita" <orasnita@xxxxxxxxx>
To: <programmingblind@xxxxxxxxxxxxx>
Sent: Saturday, September 29, 2007 10:30 PM
Subject: Re: MSHTML DocType object? Can't get the string


Wouldn't be more simple and sure to use regular expressions for this?

Octavian

----- Original Message ----- From: "Bryan Garaventa" <bryan@xxxxxxxxxxxxxxxxxxx>
To: <programmingblind@xxxxxxxxxxxxx>
Sent: Sunday, September 30, 2007 7:46 AM
Subject: MSHTML DocType object? Can't get the string


Does anyone know how to get the string value out of the Document.DocType object using VB?

I asumed that a string value would be returned when this was called, but no luck. For instance,

' Asume that DocumentObject is the current Document object for the given web page.

Dim DocType as String = DocumentObject.DocType

This looks a little different in VB6, but you get the idea. Well, it doesn't work. There is no string since an object of type DocumentType is being returned.

However, there doesn't appear to be any way of creating a variable of type DocumentType in VB, which is presumably being returned by DocumentObject.DocType.

I just need to get the actual string for the DocType tag itself from the document object using VB.

Any ideas how I might be able to do this?

Thanks a million,

Bryan Garaventa


__________
View the list's information and change your settings at http://www.freelists.org/list/programmingblind


__________
View the list's information and change your settings at http://www.freelists.org/list/programmingblind



__________
View the list's information and change your settings at http://www.freelists.org/list/programmingblind


__________
View the list's information and change your settings at http://www.freelists.org/list/programmingblind



__________
View the list's information and change your settings at http://www.freelists.org/list/programmingblind

Other related posts: