Re: MSHTML DocType object? Can't get the string
- From: "Octavian Rasnita" <orasnita@xxxxxxxxx>
- To: <programmingblind@xxxxxxxxxxxxx>
- Date: Sun, 30 Sep 2007 10:29:42 +0300
Hi,
I don't know the general format for the document type definitions in order
to be able to give you a good regexp, but if all the definitions look like
the following, then it wouldn't be too hard to create a regexp:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
If you want to get the entire content of this definition, (without the
beginning < and closing > and without !DOCTYPE, you would need to use a
regexp like:
/<!DOCTYPE\s+([^>]+)>/gs
\s means in perl regexp any space, tab or newline so you could replace it
with something more strict if you want (like a single space, for example).
C# uses perl's regular expressions, but I don't know how's in VB.
Anyway, it shouldn't be very complicated.
If I'd do it in perl, and if the content of that page would be in $content
variable, I'd do it like this:
my ($doctype) = $content =~ /<!DOCTYPE\s+([^>]+)>/gs;
And $doctype should contain:
html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"
And it would be also pretty simple to get those parts if you need it.
But anyway, for a web page validation I think there are already made perl
modules that do a much better and complete work.
Octavian
----- Original Message -----
From: "Bryan Garaventa" <bryan@xxxxxxxxxxxxxxxxxxx>
To: <programmingblind@xxxxxxxxxxxxx>
Sent: Sunday, September 30, 2007 9:46 AM
Subject: Re: MSHTML DocType object? Can't get the string
Which regular expressions would that be? This is part of the problem, I
need to capture the entire source code of a page programmatically.
This is easy to do for the HTML tag, which would be simply
Dim Source as String = DocumentObject.documentElement.outerHTML
However, this doesn't include the DocType declaration if one is present on
the page, which is fairly critical during html validation.
If there is a regular expression for this, that would be great. I don't
know what it is though.
----- Original Message -----
From: "Octavian Rasnita" <orasnita@xxxxxxxxx>
To: <programmingblind@xxxxxxxxxxxxx>
Sent: Saturday, September 29, 2007 10:30 PM
Subject: Re: MSHTML DocType object? Can't get the string
Wouldn't be more simple and sure to use regular expressions for this?
Octavian
----- Original Message -----
From: "Bryan Garaventa" <bryan@xxxxxxxxxxxxxxxxxxx>
To: <programmingblind@xxxxxxxxxxxxx>
Sent: Sunday, September 30, 2007 7:46 AM
Subject: MSHTML DocType object? Can't get the string
Does anyone know how to get the string value out of the Document.DocType
object using VB?
I asumed that a string value would be returned when this was called, but
no luck. For instance,
' Asume that DocumentObject is the current Document object for the given
web page.
Dim DocType as String = DocumentObject.DocType
This looks a little different in VB6, but you get the idea. Well, it
doesn't work. There is no string since an object of type DocumentType is
being returned.
However, there doesn't appear to be any way of creating a variable of
type DocumentType in VB, which is presumably being returned by
DocumentObject.DocType.
I just need to get the actual string for the DocType tag itself from the
document object using VB.
Any ideas how I might be able to do this?
Thanks a million,
Bryan Garaventa
__________
View the list's information and change your settings at
http://www.freelists.org/list/programmingblind
__________
View the list's information and change your settings at
http://www.freelists.org/list/programmingblind
__________
View the list's information and change your settings at
http://www.freelists.org/list/programmingblind
__________
View the list's information and change your settings at
http://www.freelists.org/list/programmingblind
- References:
- RE: common Jobs for VI Programmers: GUIs, DSP, DB, Asm
- From: Ken Perry
- Re: Symbian Vs NET, Learning Audio DSP (Was: common Jobs for VI Programmers)
- From: Veli-Pekka Tätilä
- RE: Symbian Vs NET, Learning Audio DSP (Was: common Jobs for VI Programmers)
- From: Ken Perry
- Re: Symbian Vs NET, Music DSP Resources
- From: Veli-Pekka Tätilä
- Re: Symbian Vs NET, Music DSP Resources
- From: Jim
- RE: Symbian Vs NET, Music DSP Resources
- From: Ken Perry
- MSHTML DocType object? Can't get the string
- From: Bryan Garaventa
- Re: MSHTML DocType object? Can't get the string
- From: Octavian Rasnita
- Re: MSHTML DocType object? Can't get the string
- From: Bryan Garaventa
Other related posts:
- » MSHTML DocType object? Can't get the string
- » Re: MSHTML DocType object? Can't get the string
- » Re: MSHTML DocType object? Can't get the string
- » Re: MSHTML DocType object? Can't get the string
- » Re: MSHTML DocType object? Can't get the string
This is easy to do for the HTML tag, which would be simply Dim Source as String = DocumentObject.documentElement.outerHTMLHowever, this doesn't include the DocType declaration if one is present on the page, which is fairly critical during html validation.
If there is a regular expression for this, that would be great. I don't know what it is though.
----- Original Message ----- From: "Octavian Rasnita" <orasnita@xxxxxxxxx>
To: <programmingblind@xxxxxxxxxxxxx> Sent: Saturday, September 29, 2007 10:30 PM Subject: Re: MSHTML DocType object? Can't get the string
Wouldn't be more simple and sure to use regular expressions for this? Octavian----- Original Message ----- From: "Bryan Garaventa" <bryan@xxxxxxxxxxxxxxxxxxx>To: <programmingblind@xxxxxxxxxxxxx> Sent: Sunday, September 30, 2007 7:46 AM Subject: MSHTML DocType object? Can't get the stringDoes anyone know how to get the string value out of the Document.DocType object using VB?I asumed that a string value would be returned when this was called, but no luck. For instance,' Asume that DocumentObject is the current Document object for the given web page.Dim DocType as String = DocumentObject.DocTypeThis looks a little different in VB6, but you get the idea. Well, it doesn't work. There is no string since an object of type DocumentType is being returned.However, there doesn't appear to be any way of creating a variable of type DocumentType in VB, which is presumably being returned by DocumentObject.DocType.I just need to get the actual string for the DocType tag itself from the document object using VB.Any ideas how I might be able to do this? Thanks a million, Bryan Garaventa __________View the list's information and change your settings at http://www.freelists.org/list/programmingblind__________View the list's information and change your settings at http://www.freelists.org/list/programmingblind
__________View the list's information and change your settings at http://www.freelists.org/list/programmingblind
- RE: common Jobs for VI Programmers: GUIs, DSP, DB, Asm
- From: Ken Perry
- Re: Symbian Vs NET, Learning Audio DSP (Was: common Jobs for VI Programmers)
- From: Veli-Pekka Tätilä
- RE: Symbian Vs NET, Learning Audio DSP (Was: common Jobs for VI Programmers)
- From: Ken Perry
- Re: Symbian Vs NET, Music DSP Resources
- From: Veli-Pekka Tätilä
- Re: Symbian Vs NET, Music DSP Resources
- From: Jim
- RE: Symbian Vs NET, Music DSP Resources
- From: Ken Perry
- MSHTML DocType object? Can't get the string
- From: Bryan Garaventa
- Re: MSHTML DocType object? Can't get the string
- From: Octavian Rasnita
- Re: MSHTML DocType object? Can't get the string
- From: Bryan Garaventa