Darren DICK
darrend at nimble.com.au
Sat Mar 4 05:57:14 CST 2006
Hi Marty
Thanks for the response
Way over my head
I have forwarded it to my SQL gurus (Soon to be XML gurus I guess)
Many thanks
Darren
------------------------------
T: 0424 696 433
-----Original Message-----
From: accessd-bounces at databaseadvisors.com
[mailto:accessd-bounces at databaseadvisors.com] On Behalf Of MartyConnelly
Sent: Saturday, 4 March 2006 7:00 AM
To: Access Developers discussion and problem solving
Subject: Re: [AccessD] A2000: XML Q
Just some thoughts on this
The xml PI ( processing instruction) statement. <?xml version="1.0"
encoding="iso-8859-1"?>
is not required if the xml file is UTF-8 or UTF-16, ie. file starts with a
proper BOM
at least for MS XML parsers. I wish MS would spell out, all of it's XML
defaults, They assume developers know by osmosis.
Watch out for encoding statements, it assumes you know where the file is coming
from, if it's not specified then, the parser assumes UTF-8 and or will do a BOM
check. If you add an encoding as above you are stating the file is originally
created as ANSI "iso-8859-1" western european.
Any characters outside this range will be non-valid or may not be interpeted
correctly. It used to do funny things to Euro and UK pound symbols.
You can however change the PI on the fly with statements like pi =
xmlDoc.createProcessingInstruction("xml", "version=\"1.0\"");
xmlDoc.insertBefore(pi, xmlDoc.childNodes.item(0));
If you edit an xml file in notepad watch out whether you save as ANSI or
UTF-8 (unicode). May cause grief by changing the BOM In most cases US users will
get away with this type of encoding iso-8859-1, but if you start bringing in
files from international sites or Unix boxes this will give you problems.
See info on xml encodings.
http://www.topxml.com/code/default.asp?p=3&id=v20010810181946
There are quick and dirty ways to bulk change encodings via ADO stream
charset's, I posted some code in the archives.
There is a difference between well-formed and valid XML. Well-formed is a syntax
check on XML (ie. matching tags) Valid also means that XML data entities and
attributes comply with an xml schema or DTD.
Here is some validation code that might help you out, the special error code
displays the xml character in error I hated trying to count the error line and
position number in a file to determine the character in error.
There is also a routine to check the files BOM marker.
'ValidXMLCheck "C:\XML\Gil Encodings\encUTF8_noBOM.xml"
Sub ValidXMLCheck(strxmlfilepath As String) Dim xmlMessage As
MSXML2.DOMDocument40 Dim oXMLError As IXMLDOMParseError Dim lngErrCode As Long
Set xmlMessage = New MSXML2.DOMDocument40 xmlMessage.async = False
xmlMessage.validateOnParse = True 'true by default xmlMessage.resolveExternals =
False 'Set xmlMessage.schemas = xmlSchema 'After loading the XML document, call
the Validate method of the 'DOMDocument. If there is an error validating against
the schema, there will be a 'parse Error:
xmlMessage.Load (strxmlfilepath)
lngErrCode = xmlMessage.validate()
If xmlMessage.parseError.errorCode <> 0 Then
Debug.Print " Reason: " & xmlMessage.parseError.reason
Set oXMLError = xmlMessage.parseError
reportParseError oXMLError
Else
Debug.Print strxmlfilepath & " file OK"
End If
End Sub
Public Function reportParseError(err As IXMLDOMParseError) 'this is not setup to
count tabs used as whitespace
Dim s As String
Dim r As String
Dim i As Long
s = ""
For i = 1 To err.linepos - 1
s = s & " "
Next
r = "XML Error loading " & err.url & " * " & err.reason
Debug.Print r
'show character postion of error; tired of counting on screen
If (err.Line > 0) Then
r = "at line " & err.Line & ", character " & err.linepos & vbCrLf & _
err.srcText & vbCrLf & s & "^"
End If
Debug.Print r
Debug.Print "url=" & err.url & vbCrLf
End Function
Sub CheckBOM(Optional strFileIn As Variant, Optional strIn As Variant) 'checkbom
"C:\XML\Gil Encodings\encUTF8_NoDecl.xml"
On Error GoTo Err_handler
Dim strInputData As String * 4
Dim lpBuffer() As Byte
Dim intFreeFile As Integer
If Not IsMissing(strFileIn) Then
intFreeFile = FreeFile
Open strFileIn For Binary Access Read Lock Read As #intFreeFile Len = 4
ReDim lpBuffer(4)
Get #intFreeFile, , lpBuffer
Close #intFreeFile
ElseIf Not IsMissing(strIn) Then
'Can't makes this work since VBA is always converting the string to UTF-16LE
lpBuffer = Left$(strIn, 4)
Else
MsgBox "Nothing To Do"
Exit Sub
End If
If lpBuffer(0) = 255 And lpBuffer(1) = 254 Then
Debug.Print "File is UTF-16 Little Endian"
ElseIf lpBuffer(0) = 254 And lpBuffer(1) = 255 Then
Debug.Print "File is UTF-16 Big Endian"
ElseIf lpBuffer(0) = 239 And lpBuffer(1) = 187 And lpBuffer(2) = 191 Then
Debug.Print "File is UTF-8"
'Start trying to figure out by other means this will only work on xml files
that start with "<?"
ElseIf lpBuffer(0) = 60 And lpBuffer(1) = 0 And lpBuffer(2) = 63 And
lpBuffer(3) = 0 Then
Debug.Print "File is UTF-16 Little Endian"
ElseIf lpBuffer(0) = 0 And lpBuffer(1) = 60 And lpBuffer(2) = 0 And
lpBuffer(3) = 63 Then
Debug.Print "File is UTF-16 Big Endian"
ElseIf lpBuffer(0) = 69 And lpBuffer(1) = 63 Then
Debug.Print "File can be in UTF-8, ASCII, ISO-8859-?, Shift-JIS, etc"
Else
Debug.Print "Can't seem to figure out the Character encoding"
End If
Err_Exit:
On Error Resume Next
Close #intFreeFile
Exit Sub
Err_handler:
Select Case Err.Number
Case Else
MsgBox Err.Number & " - " & Err.Description
End Select
Resume Err_Exit:
End Sub
Jim DeMarco wrote:
>Darren,
>
>We've been using this:
>
><?xml version="1.0" encoding="iso-8859-1"?>
>
>But I'm pretty sure you can get by with just:
>
><?xml version="1.0"?>
>
>Once you navigate the file below this processing instruction your on
>your own as far as defining elements etc.
>
>May I ask why the concern?
>
>HTH
>
>Jim
>
>-----Original Message-----
>From: accessd-bounces at databaseadvisors.com
>[mailto:accessd-bounces at databaseadvisors.com] On Behalf Of Darren DICK
>Sent: Wednesday, March 01, 2006 11:19 PM
>To: AccessD
>Subject: [AccessD] A2000: XML Q
>
>Hello all
>Cross Posted to dba_SQL List
>What is the minimum header information i need to include before 'my
>data' starts to get a 'Well formed' xml doc?
>Eg the stuff that looks like
><?xml version="1.0"?><Report xmlns="Invoice2"
>xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
>
>
>
>etc etc
>
>
>
>many thanks
>
>DD
>
>--
>AccessD mailing list
>AccessD at databaseadvisors.com
>http://databaseadvisors.com/mailman/listinfo/accessd
>Website: http://www.databaseadvisors.com
>
>
>***********************************************************************
>************ "This electronic message is intended to be for the use
>only of the named recipient, and may contain information from Hudson Health
Plan (HHP) that is confidential or privileged. If you are not the intended
recipient, you are hereby notified that any disclosure, copying, distribution or
use of the contents of this message is strictly prohibited. If you have
received this message in error or are not the named recipient, please notify us
immediately, either by contacting the sender at the electronic mail address
noted above or calling HHP at (914) 631-1611. If you are not the intended
recipient, please do not forward this email to anyone, and delete and destroy
all copies of this message. Thank You".
>***********************************************************************
>************
>
>
>
--
Marty Connelly
Victoria, B.C.
Canada
--
AccessD mailing list
AccessD at databaseadvisors.com
http://databaseadvisors.com/mailman/listinfo/accessd
Website: http://www.databaseadvisors.com