Darren DICK
darrend at nimble.com.au
Sat Mar 4 05:57:14 CST 2006
Hi Marty Thanks for the response Way over my head I have forwarded it to my SQL gurus (Soon to be XML gurus I guess) Many thanks Darren ------------------------------ T: 0424 696 433 -----Original Message----- From: accessd-bounces at databaseadvisors.com [mailto:accessd-bounces at databaseadvisors.com] On Behalf Of MartyConnelly Sent: Saturday, 4 March 2006 7:00 AM To: Access Developers discussion and problem solving Subject: Re: [AccessD] A2000: XML Q Just some thoughts on this The xml PI ( processing instruction) statement. <?xml version="1.0" encoding="iso-8859-1"?> is not required if the xml file is UTF-8 or UTF-16, ie. file starts with a proper BOM at least for MS XML parsers. I wish MS would spell out, all of it's XML defaults, They assume developers know by osmosis. Watch out for encoding statements, it assumes you know where the file is coming from, if it's not specified then, the parser assumes UTF-8 and or will do a BOM check. If you add an encoding as above you are stating the file is originally created as ANSI "iso-8859-1" western european. Any characters outside this range will be non-valid or may not be interpeted correctly. It used to do funny things to Euro and UK pound symbols. You can however change the PI on the fly with statements like pi = xmlDoc.createProcessingInstruction("xml", "version=\"1.0\""); xmlDoc.insertBefore(pi, xmlDoc.childNodes.item(0)); If you edit an xml file in notepad watch out whether you save as ANSI or UTF-8 (unicode). May cause grief by changing the BOM In most cases US users will get away with this type of encoding iso-8859-1, but if you start bringing in files from international sites or Unix boxes this will give you problems. See info on xml encodings. http://www.topxml.com/code/default.asp?p=3&id=v20010810181946 There are quick and dirty ways to bulk change encodings via ADO stream charset's, I posted some code in the archives. There is a difference between well-formed and valid XML. Well-formed is a syntax check on XML (ie. matching tags) Valid also means that XML data entities and attributes comply with an xml schema or DTD. Here is some validation code that might help you out, the special error code displays the xml character in error I hated trying to count the error line and position number in a file to determine the character in error. There is also a routine to check the files BOM marker. 'ValidXMLCheck "C:\XML\Gil Encodings\encUTF8_noBOM.xml" Sub ValidXMLCheck(strxmlfilepath As String) Dim xmlMessage As MSXML2.DOMDocument40 Dim oXMLError As IXMLDOMParseError Dim lngErrCode As Long Set xmlMessage = New MSXML2.DOMDocument40 xmlMessage.async = False xmlMessage.validateOnParse = True 'true by default xmlMessage.resolveExternals = False 'Set xmlMessage.schemas = xmlSchema 'After loading the XML document, call the Validate method of the 'DOMDocument. If there is an error validating against the schema, there will be a 'parse Error: xmlMessage.Load (strxmlfilepath) lngErrCode = xmlMessage.validate() If xmlMessage.parseError.errorCode <> 0 Then Debug.Print " Reason: " & xmlMessage.parseError.reason Set oXMLError = xmlMessage.parseError reportParseError oXMLError Else Debug.Print strxmlfilepath & " file OK" End If End Sub Public Function reportParseError(err As IXMLDOMParseError) 'this is not setup to count tabs used as whitespace Dim s As String Dim r As String Dim i As Long s = "" For i = 1 To err.linepos - 1 s = s & " " Next r = "XML Error loading " & err.url & " * " & err.reason Debug.Print r 'show character postion of error; tired of counting on screen If (err.Line > 0) Then r = "at line " & err.Line & ", character " & err.linepos & vbCrLf & _ err.srcText & vbCrLf & s & "^" End If Debug.Print r Debug.Print "url=" & err.url & vbCrLf End Function Sub CheckBOM(Optional strFileIn As Variant, Optional strIn As Variant) 'checkbom "C:\XML\Gil Encodings\encUTF8_NoDecl.xml" On Error GoTo Err_handler Dim strInputData As String * 4 Dim lpBuffer() As Byte Dim intFreeFile As Integer If Not IsMissing(strFileIn) Then intFreeFile = FreeFile Open strFileIn For Binary Access Read Lock Read As #intFreeFile Len = 4 ReDim lpBuffer(4) Get #intFreeFile, , lpBuffer Close #intFreeFile ElseIf Not IsMissing(strIn) Then 'Can't makes this work since VBA is always converting the string to UTF-16LE lpBuffer = Left$(strIn, 4) Else MsgBox "Nothing To Do" Exit Sub End If If lpBuffer(0) = 255 And lpBuffer(1) = 254 Then Debug.Print "File is UTF-16 Little Endian" ElseIf lpBuffer(0) = 254 And lpBuffer(1) = 255 Then Debug.Print "File is UTF-16 Big Endian" ElseIf lpBuffer(0) = 239 And lpBuffer(1) = 187 And lpBuffer(2) = 191 Then Debug.Print "File is UTF-8" 'Start trying to figure out by other means this will only work on xml files that start with "<?" ElseIf lpBuffer(0) = 60 And lpBuffer(1) = 0 And lpBuffer(2) = 63 And lpBuffer(3) = 0 Then Debug.Print "File is UTF-16 Little Endian" ElseIf lpBuffer(0) = 0 And lpBuffer(1) = 60 And lpBuffer(2) = 0 And lpBuffer(3) = 63 Then Debug.Print "File is UTF-16 Big Endian" ElseIf lpBuffer(0) = 69 And lpBuffer(1) = 63 Then Debug.Print "File can be in UTF-8, ASCII, ISO-8859-?, Shift-JIS, etc" Else Debug.Print "Can't seem to figure out the Character encoding" End If Err_Exit: On Error Resume Next Close #intFreeFile Exit Sub Err_handler: Select Case Err.Number Case Else MsgBox Err.Number & " - " & Err.Description End Select Resume Err_Exit: End Sub Jim DeMarco wrote: >Darren, > >We've been using this: > ><?xml version="1.0" encoding="iso-8859-1"?> > >But I'm pretty sure you can get by with just: > ><?xml version="1.0"?> > >Once you navigate the file below this processing instruction your on >your own as far as defining elements etc. > >May I ask why the concern? > >HTH > >Jim > >-----Original Message----- >From: accessd-bounces at databaseadvisors.com >[mailto:accessd-bounces at databaseadvisors.com] On Behalf Of Darren DICK >Sent: Wednesday, March 01, 2006 11:19 PM >To: AccessD >Subject: [AccessD] A2000: XML Q > >Hello all >Cross Posted to dba_SQL List >What is the minimum header information i need to include before 'my >data' starts to get a 'Well formed' xml doc? >Eg the stuff that looks like ><?xml version="1.0"?><Report xmlns="Invoice2" >xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" > > > >etc etc > > > >many thanks > >DD > >-- >AccessD mailing list >AccessD at databaseadvisors.com >http://databaseadvisors.com/mailman/listinfo/accessd >Website: http://www.databaseadvisors.com > > >*********************************************************************** >************ "This electronic message is intended to be for the use >only of the named recipient, and may contain information from Hudson Health Plan (HHP) that is confidential or privileged. If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution or use of the contents of this message is strictly prohibited. If you have received this message in error or are not the named recipient, please notify us immediately, either by contacting the sender at the electronic mail address noted above or calling HHP at (914) 631-1611. If you are not the intended recipient, please do not forward this email to anyone, and delete and destroy all copies of this message. Thank You". >*********************************************************************** >************ > > > -- Marty Connelly Victoria, B.C. Canada -- AccessD mailing list AccessD at databaseadvisors.com http://databaseadvisors.com/mailman/listinfo/accessd Website: http://www.databaseadvisors.com