MartyConnelly
martyconnelly at shaw.ca
Mon May 21 13:43:28 CDT 2007
You can convert text or XML files encoded like ISO-8859-1 etc through 11 to UTF-8 or UTF-16. You can also specify window code pages like Chinese Big-5 See Charset in the registry This can be done via ADO streams and CharSets Sub ReadAnsiSaveFileInUTF8() ' ReadToFile / SaveToFile snippet ' Used ADO 2.7 Dim stm As ADODB.Stream Dim strPath As String Dim strData As String 'strPath = GetPath(CurrentDb.Name) ' or whatever your path is strPath = "C:\Access files\ADO\" Set stm = New ADODB.Stream stm.Open 'the character set names for the machine are in the registry 'For a list of the character set strings that is known by a system, see 'the subkeys of HKEY_CLASSES_ROOT\MIME\Database\Charset 'in the Windows Registry. stm.Charset = "ascii" 'case specific name depending on code page may have to use iso-8859-1 stm.Position = 0 stm.Type = adTypeText stm.LoadFromFile strPath & "Readme.txt" ' if you just try and dump out stream with Save ' without reading and writing you get double BOM and clobbered file. stm.Position = 0 strData = stm.ReadText() Debug.Print strData stm.Position = 0 'reset to beginning stm.Charset = "UTF-8" stm.WriteText (strData) stm.SaveToFile strPath & "ReadmeUTF8.txt", adSaveCreateOverWrite stm.Close Set stm = Nothing End Sub Arthur Fuller wrote: >100% correct, Stuart. I recently published an article about exactly this at >TechRepulic.comm. It doesn't apply specifically to Access; it was written >for the SQL Server crowd; but it may be convertible. No promises. I dealt >solely with the SQL 2000/2005 cases. In theory, the logic ought to work, but >I haven't tested it there. > >Visit www.techrepublic.com and search for stuff by me; it ought to be the >first or second or third reference. > >A. > > >On 5/20/07, Stuart McLachlan <stuart at lexacorp.com.pg> wrote: > > >>On 20 May 2007 at 13:20, Jim Lawrence wrote: >> >> >> >>>Hi All: >>> >>>I have two questions. They are both related. A client has approached me >>> >>> >>with >> >> >>>a particular project and I am wondering if anyone has experience with >>> >>> >>the >> >> >>>following: >>> >>>1. Double-byte Character Sets; using them with Word documents and Access >>>databases. >>> >>> >>A real PITA. >> >>>From http://msdn2.microsoft.com/en-us/library/ms776454.aspx >> >>"Note: New Windows applications should use Unicode to avoid the >>inconsistencies of varied code pages and for ease of localization. >>However, >>some legacy protocols might require the use of DBCS code pages. Each DBCS >>code page supports different characters, but no page supportsthe full >>breadth of characters provided by Unicode. Each DBCS code page supports a >>different subset, differently encoded. Data converted from one DBCS code >>page to another is subject to corruption because the same data value on >>different code pages can encode a different character. Data converted from >>Unicode to DBCS is subject to data loss, because a given code page might >>not be able to represent every character used in that particular Unicode >>data."-- >>Stuart >> >> -- Marty Connelly Victoria, B.C. Canada