[AccessD] Use Regex - Create Camel Case

max.wanadoo at gmail.com max.wanadoo at gmail.com
Sun Sep 30 03:52:58 CDT 2007


Hi Shamil,
Clearly your compiled solution is by way and far the quickest solution.
I have tried all sorts of VBA solutions including looking at XOR, IMP, EQV,
bitwise solutions, but there overheads were considerable.
The best I can come up with in VBA is below.
One million iterations on my Dell Inspiron comes in at 3 min 52 secs.

If John didn't want to Hump it, then RegExpr appears to be the answer within
pure VBA 

Max

Function dbc2()
  Const conGoodChars As String = "abcdefghijklmnopqrstuvwxyz" ' valid
characters
  Const conBadChars As String = "£$%^&*()_-+@'#~?><|\, " ' space also in
this string
  Const conLoops As Long = 1000000
  Dim tStartTime As Date, tEndTime As Date, tLapsedTime As Date, iLoop As
Long, iVars As Integer, iVarLoop As Integer
  Dim iLen As Integer, strTemp As String, bFlipCase As Boolean, str2Parse As
String, strResult As String, strBit As String
  Dim varStr(5)
  varStr(1) = "John colby  "
  varStr(2) = "%idiotic_Field*name&!@"
  varStr(3) = " # hey#hey#Hey,hello_world$%#"
  varStr(4) = "@#$this#is_a_test_of_the-emerGency-broadcast-system"
  varStr(5) = "thisisastringwithnobadchars"
  iVars = 5
  tStartTime = Now
  For iLoop = 1 To conLoops
    For iVarLoop = 1 To iVars
      strResult = ""
      str2Parse = LCase(varStr(iVarLoop))
      str2Parse = UCase(Left(str2Parse, 1)) & Mid(str2Parse, 2)
      For iLen = 1 To Len(str2Parse)
        strBit = Mid(str2Parse, iLen, 1)
        If InStr(conBadChars, strBit) = 0 Then
          If bFlipCase = True Then strBit = UCase(strBit): bFlipCase = False
          strResult = strResult & strBit
        Else
          bFlipCase = True
        End If
      Next iLen
      'Debug.Print strResult
    Next iVarLoop
  Next iLoop
  tEndTime = Now
  tLapsedTime = tEndTime - tStartTime
  MsgBox tLapsedTime: Debug.Print tLapsedTime
End Function
 

-----Original Message-----
From: accessd-bounces at databaseadvisors.com
[mailto:accessd-bounces at databaseadvisors.com] On Behalf Of Shamil
Salakhetdinov
Sent: Sunday, September 30, 2007 9:01 AM
To: 'Access Developers discussion and problem solving'
Subject: Re: [AccessD] Use Regex - Create Camel Case

<<<
However for more complicated
string operations like validating an email address, a regex would be very
suitable and doable in one line vs. many, many lines the other way.
>>>
Hi Mike,

That's clear, and the John's task is to get the speediest solution.


--
Shamil
 
-----Original Message-----
From: accessd-bounces at databaseadvisors.com
[mailto:accessd-bounces at databaseadvisors.com] On Behalf Of Michael Bahr
Sent: Sunday, September 30, 2007 6:42 AM
To: Access Developers discussion and problem solving
Subject: Re: [AccessD] Use Regex - Create Camel Case

Hi Shamil, yes regex's are slower in .Net due to I believe all the objects
overhead.  For simple string operations regexes would probrably not be
effiecent BUT would be easier to write.  However for more complicated string
operations like validating an email address, a regex would be very suitable
and doable in one line vs. many, many lines the other way.

Mike...

> Hi All,
>
> I wanted to note: I have seen somewhere an article about RegEx being 
> considerably slower than a mere strings comparison etc. I cannot find 
> this article now, can you?
>
> Here is a similar article on ColdFusion and Java (watch line wraps) -
>
>
http://www.bennadel.com/blog/410-Regular-Expression-Finds-vs-String-Finds.ht
> m
>
> The info above should be also true for C#/VB.NET (just remember there 
> are no miracles in this world)...
>
> John, this could be critical information for you because of your 
> computers processing zillion gigabytes of data - if that slowness of RegEx
vs.
> string
> comparison operation proves to be true then mere chars/strings 
> comparison and simple iteration over source string's char array could 
> be the most effective solution, which will save you hours and hours of
computing time:
>
> - define a 256 bytes long table (I guess you use extended ASCII (256 
> chars
> max) only John - right?) with to be stripped out chars marked by 1;
> - define upperCase flag;
> - allocate destination string, which is as long as the source one - 
> use StringBuilder;
> - iterate source string and use current char's ASCII code as an index 
> of a cell of array mentioned above:
>     a) if the array's cell has value > 0  then the source char should 
> be stripped out/skipped; set uppercase flag = true;
>     b) if the array's cell has zero value and uppercase flag = true 
> then uppercase current source char and copy it to the destination 
> StringBuilder's; set uppercase flag = false;
>     c) if the array's cell has zero value and uppercase flag = false 
> then lower case current source char and copy it to the destination 
> StringBuilder's string;
>
>
> Here is C# code:
>
>
> private static string[] delimiters = "
> |%|*|$|@|!|#|&|^|_|-|,|.|;|:|(|)".Split('|');
> private static byte[] sieve = new byte[255]; private static bool 
> initialized = false; static void JamOutBadChars() { if (!initialized) 
> {
>     sieve.Initialize();
>     foreach (string delimiter in delimiters)
>     {
>         sieve[(int)delimiter.Substring(0, 1).ToCharArray()[0]] = 1;
>     }
>     initialized = true;
> }
> string[] test = {"John colby  ",
>         "%idiotic_Field*name&!@",
>         " # hey#hey#Hey,hello_world$%#",
>         "@#$this#is_a_test_of_the-emergency-broadcast-system          "};
>
> foreach (string source in test)
> {
>     StringBuilder result = new StringBuilder(source.Length);
>     bool upperCase = true;
>     foreach (char c in source.ToCharArray())
>     {
>         if (sieve[(int)c] > 0) upperCase = true;
>         else if (upperCase)
>         {
>             result.Append(c.ToString().ToUpper());
>             upperCase = false;
>         }
>         else result.Append(c.ToString().ToLower());
>     }
>     Console.WriteLine(source + " => {" + result + "}"); } }
>
> --
> Shamil
>
>
> -----Original Message-----
> From: accessd-bounces at databaseadvisors.com
> [mailto:accessd-bounces at databaseadvisors.com] On Behalf Of Michael 
> Bahr
> Sent: Friday, September 28, 2007 10:25 PM
> To: Access Developers discussion and problem solving
> Subject: Re: [AccessD] Use Regex - Create Camel Case
>
> Hi John, here is one way to do it (although there are many ways to get 
> the same end result).  Mind you this is air code but hopefully should 
> be enough to get you going.  You will need to create the main loop 
> within your code.
>
> Create a list of all delimiters that are used in your CSV files such 
> as delimiters = '%|*|$|@|!|#|&|^|_|-|,|.|;|:| '
>
> then run through your CSV files line by line evaluating the line 
> saving the line into an array thisarray = Split(line, delimiters)
>
> then run through the array performing a Ucase on the first letter of 
> each word newline = ""
> For item=1 to ubound
>   newline = newline & whatEverToCapFirstChar(item) Next item
>
> where ubound is the array size
>
>
> Now here are two scripts that do the same thing, one is Perl and the 
> other is TCL.  Both of these languages are open source and free and 
> can be gotten at http://www.activestate.com/Products/languages.plex
>
> Perl:
>
> my $delimiters = '/:| |\%|\*|\$|\@|\!|\#|\&|^|_|-|,|\./';
> my @test = ("John colby",
>             "%idiotic_Field*name",
>             "hey#hey#Hey,hello_world",
>             "this#is_a_test_of_the-emergency-broadcast-system");
>
> foreach my $item (@test) {
>    my $temp = "";
>    my @list = split ($delimiters, $item);
>    foreach my $thing (@list) {
>       $temp .= ucfirst($thing);
>    }
>    print "$temp\n";
>
> }
>
> Result
> d:\Perl>pascalcase.pl
> JohnColby
> IdioticFieldName
> HeyHeyHeyHelloWorld
> ThisIsATestOfTheEmergencyBroadcastSystem
>
> TCL:
>
> set delimiters {%|*|$|@|!|#|&|^|_|-|,|.|;|:|\ "} set test [list {John 
> colby} {%idiotic_Field*name} {hey#hey#Hey,hello_world} 
> {this#is_a_test_of_the-emergency-broadcast-system}]
>
>
> foreach item $test {
>    set str ""
>    set mylist [split $item, $delimiters]
>    foreach thing $mylist {
>       set s [string totitle $thing]
>       set str "$str$s"
>    }
>    puts $str
>
> }
>
> Results
> D:\VisualTcl\Projects>tclsh pascalcase.tcl JohnColby IdioticFieldName 
> HeyHeyHeyHelloWorld ThisIsATestOfTheEmergencyBroadcastSystem
>
>
> hth, Mike...
>
>
>> Folks,
>>
>> I am looking for a regex expression (preferably with explanation) for 
>> taking an expression and creating a camel case (or PascalCase) 
>> expression.
>>
>> I get CSV files with headers in them.  All too often the eejits that 
>> created the databases they came from used embedded spaces or other 
>> special use characters (!@#$%^&* etc) in their field names.  I need 
>> to strip these special characters out completely.  I also need to 
>> upper case the valid alpha character that follows any of these 
>> special characters.
>>
>> John colby becomes JohnColby
>> %idiotic_Field*name becomes IdioticFieldName
>>
>> Etc.
>>
>> It appears that Regex is the key (I am doing this in VB.Net) but 
>> until today I have never really tried to use RegEx and it ain't 
>> pretty!
>>
>> Any help in this would be much appreciated.
>>
>> John W. Colby
>> Colby Consulting
>> www.ColbyConsulting.com
>>
>> --
>> AccessD mailing list
>> AccessD at databaseadvisors.com
>> http://databaseadvisors.com/mailman/listinfo/accessd
>> Website: http://www.databaseadvisors.com
>>
>
>
> --
> AccessD mailing list
> AccessD at databaseadvisors.com
> http://databaseadvisors.com/mailman/listinfo/accessd
> Website: http://www.databaseadvisors.com
>
> --
> AccessD mailing list
> AccessD at databaseadvisors.com
> http://databaseadvisors.com/mailman/listinfo/accessd
> Website: http://www.databaseadvisors.com
>


--
AccessD mailing list
AccessD at databaseadvisors.com
http://databaseadvisors.com/mailman/listinfo/accessd
Website: http://www.databaseadvisors.com

--
AccessD mailing list
AccessD at databaseadvisors.com
http://databaseadvisors.com/mailman/listinfo/accessd
Website: http://www.databaseadvisors.com





More information about the AccessD mailing list