[AccessD] Use Regex - Create Camel Case

Shamil Salakhetdinov shamil at users.mns.ru
Fri Sep 28 16:49:40 CDT 2007


Hi All,

I wanted to note: I have seen somewhere an article about RegEx being
considerably slower than a mere strings comparison etc. I cannot find this
article now, can you? 

Here is a similar article on ColdFusion and Java (watch line wraps) - 

http://www.bennadel.com/blog/410-Regular-Expression-Finds-vs-String-Finds.ht
m 

The info above should be also true for C#/VB.NET (just remember there are no
miracles in this world)...

John, this could be critical information for you because of your computers
processing zillion gigabytes of data - if that slowness of RegEx vs. string
comparison operation proves to be true then mere chars/strings comparison
and simple iteration over source string's char array could be the most
effective solution, which will save you hours and hours of computing time:

- define a 256 bytes long table (I guess you use extended ASCII (256 chars
max) only John - right?) with to be stripped out chars marked by 1;
- define upperCase flag;
- allocate destination string, which is as long as the source one - use
StringBuilder;
- iterate source string and use current char's ASCII code as an index of a
cell of array mentioned above: 
    a) if the array's cell has value > 0  then the source char should be
stripped out/skipped; set uppercase flag = true;
    b) if the array's cell has zero value and uppercase flag = true then
uppercase current source char and copy it to the destination
StringBuilder's; set uppercase flag = false;
    c) if the array's cell has zero value and uppercase flag = false then
lower case current source char and copy it to the destination
StringBuilder's string;


Here is C# code:


private static string[] delimiters = "
|%|*|$|@|!|#|&|^|_|-|,|.|;|:|(|)".Split('|');
private static byte[] sieve = new byte[255];
private static bool initialized = false;
static void JamOutBadChars()
{
if (!initialized)
{
    sieve.Initialize();
    foreach (string delimiter in delimiters)
    {
        sieve[(int)delimiter.Substring(0, 1).ToCharArray()[0]] = 1;
    }
    initialized = true;
}
string[] test = {"John colby  ",
        "%idiotic_Field*name&!@",
        " # hey#hey#Hey,hello_world$%#",
        "@#$this#is_a_test_of_the-emergency-broadcast-system          "};

foreach (string source in test)
{
    StringBuilder result = new StringBuilder(source.Length);
    bool upperCase = true;
    foreach (char c in source.ToCharArray())
    {
        if (sieve[(int)c] > 0) upperCase = true;
        else if (upperCase)
        {
            result.Append(c.ToString().ToUpper());
            upperCase = false;
        }
        else result.Append(c.ToString().ToLower());
    }
    Console.WriteLine(source + " => {" + result + "}");  
}
}

--
Shamil
 

-----Original Message-----
From: accessd-bounces at databaseadvisors.com
[mailto:accessd-bounces at databaseadvisors.com] On Behalf Of Michael Bahr
Sent: Friday, September 28, 2007 10:25 PM
To: Access Developers discussion and problem solving
Subject: Re: [AccessD] Use Regex - Create Camel Case

Hi John, here is one way to do it (although there are many ways to get the
same end result).  Mind you this is air code but hopefully should be
enough to get you going.  You will need to create the main loop within
your code.

Create a list of all delimiters that are used in your CSV files such as
delimiters = '%|*|$|@|!|#|&|^|_|-|,|.|;|:| '

then run through your CSV files line by line evaluating the line saving
the line into an array
thisarray = Split(line, delimiters)

then run through the array performing a Ucase on the first letter of each
word
newline = ""
For item=1 to ubound
  newline = newline & whatEverToCapFirstChar(item)
Next item

where ubound is the array size


Now here are two scripts that do the same thing, one is Perl and the other
is TCL.  Both of these languages are open source and free and can be
gotten at
http://www.activestate.com/Products/languages.plex

Perl:

my $delimiters = '/:| |\%|\*|\$|\@|\!|\#|\&|^|_|-|,|\./';
my @test = ("John colby",
            "%idiotic_Field*name",
            "hey#hey#Hey,hello_world",
            "this#is_a_test_of_the-emergency-broadcast-system");

foreach my $item (@test) {
   my $temp = "";
   my @list = split ($delimiters, $item);
   foreach my $thing (@list) {
      $temp .= ucfirst($thing);
   }
   print "$temp\n";

}

Result
d:\Perl>pascalcase.pl
JohnColby
IdioticFieldName
HeyHeyHeyHelloWorld
ThisIsATestOfTheEmergencyBroadcastSystem

TCL:

set delimiters {%|*|$|@|!|#|&|^|_|-|,|.|;|:|\ "}
set test [list {John colby} {%idiotic_Field*name}
{hey#hey#Hey,hello_world}
{this#is_a_test_of_the-emergency-broadcast-system}]


foreach item $test {
   set str ""
   set mylist [split $item, $delimiters]
   foreach thing $mylist {
      set s [string totitle $thing]
      set str "$str$s"
   }
   puts $str

}

Results
D:\VisualTcl\Projects>tclsh pascalcase.tcl
JohnColby
IdioticFieldName
HeyHeyHeyHelloWorld
ThisIsATestOfTheEmergencyBroadcastSystem


hth, Mike...


> Folks,
>
> I am looking for a regex expression (preferably with explanation) for
> taking
> an expression and creating a camel case (or PascalCase) expression.
>
> I get CSV files with headers in them.  All too often the eejits that
> created
> the databases they came from used embedded spaces or other special use
> characters (!@#$%^&* etc) in their field names.  I need to strip these
> special characters out completely.  I also need to upper case the valid
> alpha character that follows any of these special characters.
>
> John colby becomes JohnColby
> %idiotic_Field*name becomes IdioticFieldName
>
> Etc.
>
> It appears that Regex is the key (I am doing this in VB.Net) but until
> today
> I have never really tried to use RegEx and it ain't pretty!
>
> Any help in this would be much appreciated.
>
> John W. Colby
> Colby Consulting
> www.ColbyConsulting.com
>
> --
> AccessD mailing list
> AccessD at databaseadvisors.com
> http://databaseadvisors.com/mailman/listinfo/accessd
> Website: http://www.databaseadvisors.com
>


-- 
AccessD mailing list
AccessD at databaseadvisors.com
http://databaseadvisors.com/mailman/listinfo/accessd
Website: http://www.databaseadvisors.com




More information about the AccessD mailing list