DWUTKA at marlow.com
DWUTKA at marlow.com
Mon Oct 31 03:28:07 CST 2005
A few years ago, we had a thread about encryption. I had programmed an 'enigma wheel' encryption routine. There were pros and cons to it. The Enigma machine was a machine developed by the Germans during World War II, which used mechanical wheels, with 'mixed' lettering. It was a pretty neat machine. Anyhow, that code was sitting and gathering dust until recently. I'm working on a pretty big project, and I was in need of some encryption capabilities. So I decided to blow the dust off of that Enigma Wheel code, and make something workable out of it. For starters, I wanted it as a class, which I tend to use more often now in development (one of the many things JC and I actually agree upon! ). Initially, I used three classes, each class having a collection of the classes below it. Programmatically speaking, it was nice. Processing wise, NO WAY. The first thing I need to do in the 'big' project I am working on, is to 'decrypt' about 90k worth of data. In my tests of my initial 'three class' code, it was taking over a minute to encrypt or decrypt about 150k. WAY too long. So I dropped the end class, so now I was running two classes, the main one holding a collection of the second one, and the second now had arrays instead of collections. Well, that dropped my 'test time' (of the ~150k file) down to about 30 to 40 seconds. Still too slow. So then I converted the entire thing into one class, using nothing but arrays inside. That did the trick. It did the ~150k file in a flash. When encrypting or decrypting a file, the machine I am using (an 800 mhz. PIII) takes about 2 to 3 seconds per megabyte. Not bad. In the process of optimizing the code, I also had to overcome the obstacle of string 'recombination'. VB.Net has a class to do this, but I am creating my big project in VB 6.0. Strings are easy to deal with, but if you are tearing them apart, and rebuilding them, that can be time consuming. Take for example this routine (sorry for the horrible naming convention....this is just an example): Private Sub Command3_Click() Dim strTemp As String Dim strTemp2 As String Dim i As Long Dim dtStart As Date strTemp = String(100000, "A") strTemp2 = "" dtStart = Now For i = 1 To Len(strTemp) strTemp2 = strTemp2 & LCase(Mid(strTemp, i, 1)) Next i MsgBox Format(Now - dtStart, "HH:NN:SS") MsgBox strTemp2 End Sub In the code above, on my 800 mhz machine, I get 34 seconds. Note that we don't start 'timing' until we get into the loop. The string function, which in this case, is creating a string 100,000 characters long, of all A's, is neglible anyways. However, the For Next loop is where we are going to process the first string, and build the second string with the 'processed' characters. In this case, we are simply getting the lower case version of the character. (Yes, I know, we could just Lcase the whole first string, but that isn't the point. What if we are swapping characters (such as in my encryption routine)? Anyhow, the time we are taking isn't in the Lcase and Mid statements, it's in the strTemp2=strTemp2 & part. Initial, that process screams. Add a debug.print statement of 'i' and you'll see what I mean. It will rocket through the beginning, but as strTemp2 gets bigger and bigger, it will get slower and slower. So how do we speed that up? Arrays...byte arrays, to be specific, with a simple API call, CopyMemory. Take a look at the code below: Private Declare Sub CopyMemory Lib "kernel32" Alias "RtlMoveMemory" (Destination As Any, Source As Any, ByVal Length As Long) Private Sub Command4_Click() Dim strTemp As String Dim strTemp2 As String Dim tmpArray() As Byte Dim i As Long Dim dtStart As Date strTemp = String(100000, "A") ReDim tmpArray(1 To Len(strTemp)) strTemp2 = "" dtStart = Now CopyMemory tmpArray(1), ByVal strTemp, Len(strTemp) For i = 1 To Len(strTemp) tmpArray(i) = Asc(LCase(Chr(tmpArray(i)))) Next i strTemp2 = StrConv(tmpArray, vbUnicode) MsgBox Format(Now - dtStart, "HH:NN:SS") MsgBox strTemp2 End Sub Same process, with two steps added. Before we start 'timing', we do redimension the byte array, but, I did include the copymemory statement within the 'timed' portion. That's the first step we added. CopyMemory. Note that the first argument is what we are copying into (and we have to refer to the first item in the array, not just the array itself), then the next argument is the string we want to copy into the array (using the ByVal statement, some API's are fun like that), then we tell it how much of the string we want to copy. (Because we could copy just a part if we want....warning, copying more then is there will crash the VBE....). The second part we added is the strConv line. Where we convert the byte array back into a string. We also modified the 'processing' that is being done, because the byte array is going to represent the ASCII values of a string, not the text itself. But the same process is in place (we are converting the ASCII byte to a character, then setting it to lowercase, and then converting it back to ascii). All the added steps, I wonder how fast it will be, well, on the same machine we got 34 seconds for on the first routine, we get 0 seconds for this routine. Same processing results, and that 0 seconds pops up instantaneously. Quite a performance increase, I'd say. Anyhow, back to my encryption routine. I went a bit further then just optimizing it. The encryption process has 3 elements. Password, SecondPassword, and a set of wheels. The first two are properties of the class. The second password is not necessary, but can be used. The set of wheels is hard coded into the class itself (and is a code equivalent of the enigma machine). If all 3 elements must be the same, in order to decrypt something encrypted with them. So if you encrypt with Password of 'Hey how's it going', and a blank second password, but you try to decrypt with the same, but a different 'set of wheels', the decrypt results will just be gobbly gook. As an explanation, the first password is used to set the wheels (which ones, how many, and which direction they rotate in), and the second password is used to set their position (if not used, all wheels start in their initial position). Where I went further then optimizing the code, is that I created an Add-in, to add the 'WheelEncryption' class to my projects. The Add-in uses a database which stores 'wheel sets' (which it generates for you), so you can use different wheel sets for different projects, etc. The class itself is very easy to use. It only has 5 properties to deal with. EncryptionType (which can be ASCII or Text (text will encrypt any typable text (Ascii values 32 to 126 (so Carriage return and line feed wouldn't fit in that), and it will encrypt it into a string with characters in the same range. (so AAAAA might become XcC[9), ascii encrypts all 256 characters, into the same. So AAAAA might include nulls or other non-keyboard characters.)), Password (Any text and any length (the longer this is the more wheels are used....though a huge amount of wheels will slow down the encrypt/decrypt process accordingly)), SecondPassword( also any text will work for this (can be shorter or longer then the first password (though if it's longer, only the length of the first password would matter on decryption....))). The last two properties are the EncryptedData and NormalData properties (setting one internally encrypts/decrypts the other). If anyone is interested in the Add-in, let me know. Sorry for the lengthy post, but I just got finished with the Add-in, and I like writing mini novels to wind down after some serious coding! ;) Drew