[dba-VB] c# lock()

jwcolby jwcolby at colbyconsulting.com
Sun Mar 27 22:01:44 CDT 2011


Shamil,

 > Yes, I see: you tend to build a multi-threaded application/utility system
 > with centralized control - and I tend to build an anarchical free-market
 > model based application/utility system :)

LOL, I guess.  To be quite honest I didn't understand exactly what you were saying.  Remember that 
my C# skills are still quite rudimentary.

Accuzip Processing system basically has developed over time.  I didn't sit down to develop an app 
from scratch to do this.  I started in late 2004 to just get one single list into SQL server and 
serve up counts to the client.  I did the whole thing manually, building queries to import the data 
and clean it up using only SQL Server Management Server.  When I started I had never touched SQL Server.

Then I was asked to research an application for doing CASS

http://en.wikipedia.org/wiki/Coding_Accuracy_Support_System

and NCOA

http://en.wikipedia.org/wiki/National_Change_Of_Address

on the addresses in the list.  There was only one list in the beginning which I originally labeled 
"the database from hell".

So I researched and discovered a vendor (Accuzip) that would allow us a fixed price license against 
which I could process as many records as I wanted.  All other vendors (that I found) charge on a 
sliding scale Cents / record kind of thing, and we had a LOT of records - 65 million in that first list.

So I started building a system of stored procedures which performed the export out to Accuzip and 
then import the Accuzipped records back in.  It took roughly 25 stored procedures to perform this 
process.  I had a paper list of instructions on how to run the stored procedures, parameters to feed 
in etc, plus manually copying the files out and back in.  I would manually type EXEC commands into a 
query window reading from my paper instructions.  It was *sloooow* and *error prone*.

Next came a stored procedure that ran the stored procedures.  Etc. Etc.  It just morphed over time. 
  Plus I got more lists.  The client was so impressed that I managed to do all this that I got 
actual orders (export the data and send to the clients) rather than just counts, and the business grew.

I started using Access ( the tool I knew ) to automate some of it but it was a very poor fit.  Since 
every operation ran in a single thread the entire application would lock up when SQL Server went out 
for 30 minutes to do some operation.  And of course my hardware was underpowered.

In the summer of 2009 I decided I really needed to automate this.  The business had grown and the 
client was suddenly 50% of my income and it was a huge struggle to process these lists manually, 
even using stored procedures of lists of stored procedures.  TSQL as a programming environment is 
crude to say the least.

So I went to the local community college in Sept. 2009 and took the first of two available C# 
courses.  By Christmas I was launching in to my automation attempt.  Understand my C# skills were 
rudimentary at best but the language was robust and I could see the power.  I really just started 
building a method of moving the paper list of stored procedures into C#, IOW C# would execute the 
stored procedures and monitor their completion.  I was not focused on a top down "build a system" 
but rather let's get my manual process automated ASAP.  I was overworked, and wasn't able to handle 
the load.

It turns out that C# is very capable in this regard and the program (such as it was) started working 
and working well.  At the end of the second semester (May 2010) I hired Paul, a young college 
student who was graduating from the community college to come to work for me part time.  I started 
teaching him what I know about general programming and together we worked on this program.  And it 
continued to morph.

We started using NLog for logging, Visual SVN for source control etc.  At some point I decided to 
just move the Stored Procedures out of SQL Server and into dynamic code generated directly in C#. 
Each step had to keep the program running because I actually used the program to fill orders and do 
counts.  And remember our C# skills are still rudimentary.

Then we "discovered" threads and soon thereafter events.  Last fall I decided to remodel the program 
into a manager / supervisor / worker model where the manager tells the supervisor to do something, 
the supervisor tells other workers to do something and basically just check results.  In this model, 
each level uses threads to scan it's tasks (watch flags) and keep its workers busy.

So the dbExport class has a thread which build a database, pulls records, builds chunks, exports to 
disk.  That is it's job.  When the file is on disk it is done and control for that chunk is passed 
to another class (the Virtual Machine).  The dbExport class uses flags to store each step of the 
process.

The Virtual Machine class picks up the files and pulls them over to the VMs directory, monitors that 
Accuzip processes them, tracks errors (if any) from Accuzip, and places the resulting files back on 
a staging directory (as I call it) for another class to take over.

The dbImport class is tasked with creating a temp database, pulling each chunk file back in from 
staging and getting the data back into the original list database.

So the manager class (a single instance) loads supervisor classes, one per list database that needs 
processing.  The Manager class also loads the VM class (only one of these at the moment but there 
could be more, with more licenses from Accuzip).

Each supervisor loads a dbExport and dbImport class to process the records in its list.

dbExport creates Process records and stores them in SQL Server and creates a process class for each 
process record.  Each process record represents a chunk of data from a specific list database.  the 
process class doesn't really do anything it is just a place for the flags to be loaded / 
manipulated, and it knows how to persist itself to SQL Server.

dbImport basically gets Process class instances which are finished with stage 2/3 (out of Accuzip) 
and starts importing them back in to SQL Server.

My concept was a factory kind of application.  There can be none or dozens of databases needing to 
be processed.  SQL Server and my disk system can only efficiently handle one at a time (at any 
stage) and Accuzip is the biggest bottleneck so the VM class sits in the middle of an export and an 
import class.  Every class has its own job, every class has at least one thread for its use.

dbExport has a thread to use in exporting data to a table and another thread to bcp the data out 
while the next chunk is created in the export database.

dbImport has a thread to BCP data back into chunk tables, and another thread to take the chunk 
tables and merge them back into live data.  The VM has a thread to monitor the input directory and 
another thread to monitor the output directory, moving files into and out of the VM.

Every step has to be logged back to SQL Server as it completes, but the clsSupervisor keeps the 
progress flags in memory as long as a specific list database is being processed, and the supervisor 
holds a list of process classes which hold all of the flags for the steps for a single chunk.

It is a busy system.

I did not originally use threads, and everything ran on the main program thread.  And I was back to 
my user interface locking up as long running processes happened.  Now with threading, each stage 
(export / vm / import) has its own list control which is used to display the status of the list 
database / chunks as they progress through the system.  Everything is logged to flags, but also to 
NLog tables.  Each class has threads to keep its own processes fed and running, and each class does 
its own piece of the whole to process a list table through Accuzip.

BTW, there is another entire C# program which processes orders for the client.  It is also threaded, 
and one of its steps is to create a supervisor record to tell the Accuzip Monitor to Accuzip the 
addresses being processed for the order.

Orders are the Accuzip processing which I mentioned have to cut in at the head of the line in 
Accuzip Monitoring.  As you have seen, I could be processing a 65 million record list database when 
an order comes in.  65 million records is going to take at least 50 hours to process, and the order 
has to go out ASAP so the Accuzip Monitor (the Manager class actually) has to watch for orders and 
interrupt the data supervisor to allow the Order supervisor to cut in.  Once Accuzip starts a 
process (chunk) I don't interrupt it, but when that process (chunk) finishes Accuzip, only Order 
chunks will process until the order is finished.

As I said, it is a busy system.

And remember that our C# skills are still rudimentary.  I started coding in C# in Sept 2009 and 
started migrating this process to c# about December 2009.  By May 2010 when Paul started work I was 
actually running an Accuzip process and doing orders using C#.  Paul is a youngster right out of school.

So much of the stuff you talk about goes right over my head Shamil.

John W. Colby
www.ColbyConsulting.com

On 3/27/2011 8:08 PM, Shamil Salakhetdinov wrote:
> Hi John --
>
> Yes, I see: you tend to build a multi-threaded application/utility system
> with centralized control - and I tend to build an anarchical free-market
> model based application/utility system :)
>
> Thank you.
>
> --
> Shamil



More information about the dba-VB mailing list