jwcolby
jwcolby at colbyconsulting.com
Mon Apr 28 11:55:08 CDT 2008
Thanks for the response Shamil, I know that you have lots of stuff to get done as do I. I am already back to the code I know in order to get this thing processed, however I will get your sample code running in the coming days. John W. Colby www.ColbyConsulting.com Shamil Salakhetdinov wrote: > Hi John, > > Thank you for your proposal to run the code against your huge db... > > OK, let's try this remote/offline Q&A/code musing/consulting/education in > several installments published here: I must say I have quite some work to do > for customers and I can't spend a lot of time for postings here/explaining > what and why is done in the sample code - I will try just give refs on > "concepts" I use for you to read more in many open sources... > > I made first adjustment to the yesterday's code - new code is here: > > http://smsconsulting.spb.ru/samples/Program.cs.txt > > http://smsconsulting.spb.ru/samples/Module.vb.txt > > http://smsconsulting.spb.ru/samples/Stats.txt > > To run it (let's use VB.NET): > > - in VS2005 create VB.NET Console application, replace the code of the > module, which VS2005 creates by default (Module1.vb) with the code from > http://smsconsulting.spb.ru/samples/Module.vb.txt , correct these code lines > > Public Const PcName As String = "LOCALHOST" > Public Const DbName As String = "NamesParser" > Public Const UserName As String = "TESTER" > Public Const Password As String = "TEST.1959" > > to point to your db and its credentials, push CTRL+F5, and you should see > the sample app running in console window and producing something like this > report: > > Strategy = 1, PageSize = 100, ThreadsQty = 10, NameParsingDuration = 1 ms > > Start time: 28.04.2008 20:18:13 > Count(*) = 1000 > Counter = 1000 > End time: 28.04.2008 20:18:19 > Elapsed time: 00:00:05.5156250 > > Strategy = 2, PageSize = 100, ThreadsQty = 10, NameParsingDuration = 1 ms > > Start time: 28.04.2008 20:18:19 > Count(*) = 1000 > Counter = 1000 > End time: 28.04.2008 20:18:22 > Elapsed time: 00:00:03.5468750 > > Strategy = 3, PageSize = 100, ThreadsQty = 10, NameParsingDuration = 1 ms > > Start time: 28.04.2008 20:18:22 > Count(*) = 1000 > Counter = 1009 > End time: 28.04.2008 20:18:26 > Elapsed time: 00:00:03.3906250 > > Try. > > If you tell me the name of your database and the name of your table I will > change them here, as well as you can create username and password as above > to not do any changes later for the new versions of the code. > > N.B: Make sure you will run the code against sample not the live db. The > code assumes that the length of the target field FName is 50 chars - it uses > just one this column currently to put "parsed" value - just first 50 chars > of OWNERNAME field. > > Please have a look through the code, and ask your questions in the order you > wanted them to be answered... > > Please everybody who wanted to participate in this discussion - you're very > welcome! > > On debugging multi-threaded apps: that's a real PITA - yes you can > debug/trace with breakpoints etc. but you have to keep in mind that as many > threads as you have will cause debugger to stop on breakpoints etc. IOW > better debug/trace single threaded code and them run it in multi-threaded > mode, and use some indirect artefacts/techniques (as unit testing), which > will show you that your app is running OK: as you can find in the new > version of the code it can run both single and multi-threaded depending on > the value of this constant: > > Public Const MULTI_THREADED_MODE As Boolean = True > > -- > Shamil > > -----Original Message----- > From: dba-vb-bounces at databaseadvisors.com > [mailto:dba-vb-bounces at databaseadvisors.com] On Behalf Of jwcolby > Sent: Monday, April 28, 2008 7:02 AM > To: Discussion concerning Visual Basic and related programming issues. > Subject: Re: [dba-VB] ADO.Net > > Shamil, > > I would like to thank you for taking the time to do this. Now... I will > thank you in advance for taking the time to explain it to me. As > happened many years ago when you taught me WithEvents in Access, I don't > know enough to understand what the heck you are doing. > > In the absence of any instructions on how to use this thing, I cut and > pasted the entire (VB) shebang into a class in a Windows Form project. > It compiled flawlessly and the blank form opened. > > Now what? > > I hope that you can appreciate my lack of knowledge here. I have built > several projects, with (to my pitiful level of expertise) quite > extensive stuff going on but... there are a lot of concepts in your code > that I have not used. Where to start. > > I guess where to start is a good spot. What do I do to cause this to > run? Coming from Access I have always started with a form, which I > promptly place a button on so I can call some function which "starts" > things. That form will have class variables if necessary to instantiate > a supervisor class, the supervisor class will then load recordsets, load > data into data classes as required etc. So what I do not understand is > "what do I call to make this run"? I see the MainEntry at the bottom > but do I just call MainEntry from a button? Does the code belong in a > class or a plain module? I assume that if I put it in a class then I > would have to dim a variable in my form. > > Second (I am totally excited about this whole thing!!!) what are the > replaceable parameters? > > Dim connectionString As String = String.Format("Data > Source={0}\SQLEXPRESS;Initial Catalog={1};User Id={2};Password={3}", > pcName, dbName, userName, password) > > I assume here that pcName gets fed in to {0}? THAT is cool, and > demonstrates where I start from in understanding your code. > > Third (also exciting!) I have never used threads yet so here we go. How > do I debug that? Can I step through a thread in the same way I step > through any other code? Set break points etc. > > And finally (for now) it just occurred to me to ask... can you help me > (online of offline) to modify this example to actually use my own table > that I need to parse the name for. I think I can figure out how to > modify the stuff for the server, database, table etc. I have done that > already in my own code for doing this. I will also worry about the > parsing code itself, though just for a demo we can just hard code some > values. > > > My table Data: > > PKID (int32) > OWNERNAME (to be parsed) > FName (parsed) > MName (parsed) > LName (parsed) > NamePrefix (parsed) > NameSuffix (parsed) > Gender (parsed) > > If I can modify they example to actually read / write to my table then I > kill two birds with one stone, learn a bunch of cool new stuff including > getting it all happening with threads, and get my real work done. > > For my real life purposes I am going to have to have a PKStart, PKEnd > pair which tracks a block (remember I am doing 80 million rows > eventually), and increment those start/end by the chunk size to repeat > over and over (eventually). What I have found convenient in the past > (and have code already written to do) is write status log files out to > disk for each block processed. I write them as XML and include things > like the start / end PKID, status memo fields, time start / stop and the > like. As I said, I have already written that code (I'm not completely > helpless, though it sometimes appears that way) and sitting out in a > library ready to use for the logging. All I would need to do is build > the specific data class to store the log data for this project. I will > handle that. > > I will promise to absorb your example if you will promise to take the > time to help me when I get stuck. Once I have done so I will then be > able to apply it to my specific problem and give you back some real life > timings from my largish database. I can vary the chunk size from 1k to > any upward limit we find useful to test. My experience has been that > anything above 10K can get problematic, though that was using the bulk > import widget in past projects. I was getting timeouts if I went with > too large a chunk size. > > VB.Net is only a small part of my work life, though I would like to make > it the main part. I still spend most of my work life stuck in Access, > maintaining applications I wrote (or inherited) long ago. While it pays > the bills I would LOVE to get to the point where I could bang out code > like this in the short time that it probably took you. > > Again thanks for your time and effort and patience. > > Shamil Salakhetdinov wrote: >> Here are the sources I promised to publish in my previous post: >> >> http://smsconsulting.spb.ru/samples/Stats.txt >> >> http://smsconsulting.spb.ru/samples/Program.cs.txt >> >> http://smsconsulting.spb.ru/samples/Module.vb.txt >> >> Your turn, guys, to fix and improve them... >> >> -- >> Shamil >> >