Shamil Salakhetdinov
shamil at smsconsulting.spb.ru
Mon Apr 28 11:40:25 CDT 2008
Hi John, Thank you for your proposal to run the code against your huge db... OK, let's try this remote/offline Q&A/code musing/consulting/education in several installments published here: I must say I have quite some work to do for customers and I can't spend a lot of time for postings here/explaining what and why is done in the sample code - I will try just give refs on "concepts" I use for you to read more in many open sources... I made first adjustment to the yesterday's code - new code is here: http://smsconsulting.spb.ru/samples/Program.cs.txt http://smsconsulting.spb.ru/samples/Module.vb.txt http://smsconsulting.spb.ru/samples/Stats.txt To run it (let's use VB.NET): - in VS2005 create VB.NET Console application, replace the code of the module, which VS2005 creates by default (Module1.vb) with the code from http://smsconsulting.spb.ru/samples/Module.vb.txt , correct these code lines Public Const PcName As String = "LOCALHOST" Public Const DbName As String = "NamesParser" Public Const UserName As String = "TESTER" Public Const Password As String = "TEST.1959" to point to your db and its credentials, push CTRL+F5, and you should see the sample app running in console window and producing something like this report: Strategy = 1, PageSize = 100, ThreadsQty = 10, NameParsingDuration = 1 ms Start time: 28.04.2008 20:18:13 Count(*) = 1000 Counter = 1000 End time: 28.04.2008 20:18:19 Elapsed time: 00:00:05.5156250 Strategy = 2, PageSize = 100, ThreadsQty = 10, NameParsingDuration = 1 ms Start time: 28.04.2008 20:18:19 Count(*) = 1000 Counter = 1000 End time: 28.04.2008 20:18:22 Elapsed time: 00:00:03.5468750 Strategy = 3, PageSize = 100, ThreadsQty = 10, NameParsingDuration = 1 ms Start time: 28.04.2008 20:18:22 Count(*) = 1000 Counter = 1009 End time: 28.04.2008 20:18:26 Elapsed time: 00:00:03.3906250 Try. If you tell me the name of your database and the name of your table I will change them here, as well as you can create username and password as above to not do any changes later for the new versions of the code. N.B: Make sure you will run the code against sample not the live db. The code assumes that the length of the target field FName is 50 chars - it uses just one this column currently to put "parsed" value - just first 50 chars of OWNERNAME field. Please have a look through the code, and ask your questions in the order you wanted them to be answered... Please everybody who wanted to participate in this discussion - you're very welcome! On debugging multi-threaded apps: that's a real PITA - yes you can debug/trace with breakpoints etc. but you have to keep in mind that as many threads as you have will cause debugger to stop on breakpoints etc. IOW better debug/trace single threaded code and them run it in multi-threaded mode, and use some indirect artefacts/techniques (as unit testing), which will show you that your app is running OK: as you can find in the new version of the code it can run both single and multi-threaded depending on the value of this constant: Public Const MULTI_THREADED_MODE As Boolean = True -- Shamil -----Original Message----- From: dba-vb-bounces at databaseadvisors.com [mailto:dba-vb-bounces at databaseadvisors.com] On Behalf Of jwcolby Sent: Monday, April 28, 2008 7:02 AM To: Discussion concerning Visual Basic and related programming issues. Subject: Re: [dba-VB] ADO.Net Shamil, I would like to thank you for taking the time to do this. Now... I will thank you in advance for taking the time to explain it to me. As happened many years ago when you taught me WithEvents in Access, I don't know enough to understand what the heck you are doing. In the absence of any instructions on how to use this thing, I cut and pasted the entire (VB) shebang into a class in a Windows Form project. It compiled flawlessly and the blank form opened. Now what? I hope that you can appreciate my lack of knowledge here. I have built several projects, with (to my pitiful level of expertise) quite extensive stuff going on but... there are a lot of concepts in your code that I have not used. Where to start. I guess where to start is a good spot. What do I do to cause this to run? Coming from Access I have always started with a form, which I promptly place a button on so I can call some function which "starts" things. That form will have class variables if necessary to instantiate a supervisor class, the supervisor class will then load recordsets, load data into data classes as required etc. So what I do not understand is "what do I call to make this run"? I see the MainEntry at the bottom but do I just call MainEntry from a button? Does the code belong in a class or a plain module? I assume that if I put it in a class then I would have to dim a variable in my form. Second (I am totally excited about this whole thing!!!) what are the replaceable parameters? Dim connectionString As String = String.Format("Data Source={0}\SQLEXPRESS;Initial Catalog={1};User Id={2};Password={3}", pcName, dbName, userName, password) I assume here that pcName gets fed in to {0}? THAT is cool, and demonstrates where I start from in understanding your code. Third (also exciting!) I have never used threads yet so here we go. How do I debug that? Can I step through a thread in the same way I step through any other code? Set break points etc. And finally (for now) it just occurred to me to ask... can you help me (online of offline) to modify this example to actually use my own table that I need to parse the name for. I think I can figure out how to modify the stuff for the server, database, table etc. I have done that already in my own code for doing this. I will also worry about the parsing code itself, though just for a demo we can just hard code some values. My table Data: PKID (int32) OWNERNAME (to be parsed) FName (parsed) MName (parsed) LName (parsed) NamePrefix (parsed) NameSuffix (parsed) Gender (parsed) If I can modify they example to actually read / write to my table then I kill two birds with one stone, learn a bunch of cool new stuff including getting it all happening with threads, and get my real work done. For my real life purposes I am going to have to have a PKStart, PKEnd pair which tracks a block (remember I am doing 80 million rows eventually), and increment those start/end by the chunk size to repeat over and over (eventually). What I have found convenient in the past (and have code already written to do) is write status log files out to disk for each block processed. I write them as XML and include things like the start / end PKID, status memo fields, time start / stop and the like. As I said, I have already written that code (I'm not completely helpless, though it sometimes appears that way) and sitting out in a library ready to use for the logging. All I would need to do is build the specific data class to store the log data for this project. I will handle that. I will promise to absorb your example if you will promise to take the time to help me when I get stuck. Once I have done so I will then be able to apply it to my specific problem and give you back some real life timings from my largish database. I can vary the chunk size from 1k to any upward limit we find useful to test. My experience has been that anything above 10K can get problematic, though that was using the bulk import widget in past projects. I was getting timeouts if I went with too large a chunk size. VB.Net is only a small part of my work life, though I would like to make it the main part. I still spend most of my work life stuck in Access, maintaining applications I wrote (or inherited) long ago. While it pays the bills I would LOVE to get to the point where I could bang out code like this in the short time that it probably took you. Again thanks for your time and effort and patience. Shamil Salakhetdinov wrote: > Here are the sources I promised to publish in my previous post: > > http://smsconsulting.spb.ru/samples/Stats.txt > > http://smsconsulting.spb.ru/samples/Program.cs.txt > > http://smsconsulting.spb.ru/samples/Module.vb.txt > > Your turn, guys, to fix and improve them... > > -- > Shamil > -- John W. Colby www.ColbyConsulting.com _______________________________________________ dba-VB mailing list dba-VB at databaseadvisors.com http://databaseadvisors.com/mailman/listinfo/dba-vb http://www.databaseadvisors.com