Shamil Salakhetdinov
shamil at users.mns.ru
Sun Apr 13 05:08:53 CDT 2008
Hi Max, Our postings crossed - I have just posted RegEx test results... N.B.: when testing this or that method both memory consumption and execution speed should be taken into consideration: - high memory consumption giving speediest results can be neglected to solve small tasks within "tiny" applications/utilities, but - high memory consumption is the thing, which could "unexpectedly" give nasty side effects in the large application systems: One example I have got several months ago, which was so "tricky" to watch, and was not easy to find why it could ever happen in .NET: I have got a memory leakage/excessive memory consumption (in .NET!) because of "lazy loading", and that latter "lazy loading" loaded quite some data when it wasn't needed, in the cycle processing millions of records - the results were like that when I watched 'Page File Usage History' in Task Manager: /| /| /| . / | / | / | . / | / | / | . / |/ |/ |/ And in usual mode without heavy application system workload everything worked well with the same data... Recap: - Split(...) approach could result in similar to the above side effect for very large input strings; // 10 sec for 20,000,000 iterations string[] recordLine = s.Split('|'); count = recordLine.Length-1; - Replace(...) could also result in the above side effect; // 10 sec for 20,000,000 iterations string temp = s.Replace("|", Microsoft.VisualBasic.Constants.vbNullString); count = s.Length - temp.Length; - RegEx(...) seems to be the slowest - unsatisfactory slow for large input strings/heavy system workload; // ? sec (unfinished) for 20,000,000 iterations Regex rx = new Regex("|"); count = rx.Matches(s).Count; - char array iteration using char index and (XOR or char comparison) gives the fastest results and is 100% safe from memory consumption point of view.... // * using XOR: // ~3 sec for 20,000,000 iterations for (int index = 0; index < s.Length; index++) if ((s[index] == '|')) count++; // * using char comparinson... // ~3 sec for 20,000,000 iterations for (int index = 0; index < s.Length; index++) if ((s[index] == '|')) count++; Please correct me if you'll find the above results have mistakes... Any takers to find quiker code for .NET VB or C# or C++/CLI? - that would be a good weekend exercise on code optimization techniques... Thank you. -- Shamil -----Original Message----- From: dba-vb-bounces at databaseadvisors.com [mailto:dba-vb-bounces at databaseadvisors.com] On Behalf Of Max Wanadoo Sent: Sunday, April 13, 2008 1:14 PM To: 'Discussion concerning Visual Basic and related programming issues.' Subject: Re: [dba-VB] Count a character in a string John, Further to my example RegExpr code that I just posted. Would you have time to compare how long it takes and how long these two take (posted by others): > One: > Dim strRecordLine() as array > strRecordline=split(yourstringhere,"|") > NumberOfSeperatorCharacters=UBound(strRecordline)+1 > > Two: > Dim strTemp as String > strTemp=Replace(yourstring,"|","") > #ofCharacters=Len(yourstring)-len(strTemp) It would be really nice to get a handle on which of these is the faster for a very large file. Thanks Max _______________________________________________ dba-VB mailing list dba-VB at databaseadvisors.com http://databaseadvisors.com/mailman/listinfo/dba-vb http://www.databaseadvisors.com