[dba-VB] Merge rows

Shamil Salakhetdinov shamil at smsconsulting.spb.ru
Wed Sep 23 04:44:08 CDT 2009


Stuart,

I did mean that once loaded bit array will be kept "warm" in memory (John
has enough memory for that).

It's unclear what are the main type of queries of JC's customer (John can
you clarify this issue?) 

- if that queries are to select a few thousand individual records as you
suggest(?) and all the queries include at least one "Y/N" field AND-end with
the other part of WHERE filtering criteria then "brute force bits crunching"
will outperform MS SQL, which (MS SQL) will have to scan the whole 50
million records, and "brute force bits crunching" will return a few thousand
individual records IDs in ~3 seconds(?), and those IDs can then be used to
select the records from the "DB from Hell" and apply the rest of the WHERE
filtering criteria to select a subset;

- if that queries are to select a few millions records and calculate some
stats on them and all that queries include at least one "Y/N" field AND-end
with the other part of WHERE filtering criteria then "brute force bits
crunching" will again outperform MS SQL for bit operation, as for the other
part of WHERE clause - it could be possible to use LINQ for Objects here if
all the compacted records can be loaded in memory on huge JC's computer -
and if that will be possible then LINQ for Objects (of just custom
programming of selection criteria logic evaluation + counting stats) should
let to outperform MS SQL I expect...

Recap: it all depends on the type of the queries to run against "DB From
Hell" - if 80-90% of them fit first or the second type of queries mentioned
above then it does make sense for compacted records to use "brute force bits
crunching" approach as it might result in considerable shortening of the
time queries run.
If it doesn't make any sense there will a query run within a minute or
within one or a couple of hours then "brute force bits crunching" isn't
needed there...

Or do you have your queries already running over "DB from Hell" within a
minute/several minutes, John?

Thank you.

--
Shamil

-----Original Message-----
From: dba-vb-bounces at databaseadvisors.com
[mailto:dba-vb-bounces at databaseadvisors.com] On Behalf Of Stuart McLachlan
Sent: Wednesday, September 23, 2009 12:46 PM
To: Discussion concerning Visual Basic and related programming issues.
Subject: Re: [dba-VB] Merge rows

But do you think that the "brute force crunching" will gain more time than
is lost in requiring 
SQL Server to pulling all 50 million records from the database, rather than
letting SQL 
Server only pull the few thousand that  you are interested in in the first
place?

-- 
Stuart


On 23 Sep 2009 at 10:54, Shamil  Salakhetdinov wrote:

> Hi Stuart,
> 
> No problem. "BitArrays population magic" isn't a complicated task for John
> "DB from Hell".
> 
> BTW, I have just run test code for 50 millions 400 bits bit arrays for OR,
> AND and XOR bitwise operations - it takes less than 3 seconds, and for "DB
> from Hell" with 50 million records it will be 50 millions * 400 = 20
billion
> bits = ~2GB - all that can be loaded into memory for 64bits PCs, so, I do
> expect that even "brute force crunching" will produce a kind of
> instantaneous results for the part of query where char "Y/N" columns are
> currently used. And once loaded in memory the info can be kept/updated
there
> (and in DB)(and back-up in 2GB+ binary file) as it's not changing that
much
> IOW maybe there is no even a need to substitute in "DB from Hell" current
> char "Y/N" fields with Binary(...) or VarBinary(...) fields... etc... 
> 
> Please correct me if you find I'm wrong somewhere in my
> considerations/expectations/evaluations...
> 
> Thank you.
> 
> --
> Shamil
> 
> P.S.
> 
> static void test3()
> {
>     //const long CYCLES_QTY = 1000000000; // 1 billion
>     const long CYCLES_QTY = 50000000; // 50 millions
>     bool[] values1 = (bool[])Array.CreateInstance(typeof(bool), 400);
>     bool[] values2 = (bool[])Array.CreateInstance(typeof(bool), 400);
>     values1[0] = true;
>     values1[225] = true;
>     values1[399] = true;
> 
> 
>     System.Collections.BitArray bitTest1 =
>         new System.Collections.BitArray(values1);
>     System.Collections.BitArray bitTest2 =
>         new System.Collections.BitArray(values2);
> 
>     for (int i = 0; i < 400; i++) bitTest2[i] = true; 
> 
>     Console.WriteLine("{0:HH:mm:ss.fff}", DateTime.Now);
>     for (long l = 1; l < CYCLES_QTY; l++)
>         bitTest1.Xor(bitTest2);
>         //bitTest1.Or(bitTest2);
>         //bitTest1.And(bitTest2);
>     Console.WriteLine("{0:HH:mm:ss.fff}", DateTime.Now);
> 
>     foreach (bool bit in bitTest1)
>         Console.Write("{0}", bit ? "1" : "0");
> 
>     Console.WriteLine();
> 
>     // 50 millions - AND
>     //10:32:56.772
>     //10:32:59.548
>  
>
//10000000000000000000000000000000000000000000000000000000000000000000000000
> 000000
>  
>
//00000000000000000000000000000000000000000000000000000000000000000000000000
> 000000
>  
>
//00000000000000000000000000000000000000000000000000000000000000000100000000
> 000000
>  
>
//00000000000000000000000000000000000000000000000000000000000000000000000000
> 000000
>  
>
//00000000000000000000000000000000000000000000000000000000000000000000000000
> 000001
> 
>     // 50 millions - OR
>     //10:34:27.644
>     //10:34:30.451
>  
>
//11111111111111111111111111111111111111111111111111111111111111111111111111
> 111111
>  
>
//11111111111111111111111111111111111111111111111111111111111111111111111111
> 111111
>  
>
//11111111111111111111111111111111111111111111111111111111111111111111111111
> 111111
>  
>
//11111111111111111111111111111111111111111111111111111111111111111111111111
> 111111
>  
>
//11111111111111111111111111111111111111111111111111111111111111111111111111
> 111111
> 
>     // 50 millions - XOR
>     //10:35:27.414
>     //10:35:30.182
>  
>
//01111111111111111111111111111111111111111111111111111111111111111111111111
> 111111
>  
>
//11111111111111111111111111111111111111111111111111111111111111111111111111
> 111111
>  
>
//11111111111111111111111111111111111111111111111111111111111111111011111111
> 111111
>  
>
//11111111111111111111111111111111111111111111111111111111111111111111111111
> 111111
>  
>
//11111111111111111111111111111111111111111111111111111111111111111111111111
> 111110
>     
>     Console.WriteLine();
>     foreach (bool bit in bitTest2)
>         Console.Write("{0}", bit ? "1" : "0");
> 
>     Console.WriteLine();
> }
> 
> 
> 
<<< snip >>>




More information about the dba-VB mailing list