[dba-VB] Using the hash for real world problems

Shamil Salakhetdinov shamil at smsconsulting.spb.ru
Sat Mar 19 10:07:08 CDT 2011


Hi John --

I suppose you can reuse the following SQL expression for your case:

select 
  count(Occurrence) QtyTheSameOccurrences, 
  Occurrence from
(
select 
  Country,
  Count(Country) Occurrence
  from Customer
Group by Country
) o
group by o.Occurrence
order by o.Occurrence

Just replace Country name columns with hashAddress column and Country table
name with your table name ...

Thank you.

--
Shamil
 
-----Original Message-----
From: dba-vb-bounces at databaseadvisors.com
[mailto:dba-vb-bounces at databaseadvisors.com] On Behalf Of jwcolby
Sent: 19 ????? 2011 ?. 17:14
To: Discussion concerning Visual Basic and related programming issues.
Subject: Re: [dba-VB] Using the hash for real world problems

Ooops.  My examples were hosed.

Let's try again

Addresses	People
22,538,240	1
00,708,462	2
00,052,234	3

and so forth.

What the client really wants to know is

"how many people with emails are at a specific address"
"how many people with emails are in a specific family"

etc.

I tie my address table to my email table using an auto increment long
increment "I must be careful not to call this a surrogate" key and then
perform the counts using the hash fields in a Group By / count query.

John W. Colby
www.ColbyConsulting.com

On 3/19/2011 9:55 AM, jwcolby wrote:
> I got into the hashing business in order to create a surrogate key of
sorts, in fact a trio of them.
>
> I have an address hash, a family hash and a person hash. I compute 
> these three hashes for every table of persons. Now that I have these 
> surrogate keys I have "identical" values in a single field where the 
> input is identical in multiple fields. Thus I do not have to do
multi-field joins where I have a hash for all of those fields.
>
> I am happy to hear that I am secure in my surrogate hash algorithm choice.
>
> OK, so I have these hashes. Every record has these three hashes.
>
> Now some definitions:
>
> HashAddr: A unique address is addr/zip5/zip4 different
> HashFamily: A unique family is LName/Addr/Zip5/Zip4 different
> HahsPerson: A unique person is FName/LName/Addr/Zip5/Zip4 different.
>
> Using this information, I need to calculate the count of addresses 
> with a single person, with two persons etc for as many combinations as I
have.
>
> Addresses People
> 22,538,240 1
> 780,462,346 2
> 52,234 3 people
> etc
> etc till every quantity of people is found.
>
> What is the simplest way to accomplish this with SQL.
>
>
>
>
_______________________________________________
dba-VB mailing list
dba-VB at databaseadvisors.com
http://databaseadvisors.com/mailman/listinfo/dba-vb
http://www.databaseadvisors.com




More information about the dba-VB mailing list