[dba-VB] Using the hash for real world problems

jwcolby jwcolby at colbyconsulting.com
Sat Mar 19 09:13:51 CDT 2011


Ooops.  My examples were hosed.

Let's try again

Addresses	People
22,538,240	1
00,708,462	2
00,052,234	3

and so forth.

What the client really wants to know is

"how many people with emails are at a specific address"
"how many people with emails are in a specific family"

etc.

I tie my address table to my email table using an auto increment long increment "I must be careful 
not to call this a surrogate" key and then perform the counts using the hash fields in a Group By / 
count query.

John W. Colby
www.ColbyConsulting.com

On 3/19/2011 9:55 AM, jwcolby wrote:
> I got into the hashing business in order to create a surrogate key of sorts, in fact a trio of them.
>
> I have an address hash, a family hash and a person hash. I compute these three hashes for every
> table of persons. Now that I have these surrogate keys I have "identical" values in a single field
> where the input is identical in multiple fields. Thus I do not have to do multi-field joins where I
> have a hash for all of those fields.
>
> I am happy to hear that I am secure in my surrogate hash algorithm choice.
>
> OK, so I have these hashes. Every record has these three hashes.
>
> Now some definitions:
>
> HashAddr: A unique address is addr/zip5/zip4 different
> HashFamily: A unique family is LName/Addr/Zip5/Zip4 different
> HahsPerson: A unique person is FName/LName/Addr/Zip5/Zip4 different.
>
> Using this information, I need to calculate the count of addresses with a single person, with two
> persons etc for as many combinations as I have.
>
> Addresses People
> 22,538,240 1
> 780,462,346 2
> 52,234 3 people
> etc
> etc till every quantity of people is found.
>
> What is the simplest way to accomplish this with SQL.
>
>
>
>



More information about the dba-VB mailing list