[dba-VB] SHA1 to compute a hash

Stuart McLachlan stuart at lexacorp.com.pg
Sat Mar 19 06:20:26 CDT 2011


Using your SHA1 function, what  message digests do you get for the standard test cases:

1. abc
2. abcdbcdecdefdefgefghfghighijhijkijkljklmklmnlmnomnopnopq

These should return:

1. A9993E36 4706816A BA3E2571 7850C26C 9CD0D89D
2. 84983E44 1C3BD26E BAAE4AA1 F95129E5 E54670F1

See Examples 1 and 2 at
http://www.itl.nist.gov/fipspubs/fip180-1.htm



-- 
Stuart


On 19 Mar 2011 at 20:20, Stuart McLachlan wrote:

> How are you creating your hash?
> 
> Can you post a few examples of different  data strings and colliding
> SHA1 hashes.   I can probably make a lot of money out of them.  
> AFAIK, no one other than you has found any.
> 
> -- 
> Stuart
> 
> 
> On 18 Mar 2011 at 22:08, jwcolby wrote:
> 
> > In my databases I create SHA1 hashes to enable joining between
> > tables and pull identical records (identical for the fields hashed).
> >  I create:
> > 
> > 1) A HashAddr of the zip5, zip4 and addr.  IOW I simply append the
> > three values and feed them into SHa1 and out pops a number which I
> > store in a field in my table.
> > 
> > 2) A HashFamily of the Zip5, Zip4, Addr ad LName.
> > 
> > 3) A HashPerson of Zip5, Zip4, Addr, LName and FName.
> > 
> > I am getting known collisions between different addresses (I have
> > discovered and investigated collisions) in my HashAddr when I have
> > many millions of addresses.  I need to address this.
> > 
> > Back when I made my design decisions (2004) my hardware consisted of
> > single core processors, 4 gigs ram, Windows x32 etc.  Now I have 8
> > cores, 32 gigs Ram, Windows X64 etc.  IOW I was to a great extent
> > constrained by my hardware "back in the day" whereas I am much less
> > so now.
> > 
> > I am about to redesign my process.
> > 
> > I am considering simply appending in the city and state strings to
> > all of the inputs: Addr, City, St, Zip5, Zip4 as the address base
> > and then the same with LName and FName for the other two respective
> > hashes.
> > 
> > The objective is to minimize hash collisions, not prevent some
> > crypto attack.  I use these hash fields to join between multi
> > million record tables so If I need to discover info in TableA where
> > the HashAddr is the same as in TableB, I need the probability of a
> > collision between different addresses (family/Person) to be as close
> > to zero as I can get it.
> > 
> > My questions are:
> > 
> > 1) Whether anyone out there is using a hash in this manner?
> > 2) Has anyone seen a table of collision probability between messages
> > of a given (short) message length.  My message is 9 digits for the
> > zip5/4 and the address could be something as short as PO Box 1, or
> > Apt 1.  IOW the total message length of 14 is pretty common.  Adding
> > the state would give me minimum message lengths of only 16 and City
> > would only add a few more characters. 3) Does anyone know if just
> > adding the same data back in again would decrease the collision
> > probability. IOW Zip5,Zip4,Addr,City,St,Zip5,Zip4Etc.
> > 
> > Any experience out there?
> > 
> > 
> > -- 
> > John W. Colby
> > www.ColbyConsulting.com
> > _______________________________________________
> > dba-VB mailing list
> > dba-VB at databaseadvisors.com
> > http://databaseadvisors.com/mailman/listinfo/dba-vb
> > http://www.databaseadvisors.com
> > 
> > 
> 
> 
> 
> _______________________________________________
> dba-VB mailing list
> dba-VB at databaseadvisors.com
> http://databaseadvisors.com/mailman/listinfo/dba-vb
> http://www.databaseadvisors.com
> 
> 






More information about the dba-VB mailing list