[dba-SQLServer] Find the second occurrence of a character inastring

Mark A Matte markamatte at hotmail.com
Fri Dec 22 12:55:38 CST 2006


John,

I've mentioned before a friend of mine does the same thing with his software 
and then matches it against equifax data.  Accuracy is everything in his 
business.  I talked with him about his approach...he said it boils down to 
defining your rules...and then a crap load of IF statements to find which 
rule/rules to apply.  He has been doing this(and refining the code) for more 
than 10 years.

His advice to someone building a similar tool to the one he developed would 
be:

"Start with as many messed up records as you can...and keep adding IF 
statements until you can run it once against the data and all records come 
out correct.  About 60 million should be a good start.  Then get another 60 
million messed up records...run it and see what you missed the first time.  
...Or hire me to clean your data.."

Just thought I'd share.  You seem to have quite the battle ahead of you.  
Best of luck...and Happy Holidays.

Mark A. Matte


>From: artful at rogers.com
>Reply-To: dba-sqlserver at databaseadvisors.com
>To: dba-sqlserver at databaseadvisors.com
>Subject: Re: [dba-SQLServer] Find the second occurrence of a character 
>inastring
>Date: Fri, 22 Dec 2006 09:15:05 -0800 (PST)
>
>Quite right. This is not a simple SQL statement.
>
>----- Original Message ----
>From: JWColby <jwcolby at colbyconsulting.com>
>To: dba-sqlserver at databaseadvisors.com
>Sent: Friday, December 22, 2006 9:32:17 AM
>Subject: Re: [dba-SQLServer] Find the second occurrence of a character in 
>astring
>
>I think the "best way" to handle this if you are going to truly try to
>handle this problem is to:
>
>Develop a list of those "prefixes" to last names - Van, La, De etc.
>Take the first word as the first name
>Get a count of remaining words.
>If count > 0 then
>     ProcessRest
>Else
>     Rest is last name
>Endif
>
>ProcessRest
>     Look up the second word in the prefix list.
>     If InList then
>         Treat everything left as the last name
>     else
>         Treat next word as middle name
>         remove middle name from string
>         Process rest as last name
>     endif
>End ProcessRest
>
>Let's just say this is not s simple sql statement
>
>John W. Colby
>Colby Consulting
>www.ColbyConsulting.com
>
>-----Original Message-----
>From: dba-sqlserver-bounces at databaseadvisors.com
>[mailto:dba-sqlserver-bounces at databaseadvisors.com] On Behalf Of
>artful at rogers.com
>Sent: Thursday, December 21, 2006 3:00 PM
>To: dba-sqlserver at databaseadvisors.com
>Subject: Re: [dba-SQLServer] Find the second occurrence of a character in
>astring
>
>I appreciate your point, but I'm still not certain of the best way to go
>with my question, which concerns the way to handle some unusual surnames.
>
>van den Berq
>la Flame
>de la Vega
>Ben Gurion
>
>and any number of names that begin with "al". Or "da" as in Leonardo. My
>very limited Italian suggests that Leonardo was born in a town called 
>Vinci.
>
>So how does one sort such a list? On the capitalized word? On the first
>letter of the two or three words considered the surname?
>
>Advice from Europeans, Asians, Africans, or even North Americans familiar
>with this problem, would be appreciated. I have no immediate problem that
>requires this solution. This is purely theoretical at the moment, but who
>knows, someday I may need the answer.
>
>TIA,
>Arthur
>
>----- Original Message ----
>From: Robert L. Stewart <rl_stewart at highstream.net>
>To: dba-sqlserver at databaseadvisors.com
>Sent: Thursday, December 21, 2006 1:41:04 PM
>Subject: Re: [dba-SQLServer] Find the second occurrence of a character in a
>string
>
>You put it in the right columns to begin with and don't try to parse it out
>of a single one.  :-)
>
>
>
>
>_______________________________________________
>dba-SQLServer mailing list
>dba-SQLServer at databaseadvisors.com
>http://databaseadvisors.com/mailman/listinfo/dba-sqlserver
>http://www.databaseadvisors.com
>
>_______________________________________________
>dba-SQLServer mailing list
>dba-SQLServer at databaseadvisors.com
>http://databaseadvisors.com/mailman/listinfo/dba-sqlserver
>http://www.databaseadvisors.com
>
>
>
>
>
>_______________________________________________
>dba-SQLServer mailing list
>dba-SQLServer at databaseadvisors.com
>http://databaseadvisors.com/mailman/listinfo/dba-sqlserver
>http://www.databaseadvisors.com
>

_________________________________________________________________
Fixing up the home? Live Search can help 
http://imagine-windowslive.com/search/kits/default.aspx?kit=improve&locale=en-US&source=hmemailtaglinenov06&FORM=WLMTAG




More information about the dba-SQLServer mailing list