Mark Breen
marklbreen at gmail.com
Thu Oct 22 09:29:54 CDT 2009
Hello John,
thanks for the credit, but the real credit goes to Nigel Rivett.
But thanks anyway
Have a nice weekend,
Mark
Mark Breen said
>> I got the code from this site, if you have a look, there is a little more
>> there that you may be able to make use of.
>>
>> Thanks to Nigel Rivett.
>>
>> http://www.nigelrivett.net/SQLTsql/RemoveNonNumericCharacters.html
>>
>> <snip>
2009/10/21 jwcolby <jwcolby at colbyconsulting.com>
> Just an FYI, I ran this against 1 million records, in a query that pulled
> the top(1000000) City
> fields (where this specific problem lies) and it took about 13 seconds.
> This is with an index in
> place that includes the city field. I then ran it against the entire table
> (62.8 million records)
> and it took 2 minutes 52 seconds.
>
> This will become a standard part of my program for cleaning up my data.
>
> Thanks Mark Breen!
>
> John W. Colby
> www.ColbyConsulting.com
>
>
> jwcolby wrote:
> > Mark,
> >
> > Below is the SP I created to perform this magic. I will be running this
> on 66 million records so we
> > shall see about performance.
> >
> > set ANSI_NULLS ON
> > set QUOTED_IDENTIFIER ON
> > go
> >
> > -- =============================================
> > -- Author: <Author,,Name>
> > -- Create date: <Create Date,,>
> > -- Description: <Description,,>
> > --
> > --WHILE @@rowcount > 0
> > -- UPDATE tablename
> > -- SET columnname = REPLACE(columnname,
> > -- SUBSTRING(columnname,
> > --
> PATINDEX('%[^a-zA-Z0-9]%',
> > -- columnname), 1),
> '')
> > -- WHERE PATINDEX('%[^a-zA-Z0-9]%', columnname) <> 0
> > --
> > -- =============================================
> > ALTER PROCEDURE [dbo].[sp_ReplaceCharacters]
> > -- Add the parameters for the stored procedure here
> > @DBName varchar(50), @TblName varchar(50), @FldName varchar(50)
> > AS
> > BEGIN
> > -- SET NOCOUNT ON added to prevent extra result sets from
> > -- interfering with SELECT statements.
> > SET NOCOUNT ON;
> > declare @SQL varchar(1000)
> > declare @PatIndex varchar(50)
> >
> > BEGIN TRY
> > set @PatIndex = '%[^a-zA-Z0-9 .,#]%'
> > SELECT @SQL = 'WHILE @@rowcount > 0
> > UPDATE ' + @DBName +
> '.dbo.' + @TblName + '
> > SET ' + @FldName + ' =
> REPLACE(' + @FldName + ',
> > SUBSTRING(' + @FldName + ',
> > PATINDEX(''' + @PatIndex +
> ''',
> > ' + @FldName +
> '), 1), '''')
> > WHERE PATINDEX(''' +
> @PatIndex + ''', ' + @FldName + ') <> 0'
> > exec (@SQL)
> > print 'success REPLACING CHARACTERS IN ' + @DBName + '.' +
> @TblName + '.' + @FldName
> > END TRY
> > BEGIN CATCH
> > print 'There was an error REPLACING CHARACTERS IN ' +
> @DBName + '.' + @TblName + '.' + @FldName
> > print @SQL
> > print ERROR_MESSAGE()
> > END CATCH
> > END
> >
> >
> > John W. Colby
> > www.ColbyConsulting.com
> >
> >
> > Mark Breen wrote:
> >> Hello John,
> >> I used the following SQL today, and it works a treat, in my case, I
> wanted
> >> to exclude all except alphas, I know that it is not C#, and I know you
> may
> >> prefer C# as practice, so if it is no useful, just bin it.
> >>
> >> I got the code from this site, if you have a look, there is a little
> more
> >> there that you may be able to make use of.
> >>
> >> Thanks to Nigel Rivett.
> >>
> >> http://www.nigelrivett.net/SQLTsql/RemoveNonNumericCharacters.html
> >>
> >> <snip>
> >>
> >> WHILE @@rowcount > 0
> >> UPDATE tablename
> >> SET columnname = REPLACE(columnname,
> >> SUBSTRING(columnname,
> >> PATINDEX('%[^a-zA-Z0-9]%',
> >> columnname), 1),
> '')
> >> WHERE PATINDEX('%[^a-zA-Z0-9]%', columnname) <> 0
> >>
> >> </snip>
> >>
> >>
> >> Mark
> >
> > _______________________________________________
> > dba-VB mailing list
> > dba-VB at databaseadvisors.com
> > http://databaseadvisors.com/mailman/listinfo/dba-vb
> > http://www.databaseadvisors.com
> >
> >
> _______________________________________________
> dba-VB mailing list
> dba-VB at databaseadvisors.com
> http://databaseadvisors.com/mailman/listinfo/dba-vb
> http://www.databaseadvisors.com
>
>