Mark Breen
marklbreen at gmail.com
Thu Oct 22 09:29:54 CDT 2009
Hello John, thanks for the credit, but the real credit goes to Nigel Rivett. But thanks anyway Have a nice weekend, Mark Mark Breen said >> I got the code from this site, if you have a look, there is a little more >> there that you may be able to make use of. >> >> Thanks to Nigel Rivett. >> >> http://www.nigelrivett.net/SQLTsql/RemoveNonNumericCharacters.html >> >> <snip> 2009/10/21 jwcolby <jwcolby at colbyconsulting.com> > Just an FYI, I ran this against 1 million records, in a query that pulled > the top(1000000) City > fields (where this specific problem lies) and it took about 13 seconds. > This is with an index in > place that includes the city field. I then ran it against the entire table > (62.8 million records) > and it took 2 minutes 52 seconds. > > This will become a standard part of my program for cleaning up my data. > > Thanks Mark Breen! > > John W. Colby > www.ColbyConsulting.com > > > jwcolby wrote: > > Mark, > > > > Below is the SP I created to perform this magic. I will be running this > on 66 million records so we > > shall see about performance. > > > > set ANSI_NULLS ON > > set QUOTED_IDENTIFIER ON > > go > > > > -- ============================================= > > -- Author: <Author,,Name> > > -- Create date: <Create Date,,> > > -- Description: <Description,,> > > -- > > --WHILE @@rowcount > 0 > > -- UPDATE tablename > > -- SET columnname = REPLACE(columnname, > > -- SUBSTRING(columnname, > > -- > PATINDEX('%[^a-zA-Z0-9]%', > > -- columnname), 1), > '') > > -- WHERE PATINDEX('%[^a-zA-Z0-9]%', columnname) <> 0 > > -- > > -- ============================================= > > ALTER PROCEDURE [dbo].[sp_ReplaceCharacters] > > -- Add the parameters for the stored procedure here > > @DBName varchar(50), @TblName varchar(50), @FldName varchar(50) > > AS > > BEGIN > > -- SET NOCOUNT ON added to prevent extra result sets from > > -- interfering with SELECT statements. > > SET NOCOUNT ON; > > declare @SQL varchar(1000) > > declare @PatIndex varchar(50) > > > > BEGIN TRY > > set @PatIndex = '%[^a-zA-Z0-9 .,#]%' > > SELECT @SQL = 'WHILE @@rowcount > 0 > > UPDATE ' + @DBName + > '.dbo.' + @TblName + ' > > SET ' + @FldName + ' = > REPLACE(' + @FldName + ', > > SUBSTRING(' + @FldName + ', > > PATINDEX(''' + @PatIndex + > ''', > > ' + @FldName + > '), 1), '''') > > WHERE PATINDEX(''' + > @PatIndex + ''', ' + @FldName + ') <> 0' > > exec (@SQL) > > print 'success REPLACING CHARACTERS IN ' + @DBName + '.' + > @TblName + '.' + @FldName > > END TRY > > BEGIN CATCH > > print 'There was an error REPLACING CHARACTERS IN ' + > @DBName + '.' + @TblName + '.' + @FldName > > print @SQL > > print ERROR_MESSAGE() > > END CATCH > > END > > > > > > John W. Colby > > www.ColbyConsulting.com > > > > > > Mark Breen wrote: > >> Hello John, > >> I used the following SQL today, and it works a treat, in my case, I > wanted > >> to exclude all except alphas, I know that it is not C#, and I know you > may > >> prefer C# as practice, so if it is no useful, just bin it. > >> > >> I got the code from this site, if you have a look, there is a little > more > >> there that you may be able to make use of. > >> > >> Thanks to Nigel Rivett. > >> > >> http://www.nigelrivett.net/SQLTsql/RemoveNonNumericCharacters.html > >> > >> <snip> > >> > >> WHILE @@rowcount > 0 > >> UPDATE tablename > >> SET columnname = REPLACE(columnname, > >> SUBSTRING(columnname, > >> PATINDEX('%[^a-zA-Z0-9]%', > >> columnname), 1), > '') > >> WHERE PATINDEX('%[^a-zA-Z0-9]%', columnname) <> 0 > >> > >> </snip> > >> > >> > >> Mark > > > > _______________________________________________ > > dba-VB mailing list > > dba-VB at databaseadvisors.com > > http://databaseadvisors.com/mailman/listinfo/dba-vb > > http://www.databaseadvisors.com > > > > > _______________________________________________ > dba-VB mailing list > dba-VB at databaseadvisors.com > http://databaseadvisors.com/mailman/listinfo/dba-vb > http://www.databaseadvisors.com > >