[dba-SQLServer] deleting an unknown number of duplicates

David Lewis David at sierranevada.com
Mon Jul 9 12:35:03 CDT 2007



This should get you up and running.  HTH  D. Lewis


/*

Here is an example of the first sql statement that runs to determine how many duplicates there are.  You will need to first build this statement to fit your data.  The table I am working with has a 'somewhat' unique brewnumber (they begin again at 0 each year):

SELECT COUNT(*) AS Amount,
BrewNumber,
StartTimeRest,
StartTimeWhirlEnd
FROM tblRptWhirlpoolAll
GROUP BY BrewNumber,StartTimeRest,Probrewid,STartTimeWhirlEnd
HAVING COUNT(*) > 1
Order by brewnumber, probrewid

Once I have the row count, I subtract 1 from that and delete all the rows.
I build a cursor to loop through because I run this after a batch import each week, and usually there are any number of brews with duplicate records.

*/
-- declare all variables

DECLARE @iErrorVar int,
@iCount int,
@vchBrewid varchar(10),
@vchmalttypeid varchar(2),
@vchsilo varchar(3),
@vchPounds varchar(6),
@chCount char(3),
@nvchCommand nvarchar(4000),
@intCounter int
-- set initial environment
SET @intCounter=0
SET ROWCOUNT 0
SET NOCOUNT ON

--Notice that in declaring the cursor, the data in the select query are all --cast as varchar's.  This is because I later concatenate all these variable --values into one sql string, and t-sql won't concatenate non-characters.

DECLARE DelDupe CURSOR FOR
SELECT COUNT(*) AS Amount,
CONVERT(varchar(10),Brewid),
CONVERT(varchar(2),malttypeid),
CONVERT(varchar(6),pounds),
CONVERT(varchar(3),silo)
FROM tblBrewMalt
GROUP BY Brewid,
malttypeid,pounds,silo
HAVING COUNT(*) > 1


OPEN DelDupe
FETCH NEXT FROM DelDupe INTO @iCount, at vchBrewid,
@vchmalttypeid,
@vchpounds, at vchsilo
WHILE (@@fetch_status = 0)
BEGIN

-- Calculate number of rows to delete for each grouping by subtracting
-- 1 from the total count for a given group.
-- Every iteration through the cursor will generate a delete statement.
-- Often, though, it is deleting '0' rows, because there are no duplicates.
-- It was easier to just have one bit of code than having it branch.
--  Note that the number of rows to delete is set by the "SET ROWCOUNT"


SELECT @iCount = @iCount - 1
SELECT @chCount = CONVERT(char(3), at iCount)
-- now build the rowcount and delete statements.
SELECT @nvchCommand = N'SET ROWCOUNT ' + @chCount +
'DELETE tblBrewMalt ' +
' WHERE Brewid = ' + '''' + @vchBrewid + '''' +
' AND Malttypeid = ' + '''' + @vchmalttypeid + '''' +
' AND pounds = ' + '''' + @vchPounds + '''' +
' AND silo = ' + '''' + @vchsilo + ''''

-- print the statement. For debugging.

PRINT Convert(varchar(5), at intCounter) + ' ' + @nvchCommand
-- execute the statement.
EXEC sp_executesql @nvchCommand
SELECT @intCounter=@intCounter+1
SELECT @iErrorVar = @@Error
IF @iErrorVar <> 0
BEGIN
RETURN
END

FETCH NEXT FROM DelDupe INTO @iCount, at vchBrewid,
@vchmalttypeid,
@vchpounds, at vchsilo


END

CLOSE DelDupe
DEALLOCATE DelDupe

RETURN







More information about the dba-SQLServer mailing list