[dba-SQLServer] Duplicate records

Paul Nielsen pauln at sqlserverbible.com
Wed Mar 19 09:59:18 CDT 2008


A cursor is like going to the bank and depositing 1 million dollars, one
dollar at a time, with a deposit ticket for each dollar. 

It not only that cursors are paaaaaainfully slow, it's that thinking about
solving problems in the mindset of cursors leads to more sloppy code. Db
developers think in terms of sets. 

-Paul



-----Original Message-----
From: dba-sqlserver-bounces at databaseadvisors.com
[mailto:dba-sqlserver-bounces at databaseadvisors.com] On Behalf Of Mark A
Matte
Sent: Wednesday, March 19, 2008 8:21 AM
To: Discussion concerning MS SQL Server
Subject: Re: [dba-SQLServer] Duplicate records


Thanks for the suggestions.  
 
I know cursors are slow...but I'm not going through every record...just the
dups.  Other than potential speed issues...anyone see anything that deserves
caution here? I kinda like 'find them...then delete' approach
 
Thanks,
 
Mark A. Matte
 
 **********************BEGIN************************--By Mark A. Matte
3/18/2008declare @CaseID decimal(15,0), at NoteNum int, at NoteCount int declare
MyCursor cursor for Select case_id,note_seq_nbr,count(*) as CT from
tblCaseNote_temp group by case_id,note_seq_nbr having count(*)>1 order by
case_id,note_seq_nbr open mycursor fetch next from mycursor into
@CaseID, at NoteNum, at NoteCount while (@@fetch_status =0)
 Begin delete top(@noteCount-1) from tblCaseNote_Temp where case_id=@CaseID
and note_seq_nbr=@noteNum fetch next from mycursor into
@CaseID, at NoteNum, at NoteCount End Close mycursor deallocate mycursor
***********************END*************************> From:
pauln at sqlserverbible.com> To: dba-sqlserver at databaseadvisors.com> Date: Tue,
18 Mar 2008 22:17:59 -0600> Subject: Re: [dba-SQLServer] Duplicate records>
> > I'd do the BulkINsert into a staging table then eliminate the dups as
part> of the cleanup.> > Here are two popular ways to eliminate dups...> > >
Use Tempdb> > drop table dups> go> CREATE TABLE dbo.dups (> col1 INT,> col2
CHAR(5),> col3 CHAR(5)> )> go> INSERT dbo.dups (col1, col2, col3) -- 1>
VALUES (1, 'abc', 'Paul')> INSERT dbo.dups (col1, col2, col3) -- 2> VALUES
(1, 'abc', 'Paul') --dup> INSERT dbo.dups (col1, col2, col3) -- 3> VALUES
(2, 'abc', 'Paul')> INSERT dbo.dups (col1, col2, col3) -- 4> VALUES (2,
'xyz', 'Sue')> INSERT dbo.dups (col1, col2, col3) -- 5> VALUES (2, 'xyz',
'Sue') -- dup> INSERT dbo.dups (col1, col2, col3) -- 6> VALUES (2, 'xyz',
'Sue') -- dup> INSERT dbo.dups (col1, col2, col3) -- 7> VALUES (3, 'xyz',
'Sue')> go> > -- eliminate dups - select distinct method> > SELECT DISTINCT
*> INTO nodups> FROM dups > > SELECT * FROM nodups> > -- eliminate dups
group by adding a pk> > ALTER TABLE dbo.dups> ADD pk INT NOT NULL IDENTITY
PRIMARY KEY> > -- identify dups> SELECT * > FROM dbo.dups> WHERE PK NOT IN
(> SELECT min(pk) > FROM dups> GROUP BY col1, col2, col3> )> > -- remove
dups> DELETE dbo.dups> WHERE PK NOT IN (> SELECT min(pk) > FROM dups> GROUP
BY col1, col2, col3> )> > SELECT * FROM dbo.dups> > > > > -----Original
Message-----> From: dba-sqlserver-bounces at databaseadvisors.com>
[mailto:dba-sqlserver-bounces at databaseadvisors.com] On Behalf Of Mark A>
Matte> Sent: Tuesday, March 18, 2008 4:33 PM> To: Discussion concerning MS
SQL Server> Subject: Re: [dba-SQLServer] Duplicate records> > > I get the
data in a 2+gig text file...how do import without dups...I tried> that
first...and it just stopped the import?> > > > Date: Tue, 18 Mar 2008
14:42:07 -0700> > From: fhtapia at gmail.com> > To:
dba-sqlserver at databaseadvisors.com> > Subject: Re: [dba-SQLServer] Duplicate
records> >> > Cursors are slow, you could choose to place an index to
prevent duplicates> > so duplicate information just never makes it into the
table?> >> >> >> > On Tue, Mar 18, 2008 at 1:46 PM, Mark A Matte > > wrote:>
>> >>> >> Hello All,> >>> >> I have a large table that occasionally
duplicates make it into. The 2> >> fields that make the record unique are
Case_ID and Note_Seq_Nbr. The> >> duplicates are identical...so I have no
primary key. I used the code> below.> >> It doesn't run very fast...but it
does the job of removing just the extra> >> records...> >>> >> Any
thoughts,comments, or suggestions on making it faster/better...or a> >>
totally different approach?> >>> >> Thanks,> >>> >> Mark A. Matte
_________________________________________________________________
Need to know the score, the latest news, or you need your HotmailR-get your
"fix".
http://www.msnmobilefix.com/Default.aspx
_______________________________________________
dba-SQLServer mailing list
dba-SQLServer at databaseadvisors.com
http://databaseadvisors.com/mailman/listinfo/dba-sqlserver
http://www.databaseadvisors.com


__________ NOD32 2959 (20080319) Information __________

This message was checked by NOD32 antivirus system.
http://www.eset.com





More information about the dba-SQLServer mailing list