Paul Nielsen
pauln at sqlserverbible.com
Tue Mar 18 23:17:59 CDT 2008
I'd do the BulkINsert into a staging table then eliminate the dups as part of the cleanup. Here are two popular ways to eliminate dups... Use Tempdb drop table dups go CREATE TABLE dbo.dups ( col1 INT, col2 CHAR(5), col3 CHAR(5) ) go INSERT dbo.dups (col1, col2, col3) -- 1 VALUES (1, 'abc', 'Paul') INSERT dbo.dups (col1, col2, col3) -- 2 VALUES (1, 'abc', 'Paul') --dup INSERT dbo.dups (col1, col2, col3) -- 3 VALUES (2, 'abc', 'Paul') INSERT dbo.dups (col1, col2, col3) -- 4 VALUES (2, 'xyz', 'Sue') INSERT dbo.dups (col1, col2, col3) -- 5 VALUES (2, 'xyz', 'Sue') -- dup INSERT dbo.dups (col1, col2, col3) -- 6 VALUES (2, 'xyz', 'Sue') -- dup INSERT dbo.dups (col1, col2, col3) -- 7 VALUES (3, 'xyz', 'Sue') go -- eliminate dups - select distinct method SELECT DISTINCT * INTO nodups FROM dups SELECT * FROM nodups -- eliminate dups group by adding a pk ALTER TABLE dbo.dups ADD pk INT NOT NULL IDENTITY PRIMARY KEY -- identify dups SELECT * FROM dbo.dups WHERE PK NOT IN ( SELECT min(pk) FROM dups GROUP BY col1, col2, col3 ) -- remove dups DELETE dbo.dups WHERE PK NOT IN ( SELECT min(pk) FROM dups GROUP BY col1, col2, col3 ) SELECT * FROM dbo.dups -----Original Message----- From: dba-sqlserver-bounces at databaseadvisors.com [mailto:dba-sqlserver-bounces at databaseadvisors.com] On Behalf Of Mark A Matte Sent: Tuesday, March 18, 2008 4:33 PM To: Discussion concerning MS SQL Server Subject: Re: [dba-SQLServer] Duplicate records I get the data in a 2+gig text file...how do import without dups...I tried that first...and it just stopped the import? > Date: Tue, 18 Mar 2008 14:42:07 -0700 > From: fhtapia at gmail.com > To: dba-sqlserver at databaseadvisors.com > Subject: Re: [dba-SQLServer] Duplicate records > > Cursors are slow, you could choose to place an index to prevent duplicates > so duplicate information just never makes it into the table? > > > > On Tue, Mar 18, 2008 at 1:46 PM, Mark A Matte > wrote: > >> >> Hello All, >> >> I have a large table that occasionally duplicates make it into. The 2 >> fields that make the record unique are Case_ID and Note_Seq_Nbr. The >> duplicates are identical...so I have no primary key. I used the code below. >> It doesn't run very fast...but it does the job of removing just the extra >> records... >> >> Any thoughts,comments, or suggestions on making it faster/better...or a >> totally different approach? >> >> Thanks, >> >> Mark A. Matte >> >> >> >> **********************BEGIN************************ >> --By Mark A. Matte 3/18/2008 >> declare @CaseID decimal(15,0), at NoteNum int, at NoteCount int >> declare MyCursor cursor >> for Select case_id,note_seq_nbr,count(*) as CT >> from tblCaseNote_temp >> group by case_id,note_seq_nbr >> having count(*)>1 >> order by case_id,note_seq_nbr >> open mycursor >> fetch next from mycursor >> into @CaseID, at NoteNum, at NoteCount >> while (@@fetch_status =0) >> >> Begin >> delete top(@noteCount-1) from tblCaseNote_Temp where >> case_id=@CaseID and note_seq_nbr=@noteNum >> fetch next from mycursor >> into @CaseID, at NoteNum, at NoteCount >> End >> Close mycursor >> deallocate mycursor >> ***********************END************************* >> _________________________________________________________________ >> Need to know the score, the latest news, or you need your Hotmail(R)-get >> your "fix". >> http://www.msnmobilefix.com/Default.aspx >> _______________________________________________ >> dba-SQLServer mailing list >> dba-SQLServer at databaseadvisors.com >> http://databaseadvisors.com/mailman/listinfo/dba-sqlserver >> http://www.databaseadvisors.com >> >> > > > -- > -Francisco > http://sqlthis.blogspot.com | Tsql and More... > _______________________________________________ > dba-SQLServer mailing list > dba-SQLServer at databaseadvisors.com > http://databaseadvisors.com/mailman/listinfo/dba-sqlserver > http://www.databaseadvisors.com > _________________________________________________________________ Shed those extra pounds with MSN and The Biggest Loser! http://biggestloser.msn.com/ _______________________________________________ dba-SQLServer mailing list dba-SQLServer at databaseadvisors.com http://databaseadvisors.com/mailman/listinfo/dba-sqlserver http://www.databaseadvisors.com __________ NOD32 2958 (20080318) Information __________ This message was checked by NOD32 antivirus system. http://www.eset.com