[dba-Tech] SQL Question
Stuart McLachlan
stuart at lexacorp.com.pg
Mon Jul 16 08:32:32 CDT 2018
Ok, it's not RBAR :)
They are deleting sets of 10,000 records at a time (and creating an index frist to make
selecting those 10,000 records as quick as possible)
They are grabbing the PK of 10,000 records at a time into a temporary table.then deleting all
related records in another table doing an INNER JOIN between the temp table and the
related table. They repeat this process for six different tables in each loop.
The WHILE (1=1) keeps the BEGIN....END block repeating until:
if(@DeleteRowCount > @rowsAffected)
return
That is - if the last "Select top (@DeleteRowCount) AsyncOperationId from
AsyncOperationBase" returned less than 10,000 records we have processed the last block
of matching records so exit the proecdure
Not sure why it needs a 2 second delay on each iteration, but apart from that, it all looks
good to me. Looks fairly well optimised assuming that AsyncOperationId is indexed in each
of the tables being deleted (that may be worth checking).
(Those delays only add about half an hour for 22 million records.)
On 16 Jul 2018 at 12:49, Martin Reid wrote:
> It's an available MS script Their engineer told us only supported way
> to do this. Full Script is
>
> IF EXISTS (SELECT name from sys.indexes
> WHERE name = N'CRM_AsyncOperation_CleanupCompleted')
> DROP Index
> AsyncOperationBase.CRM_AsyncOperation_CleanupCompleted
> GO
> CREATE NONCLUSTERED INDEX CRM_AsyncOperation_CleanupCompleted
> ON [dbo].[AsyncOperationBase]
> ([StatusCode],[StateCode],[OperationType]) GO
>
> while(1=1)
> begin
> declare @DeleteRowCount int = 10000
> declare @rowsAffected int
> declare @DeletedAsyncRowsTable table (AsyncOperationId
> uniqueidentifier not null primary key) insert into
> @DeletedAsyncRowsTable(AsyncOperationId) Select top (@DeleteRowCount)
> AsyncOperationId from AsyncOperationBase where
> OperationType in (1, 9, 12, 25, 27, 10)
> AND StateCode = 3
> AND StatusCode in (30, 32)
>
> select @rowsAffected = @@rowcount
> delete poa from PrincipalObjectAccess poa
> join WorkflowLogBase wlb on
> poa.ObjectId = wlb.WorkflowLogId
> join @DeletedAsyncRowsTable dart on
> wlb.AsyncOperationId = dart.AsyncOperationId
> delete WorkflowLogBase from WorkflowLogBase W, @DeletedAsyncRowsTable
> d where
> W.AsyncOperationId = d.AsyncOperationId
> delete BulkDeleteFailureBase From BulkDeleteFailureBase B,
> @DeletedAsyncRowsTable d
> where
> B.AsyncOperationId = d.AsyncOperationId
> delete BulkDeleteOperationBase From BulkDeleteOperationBase O,
> @DeletedAsyncRowsTable d where
> O.AsyncOperationId = d.AsyncOperationId
> delete WorkflowWaitSubscriptionBase from WorkflowWaitSubscriptionBase
> WS, @DeletedAsyncRowsTable d where
> WS.AsyncOperationId = d.AsyncOperationID
> delete AsyncOperationBase From AsyncOperationBase A,
> @DeletedAsyncRowsTable d
> where
> A.AsyncOperationId = d.AsyncOperationId
> /*If not calling from a SQL job, use the WAITFOR DELAY*/
> if(@DeleteRowCount > @rowsAffected)
> return
> else
> WAITFOR DELAY '00:00:02.000'
> end
>
>
>
>
> -----Original Message-----
> From: dba-Tech <dba-tech-bounces at databaseadvisors.com> On Behalf Of
> Stuart McLachlan Sent: 16 July 2018 13:47 To: Discussion of Hardware
> and Software issues <dba-tech at databaseadvisors.com> Subject: Re:
> [dba-Tech] SQL Question
>
> I assume that it looks like:
>
> WHILE (1=1)
> BEGIN
> ....
> IF..... BREAK;
> ....
> END
>
> It's a common way to create an infinite loop since SQL doen't have an
> absolute. DO_LOOP construct. But it s inefficent because (1=1) has
> to be evaluated on every iteration.
>
> Again, without knowing what's in the BEGIN... END loop, it's hard to
> tell, but from how long it takes, I'd suspect that it is using a
> cursor, stepping through a large recordset and deleting records one
> at a time. This is often called RBAR programming ( "Row By Agonising
> Row") and is notoriously slow with large recordsets. Far better to
> try using one or a series of set based deletions i.e DELETE FROM....
> WHERE...
>
> It may be worth googling SQL RBAR for lots of discussion on the
> subject. :-)
>
>
>
> On 16 Jul 2018 at 8:02, Susan Harkins wrote:
>
> > From what I've read, it's for testing -- a placeholder of sorts. I'd
> > delete it and try again.
> >
> > Susan H.
> >
> >
> > We are running a script to delete a few million records from a CRM
> > SQL Server. This bit is confusing us "While 1 =1" . We can't
> > understand how this would ever evaluate to anything other than 1 =1?
> >
> > IF EXISTS (SELECT name from sys.indexes WHERE name =
> > N'CRM_AsyncOperation_CleanupCompleted')
> > DROP Index
> > AsyncOperationBase.CRM_AsyncOperation_CleanupCompleted
> > GO
> > CREATE NONCLUSTERED INDEX CRM_AsyncOperation_CleanupCompleted
> > ON [dbo].[AsyncOperationBase]
> > ([StatusCode],[StateCode],[OperationType]) GO
> >
> > while(1=1)
> > begin
> >
> > We also ran the full script for 34 hours and it ran successfully but
> > turns out left 22million records undeleted all of which meet the
> > deletion criteria!
> >
> > It's a Microsoft script. Confusing.
> >
> > Martin
> >
> >
> > _______________________________________________
> > dba-Tech mailing list
> > dba-Tech at databaseadvisors.com
> > http://databaseadvisors.com/mailman/listinfo/dba-tech
> > Website: http://www.databaseadvisors.com
> >
> > _______________________________________________
> > dba-Tech mailing list
> > dba-Tech at databaseadvisors.com
> > http://databaseadvisors.com/mailman/listinfo/dba-tech
> > Website: http://www.databaseadvisors.com
> >
>
>
> _______________________________________________
> dba-Tech mailing list
> dba-Tech at databaseadvisors.com
> http://databaseadvisors.com/mailman/listinfo/dba-tech
> Website: http://www.databaseadvisors.com
>
> _______________________________________________
> dba-Tech mailing list
> dba-Tech at databaseadvisors.com
> http://databaseadvisors.com/mailman/listinfo/dba-tech
> Website: http://www.databaseadvisors.com
>
More information about the dba-Tech
mailing list