Search This Blog

Tuesday, January 5, 2010

All about locking , blocking and deadlocking ....Part 5

This is the last one in this series .Let me know if you have any comments : Good or Bad (",)

How to reduce blocking and deadlocks:
I can give you one answer from my own and then we will see more : Do not commit the stupidity that I did in the example in the previous post (deadlock example).

As a DBA
-> Make sure that there are proper Indexes on tables and that they are actually being used.sometimes we feel relaxed after creating the indexes .But the query might be written in such a way that , that particular index might not be used and we might be doing scan instead of seek .One Such example is not using SARGable arguments (Now dont ask whats that ..we will discuss some other day or refer BOL).

-> Make sure you rebuild the indexes regularly based on the fragmentation level (you might be having so many scripts to do that).Do not bother to rebuild if the number of pages are less than 1000.rebuilding index will recompute the stats

-> Check if the Query is doing parallelism .If yes then Try everything else before running the query with maxdop hint.Many a times under CPU pressure the ECIDs have to wait on CPUs as its been taken by other threads.Thus causing blocking and or deadlocking .

-> Although the auto-update stats is ON by default.It does not mean that you should not update the stats .I will not cover this fact in detail here .Check auto-update statistics in BOL and you will know a lot ....trust me .

-> Any query that is going to do sorts in tempdb will surely take a lot of time as compared to in database sorting .For more on this check the facts on my Blog entry http://ms-abhay.blogspot.com/2009/11/sorts-spills-in-tempdb-due-to-incorrect.html

-> make sure that you are not facing disk IOs contention even after following steps .Perfmon will show high Disk queue length .Normally DQL of 2 per spindle is OK

-> Check on what resource the query is waiting using sysprocesses .It might be Network IO as well .

-> When you do a bulk insert disable the indexes and enable them during the time of least activity so that it can catch up to the new rows and update the stats accordingly .Or better go for rebuilding it after the bulk insert or delete.

-> Do not play with the locks option in sp_configure .I would suggest to keep the default setting of 0.

-> Deadlocks can occur due to memory pressure .When concurrent requests are waiting for memory grants that cannot be satisfied with the available memory,a deadlock can occur.For example,two concurrent queries,Q1 and Q2,acquire 10MB and 20MB of memory respectively.If each query needs 30MB and the total available memory is 20MB,then Q1 and Q2 must wait for each other to release memory.This results in a deadlock.


As a Developer

-> Use Temp tables instead of temp variables .Or test both and see which one works better .There is a great debate outside on this topic.

-> Use as less Adhoc queries as possible .There might be situations where too many adhoc queries kicks off the cached plans of SPs and in turn those SPs will
again be recompiled.If you want to use Adhoc queries use EXEC or sp_executesql and the plans will be cached.E.g. exec ('select * from t2') .First time it took 10 seconds .Second time it took 5 seconds.

-> Keep Small batches and transactions .Do not run a lot of transactions from one connection or SPs and try to achieve everything in one go.

-> Try to avoid Sorts in tempdb by creating and calling the temp table from the same SP and not from other SP .Google or Bing on SORT IN TEMPDB

-> Its not always true to say that more tempdb files will improveperformance .Your SAN should be capable enough to do that .

-> Choose Isolation level as per requirement .If you can live with read-uncommitted , go for it .

-> Its not bad to go for Snapshot isolation level but it will stress your tempdb.

-> Keep your table normalized so that you use the DISTINCT clause as less as possible .But make sure you do not over normalize the tables else you will be
ending up making your query more complex to understand for optimizer as it will have more joins .

-> One of the most important reasons for blocking and deadlocking could be lock escalation .If you have a slightest doubt that your query is going for lock
escalation , use Lock:Escalation event in profiler and see if its actually happening .If you find it happening , check if the same SPID is the blocker or
being blocked as well .you can disable lock escalation by enabling trace flag 1211 But I would not suggest that.To avoid lock escalation modify your query
to take less locks.

Tip :
Mostly I have seen that, if the deadlocking issue comes right after the application goes live then its the application fault (the way it synchronizes the
access to the data) .But if the application owner says that everything was fine and he is facing the deadlocks all of a suden then some major data change
might have happened in the tables that might be causing blocking and deadlocking .One such cause is " Parameter Sniffing ".

Sometimes we do a select in such a way that we crosses the threshold of in-database sort and the sort goes to tempdb
(http://ms-abhay.blogspot.com/2009/11/sorts-spills-in-tempdb-due-to-incorrect.html) .In this case we need to keep on testing the query until we find the threshold where the query spills the sorts in the tempdb and reduce the value of the column in the where clause.

Both Developer and DBA has to work together in resolving such issues.


SET DEADLOCK_PRIORITY Option

Sometimes , you might wish that in any case a particular transaction should not be treated as a victim in the event of deadlock.
In that case you can set the deadlock_priority of the trasnaction higher than others.An example could be a case of an update and select , you might be ready
to rollback the select but you want the update to be successful .

Which session is chosen as the deadlock victim depends on each session's deadlock priority:
-If both sessions have the same deadlock priority, the instance of SQL Server chooses the session that is less expensive to roll back as the deadlock victim.
For example, if both sessions have set their deadlock priority to HIGH, the instance will choose as a victim the session it estimates is less costly to roll
back.

-If the sessions have different deadlock priorities, the session with the lowest
deadlock priority is chosen as the deadlock victim.

Unresolved Deadlocks
You need not to bother about it .If it happens open a case with Microsoft as a prooduct bug .Either they have the hotfix or they dont .
In the error log you will find the deadlock graph and the Dump information .

*******************************************************************************
2007-09-12 11:43:53.51 spid4s *
2007-09-12 11:43:53.51 spid4s * BEGIN STACK DUMP:
2007-09-12 11:43:53.51 spid4s * 09/12/07 11:43:53 spid 4
2007-09-12 11:43:53.51 spid4s *
2007-09-12 11:43:53.51 spid4s * Unresolved deadlock
2007-09-12 11:43:53.51 spid4s *
2007-09-12 11:43:53.51 spid4s *
2007-09-12 11:43:53.51 spid4s *
*******************************************************************************
You will also see Cost:(N/A) and deadlock victim=process0 in the deadlock graph.


Few references
Batch Compilation, Recompilation, and Plan Caching Issues in SQL Server 2005
http://technet.microsoft.com/hi-in/library/cc966425(en-us).aspx

How to resolve blocking problems that are caused by lock escalation in SQL Server
http://support.microsoft.com/kb/323630

All about locking , blocking and deadlocking ....Part 4

continuing from the previous 3 entries ............

Blocking
So far we have executed many queries and seen blocking as well.But I would still like to discuss blocking a bit more ...
Use sysprocesses DMV/Table to find blocking .This has 3 advanages .One that you will see only the blocking SPIDs and not all the SPIDs by using sp_lock.
Two , you will be able to see head of the blocking chain. Three , you wil be able to see on what resource the SPID is waiting for .Once you have all the knowledge above you can surely use sp_lock to get some more information .

Lab

1st User
set transaction isolation level serializable
begin tran
update t1 set c1 = 3000 where c2 < 3000
--rollback

2nd user
set transaction isolation level serializable
begin tran
select * from t1

Now check the sysprocesses DMV (I am using 2005) :
select * from sys.sysprocesses where blocked <>0

You will notice that lastwaittype is LCK_M_IS
This means that the SPID which is being blocked is actually waiting for an IS lock to grab but cannot and being blockd by another spid.
The SPID that is blocking is shown in the "Blocked" column .
You can cross check by running sp_lock .

Now if you run one more query :
set transaction isolation level serializable
begin tran
select * from t1

You will notice the now there are 2 rows in the sysprocesses DMV where blocked <>0
here you will see :

SPID A is blocked by SPID B --> SPID B is blocked by SPID C
So who is the actual culprit ..yes its C and is called as the head of the blockign chain .


Now a tough question for you :
Why SPID A is blocked by SPID B and not C ? .....Think :) ..


Deadlocks
you have already seen blocking ..
Lets try our hands at creating a deadlock.I am not going to explain what is a deadlock .

For the benefit of all I will keep the Isolation level to serializable .
It will help us in creating deadlocks easily .

Lab
First do this :
select count(*) from t2
select count(*) from t1

If the number of rows are same then we are good to go .
If not, then drop the table with less number of rows and recreate it with the help of another table .
How to Do it .......check previous updates ...



First Window :
set transaction isolation level serializable
go
begin tran
update dbo.t3 set c2 = 230 where c2 > 1000 and c3 > 700

Second Window :
set transaction isolation level serializable
go
begin tran
update dbo.t2 set c3 = 230 where c2 > 1000 and c3 > 700


First Window :
begin tran
select * from t2
--rollback

Second Window :
begin tran
select * from t3
--rollback


You will see that the SPID in second window will be the deadlock victim:

Msg 1205, Level 13, State 56, Line 2
Transaction (Process ID 52) was deadlocked on lock resources with another process and has been chosen as the deadlock victim. Rerun the transaction.


Like this you can create many deadlocks

How to capture the deadlock Graph
There are 2 ways :

- Use Traceflag 1204 <-- for 2000 and 1222 <-- for 2005 and above.
- Use WMI class deadlock_graph to capture the graph for SQL Server 2005 and above(for 2000 its a bit tricky .Hence not mentioning it here.)


Using TraceFlag 1222
dbcc traceon(1222 , -1)
go

Create the deadlock (I am creating the same one as mentioned above)

Using WMI Class and SQL Server Agent
Please go through one of my previous post : http://ms-abhay.blogspot.com/2009/09/wmi-alert-for-deadlocking-no-need-to.html

How to read the deadlock graph
Check the Error logs and copy the deadlock information on a separate notepad window.
You can find my analysis below:

deadlock-list
deadlock victim=process968d48
process-list
process id=process968d48 taskpriority=0 logused=543996 waitresource=OBJECT: 8:629577281:0 waittime=3359 ownerId=29121 transactionname=user_transaction lasttranstarted=2009-12-28T15:40:39.240 XDES=0x3e8f838 lockMode=IS schedulerid=2 kpid=6124 status=suspended spid=52 sbid=0 ecid=0 priority=0 transcount=2 lastbatchstarted=2009-12-28T15:40:56.500 lastbatchcompleted=2009-12-28T15:40:39.413 lastattention=2009-12-28T15:31:16.907 clientapp=Microsoft SQL Server Management Studio - Query hostname=abchaudh hostpid=1336 loginname=abchaudh\Abhay isolationlevel=serializable (4) xactid=29121 currentdb=8 lockTimeout=4294967295 clientoption1=671090784 clientoption2=390200
executionStack
frame procname=adhoc line=2 stmtstart=26 sqlhandle=0x02000000bea8742faf157cd9e8e65d380d705fefe8dfb182
select * from t3
inputbuf
begin tran
select * from t3
process id=process969018 taskpriority=0 logused=576116 waitresource=OBJECT: 8:597577167:0 waittime=10000 ownerId=28599 transactionname=user_transaction lasttranstarted=2009-12-28T15:40:34.780 XDES=0x3e8f290 lockMode=IS schedulerid=2 kpid=8140 status=suspended spid=53 sbid=0 ecid=0 priority=0 transcount=2 lastbatchstarted=2009-12-28T15:40:49.873 lastbatchcompleted=2009-12-28T15:40:34.907 lastattention=2009-12-28T15:27:27.483 clientapp=Microsoft SQL Server Management Studio - Query hostname=abchaudh hostpid=1336 loginname=abchaudh\Abhay isolationlevel=serializable (4) xactid=28599 currentdb=8 lockTimeout=4294967295 clientoption1=671090784 clientoption2=390200
executionStack
frame procname=adhoc line=2 stmtstart=26 sqlhandle=0x02000000b7cdeb20804433e7571f29d85516ff688d005724
select * from t2
inputbuf
begin tran
select * from t2



resource-list
objectlock lockPartition=0 objid=597577167 subresource=FULL dbid=8 objectname=deadlocktest.dbo.t2 id=lockcc7f700 mode=X associatedObjectId=597577167
owner-list
owner id=process968d48 mode=X
waiter-list
waiter id=process969018 mode=IS requestType=wait
objectlock lockPartition=0 objid=629577281 subresource=FULL dbid=8 objectname=deadlocktest.dbo.t3 id=lockcc7f780 mode=X associatedObjectId=629577281
owner-list
owner id=process969018 mode=X
waiter-list
waiter id=process968d48 mode=IS requestType=wait




Analysis :
----------
Divide the output into 2 parts : Deadlock list and resource list

process968d48
--------------
logused=543996
waitresource=OBJECT: 8:629577281:0 <-- here 8 is DBID ,629577281 is the object ID and 0 is heap
lockMode=IS
spid=52
sbid=0 <-- Its the Batch ID .It will be 0 until the application is using MARS (check BOL for MARS)
ecid=0 <-- zero means no paralelism is going on (Refer BOL for Query Parallelism)
transcount=2 <-- We are running 2 transaction under this SPID (select and update)
isolationlevel=serializable
currentdb=8
procname=adhoc (select * from t3) <-- Its an adhoc query
inputbuf =
begin tran
select * from t3


process969018
--------------
logused=576116
waitresource=OBJECT: 8:597577167:0
lockMode=IS
spid=53
sbid=0
ecid=0
transcount=2
isolationlevel=serializable
currentdb=8
procname=adhoc (Select * from t2)
inputbuf=
begin tran
select * from t2


Story
---------
-> At the time of deadlock 2 queries were running (under SPID 52 and 53) .
-> The processes were process968d48 for SPID 52 and process969018 for SPID 53
-> process968d48 took X lock on table T2 and requested IS on table T3 ..which was blocked by process969018
-> process969018 took X lock on table T3 and requested IS on table T2.. which was blocked by process968d48
-> deadlock victim is process968d48

Why process968d48 is deadlock victim ?
Because it has less undo to rollback .

How do you know if it has less undo ?
See log used values .


Rest ..........in next blog ....keep looking ...