在过去很长一段时间,不断有客人会问道:“在事先没有任何跟踪或者监控部署的情况下,阿里云RDS SQL Server有没有办法获取到历史死锁信息,供我们分析?”。在写到RDS SQL Server死锁系列文章之五时,我们就可以使用Extended Events来解决这个问题。
Extended Events是微软从SQL Server 2008版本开始引入的,其中有一个默认事件会话是system_health,它的作用是用来收集SQL Server发生的错误信息,以XML格式存储在一个名为sys.dm_xe_session_targets的DMV中,这其中就包含了死锁信息。所以,我们可以利用这个DMV关联另外一个名为sys.dm_xe_sessions的DMV来获取死锁信息。注意:存储在这两个DMV中的信息会伴随SQL Server服务重启而消失。也就是说,我们无法获取到SQL Server服务重启之前的历史死锁信息。
为了描述清楚如何获取历史死锁信息,我们可以选择其中任意一条死锁信息加以分析,代码如下:
USE master GO -- analysis Extend Event for deadlock DECLARE @deadlock_graph_nvarchar nvarchar(max) ,@deadlock_graph_xml xml ; ;WITH RingBufferTarget AS ( SELECT CAST (target_data AS XML) AS target_xml FROM sys.dm_xe_session_targets st INNER JOIN sys.dm_xe_sessions s ON s.address = st.event_session_address WHERE s.[name] = 'system_health' and st.target_name = 'ring_buffer' ) SELECT TOP 1 @deadlock_graph_nvarchar = CAST(T.C.query('.') AS NVARCHAR(MAX)) FROM RingBufferTarget AS A CROSS APPLY target_xml.nodes('./RingBufferTarget/event') AS T(C) WHERE T.C.value('./@name','varchar(200)') = 'xml_deadlock_report' SELECT @deadlock_graph_xml = CAST(REPLACE(REPLACE(@deadlock_graph_nvarchar, '<', '<'),'>', '>') as xml) ; select @deadlock_graph_xml.query('event/data/value/deadlock') ;WITH deadlock AS ( SELECT OwnerID = T.C.value('@id', 'varchar(50)') ,SPid = T.C.value('(./@spid)[1]','int') ,status = T.C.value('(./@status)[1]','varchar(10)') ,Victim = case when T.C.value('@id', 'varchar(50)') = T.C.value('(./../../victim-list/victimProcess/@id)[1]','varchar(50)') then 1 else 0 end ,LockMode = T.C.value('@lockMode', 'varchar(20)') ,Inputbuf = T.C.value('(./inputbuf/text())[1]','varchar(max)') ,SPName = T.C.value('(./executionStack/frame/@procname)[1]','sysname') ,Hostname = T.C.value('(./@hostname)[1]','sysname') ,Clientapp = T.C.value('(./@clientapp)[1]','varchar(max)') ,LoginName = T.C.value('@loginname', 'varchar(20)') ,Action = T.C.value('(./@transactionname)[1]','varchar(max)') ,TransactionTime = T.C.value('@lasttranstarted', 'datetime') --,* FROM @deadlock_graph_xml.nodes('./event/data/value/deadlock/process-list/process') AS T(C) ) , keylock AS ( SELECT OwnerID = T.C.value('./owner[1]/@id', 'varchar(50)') ,KeylockObject = T.C.value('./../@objectname', 'varchar(200)') ,Indexname = T.C.value('./../@indexname', 'varchar(200)') ,IndexLockMode = T.C.value('./../@mode', 'varchar(20)') FROM @deadlock_graph_xml.nodes('./event/data/value/deadlock/resource-list/keylock/owner-list') AS T(C) ) SELECT A.SPid ,A.TransactionTime ,is_Vitim = A.Victim ,A.SPName ,A.LockMode ,B.Indexname ,B.KeylockObject ,B.IndexLockMode ,A.Inputbuf ,A.Hostname ,A.LoginName ,A.Clientapp ,A.Action ,status FROM deadlock AS A LEFT JOIN keylock AS B ON A.OwnerID = B.OwnerID从执行查询后的结果来看,我们成功拿到了历史死锁信息(从时间字段来看,死锁发生在8天以前),这些有用的信息包含:锁进程,死锁进程,锁的类型,执行的语句,登录用户等信息,如下截图所示:
我们也可以打开这个死锁信息的xml,如下:
<deadlock> <victim-list> <victimProcess id="process15ee08" /> </victim-list> <process-list> <process id="process15ee08" taskpriority="0" logused="0" waitresource="KEY: 14:72057594038910976 (8194443284a0)" waittime="3906" ownerId="23597" transactionname="user_transaction" lasttranstarted="2017-04-19T21:28:11.050" XDES="0x87141730" lockMode="X" schedulerid="1" kpid="4784" status="suspended" spid="64" sbid="0" ecid="0" priority="0" trancount="2" lastbatchstarted="2017-04-19T21:28:11.050" lastbatchcompleted="2017-04-19T21:28:11.027" clientapp="Microsoft SQL Server Management Studio - Query" hostname="CHERISH-PC" hostpid="4284" loginname="Cherish-PC\Cherish" isolationlevel="read committed (2)" xactid="23597" currentdb="14" lockTimeout="4294967295" clientoption1="671090784" clientoption2="390200"> <executionStack> <frame procname="" line="8" stmtstart="58" sqlhandle="0x020000008902b6141ee31ae1865c893c9823c9cf5d55fafb" /> <frame procname="" line="8" stmtstart="156" stmtend="276" sqlhandle="0x020000008e0a8d0d015ffe0258d01a670c6864df6370c807" /> </executionStack> <inputbuf> BEGIN TRAN UPDATE dbo.test_deadlock2 SET name = 'CC' WHERE id = 1 ; UPDATE dbo.test_deadlock1 SET name = 'CC' WHERE id = 1 ; COMMIT </inputbuf> </process> <process id="process15f048" taskpriority="0" logused="0" waitresource="KEY: 14:72057594038976512 (8194443284a0)" waittime="503" ownerId="23574" transactionname="user_transaction" lasttranstarted="2017-04-19T21:28:09.450" XDES="0x87140e80" lockMode="X" schedulerid="1" kpid="4864" status="suspended" spid="63" sbid="0" ecid="0" priority="0" trancount="2" lastbatchstarted="2017-04-19T21:28:09.443" lastbatchcompleted="2017-04-19T21:28:09.440" clientapp="Microsoft SQL Server Management Studio - Query" hostname="CHERISH-PC" hostpid="4284" loginname="Cherish-PC\Cherish" isolationlevel="read committed (2)" xactid="23574" currentdb="14" lockTimeout="4294967295" clientoption1="671090784" clientoption2="390200"> <executionStack> <frame procname="" line="9" stmtstart="58" sqlhandle="0x0200000070cad20bef4ae5ada9481eea2eb28415cd7e0c04" /> <frame procname="" line="9" stmtstart="208" stmtend="328" sqlhandle="0x020000008b7a380c6bf24758d2b29f0eeb276e4f0aa76d8f" /> </executionStack> <inputbuf> BEGIN TRAN UPDATE dbo.test_deadlock1 SET name = 'CC' WHERE id = 1 ; WAITFOR DELAY '00:00:05' UPDATE dbo.test_deadlock2 SET name = 'CC' WHERE id = 1 ; ROLLBACK </inputbuf> </process> </process-list> <resource-list> <keylock hobtid="72057594038910976" dbid="14" objectname="" indexname="" id="lock80153480" mode="X" associatedObjectId="72057594038910976"> <owner-list> <owner id="process15f048" mode="X" /> </owner-list> <waiter-list> <waiter id="process15ee08" mode="X" requestType="wait" /> </waiter-list> </keylock> <keylock hobtid="72057594038976512" dbid="14" objectname="" indexname="" id="lock80154580" mode="X" associatedObjectId="72057594038976512"> <owner-list> <owner id="process15ee08" mode="X" /> </owner-list> <waiter-list> <waiter id="process15f048" mode="X" requestType="wait" /> </waiter-list> </keylock> </resource-list> </deadlock>