合 Oracle等待事件队列等待之TX - allocate ITL entry引起的死锁处理
Tags: Oracle故障处理等待事件死锁TX - allocate ITL entry块级死锁案例
前言部分
导读和注意事项
各位技术爱好者,看完本文后,你可以掌握如下的技能,也可以学到一些其它你所不知道的知识,~O(∩_∩)O~:
① enq: TX - allocate ITL entry等待事件的解决
② 一般等待事件的解决办法
③ 队列等待的基本知识
④ ITL死锁解决
⑤ ITL死锁模拟
⑥ Merge语句的非关联形式的查询优化
故障分析及解决过程
故障环境介绍
项目 | source db |
---|---|
db 类型 | RAC |
db version | 11.2.0.3.0 |
db 存储 | ASM |
OS版本及kernel版本 | AIX 64位 7.1.0.0 |
故障发生现象及故障分析解决
早上刚来上班,同事就发了一个SQL过来,说是有锁,然后我就查了查系统里的锁,结果一个锁都没得。好吧,还是得干点事的,先看看SQL语句:
MERGE INTO TLHR.TLHRBOKBAL S
USING (SELECT A.BOOKACCOUNT AS BOOKACCOUNT,
(A.CURRBALANCE + NVL(B.BAL, 0.00)) AS BANKAMT
FROM TLHR.TLHRBOKBAL_TMP A,
(SELECT T1.BOOKACCOUNT AS BOOKACCOUNT,
SUM(DECODE(T1.DCFLAG, 'D', -T1.AMT, 'C', T1.AMT, 0)) AS BAL
FROM TLHR.TLHRBOKBALJN T1
WHERE T1.BOOKACCOUNT LIKE '13500000%'
AND T1.TRANDATE = '20150901'
AND (T1.REASON IN ('1', '2') OR
(T1.REASON = '0' AND T1.ONLINEFLAG = '1'))
GROUP BY T1.BOOKACCOUNT) B
WHERE A.BOOKACCOUNT = B.BOOKACCOUNT(+)
AND A.BOOKACCOUNT LIKE '13500000%') T
ON (S.BOOKACCOUNT = T.BOOKACCOUNT)
WHEN MATCHED THEN
UPDATE
SET S.LASTBALANCE = T.BANKAMT,
S.CURRBALANCE = T.BANKAMT,
S.DEBITAMT = 0.00,
S.CREDITAMT = 0.00;
看起来是一个MERGE语句,按照小麦苗以前的经验,这一类的SQL最好是修改为MERGE的非关联形式比较好,我们先看看执行计划有没有问题:
先找到SQL_ID为53qv858pwwwwb:
SELECT a.ELAPSED_TIME,a.EXECUTIONS,a.* FROM v$sql a WHERE a.SQL_TEXT LIKE '%MERGE INTO TLHRBOKBAL S%' AND A.SQL_TEXT LIKE '%13500000%' ;
查询历史执行计划:
SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_AWR(SQL_ID => '53qv858pwwwwb' )) ;
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | Plan hash value: 2695089823 ------------------------------------------------------------------------------------------------------------ | Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time | ------------------------------------------------------------------------------------------------------------ | 0 | MERGE STATEMENT | | | | | 155K(100)| | | 1 | MERGE | TLHRBOKBAL | | | | | | | 2 | VIEW | | | | | | | | 3 | HASH JOIN RIGHT OUTER | | 153K| 15M| | 155K (2)| 00:31:04 | | 4 | VIEW | | 1 | 31 | | 6 (0)| 00:00:01 | | 5 | SORT GROUP BY | | 1 | 41 | | 6 (0)| 00:00:01 | | 6 | TABLE ACCESS BY INDEX ROWID| TLHRBOKBALJN | 1 | 41 | | 6 (0)| 00:00:01 | | 7 | INDEX RANGE SCAN | PK_TLHRBOKBALJN | 2 | | | 4 (0)| 00:00:01 | | 8 | HASH JOIN | | 153K| 10M| 5848K| 155K (2)| 00:31:04 | | 9 | TABLE ACCESS FULL | TLHRBOKBAL_TMP | 153K| 4048K| | 85415 (2)| 00:17:05 | | 10 | TABLE ACCESS FULL | TLHRBOKBAL | 305K| 13M| | 68755 (3)| 00:13:46 | ------------------------------------------------------------------------------------------------------------ |
可以看到,该执行计划的顺序为【7-->6-->5-->4-->9-->10-->8-->3-->2-->1-->0】,而耗费性能的地方在9、10、8这3个步骤上,走的是全表扫描,我们先看看2个大表的数据量:
SELECT COUNT(*) FROM TLHR.TLHRBOKBAL_TMP A WHERE A.BOOKACCOUNT LIKE '13500000%'; --306043/38998765
SELECT COUNT(*) FROM TLHR.TLHRBOKBAL A WHERE A.BOOKACCOUNT LIKE '13500000%'; --306043/38826275
从3000万的数据里边取出30万的数据,还是比较少的,所以应该去走索引的,看了一下统计信息,也是最新收集的,好吧,算了,先修改一下SQL让其走索引扫描看看,:
MERGE INTO TLHR.TLHRBOKBAL S
USING (SELECT S.ROWID ROWIDS,
A.BOOKACCOUNT AS BOOKACCOUNT,
(A.CURRBALANCE + NVL(B.BAL, 0.00)) AS BANKAMT
FROM (SELECT /*+index(NB,PK_TLHRBOKBAL_TMP)*/NB.CURRBALANCE,NB.BOOKACCOUNT
FROM TLHR.TLHRBOKBAL_TMP NB
WHERE NB.BOOKACCOUNT LIKE '13500000%') A,
TLHR.TLHRBOKBAL S,
(SELECT T1.BOOKACCOUNT AS BOOKACCOUNT,
SUM(DECODE(T1.DCFLAG, 'D', -T1.AMT, 'C', T1.AMT, 0)) AS BAL
FROM TLHR.TLHRBOKBALJN T1
WHERE T1.BOOKACCOUNT LIKE '13500000%'
AND T1.TRANDATE = '20150901'
AND (T1.REASON IN ('1', '2') OR
(T1.REASON = '0' AND T1.ONLINEFLAG = '1'))
GROUP BY T1.BOOKACCOUNT) B