上周,应用人员在数据库Dataguard端进行查询操作时发现ORA-1157错误,当时应用发送过来的报错截图如下,
什么是ORA-01157错误:
数据库无法访问数据库对应数据文件或被LOCK,可以看到Alert日志中同样出现了一致错误并伴随了ORA-1186与ORA-1110错误
(4) DBMS_SUPPORT包(同10046 level 12效果一致)
此方法使用之前需要执行如下脚本安装: SQL> @?/rdbms/admin/dbmssupp.sql SQL> @?/rdbms/admin/prvtsupp.plb exec sys.dbms_support.start_trace ; --跟踪当前SESSION exec sys.dbms_support.stop_trace ; --停止跟踪当前SESSION exec dbms_support.start_trace_in_session(4,1); --跟踪OTHER SESSION exec dbms_support.stop_trace_in_session(4,1); --停止跟踪OTHER SESSION
昨日在线上分享案例时,一个朋友询问了一个问题,等待Cursor Pin s wait on X的会话是否可以Kill掉。当数据库发生严重的等待时,我一般常用的一个脚本如下:
with blocked as
( SELECT *
FROM
( SELECT instance, sid, sess_serial#, blocker_instance, blocker_sid, blocker_sess_serial#, WAIT_EVENT_TEXT, level lv, num_waiters, blocker_chain_id
FROM v$wait_chains connect by PRIOR sid = blocker_sid
AND PRIOR sess_serial# = blocker_sess_serial#
AND PRIOR INSTANCE = BLOCKER_INSTANCE START
WITH blocker_is_valid = 'FALSE')
WHERE num_waiters > 0
OR blocker_sid is NOT null) SELECT s.username, b.instance, b.ses, b.sess_serial#, b.wait_event_text
FROM
( SELECT instance, lpad(' ', 2 * (lv - 1)) || b.sid ses, b.sess_serial#, WAIT_EVENT_TEXT, b.blocker_instance, b.blocker_sid, b.blocker_sess_serial#
FROM blocked b) b, gv$session s
WHERE s.sid=b.ses
AND s.serial#=b.sess_serial#;
read more
我们在SQL优化的过程中,避免不了的是要执行10046来跟踪SQL,获取cpu,elapsed,disk,rows等信息来定位具体的SQL消耗与执行效率,下面来总结一下,Oracle中强大的跟踪方式:
(1)oradebug 10046:
此种方式与alter sessionset events ‘10046 trace name context forever, level 12’;使我们工作中相对常用的方式,此两种方式之中,我个人偏爱oradebug方式,利用oradebug tracefile_name
继上一篇文章《子表未创建正确索引导致enq: TM – contention的案例-分析锁机制(主表操作)》后,现在来研究子表DML时,在表对象存在主表,子表关系时,子表未创建索引或未创建正确类型索引时(无cascade),会产生什么类型的enq TM的锁类型。
测试表不变:
近日,接到应用人员的反馈,一套数据库夜维程序上线后出现了应用性能缓慢,希望帮忙定位一下根因。数据库在此期间段经历了大量enq: TM – contention等待,在经过一番分析后最终解决了问题,具体的案例bisal进行相关分析及测试详情请见bisal的个人杂货铺(https://mp.weixin.qq.com/s/N2XhQhUl2MVYLkLW_iDXbQ)我们知道,主子表关系的应用环境中,子表外键列需要建立索引,否者会出现大量Lock等待,产生阻塞,而且重要的是,需要建立合适的索引。但是,本篇文章要讨论的是,如果不建立索引,为什么会产生等待,产生了什么类型的LOCK导致了会话间的阻塞?
总结了一下,OracleTM锁类型:
P.S Ora11g后,select …for update 为mode=3 的RX(SX)锁
在上一篇文章《select * from dual引发的性能问题》引发了严重SQL超时的Cursor Pin :S等待事件,通过模拟该故障现象,简单记录一下该问题的处理思路:
官方解释:
A session waits for “cursor: pin S” when it wants a specific mutex in S (share) mode on a specific cursor and there is no concurrent X holder but it could not acquire
that mutex immediately. This may seem a little strange as one might question why there should be any form of wait to get a mutex which has no session holding it in an
incompatible mode. The reason for the wait is that in order to acquire the mutex in S mode (or release it) the session has to increment (or decrement) the mutex reference
count and this requires an exclusive atomic update to the mutex structure itself. If there are concurrent sessions trying to make such an update to the mutex then only
one session can actually increment (or decrement) the reference count at a time. A wait on “cursor: pin S” thus occurs if a session cannot make that atomic change
immediately due to other concurrent requests. Mutexes are local to the current instance in RAC environments.
Parameter Description
P1 idn — SQL HASH_VALUE
P2 value
P3 where (where|sleeps in 10)
模拟故障现象:
启动两个会话,进行大量SQL的执行(软解析、软软解析)
Oracle 10gR2后,Cursor Mutex代替了Library Cache Pin的功能,其好处在于资源持有更加“轻量级”,资源结构更小,获取一个Mutex的内部指令更少,并且更重要的是:原有的Latch以一种结构的方式存在,一个Latch可能需要控制管理多个对象的并发操作,对象并发的可能性大大增大;Mutex则不然,它对应对象,每个对象中有着独立的Mutex结构,减少了不同对象对应同一Latch管理下并发争用的可能性
官方解释如下:
A session waits for “cursor: pin S” when it wants a specific mutex in S (share) mode on a specific cursor and there is no concurrent X holder but it could not acquire
that mutex immediately. This may seem a little strange as one might question why there should be any form of wait to get a mutex which has no session holding it in an
incompatible mode. The reason for the wait is that in order to acquire the mutex in S mode (or release it) the session has to increment (or decrement) the mutex reference
count and this requires an exclusive atomic update to the mutex structure itself. If there are concurrent sessions trying to make such an update to the mutex then only
one session can actually increment (or decrement) the reference count at a time. A wait on “cursor: pin S” thus occurs if a session cannot make that atomic change
immediately due to other concurrent requests. Mutexes are local to the current instance in RAC environments.
真实案例:
曾经公司一套核心数据库业务出现了SQL严重超时的现象,导致前台交易严重超时:
一. 什么是Library cache:Mutex X
官方解释:
Library cache mutexes are similar to library cache operations in earlier versions except they are now implemented using mutexes. In all these cases, waits for these resources occurs when 2 (or more) sessions are working with
the same cursors simultaneously. When one session takes and holds a resource required by another, the second session will wait and will wait on one of these events.library cache: mutex X which means that sessions are waiting
to get a library cache mutex in eXclusive mode for one or more cursors.
Parameter Description
P1 “idn” = Unique Mutex Identifier
P2 Mutex “value” = in order to find the mutex value, read note 1298015.1
upper 8-bits which is the session ID (SID) of the Mutex owner
lower 8-bits is the Mutex reference count which is used for concurrency.
P3 “where” = location in code (internal identifier) where mutex is being waited for
个人理解:Library cache:Mutex X保护Library Cache Object Hanles Object(10.2.0.5后Handle上的并发控制由Mutex所体态),当SQL解析过程中首先要持有Library Cache Mutex扫描HASH_BUCKET与HASH_TABLE中Libray Cache Objects Handle首先已S模式进行扫描Parent Curosr Handle,当无法匹配任何游标句柄则需要硬解析一对父子游标,那么此时会持有Library cache:Mutex X,硬解析扫描HASH_BUCKET与HASH_TABLE均以Library cache:Mutex X持有.保护与控制HASH_BUCKET与HASH_TABLE链表扫描时的内存结构一致性,SCAN动作均持有Library Cache Mutex :S/X,软解析先持有S,硬解析持有X。