|
|
|||||||||
|
留言 |
收藏本站 | ||||||||
|
Oracle技术站。提供各种数据库技术文档、培训教程、解决方案、案例、工具及周边新闻。同时,本站还提供oracle培训、优化、备份方案和技术支持。email: fuyuncat@gmail.com MSN: fuyuncat@hotmail.com |
|||||||||
作者:
fuyuncat
来源:
www.HelloDBA.com
近来生产系统反馈,时常有作业被异常中止,导致应用程序被hung住。经过检查日志,是某个作业在运行时发生了1555错误,导致程序无法返回结果:
*
ERROR at line 1:
ORA-01555: snapshot too old: rollback segment number 9 with name "RBS08" too small
ORA-06512: at "SQLUSER.EXT_PKG", line 4917
ORA-06512: at line 1
相关程序记录下的日志:
STAT-S.20060313185536.lg = 2 processed =20060313185536 end date Error = ORA-01555: snapshot too old: rollback segment number 9 with name "RBS08" too small BEGIN ext_pkg.main('extfiles','50','EAT'); END;
这是一个典型的1555错误。检查引发该错误的PACKAGE,发现它只有一个入口函数main(及程序日志中记录的函数),但这个函数还调用了其他N个PACKAGE里面的函数。这是一个大作业,执行时,设置它使用了一个大的回滚段:RBS_BATCH1。
先看看相关配置:rollback tablespace空间为8G,undo_retention为1800。
看看回滚段的统计数据:
SQL> Select rn.Name "Rollback Segment", rs.RSSize/1024 "Size (KB)", rs.Gets "Gets",
2 rs.waits "Waits", (rs.Waits/rs.Gets)*100 "% Waits",
3 rs.Shrinks "# Shrinks", rs.Extends "# Extends"
4 from sys.v_$RollName rn, sys.v_$RollStat rs
5 where rn.usn = rs.usn;
Rollback Segment Size (KB) Gets Waits % Waits # Shrinks # Extends
------------------------------ ---------- ---------- ---------- ---------- ---------- ----------
SYSTEM 952 1189 0 0 0 0
RBS_BATCH1 409592 681422 667 .097883544 0 0
... ...
RBS10 204792 478502 10 .002089855 0 0
RBS11 204792 477366 8 .001675863 0 0
RBS12 204792 491070 6 .001221822 0 0
RBS_BATCH2 409592 650088 644 .099063511 0 0
RBS_BATCH3 409592 243849 3 .00123027 0 0
16 rows selected.
注意到RBS_BATCHT1的wait%是0.098%,这个值应该是比较好的一个值。
回过头再来看依法错误的语句:调用ext_pkg.main函数。在程序日志中已经记录下了输入参数,这就比较好办了:作一个trace,看看到底哪条语句的性能最差:
SQL> alter system set events ‘10046 trace name context forever, level 1’;
System altered.
SQL> exec ext_pkg.main('extfiles','50','EAT');
PL/SQL procedure successfully completed.
SQL> alter system set events ‘10046 trace name context off’;
System altered.
用tkprof处理trace文件后检查trace文件:
TKPROF: Release 9.2.0.5.0 - Production on Tue Mar 14 09:21:58 2006
Copyright (c) 1982, 2002, Oracle Corporation. All rights reserved.
Trace file: prod_ora_29225.trc
Sort options: default
********************************************************************************
count= number of times OCI procedure was executed
cpu= cpu time in seconds executing
elapsed =elapsed time in seconds executing
disk= number of physical reads of buffers from disk
query= number of buffers gotten for consistent read
current =number of buffers gotten in current mode (usually for update)
rows= number of rows processed by the fetch or execute call
********************************************************************************
altersession set events='10046 trace name context forever, level 1'
... ...
BEGIN
ext_pkg.main('extfiles','50','EAT');
END;
callcount cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.01 0.04 4 39 0 0
Execute 1 0.20 0.21 24560 37808 0 1
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total2 0.21 0.26 24564 37847 0 1
Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Parsing user id: 30
********************************************************************************
... ...
SELECT QID
FROM
CSQUE QUE, ASTP STP WHERE
QUE.CDE IN (:B3, :B2) AND QUE.TID =
STP.TID AND STP.ACDE = :B1 ORDER BY
QUE.CDE, QUE.DT
callcount cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 2 0.00 0.00 0 0 0 0
Fetch 2682 0.41 0.54 7557 10968 0 2680
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total2685 0.41 0.54 7557 10968 0 2680
Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Parsing user id: 31 (recursive depth: 1)
********************************************************************************
... ...
7 user SQL statements in session.
41 internal SQL statements in session.
48 SQL statements in session.
********************************************************************************
Trace file: prod_ora_29225.trc
Trace filecompatibility: 9.02.00
Sort options: default
1 session in tracefile.
7 user SQL statements in trace file.
41 internal SQL statements in trace file.
48 SQL statements in trace file.
20 unique SQL statements in trace file.
8626 lines in trace file.
从trace文件中,发现有一条语句性能相当差,通过对这条语句做SQL Trace,发现它的consistent gets达到80万!
于是对该语句进行优化,调整了它的写法,并建立了缺少的索引(优化过程略)。最终将consistent gets数量降低到了5000。
重新安排上线,经过一周的观察,1555错误没再发生。
其实这个案例的解决是比较简单的,最终的处理就是将一条语句进行优化。
关于ORA-01555错误产生原因和处理思路,可以参考我的另外一篇文章
《ORA-01555错误浅析》。