一、 基礎(chǔ).SQL_TRACE
1. 在全局啟用
在參數(shù)文件(pfile/spfile)中指定:
sql_trace =true
在全局啟用SQL_TRACE會(huì)導(dǎo)致所有進(jìn)程的活動(dòng)被跟蹤,包括后臺(tái)進(jìn)程及所有用戶進(jìn)程,這通常會(huì)導(dǎo)致比較嚴(yán)重的性能問(wèn)題,所以在生產(chǎn)環(huán)境中要謹(jǐn)慎使用.
提示: 通過(guò)在全局啟用sql_trace,我們可以跟蹤到所有后臺(tái)進(jìn)程的活動(dòng),很多在文檔中的抽象說(shuō)明,通過(guò)跟蹤文件的實(shí)時(shí)變化,我們可以清晰的看到各個(gè)進(jìn)程之間的緊密協(xié)調(diào).
2. 在當(dāng)前session級(jí)設(shè)置
大多數(shù)時(shí)候我們使用sql_trace跟蹤當(dāng)前進(jìn)程.通過(guò)跟蹤當(dāng)前進(jìn)程可以發(fā)現(xiàn)當(dāng)前操作的后臺(tái)數(shù)據(jù)庫(kù)遞歸活動(dòng)(這在研究數(shù)據(jù)庫(kù)新特性時(shí)尤其有效), 研究SQL執(zhí)行,發(fā)現(xiàn)后臺(tái)錯(cuò)誤等.
在session級(jí)啟用和停止sql_trace方式如下:
SQL> alter session set sql_trace=true;
SQL> SQL statements ......
SQL> alter session set sql_trace=false;
3. 跟蹤其他用戶進(jìn)程
在很多時(shí)候我們需要跟蹤其他用戶的進(jìn)程,而不是當(dāng)前用戶,這可以通過(guò)Oracle提供的系統(tǒng)包
DBMS_SYSTEM.SET_SQL_TRACE_IN_SESSION來(lái)完成
SET_SQL_TRACE_IN_SESSION過(guò)程序要提供三個(gè)參數(shù):
SQL> desc dbms_system
......
PROCEDURE SET_SQL_TRACE_IN_SESSION
Argument Name Type In/Out Default?
------------------------------ --------------------- ------ --------
SID NUMBER IN
SERIAL# NUMBER IN
SQL_TRACE BOOLEAN IN
......
通過(guò)v$session我們可以獲得sid、serial#等信息:
獲得進(jìn)程信息,選擇需要跟蹤的進(jìn)程:
SQL> select sid,serial#,username from v$session where username is not null;
SID SERIAL# USERNAME
---------- ---------- ------------------------------
10 9068 BIGBEAR
SQL> exec dbms_system.set_sql_trace_in_session(10,9068,true)
SQL> exec dbms_system.set_sql_trace_in_session(10,9068,false)
二、 基礎(chǔ).10046事件
10046事件是Oracle提供的內(nèi)部事件,是對(duì)SQL_TRACE的增強(qiáng).
10046事件可以設(shè)置以下四個(gè)級(jí)別:
1 - 啟用標(biāo)準(zhǔn)的SQL_TRACE功能,等價(jià)于sql_trace
4 - Level 1 加上綁定值(bind values)
8 - Level 1 + 等待事件跟蹤
12 - Level 1 + Level 4 + Level 8
類(lèi)似sql_trace,10046事件可以在全局設(shè)置,也可以在session級(jí)設(shè)置。
1. 在全局設(shè)置
在參數(shù)文件中增加:
event="10046 trace name context forever,level 12"
此設(shè)置對(duì)所有用戶的所有進(jìn)程生效、包括后臺(tái)進(jìn)程.
2. 對(duì)當(dāng)前session設(shè)置
通過(guò)alter session的方式修改,需要alter session的系統(tǒng)權(quán)限:
SQL> alter session set events '10046 trace name context forever';
SQL> alter session set events '10046 trace name context forever, level 8';
......
SQL> alter session set events '10046 trace name context off';
3. 對(duì)其他用戶session設(shè)置
通過(guò)DBMS_SYSTEM.SET_EV系統(tǒng)包來(lái)實(shí)現(xiàn):
SQL> desc dbms_system
...
PROCEDURE SET_EV
Argument Name Type In/Out Default?
------------------------------ ----------------------- ------ --------
SI BINARY_INTEGER IN
SE BINARY_INTEGER IN
EV BINARY_INTEGER IN
LE BINARY_INTEGER IN
NM VARCHAR2 IN
...
其中的參數(shù)SI、SE來(lái)自v$session視圖:
查詢獲得需要跟蹤的session信息:
SQL> select sid,serial#,username from v$session where username is not null;
SID SERIAL# USERNAME
---------- ---------- ------------------------------
10 9068 BIGBEAR
SQL> exec dbms_system.set_ev(10,9068,10046,8,'bigbear');
......
SQL> exec dbms_system.set_ev(10,9068,10046,0,'bigbear');
三、TKPROF格式化輸出
TKPROF bigbear_ora_27292.trc output-file
EXPLAIN=scott/tiger
TABLE=scott.temp_plan_table_a
INSERT=STOREA.SQL
SYS=NO
SORT=(EXECPU,FCHCPU)
PRINT=5
list |
description |
prscnt |
number of times parse was called 解析被調(diào)用次數(shù) |
prscpu |
cpu time parsing 解析占用CPU時(shí)間(秒) |
prsela |
elapsed time parsing 解析占用總時(shí)間(秒) |
prsdsk |
number of disk reads during parse 解析期間磁盤(pán)讀取的數(shù)據(jù)塊數(shù) |
prsqry |
number of buffers for consistent read during parse 解析期間一致讀所訪問(wèn)到的緩沖塊的數(shù)量 |
prscu |
number of buffers for current read during parse 解析期間當(dāng)前讀所訪問(wèn)到的緩沖塊的數(shù)量 |
prsmis |
number of misses in library cache during parse 解析期間從庫(kù)緩存中丟失的塊的數(shù)量 |
execnt |
number of execute was called |
execpu |
cpu time spent executing |
exeela |
elapsed time executing |
exedsk |
number of disk reads during execute |
exeqry |
number of buffers for consistent read during execute |
execu |
number of buffers for current read during execute |
exerow |
number of rows processed during execute |
exemis |
number of library cache misses during execute |
fchcnt |
number of times fetch was called |
fchcpu |
cpu time spent fetching |
fchela |
elapsed time fetching |
fchdsk |
number of disk reads during fetch |
fchqry |
number of buffers for consistent read during fetch |
fchcu |
number of buffers for current read during fetch |
fchrow |
number of rows fetched |
userid |
userid of user that parsed the cursor |
四、TKPROF輸出說(shuō)明
list |
description |
PARSE |
Translates the SQL statement into an execution plan, including checks for proper security authorization and checks for the existence of tables, columns, and other referenced objects. 將SQL語(yǔ)句轉(zhuǎn)換為執(zhí)行計(jì)劃,包括檢查安全認(rèn)證,還有表/列及其它相關(guān)對(duì)象的存在性 |
EXECUTE |
Actual execution of the statement by Oracle. For INSERT, UPDATE, and DELETE statements, this modifies the data. For SELECT statements, this identifies the selected rows. 實(shí)際提交給ORACLE執(zhí)行。對(duì)于INS/UPD/DEL這個(gè)指標(biāo)反映修改后的數(shù)據(jù),對(duì)于SELECT反映的是選中的數(shù)據(jù)。 |
FETCH |
Retrieves rows returned by a query. Fetches are only performed for SELECT statements. 對(duì)于SELECT取回?cái)?shù)據(jù)的行數(shù)。 |
column |
description |
COUNT |
Number of times a statement was parsed, executed, or fetched. 次數(shù) |
CPU |
Total CPU time in seconds for all parse, execute, or fetch calls for the statement. This value is zero (0) if TIMED_STATISTICS is not turned on. 占CUP的秒數(shù) |
ELAPSED |
Total elapsed time in seconds for all parse, execute, or fetch calls for the statement. This value is zero (0) if TIMED_STATISTICS is not turned on. 整個(gè)流逝的時(shí)間 |
DISK |
Total number of data blocks physically read from the data files on disk for all parse, execute, or fetch calls. 讀磁盤(pán)的塊數(shù) |
QUERY |
Total number of buffers retrieved in consistent mode for all parse, execute, or fetch calls. Usually, buffers are retrieved in consistent mode for queries. 一致讀所訪問(wèn)的緩沖塊數(shù) |
CURRENT |
Total number of buffers retrieved in current mode. Buffers are retrieved in current mode for statements such as INSERT, UPDATE, and DELETE. SQL所影響到的當(dāng)前的緩沖塊數(shù) |
ROWS |
Total number of rows processed by the SQL statement. This total does not include rows processed by sub-queries of the SQL statement. SQL總處理的行數(shù),但不包括子查詢所訪問(wèn)的行。 |
Timing statistics have a resolution of one hundredth of a second; therefore, any operation on a cursor that takes a hundredth of a second or less might not be timed accurately. Keep this in mind when interpreting statistics. In particular, be careful when interpreting the results from simple queries that execute very quickly.
基于時(shí)間的統(tǒng)計(jì)是每百秒一次,因此任何基于指針的操作如果小于100秒將不會(huì)被精確統(tǒng)計(jì)。特別提醒:對(duì)于那些簡(jiǎn)單的執(zhí)行速度很快的query的統(tǒng)計(jì)要十分地小心
Sometimes, in order to execute a SQL statement issued by a user, Oracle must issue additional statements. Such statements are called recursive calls or recursive SQL statements. For example, if you insert a row into a table that does not have enough space to hold that row, then Oracle makes recursive calls to allocate the space dynamically. Recursive calls are also generated when data dictionary information is not available in the data dictionary cache and must be retrieved from disk.
有時(shí)候?yàn)榱藞?zhí)行用戶的一條SQL語(yǔ)句,ORACLE需要調(diào)用其它的SQL語(yǔ)句。這樣的SQL我們稱(chēng)為遞歸調(diào)用,或者稱(chēng)為遞歸SQL語(yǔ)句。例如:當(dāng)為一張表插入一行數(shù)據(jù)時(shí),發(fā)現(xiàn)沒(méi)有足夠的空間,這時(shí)ORACLE將產(chǎn)生遞歸調(diào)用為其分配空間。當(dāng)數(shù)據(jù)字典不可用時(shí),會(huì)從磁盤(pán)上讀取,這一過(guò)程也產(chǎn)生遞歸調(diào)用。
If recursive calls occur while the SQL Trace facility is enabled, then TKPROF produces statistics for the recursive SQL statements and marks them clearly as recursive SQL statements in the output file. You can suppress the listing of Oracle internal recursive calls (for example, space management) in the output file by setting the SYS command-line parameter to NO. The statistics for a recursive SQL statement are included in the listing for that statement, not in the listing for the SQL statement that caused the recursive call. So, when you are calculating the total resources required to process a SQL statement, consider the statistics for that statement as well as those for recursive calls caused by that statement.
在遞歸調(diào)用發(fā)生在SQL TRACE中時(shí),TKPROF會(huì)統(tǒng)計(jì)這些信息,并明確地在輸出文件中表明。你可以通過(guò)設(shè)置參數(shù)禁止上述遞歸調(diào)用信息的產(chǎn)生,即:只有產(chǎn)生遞歸調(diào)用的SQL被記錄,相關(guān)的遞歸調(diào)用過(guò)程不被記錄。所以,當(dāng)要計(jì)算整個(gè)資源消耗時(shí),要考慮是否將上述信息算在內(nèi)。
Deciding Which Statements to Tune
You need to find which SQL statements use the most CPU or disk resource. If the TIMED_STATISTICS parameter is on, then you can find high CPU activity in the CPU column. If TIMED_STATISTICS is not on, then check the QUERY and CURRENT columns.
判斷哪些SQL需要調(diào)整
如果設(shè)置了TIMED_STATISTICS可以查看CPU占用,如果沒(méi)有設(shè)置TIMED_STATISTICS,查看QUERY和CURRNET。
With the exception of locking problems and inefficient PL/SQL loops, neither the CPU time nor the elapsed time is necessary to find problem statements. The key is the number of block visits, both query (that is, subject to read consistency) and current (that is, not subject to read consistency). Segment headers and blocks that are going to be updated are acquired in current mode, but all query and sub-query processing requests the data in query mode. These are precisely the same measures as the instance statistics CONSISTENT GETS and DB BLOCK GETS. You can find high disk activity in the disk column.
當(dāng)產(chǎn)生鎖異?;蛘叩托实腜L/SQL,不管是CPU time還是elapsed time都不能發(fā)現(xiàn)有問(wèn)題的SQL。這時(shí)的關(guān)鍵是訪問(wèn)的數(shù)據(jù)塊的數(shù)目,即:QUERY和CURRENT兩項(xiàng)中對(duì)應(yīng)的數(shù)值。
The following listing shows TKPROF output for one SQL statement as it appears in the output file:
SELECT *
FROM emp, dept
WHERE emp.deptno = dept.deptno;
call count cpu elapsed disk query current rows
---- ------- ------- ------- -------- -------- ------- ------
Parse 11 0.08 0.18 0 0 0 0
Execute 11 0.23 0.66 0 3 6 0
Fetch 35 6.70 6.83 100 12326 2 824
------------------------------------------------------------------
total 57 7.01 7.67 100 12329 8 826
Misses in library cache during parse: 0
If it is acceptable to have 7.01 CPU seconds and to retrieve 824 rows, then you need not look any further at this trace output. In fact, a major use of TKPROF reports in a tuning exercise is to eliminate processes from the detailed tuning phase.
You can also see that 10 unnecessary parse call were made (because there were 11 parse calls for this one statement) and that array fetch operations were performed. You know this because more rows were fetched than there were fetches performed. A large gap between CPU and elapsed timings indicates Physical I/Os (PIOs).
可以發(fā)現(xiàn)有10個(gè)無(wú)用的解析被執(zhí)行(因?yàn)閷?duì)于同一個(gè)SQL解析了11次),并且執(zhí)行的是“批取回”,因?yàn)槿』氐男袛?shù)比結(jié)果集中實(shí)際存在行數(shù)多。如果CPU與elapsed相關(guān)比較大,則認(rèn)為有物理I/O,即:(PIOs)。
五、保存 SQL_TRACE 統(tǒng)計(jì)結(jié)果
When you run TKPROF, use the INSERT parameter to specify the name of the generated SQL script. If you omit this parameter, then TKPROF does not generate a script.
如果使用INSERT參數(shù),可以生成相應(yīng)的CREATE/INSERT語(yǔ)句。
默認(rèn)條件下不會(huì)生成。
下面表中內(nèi)容是對(duì)于TKPROF_TABLE中部分字段的解釋?zhuān)?/SPAN>
column |
description |
SQL_STATEMENT |
This is the SQL statement for which the SQL Trace facility collected the row of statistics. Because this column has data-type LONG, you cannot use it in expressions or WHERE clause conditions. 這里保存的是SQL語(yǔ)句,注意保存類(lèi)型為L(zhǎng)ONG,所以不能使用WHERE條件對(duì)其做為查詢條件 |
DATE_OF_INSERT |
This is the date and time when the row was inserted into the table. This value is not exactly the same as the time the statistics were collected by the SQL Trace facility. 這個(gè)值是反映插入這個(gè)表的時(shí)間,并不是執(zhí)行TKPROF的時(shí)間。 |
DEPTH |
This indicates the level of recursion at which the SQL statement was issued. For example, a value of 0 indicates that a user issued the statement. A value of 1 indicates that Oracle generated the statement as a recursive call to process a statement with a value of 0 (a statement issued by a user). A value of n indicates that Oracle generated the statement as a recursive call to process a statement with a value of n-1. 指嵌套查詢的嵌套深度。 |
USER_ID |
This identifies the user issuing the statement. This value also appears in the formatted output file. 用戶唯一的標(biāo)識(shí),同時(shí)也會(huì)出現(xiàn)在格式輸出文件的文件名上 |
CURSOR_NUM |
Oracle uses this column value to keep track of the cursor to which each SQL statement was assigned. ORACLE用來(lái)記錄游標(biāo)數(shù),這些游標(biāo)是為每個(gè)SQL分配的 |
六、TKPROF的缺陷
避免參數(shù)(Argument)缺陷
If you are not aware of the values being bound at run time, then it is possible to fall
into the argument trap. EXPLAIN PLAN cannot determine the type of a bind variable from the text of SQL statements, and it always assumes that the type is varchar. If the bind variable is actually a number or a date, then TKPROF can cause implicit data conversions, which can cause inefficient plans to be executed. To avoid this, experiment with different data types in the query.
如果在運(yùn)行時(shí)綁定變量不小心,就有可能遇到這個(gè)缺陷。EXPLAIN PLAN從SQL語(yǔ)句中不能判斷綁定變量的類(lèi)型,并且總是將其認(rèn)為是varchar類(lèi)型的。如果SQL語(yǔ)句中實(shí)際綁定的變量是數(shù)字或者日期型的,則TKPROF將會(huì)做隱式的數(shù)據(jù)轉(zhuǎn)換,這樣會(huì)導(dǎo)致低效的計(jì)劃被執(zhí)行。為了避免這個(gè)不足,可以綁定不同的數(shù)據(jù)類(lèi)型的變量做測(cè)試。
避免讀一致性(Read Consistency)缺陷
The next example illustrates the read consistency trap. Without knowing that an uncommitted transaction had made a series of updates to the NAME column, it is very difficult to see why so many block visits would be incurred.
如果不知道一個(gè)未提交的事務(wù),其產(chǎn)生一系列的更新在NAME列上,就非常難于明白為什么有如此多的數(shù)據(jù)塊被訪問(wèn)。
Cases like this are not normally repeatable: if the process were run again, it is unlikely that another transaction would interact with it in the same way.
像這樣的一些例子不經(jīng)常出現(xiàn),如果將上述過(guò)程再運(yùn)行一次,未必另一個(gè)事務(wù)會(huì)出現(xiàn)同樣的結(jié)果。
SELECT name_id
FROM cq_names
WHERE name = ’FLOOR’;
call count cpu elapsed disk query current rows
---- ----- --- ------- ---- ----- ------- ----
Parse 1 0.10 0.18 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 1 0.11 0.21 2 101 0 1
Misses in library cache during parse: 1
Parsing user id: 01 (USER1)
Rows Execution Plan
---- --------- ----
0 SELECT STATEMENT
1 TABLE ACCESS (BY ROWID) OF ’CQ_NAMES’
2 INDEX (RANGE SCAN) OF ’CQ_NAMES_NAME’ (NON_UNIQUE)
避免讀計(jì)劃(schema)缺陷
This example shows an extreme (and thus easily detected) example of the schema trap. At first, it is difficult to see why such an apparently straightforward indexed query needs to look at so many database blocks, or why it should access any blocks at all in current mode.
SELECT name_id
FROM cq_names
WHERE name = ’FLOOR’;
call count cpu elapsed disk query current rows
-------- ------- -------- --------- ------- ------ ------- ----
Parse 1 0.06 0.10 0 0 0 0
Execute 1 0.02 0.02 0 0 0 0
Fetch 1 0.23 0.30 3131 3 1
Misses in library cache during parse: 0
Parsing user id: 02 (USER2)
Rows Execution Plan
------- ---------------------------------------------------
0 SELECT STATEMENT
2340 TABLE ACCESS (BY ROWID) OF ’CQ_NAMES’
0 INDEX (RANGE SCAN) OF ’CQ_NAMES_NAME’ (NON-UNIQUE)
Two statistics suggest that the query might have been executed with a full table scan. These statistics are the current mode block visits, plus the number of rows originating from the Table Access row source in the execution plan. The explanation is that the required index was built after the trace file had been produced, but before TKPROF had been run.
Generating a new trace file gives the following data:
SELECT name_id
FROM cq_names
WHERE name = ’FLOOR’;
call count cpu elapsed disk query current rows
----- ------ ------ -------- ----- ------ ------- -----
Parse 1 0.01 0.02 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 1 0.00 0.00 0 2 0 1
Misses in library cache during parse: 0
Parsing user id: 02 (USER2)
Rows Execution Plan
------- ---------------------------------------------------
0 SELECT STATEMENT
1 TABLE ACCESS (BY ROWID) OF ’CQ_NAMES’
2 INDEX (RANGE SCAN) OF ’CQ_NAMES_NAME’ (NON-UNIQUE)
One of the marked features of this correct version is that the parse call took 10 milliseconds of CPU time and 20 milliseconds of elapsed time, but the query apparently took no time at all to execute and perform the fetch. These anomalies arise because the clock tick of 10 milliseconds is too long relative to the time taken to execute and fetch the data. In such cases, it is important to get lots of executions of the statements, so that you have statistically valid numbers.
由于在執(zhí)行TKPORF之前沒(méi)有建立索引,一但建立索引,執(zhí)行TKPROF會(huì)是另外一種結(jié)果。
避免讀時(shí)間(time)缺陷
Sometimes, as in the following example, you might wonder why a particular query has taken so long.
UPDATE cq_names SET ATTRIBUTES = lower(ATTRIBUTES)
WHERE ATTRIBUTES = :att
call count cpu elapsed disk query current rows
-------- ------- -------- --------- -------- -------- ------- ----------
Parse 1 0.06 0.24 0 0 0 0
Execute 1 0.62 19.62 22 526 12 7
Fetch 0 0.00 0.00 0 0 0 0
Misses in library cache during parse: 1
Parsing user id: 02 (USER2)
Rows Execution Plan
------- ---------------------------------------------------
0 UPDATE STATEMENT
2519 TABLE ACCESS (FULL) OF ’CQ_NAMES’
Again, the answer is interference from another transaction. In this case, another transaction held a shared lock on the table cq_names for several seconds before and after the update was issued. It takes a fair amount of experience to diagnose that interference effects are occurring. On the one hand, comparative data is essential when the interference is contributing only a short delay (or a small increase in block visits in the previous example). On the other hand, if the interference is contributing only a modest overhead, and the statement is essentially
efficient, then its statistics might not need to be analyzed.
還是要從事務(wù)角度考慮,當(dāng)執(zhí)行這個(gè)UPDATE之前幾秒,可能有另外一個(gè)事務(wù)在同一表上分配了共享鎖。處理這樣的沖突事件需要有權(quán)衡的能力。一方面,當(dāng)這個(gè)沖突在瞬間發(fā)生時(shí)訪問(wèn)到一定量的數(shù)據(jù)(或者訪問(wèn)有小量的增長(zhǎng))。另一方面,如果這一事件不是很過(guò)份,并且這個(gè)SQL語(yǔ)句本質(zhì)上是有效率的,那么這個(gè)統(tǒng)計(jì)信息可能被忽略。
避免觸發(fā)器(trigger)缺陷
The resources reported for a statement include those for all of the SQL issued while the statement was being processed. Therefore, they include any resources used within a trigger, along with the resources used by any other recursive SQL (such as that used in space allocation). With the SQL Trace facility enabled, TKPROF reports these resources twice. Avoid trying to tune the DML statement if the resource is actually being consumed at a lower level of recursion.
SQL語(yǔ)句的資源報(bào)告包括它們被執(zhí)行時(shí)全部的SQL聲明。因此上述資源包括在觸發(fā)器中調(diào)用的任何資源,連同被其它嵌套占用的資源(例如調(diào)用空間分配的SQL)。在執(zhí)行SQL_TRACE后,TKPROF會(huì)報(bào)告這些資源兩次被使用。最好不要調(diào)整這些DML語(yǔ)句,如果這些資源確實(shí)被以低度嵌套SQL所消耗。
If a DML statement appears to be consuming far more resources than you would expect, then check the tables involved in the statement for triggers and constraints that could be greatly increasing the resource usage.
如果一個(gè)DML語(yǔ)句消耗的資料比預(yù)期的大,則檢查與這些語(yǔ)句相關(guān)的觸發(fā)器和約束,他們也許會(huì)極大地增加資源的占用。