墨墨導讀:由于執(zhí)行計劃中,對過濾謂詞順序的改變,導致SQL運行報錯。 最近,遇到了一個關于ORA-01841的報錯,起初,認為這個錯誤處理起來應該不困難,但實際上折騰了很久,才最終找到問題原因,并解決掉,下面將本次解決和分析的過程用樣例來說明。 ORA-01841的錯誤提示是“(full) year must be between -4713 and +9999, and not be 0”,翻譯過來,大意是完整的年份值需在-4712到+9999之間,并且不得為0。出現這個錯誤,通常都是數據本身存在問題導致的,但本案例中,又不僅僅是數據的問題。 下面就來回顧一下問題處理的過程。為了簡化問題,方便理解,以下描述均是在事后構建的模擬環(huán)境中進行的: 執(zhí)行以下SQL時,發(fā)生了ora-01841的報錯: SQL> select * from ( select * from test_tab1 where c1 not like 'X%' ) where to_date(c1,'yyyy-mm-dd') > date'2020-11-01' ; ERROR: ORA-01841: (full) year must be between -4713 and +9999, and not be 0
no rows selected 結合SQL和報錯信息,最初的懷疑是內層查詢的結果集的C1列上,有不正常的數據,導致出現了報錯。因此,首先檢查內層查詢的結果: SQL> select * from test_tab1 where c1 not like 'X%' ;
ID C1 ---------- -------------------------------- 1 2020-10-04 2 2020-09-17 3 2020-10-14 4 2020-11-03 5 2020-12-04 我們可以看到,內層查詢的結果集中,并沒有不正常的數據。 到此時,想了許久,也做了各種測試,但均沒有找到問題原因。決定看一下執(zhí)行計劃: SQL> set autot on SQL> select * from ( select * from test_tab1 where c1 not like 'X%' ) where to_date(c1,'yyyy-mm-dd') > date'2020-11-01' ; ERROR: ORA-01841: (full) year must be between -4713 and +9999, and not be 0
no rows selected
Execution Plan ---------------------------------------------------------- Plan hash value: 1698440217
------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 14 | 3 (0)| 00:00:01 | |* 1 | TABLE ACCESS FULL| TEST_TAB1 | 1 | 14 | 3 (0)| 00:00:01 | -------------------------------------------------------------------------------
Predicate Information (identified by operation id): ---------------------------------------------------
1 - filter(TO_DATE("TEST_TAB1"."C1",'yyyy-mm-dd')>TO_DATE(' 2020-11-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND "C1" NOT LIKE 'X%')
Statistics ---------------------------------------------------------- 1 recursive calls 0 db block gets 4 consistent gets 0 physical reads 0 redo size 419 bytes sent via SQL*Net to client 492 bytes received via SQL*Net from client 1 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 0 rows processed
SQL> 從執(zhí)行計劃中看,CBO對該SQL做了自動改寫,將外層查詢的條件,推到了內層查詢。而且,從謂詞信息部分,我們可以看到SQL中的條件“to_date(c1,‘yyyy-mm-dd’) > date’2020-11-01’”在兩個過濾條件中,是位于靠前的位置。 也就是說,當數據庫對表中的數據做過濾時,是先用“to_date(c1,‘yyyy-mm-dd’) > date’2020-11-01’”來檢查。這樣,如果有某行數據的C1列中的值不正常,就會導致這樣的報錯。 我們來驗證一下: SQL> select * from test_tab1;
ID C1 ---------- -------------------------------- 1 2020-10-04 2 2020-09-17 3 2020-10-14 4 2020-11-03 5 2020-12-04 6 XXXXXXXXX1
6 rows selected.
未被CBO自動改寫的原始SQL,其內層查詢,會將不能正常轉換為日期的數據排除掉,然后在外層再去做TO_DATE的轉換。如果CBO按照這種方式來處理,就不會報錯了。 知道了原因,那我們要如何處理呢? SQL> select * from ( select t.*, rownum rn from test_tab1 t where c1 not like 'X%' ) where to_date(c1,'yyyy-mm-dd') > date'2020-11-01'; 2 3 4 5 6 7 8
ID C1 RN ---------- -------------------------------- ---------- 4 2020-11-03 4 5 2020-12-04 5
Execution Plan ---------------------------------------------------------- Plan hash value: 4134971776
--------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | --------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 5 | 220 | 3 (0)| 00:00:01 | |* 1 | VIEW | | 5 | 220 | 3 (0)| 00:00:01 | | 2 | COUNT | | | | | | |* 3 | TABLE ACCESS FULL| TEST_TAB1 | 5 | 70 | 3 (0)| 00:00:01 | ---------------------------------------------------------------------------------
Predicate Information (identified by operation id): ---------------------------------------------------
1 - filter(TO_DATE("C1",'yyyy-mm-dd')>TO_DATE(' 2020-11-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss')) 3 - filter("C1" NOT LIKE 'X%')
Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 8 consistent gets 0 physical reads 0 redo size 711 bytes sent via SQL*Net to client 492 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 2 rows processed
再將TO_DATE轉換施加到內層查詢的結果之上。 或者,在內層查詢上,對C1進行一些不影響結果值的運算。例如: SQL> select * from ( select id, c1||'' c1 from test_tab1 where c1 not like 'X%' ) where to_date(c1,'yyyy-mm-dd') > date'2020-11-01'; 2 3 4 5 6 7
ID C1 ---------- -------------------------------- 4 2020-11-03 5 2020-12-04
Execution Plan ---------------------------------------------------------- Plan hash value: 1698440217
------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 14 | 3 (0)| 00:00:01 | |* 1 | TABLE ACCESS FULL| TEST_TAB1 | 1 | 14 | 3 (0)| 00:00:01 | -------------------------------------------------------------------------------
Predicate Information (identified by operation id): ---------------------------------------------------
1 - filter("C1" NOT LIKE 'X%' AND TO_DATE("C1"||'','yyyy-mm-dd')>TO_DATE(' 2020-11-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
Statistics ---------------------------------------------------------- 1 recursive calls 0 db block gets 8 consistent gets 0 physical reads 0 redo size 645 bytes sent via SQL*Net to client 492 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 2 rows processed 如上所示,這種處理方法,雖然外層的過濾條件被推入到了內層,但會放到后邊執(zhí)行,這樣,當前邊的條件已經將不正常的數據過濾掉后,也就不會報錯了。 但是,這又引起了我的一個新的疑問,如果初始SQL就是只有一層(如下所示),兩個過濾條件在一起時,CBO是先用哪個過濾條件來過濾呢? select * from test_tab1 where c1 not like 'X%' and to_date(c1,'yyyy-mm-dd') > date'2020-11-01'; 執(zhí)行后的結果如下: SQL> set autot on SQL> select * from test_tab1 where c1 not like 'X%' and to_date(c1,'yyyy-mm-dd') > date'2020-11-01'; 2 3 4 ERROR: ORA-01841: (full) year must be between -4713 and +9999, and not be 0
no rows selected
Execution Plan ---------------------------------------------------------- Plan hash value: 1698440217
------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 14 | 3 (0)| 00:00:01 | |* 1 | TABLE ACCESS FULL| TEST_TAB1 | 1 | 14 | 3 (0)| 00:00:01 | -------------------------------------------------------------------------------
Predicate Information (identified by operation id): ---------------------------------------------------
1 - filter(TO_DATE("C1",'yyyy-mm-dd')>TO_DATE(' 2020-11-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND "C1" NOT LIKE 'X%')
Statistics ---------------------------------------------------------- 1 recursive calls 0 db block gets 4 consistent gets 0 physical reads 0 redo size 434 bytes sent via SQL*Net to client 520 bytes received via SQL*Net from client 1 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 0 rows processed
SQL> select * from test_tab1 where to_date(c1,'yyyy-mm-dd') > date'2020-11-01' and c1 not like 'X%'; 2 3 4 ERROR: ORA-01841: (full) year must be between -4713 and +9999, and not be 0
no rows selected
Execution Plan ---------------------------------------------------------- Plan hash value: 1698440217
------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 14 | 3 (0)| 00:00:01 | |* 1 | TABLE ACCESS FULL| TEST_TAB1 | 1 | 14 | 3 (0)| 00:00:01 | -------------------------------------------------------------------------------
Predicate Information (identified by operation id): ---------------------------------------------------
1 - filter(TO_DATE("C1",'yyyy-mm-dd')>TO_DATE(' 2020-11-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND "C1" NOT LIKE 'X%')
Statistics ---------------------------------------------------------- 1 recursive calls 4 db block gets 4 consistent gets 0 physical reads 0 redo size 434 bytes sent via SQL*Net to client 520 bytes received via SQL*Net from client 1 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 0 rows processed 如上所示,看來和條件出現的順序是無關的。 這正好給了我們一個啟示,在CBO下,在選擇先執(zhí)行哪個過濾條件時,是否會依據統(tǒng)計信息,計算并排序各個過濾條件的選擇性,選擇性越好的,則越會先被執(zhí)行呢? 我們測試驗證一下。主要測試思路如下: 2、CBO在計算NOT LIKE這類條件時,其計算思路是先計算出LIKE的選擇率(類似于相等條件,是條件列中唯一值數量的倒數),然后用1-(like的選擇率)就是NOT LIKE的選擇率。 3、向表中再插入94行形如‘XXXXXXXXX1’這樣的記錄。構造一個有100行記錄的表,其中c1列上有100個唯一值,然后收集統(tǒng)計信息(注意,不要收集列上的直方圖信息,因為在有直方圖時,其計算邏輯和方法都要復雜得多,這里,我們只用列上的非直方圖的統(tǒng)計信息)。操作過程如下: SQL> insert into test_tab1 select 6+rownum id,lpad(rownum+1,10,'X') c1 from dual connect by rownum<=94;
94 rows created.
SQL> commit; Commit complete.
SQL> exec dbms_stats.gather_table_stats('DEMO','TEST_TAB1',method_opt=>'for columns c1 size 1');
PL/SQL procedure successfully completed.
SQL> select count(*) cnt,count(distinct c1) cnt_c1 from test_tab1;
CNT CNT_C1 ---------- ---------- 100 100 分別來驗證一下施加單個條件時,CBO的估算結果 看看是否與前邊的理解是吻合的: SQL> set autot on exp SQL> select * from test_tab1 where to_date(c1,'yyyy-mm-dd') > date'2020-11-01'; 2 3 ERROR: ORA-01841: (full) year must be between -4713 and +9999, and not be 0
no rows selected
Execution Plan ---------------------------------------------------------- Plan hash value: 1698440217
------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 5 | 70 | 3 (0)| 00:00:01 | |* 1 | TABLE ACCESS FULL| TEST_TAB1 | 5 | 70 | 3 (0)| 00:00:01 | -------------------------------------------------------------------------------
Predicate Information (identified by operation id): ---------------------------------------------------
1 - filter(TO_DATE("C1",'yyyy-mm-dd')>TO_DATE(' 2020-11-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss')) 如上所示,對條件“to_date(c1,‘yyyy-mm-dd’) > date’2020-11-01’”返回行數的估算是5行。由于表中總共有100行,所以,選擇率是5/100=5%。與我們的理解是吻合的。 再來看對NOT LIKE的選擇率: SQL> set autot traceonly exp SQL> select * from test_tab1 where c1 like 'X%'; 2 3
Execution Plan ---------------------------------------------------------- Plan hash value: 1698440217
------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 14 | 3 (0)| 00:00:01 | |* 1 | TABLE ACCESS FULL| TEST_TAB1 | 1 | 14 | 3 (0)| 00:00:01 | -------------------------------------------------------------------------------
Predicate Information (identified by operation id): ---------------------------------------------------
1 - filter("C1" LIKE 'X%')
SQL> select * from test_tab1 where c1 NOT like 'X%'; 2 3
Execution Plan ---------------------------------------------------------- Plan hash value: 1698440217
------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 99 | 1386 | 3 (0)| 00:00:01 | |* 1 | TABLE ACCESS FULL| TEST_TAB1 | 99 | 1386 | 3 (0)| 00:00:01 | -------------------------------------------------------------------------------
Predicate Information (identified by operation id): ---------------------------------------------------
1 - filter("C1" NOT LIKE 'X%') 如上所示,我們看到對LIKE和NOT LIKE的估算,與我們的理解也是吻合的。 SQL> set autot traceonly SQL> select * from test_tab1 where c1 not like 'X%' and to_date(c1,'yyyy-mm-dd') > date'2020-11-01'; 2 3 4 ERROR: ORA-01841: (full) year must be between -4713 and +9999, and not be 0
no rows selected
Execution Plan ---------------------------------------------------------- Plan hash value: 1698440217
------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 5 | 70 | 3 (0)| 00:00:01 | |* 1 | TABLE ACCESS FULL| TEST_TAB1 | 5 | 70 | 3 (0)| 00:00:01 | -------------------------------------------------------------------------------
Predicate Information (identified by operation id): ---------------------------------------------------
1 - filter(TO_DATE("C1",'yyyy-mm-dd')>TO_DATE(' 2020-11-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND "C1" NOT LIKE 'X%')
Statistics ---------------------------------------------------------- 0 recursive calls 4 db block gets 4 consistent gets 0 physical reads 0 redo size 434 bytes sent via SQL*Net to client 520 bytes received via SQL*Net from client 1 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 0 rows processed 那我們再來驗證一下,如果可以讓條件“c1 NOT like ‘X%’”的選擇率低于5%,那么我們就可能讓CBO選擇先執(zhí)行該條件了。即1-1/n<0.05,顯然,N要小于1.053,由于N表示的是唯一值的數量,所以,一定是個整數,即N只能是1了。 為了滿足這個條件,我們將表中C1列的值,全部更新為同一個值:‘XXXXXXXXX1’后,收集統(tǒng)計信息后,如下所示: SQL> set autot off SQL> update test_tab1 set c1='XXXXXXXXX1';
100 rows updated.
SQL> commit;
Commit complete.
SQL> exec dbms_stats.gather_table_stats('DEMO','TEST_TAB1',method_opt=>'for columns c1 size 1');
PL/SQL procedure successfully completed. 我們先來驗證一下前述兩個條件的選擇性是否如我們所愿,已經發(fā)生了改變: SQL> set autot traceonly exp SQL> select * from test_tab1 where to_date(c1,'yyyy-mm-dd') > date'2020-11-01'; 2 3
---------------------------------------------------------- Plan hash value: 1698440217
------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 5 | 70 | 3 (0)| 00:00:01 | |* 1 | TABLE ACCESS FULL| TEST_TAB1 | 5 | 70 | 3 (0)| 00:00:01 | -------------------------------------------------------------------------------
Predicate Information (identified by operation id): ---------------------------------------------------
1 - filter(TO_DATE("C1",'yyyy-mm-dd')>TO_DATE(' 2020-11-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
SQL> select * from test_tab1 where c1 like 'X%'; 2 3
Execution Plan ---------------------------------------------------------- Plan hash value: 1698440217
------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 100 | 1400 | 3 (0)| 00:00:01 | |* 1 | TABLE ACCESS FULL| TEST_TAB1 | 100 | 1400 | 3 (0)| 00:00:01 | -------------------------------------------------------------------------------
Predicate Information (identified by operation id): ---------------------------------------------------
1 - filter("C1" LIKE 'X%')
SQL> select * from test_tab1 where c1 NOT like 'X%'; 2 3
Execution Plan ---------------------------------------------------------- Plan hash value: 1698440217
------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 14 | 3 (0)| 00:00:01 | |* 1 | TABLE ACCESS FULL| TEST_TAB1 | 1 | 14 | 3 (0)| 00:00:01 | -------------------------------------------------------------------------------
Predicate Information (identified by operation id): ---------------------------------------------------
1 - filter("C1" NOT LIKE 'X%') 如上所示,條件“to_date(c1,‘yyyy-mm-dd’) > date’2020-11-01’”的選擇率未變,仍然是5%,但條件“c1 NOT like ‘X%’”的選擇率已經低于5%,目前估算只有大約1行記錄滿足該條件。 那么我們再次執(zhí)行測試SQL,看看結果如何: SQL> select * from test_tab1 where c1 not like 'X%' and to_date(c1,'yyyy-mm-dd') > date'2020-11-01'; 2 3 4
Execution Plan ---------------------------------------------------------- Plan hash value: 1698440217
------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 14 | 3 (0)| 00:00:01 | |* 1 | TABLE ACCESS FULL| TEST_TAB1 | 1 | 14 | 3 (0)| 00:00:01 | -------------------------------------------------------------------------------
Predicate Information (identified by operation id): ---------------------------------------------------
1 - filter("C1" NOT LIKE 'X%' AND TO_DATE("C1",'yyyy-mm-dd')>TO_DATE(' 2020-11-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
SQL> select * from test_tab1 where to_date(c1,'yyyy-mm-dd') > date'2020-11-01' and c1 not like 'X%'; 2 3 4
Execution Plan ---------------------------------------------------------- Plan hash value: 1698440217
------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 14 | 3 (0)| 00:00:01 | |* 1 | TABLE ACCESS FULL| TEST_TAB1 | 1 | 14 | 3 (0)| 00:00:01 | -------------------------------------------------------------------------------
Predicate Information (identified by operation id): ---------------------------------------------------
1 - filter("C1" NOT LIKE 'X%' AND TO_DATE("C1",'yyyy-mm-dd')>TO_DATE(' 2020-11-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
SQL> 如上所示,這時,CBO已經先執(zhí)行條件“c1 NOT like ‘X%’”了。 SQL> select * from ( select * from test_tab1 where c1 not like 'X%' ) where to_date(c1,'yyyy-mm-dd') > date'2020-11-01' ; 2 3 4 5 6 7
Execution Plan ---------------------------------------------------------- Plan hash value: 1698440217
------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 14 | 3 (0)| 00:00:01 | |* 1 | TABLE ACCESS FULL| TEST_TAB1 | 1 | 14 | 3 (0)| 00:00:01 | -------------------------------------------------------------------------------
Predicate Information (identified by operation id): ---------------------------------------------------
1 - filter("C1" NOT LIKE 'X%' AND TO_DATE("TEST_TAB1"."C1",'yyyy-mm-dd')>TO_DATE(' 2020-11-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss')) 果然是這樣。 附錄:提供上述模擬數據的生成腳本 SQL> create table test_tab1 (id number,c1 varchar2(32));
Table created.
SQL> insert into test_tab1 select rownum id,to_char(sysdate-dbms_random.value(1,100),'yyyy-mm-dd') c1 from dual connect by rownum<=5;
5 rows created.
SQL> insert into test_tab1 select 5+rownum id,lpad(rownum,10,'X') c1 from dual connect by rownum<=1;
1 row created.
SQL>commit;
SQL> exec dbms_stats.gather_table_stats('DEMO','TEST_TAB1');
PL/SQL procedure successfully completed.
墨天輪原文鏈接:https://www./db/42008(復制到瀏覽器中打開或者點擊“閱讀原文”立即查看) |
|