Index Usage – 4

2015-11-16  本文已影响15人  2548d1d6a965

Here’s a thought that came to me while I was writing up a note about identifying redundant indexes a few minutes ago. Sometimes you end up supporting applications with unexpected duplication of data and indexes and need to find ways to reduce overheads. Here’s some code modelling a scenario that I’ve seen more often than I like (actually, just once would be more often than I’d like):

create table t1
nologging
as
with generator as (
        select  --+ materialize
                rownum id
        from dual
        connect by
                level <= 1e5
)
select
        rownum                                          id,
        trunc(sysdate,'MM') + (rownum-1)/1440           date_time,
        trunc(sysdate,'MM') + trunc((rownum-1)/1440)    date_only,
        rpad('x',100)                                   padding
from
        generator       v1,
        generator       v2
where
        rownum <= 1e5 ; begin dbms_stats.gather_table_stats( ownname => user,
                tabname          =>'T1',
                method_opt       => 'for all columns size 1'
        );
end;
/

I’ve got a table holding one row per minute since the start of the month; there’s a column which holds the date and time accurate to the minute, and another column which is supposed to hold just the date part. Is it possible to create a single index that allows Oracle to handles queries relatively efficiently whether they refer to date_time or date_only ? As a starting step could we get an index range scan on the same index for both of the following queries:

select
       max(id)
from
       t1
where
       date_only between sysdate-1 and sysdate
;


select
       max(id)
from
       t1
where
       date_time between sysdate-1 and sysdate
;

As Bob the Builder likes to say: “yes we can”.
There are a few lines of SQL between the table creation and the stats gathering that I didn’t show you. The first creates the constraint that describes the relationship between date_time and date_only – one is the truncated version of the other; the second defines the index we need, and the third (unfortunately) has to be there to declare the date_time column as a mandatory column:

alter table t1
        add constraint t1_trunc_date
        check(
                  date_only = trunc(date_time)
              and (   (date_only is null and date_time is null)
                   or (date_only is not null and date_time is not null)
              )
        )
;
 
create index t1_i1 on t1(trunc(date_time)) nologging;
 
alter table t1 modify (date_time not null);

(Given the requirement for date_time to be not null to get my indexing strategy to work, we could simplify the t1_trunc_dateconstraint to just (date_only = trunc(date_time)) if we declared date_only to be not null as well).
With the extra lines of SQL included here are the resulting execution plans for the two queries (running on 11.2.0.4, but you get the same plans on 12.1.0.2):

=======================================
date_only between sysdate-1 and sysdate
=======================================
 
---------------------------------------------------------------------------------------
| Id  | Operation                     | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |       |     1 |    21 |    92   (2)| 00:00:01 |
|   1 |  SORT AGGREGATE               |       |     1 |    21 |            |          |
|*  2 |   FILTER                      |       |       |       |            |          |
|*  3 |    TABLE ACCESS BY INDEX ROWID| T1    |  4306 | 90426 |    92   (2)| 00:00:01 |
|*  4 |     INDEX RANGE SCAN          | T1_I1 |  4306 |       |    13   (0)| 00:00:01 |
---------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter(SYSDATE@!>=SYSDATE@!-1)
   3 - filter("DATE_ONLY"<=SYSDATE@! AND "DATE_ONLY">=SYSDATE@!-1)
   4 - access(TRUNC(INTERNAL_FUNCTION("DATE_TIME"))>=SYSDATE@!-1 AND
              TRUNC(INTERNAL_FUNCTION("DATE_TIME"))<=SYSDATE@!)
=======================================
date_time between sysdate-1 and sysdate
=======================================
 
---------------------------------------------------------------------------------------
| Id  | Operation                     | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |       |     1 |    21 |    92   (2)| 00:00:01 |
|   1 |  SORT AGGREGATE               |       |     1 |    21 |            |          |
|*  2 |   FILTER                      |       |       |       |            |          |
|*  3 |    TABLE ACCESS BY INDEX ROWID| T1    |  1442 | 30282 |    92   (2)| 00:00:01 |
|*  4 |     INDEX RANGE SCAN          | T1_I1 |  4306 |       |    13   (0)| 00:00:01 |
---------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter(SYSDATE@!>=SYSDATE@!-1)
   3 - filter("DATE_TIME"=SYSDATE@!-1)
   4 - access(TRUNC(INTERNAL_FUNCTION("DATE_TIME"))>=TRUNC(SYSDATE@!-1) AND
              TRUNC(INTERNAL_FUNCTION("DATE_TIME"))>=TRUNC(SYSDATE@!))

The optimizer has managed to generate extra predicates in both cases by applying transitive closure to the critical constraint to produce queries that can be addressed (with some inefficiencies) through the single index.
Within limits, therefore, I can reduce two indexes to a single index. The strategy isn’t ideal but it may be appropriate in a few special cases. There are several problems that should be considered carefully:

Despite these drawbacks you may decide that you have a case where the strategy is “good enough” to help you reduce the workload on your system at some critical times during the day or night.

上一篇下一篇

猜你喜欢

热点阅读