Saturday, July 14, 2012

Best way to specify columns order in an Oracle concatened index


If I have a table called Test contains 5 columns a, b, c, d and e.
1. What is the deference between index_1, and index_2 if they are defined as:-
  Index_1.
    Create index_1 on test (a, b);
  Index_2.
    Create index_2 on test (b, a).
  Is the order in columns making difference? What is the deference and how it effect.
 
 
The order of the columns can make a difference.  index_1 would be most useful if you ask 
queries such as:

select * from t where a = :a and b = :b;
select * from t where a = :a;
 
Index_2 would be most useful if you ask queries such as:

select * from t where a = :a and b = :b;
select * from t where b = :b;

(in 9i, there is a new "index skip scan" -- search for that there to read about that.  It 
makes the index (a,b) OR (b,a) useful in both of the above cases sometimes!)

So, the order of columns in your index depends on HOW YOUR QUERIES are written.  You want 
to be able to use the index for as many queries as you can (so as to cut down on the over 
all number of indexes you have) -- that will drive the order of the columns.  Nothing 
else (selectivity of a or b does not count at all)
 
An Index Skip Scan can only actually be used and considered by the CBO 
in very specific scenarios and is often an indicator there’s either 
a missing index or an exisiting index has the columns in the wrong 
order.
 
If the leading column of an index is missing, it basically means the 
values in subsequently referenced columns in the index can potentially 
appear anywhere within the index structure as the index entries are 
sorted primarily on the leading indexed column. So if we have column A 
with 100,000 distinct values and column B with 100,000 distinct values 
and an index based on (A,B), all index entries are sorted primarily on 
column A and within a specific value of column A, sorted by column B. 
Therefore if we attempt a search on just Column B = 42, these values 
could potentially appear anywhere within the index structure and so the 
index can not generally be effectively used.
 
However, what if the leading column actually contained very few distinct
 values ? Yes, the subsequent column(s) values could appear anywhere 
within the index structure BUT if these subsequent columns have 
relatively high cardinality, once we’ve referenced the required index 
entries for a specific occurrence of a leading column value, we can 
ignore all subsequent index row entries with the same leading column 
value. If the leading column has few distinct values, this means we can 
potentially “skip” several/many leaf blocks until the leading column 
value changes again, where we can again ”probe” the index looking for 
the subsequent indexed column values of interest.
 
So if we have a leading column with few distinct values, we may be able 
to use the index “relatively” efficiently by probing the index as many 
times as we have distinct leading column values.
 
 

No comments:

Post a Comment