This is a discussion on 9i and histogram index within the Oracle Miscellaneous forums, part of the Oracle Database category; --> Hi all, I have a table with 18m rows in a data warehouse. 3 m rows have a '-1' ...
| |||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| Hi all, I have a table with 18m rows in a data warehouse. 3 m rows have a '-1' value and 15 m rows have mostly distinct values. Is this a good candidate for a histogram index? I searched for info but couldn't find anything useful (a point in the direction of good documentation of this would be helpful). TIA, jabs. |
| |||
| "jabs" <no@email.com> wrote in message news:<nYqOa.33057$C83.2816401@newsread1.prod.itd.e arthlink.net>... > Hi all, > > I have a table with 18m rows in a data warehouse. 3 m rows have a '-1' value > and 15 m rows have mostly distinct values. Is this a good candidate for a > histogram index? I searched for info but couldn't find anything useful (a > point in the direction of good documentation of this would be helpful). > TIA, > > jabs. Hi jabs, The selectivity of the column where the value = -1: S = 3M / (3M + 15M) = 0.16666 (Subtract any NULLS from the 15M) If No Histogram is Used: Then the selectivity of the column is assumed to be uniformly distributed across -1 and all of the other distinct values. You need to ascertain how many other distinct values you have in the 15M. You imply that the 15M have many distinct values If so, this is fairly selective; therefore, the column may be a good choice for use as an index. If a Histogram is Used: Then the data distribution information is stored in the dictionary. This allows the optimizer to use this information and compute the correct selectivity based on the data distribution. In the above example, the selectivity, based on the histogram data, is 0.1666. This may not be high enough to indicate to the optimizer to use an index on the column in the execution plan; I'm not sure based on what you've provided. I hope this helps to indicate the general thinking in determining if histograms can help. Steve |
| ||||
| Thank you both for your help! "Stephen_CA" <stephen.bell@sympatico.ca> wrote in message news:5aeee507.0307080737.4a100942@posting.google.c om... > "jabs" <no@email.com> wrote in message news:<nYqOa.33057$C83.2816401@newsread1.prod.itd.e arthlink.net>... > > Hi all, > > > > I have a table with 18m rows in a data warehouse. 3 m rows have a '-1' value > > and 15 m rows have mostly distinct values. Is this a good candidate for a > > histogram index? I searched for info but couldn't find anything useful (a > > point in the direction of good documentation of this would be helpful). > > TIA, > > > > jabs. > > Hi jabs, > > The selectivity of the column where the value = -1: > > S = 3M / (3M + 15M) = 0.16666 (Subtract any NULLS from the 15M) > > If No Histogram is Used: Then the selectivity of the column is > assumed to be uniformly distributed across -1 and all of the other > distinct values. You need to ascertain how many other distinct values > you have in the 15M. You imply that the 15M have many distinct values > If so, this is fairly selective; therefore, the column may be a good > choice for use as an index. > > If a Histogram is Used: Then the data distribution information is > stored in the dictionary. This allows the optimizer to use this > information and compute the correct selectivity based on the data > distribution. In the above example, the selectivity, based on the > histogram data, is 0.1666. This may not be high enough to indicate to > the optimizer to use an index on the column in the execution plan; I'm > not sure based on what you've provided. > > I hope this helps to indicate the general thinking in determining if > histograms can help. > > Steve > |