For a random variable , the probability density function is defined as a limit of a probability for an interval, as the interval width reduces to zero. That is
(3) |
This allows estimation of , for any , by the proportion of the sample falling in the interval . Hence, for a small , would be an approximation to . Replacing the probability with a relative frequency gives the naive estimator
(4) |
see Silverman (1986).
Defining the weight function
(5) |
allows the naive estimator to be written
Like the histogram the naive estimator is a step function not a continuous function. For this reason it is not wholly satisfactory either as a density estimate or for presentation. For the recovery of continuous densities a smooth estimator that operates in a way that is consistent with its use in a Grand Tour is required (see section ) (i.e. it should be fast enough that the flow of projections is not disrupted). KDE is a generalization of the naive estimator which replaces the weight function with a kernel function. If this kernel function is a smooth, continuous, probability density function then the density estimate will be a smooth, continuous, probability density function too, obtaining the necessary smoothness characteristics.
danny 2009-07-23