The Naive Estimator

For a random variable , the probability density function is defined as a limit of a probability for an interval, as the interval width reduces to zero. That is

$\begin{displaymath} f(x) = \lim_{h\rightarrow 0}\frac{1}{2h} \Pr(x-h < X < x+h). \end{displaymath}$

(3)

This allows estimation of , for any , by the proportion of the sample falling in the interval . Hence, for a small , $\frac{1}{2h} \Pr(x-h < X < x+h)$ would be an approximation to . Replacing the probability with a relative frequency gives the naive estimator

$\begin{displaymath} \hat f(x) = \frac{1}{2nh} (\mbox{ no. of } x_1,\ldots x_n \mbox{ falling in } x \pm h) \end{displaymath}$

(4)

see Silverman (1986).

Defining the weight function

$\begin{displaymath} w(x) = \left\{ \begin{array}{lr} \frac{1}{2}& \mbox{if } ... ... x\vert < 1 \\ 0 & {\mbox{otherwise }} \end{array} \right. \end{displaymath}$

(5)

allows the naive estimator to be written

$\begin{displaymath} \hat f(x) = \frac{1}{nh}\sum_{i=1}^n w\left(\frac{x-x_i}{h}\right). \end{displaymath}$

(6)

Like the histogram the naive estimator is a step function not a continuous function. For this reason it is not wholly satisfactory either as a density estimate or for presentation. For the recovery of continuous densities a smooth estimator that operates in a way that is consistent with its use in a Grand Tour is required (see section ) (i.e. it should be fast enough that the flow of projections is not disrupted). KDE is a generalization of the naive estimator which replaces the weight function with a kernel function. If this kernel function is a smooth, continuous, probability density function then the density estimate will be a smooth, continuous, probability density function too, obtaining the necessary smoothness characteristics.

danny 2009-07-23