If the weight function in (6) is replaced by a *Kernel
function* which satisfies

(7) |

The *kernel density estimator* is defined as (see for
example Silverman, 1986, p. 15)

where , is the window width or *bandwidth* of the estimator. Note
that the estimate smoothness depends on the bandwidth. If is small then the
estimate will consist of spikes centred on the if is large then the
estimated density tends to the uniform and all detail is obscured.

The estimate obtained is continuous if is continuous and so may avoid the problems associated with the naive estimator or the histogram. A real disadvantage, in this context, of both the naive estimator and the histogram is that they both exhibit a lack of continuity.

The estimated density is a sum of functions, where is the number of
items of data. This means that it has the same properties as the kernel
function - if is a probability density function^{4}
then so too is .

It should also be noted that the naive estimator, as defined above, is a KDE with the non-continuous Kernel

(9) |

However, this kernel gives an estimate with discontinuities similar to, if not so visually obvious as, the histogram.

It is usual that , however there are arguments for sometimes using kernels which take negative values (see Silverman (1986) section 3.6). This can lead to problems and, as the advantages are not large, the kernel functions used in the present work are everywhere non-negative.

danny 2009-07-23