The kernel estimator does suffer from one significant drawback, the need to
estimate, or choose the bandwidth. If density is such that different values of
are best for different areas of it, then the best a single value of
bandwidth can do is a compromise. For example, when applied to long-tailed
distributions the fixed kernel estimator will tend to under-smooth the tail. In
order to overcome this, the bandwidth of the estimator may be allowed to vary,
for example it is possible to have a bandwidth inversely proportional
to the local density obtained from some previous estimate,
Silverman (1986, p. 100).
The steps given in Silverman (1986) for arriving at this are
![]() |
(11) |
and is the sensitivity parameter, a number
satisfying
.
where is the kernel function and
is the bandwidth. As in
the ordinary kernel method,
is a symmetric function integrating
to unity.
The first step of this requires the use of some density estimator to obtain a
pilot estimate, this does not need to be of particular accuracy and can be a
simple as a nearest neighbour estimator5, the fixed BKDE
estimate is used here for the very simple reason that it is already part of the
Bayes 4 program written to support this work. The
sensitivity parameter controls the sensitivity of the method to
variations in the pilot density, setting
gives the fixed bandwidth
KDE. Abramson (1982) gives arguments for choosing
for
reasons of minimising the bias of the estimator and finds that:
Proportionally varying the bandwidths like.at the contributing readings lowers the bias to a vanishing fraction of the usual value, and makes for performance seen in well-known estimators that force moment conditions on the kernel (and so sacrifice positivity of the curve estimate).
The factor in (10) means that the bandwidth
factors,
are free of the scale of the data. However
![]() |
![]() |
(13) | |
(14) | |||
![]() |
(15) | ||
(16) |
![]() |
(17) | ||
(18) |
is equivalent to rescaling by a factor of
.
danny 2009-07-23