0% found this document useful (0 votes)
62 views

Introduction To Kernel Smoothing

This document provides an introduction to kernel smoothing, a non-parametric method for estimating probability density functions from data. Kernel density estimation places a kernel function over each data point to estimate the density over the entire domain. The bandwidth parameter controls the width of the kernel and influences the smoothness of the resulting density estimate. Choosing an appropriate bandwidth is important to avoid oversmoothing or undersmoothing the data.

Uploaded by

willrent
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
62 views

Introduction To Kernel Smoothing

This document provides an introduction to kernel smoothing, a non-parametric method for estimating probability density functions from data. Kernel density estimation places a kernel function over each data point to estimate the density over the entire domain. The bandwidth parameter controls the width of the kernel and influences the smoothness of the resulting density estimate. Choosing an appropriate bandwidth is important to avoid oversmoothing or undersmoothing the data.

Uploaded by

willrent
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 24

Introduction to Kernel Smoothing

0.006
0.004
Density

0.002
0.000

700 800 900 1000 1100 1200 1300

Wilcoxon score
M. P. Wand & M. C. Jones
Kernel Smoothing
Monographs on Statistics and Applied Probability
Chapman & Hall, 1995.

Stefanie Scheid - Introduction to Kernel Smoothing - January 5, 2004 1


Introduction

Histogram of some pvalues


12
10
8
Density

6
4
2
0

0.0 0.2 0.4 0.6 0.8 1.0

pvalues

Stefanie Scheid - Introduction to Kernel Smoothing - January 5, 2004 2


Introduction

Estimation of functions such as regression functions or probability


density functions.

Kernel-based methods are most popular non-parametric estimators.

Can uncover structural features in the data which a parametric


approach might not reveal.

Stefanie Scheid - Introduction to Kernel Smoothing - January 5, 2004 3


Univariate kernel density estimator

Given a random sample X1, . . . , Xn with a continuous, univariate


density f . The kernel density estimator is

n  
1 X x Xi
f (x, h) = K
nh i=1 h

with kernel K and bandwidth h. Under mild conditions (h


must decrease with increasing n) the kernel estimate converges in
probability to the true density.

Stefanie Scheid - Introduction to Kernel Smoothing - January 5, 2004 4


The kernel K

Can be a proper pdf. Usually chosen to be unimodal and symmetric


about zero.

Center of kernel is placed right over each data point.

Influence of each data point is spread about its neighborhood.

Contribution from each point is summed to overall estimate.

Stefanie Scheid - Introduction to Kernel Smoothing - January 5, 2004 5


Gaussian kernel density estimate
1.5
1.0
0.5
0.0

0 5 10 15

Stefanie Scheid - Introduction to Kernel Smoothing - January 5, 2004 6


The bandwidth h

Scaling factor.

Controls how wide the probability mass is spread around a point.

Controls the smoothness or roughness of a density estimate.

Bandwidth selection bears danger of under- or oversmoothing.

Stefanie Scheid - Introduction to Kernel Smoothing - January 5, 2004 7


From over- to undersmoothing

KDE with b=0.1


12
10
8
Density

6
4
2
0

0.0 0.2 0.4 0.6 0.8 1.0

pvalues

Stefanie Scheid - Introduction to Kernel Smoothing - January 5, 2004 8


From over- to undersmoothing

KDE with b=0.05


12
10
8
Density

6
4
2
0

0.0 0.2 0.4 0.6 0.8 1.0

pvalues

Stefanie Scheid - Introduction to Kernel Smoothing - January 5, 2004 9


From over- to undersmoothing

KDE with b=0.02


12
10
8
Density

6
4
2
0

0.0 0.2 0.4 0.6 0.8 1.0

pvalues

Stefanie Scheid - Introduction to Kernel Smoothing - January 5, 2004 10


From over- to undersmoothing

KDE with b=0.005


12
10
8
Density

6
4
2
0

0.0 0.2 0.4 0.6 0.8 1.0

pvalues

Stefanie Scheid - Introduction to Kernel Smoothing - January 5, 2004 11


Some kernels

Uniform Epanechnikov Biweight


0.5

0.8
0.4

0.6

0.6
0.3

0.4

0.4
0.2

0.2

0.2
0.1
0.0

0.0

0.0
2 1 0 1 2 2 1 0 1 2 2 1 0 1 2

Gauss Triangular
0.4

1.0
0.8
0.3

0.6
0.2

0.4
0.1

0.2
0.0
0.0

2 1 0 1 2
4 2 0 2 4

Stefanie Scheid - Introduction to Kernel Smoothing - January 5, 2004 12


Some kernels

(1 x2)p
K(x, p) = 2p+1 1{|x|<1}
2 B(p + 1, p + 1)
with B(a, b) = (a)(b)/(a + b).

p = 0: Uniform kernel.

p = 1: Epanechnikov kernel.

p = 2: Biweight kernel.

Stefanie Scheid - Introduction to Kernel Smoothing - January 5, 2004 13


Kernel efficiency

Perfomance of kernel is measured by MISE (mean integrated


squared error) or AMISE (asymptotic MISE).

Epanechnikov kernel minimizes AMISE and is therefore optimal.

Kernel efficiency is measured in comparison to Epanechnikov kernel.

Stefanie Scheid - Introduction to Kernel Smoothing - January 5, 2004 14


Kernel Efficiency

Epanechnikov 1.000

Biweight 0.994

Triangular 0.986

Normal 0.951

Uniform 0.930

Choice of kernel is not as important as choice of bandwidth.

Stefanie Scheid - Introduction to Kernel Smoothing - January 5, 2004 15


Modified KDEs

Local KDE: Bandwidth depends on x.

Variable KDE: Smooth out the influence of points in sparse regions.

Transformation KDE: If f is difficult to estimate (highly skewed,


high kurtosis), transform data to gain a pdf that is easier to
estimate.

Stefanie Scheid - Introduction to Kernel Smoothing - January 5, 2004 16


Bandwidth selection

Simple versus high-tech selection rules.

Objective function: MISE/AMISE.

R-function density offers several selection rules.

Stefanie Scheid - Introduction to Kernel Smoothing - January 5, 2004 17


bw.nrd0, bw.nrd

Normal scale rule.

Assumes f to be normal and calculates the AMISE-optimal


bandwidth in this setting.

First guess but oversmoothes if f is multimodal or otherwise not


normal.

Stefanie Scheid - Introduction to Kernel Smoothing - January 5, 2004 18


bw.ucv

Unbiased (or least squares) cross-validation.

Estimates part of MISE by leave-one-out KDE and minimizes this


estimator with respect to h.

Problems: Several local minima, high variability.

Stefanie Scheid - Introduction to Kernel Smoothing - January 5, 2004 19


bw.bcv

Biased cross-validation.

Estimation is based on optimization of AMISE instead of MISE (as


bw.ucv does).

Lower variance but reasonable bias.

Stefanie Scheid - Introduction to Kernel Smoothing - January 5, 2004 20


bw.SJ(method=c("ste", "dpi"))

The AMISE optimization involves the estimation of density


functionals like integrated squared density derivatives.

dpi: Direct plug-in rule. Estimates the needed functionals by KDE.


Problem: Choice of pilot bandwidth.

ste: Solve-the-equation rule. The pilot bandwidth depends on h.

Stefanie Scheid - Introduction to Kernel Smoothing - January 5, 2004 21


Comparison of bandwidth selectors

Simulation results depend on selected true densities.

Selectors with pilot bandwidths perform quite well but rely on


asymptotics less accurate for densities with sharp features
(e.g. multiple modes).

UCV has high variance but does not depend on asymptotics.

BCV performs bad in several simulations.

Authors recommendation: DPI or STE better than UCV or BCV.

Stefanie Scheid - Introduction to Kernel Smoothing - January 5, 2004 22


KDE with Epanechnikov kernel and DPI rule
12
10
8
Density

6
4
2
0

0.0 0.2 0.4 0.6 0.8 1.0

pvalues

Stefanie Scheid - Introduction to Kernel Smoothing - January 5, 2004 23

You might also like