0% found this document useful (0 votes)

62 views

Introduction To Kernel Smoothing

This document provides an introduction to kernel smoothing, a non-parametric method for estimating probability density functions from data. Kernel density estimation places a kernel function over each data point to estimate the density over the entire domain. The bandwidth parameter controls the width of the kernel and influences the smoothness of the resulting density estimate. Choosing an appropriate bandwidth is important to avoid oversmoothing or undersmoothing the data.

Uploaded by

willrent

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

62 views

Introduction To Kernel Smoothing

Uploaded by

willrent

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 24

Introduction to Kernel Smoothing

0.006
0.004
Density

0.002
0.000

700 800 900 1000 1100 1200 1300

Wilcoxon score
M. P. Wand & M. C. Jones
Kernel Smoothing
Monographs on Statistics and Applied Probability
Chapman & Hall, 1995.

Stefanie Scheid - Introduction to Kernel Smoothing - January 5, 2004 1

Introduction

Histogram of some pvalues

12
10
8
Density

6
4
2
0

0.0 0.2 0.4 0.6 0.8 1.0

pvalues

Stefanie Scheid - Introduction to Kernel Smoothing - January 5, 2004 2

Introduction

Estimation of functions such as regression functions or probability

density functions.

Kernel-based methods are most popular non-parametric estimators.

Can uncover structural features in the data which a parametric

approach might not reveal.

Stefanie Scheid - Introduction to Kernel Smoothing - January 5, 2004 3

Univariate kernel density estimator

Given a random sample X1, . . . , Xn with a continuous, univariate

density f . The kernel density estimator is

n
1 X x Xi
f (x, h) = K
nh i=1 h

with kernel K and bandwidth h. Under mild conditions (h

must decrease with increasing n) the kernel estimate converges in
probability to the true density.

Stefanie Scheid - Introduction to Kernel Smoothing - January 5, 2004 4

The kernel K

Can be a proper pdf. Usually chosen to be unimodal and symmetric

about zero.

Center of kernel is placed right over each data point.

Influence of each data point is spread about its neighborhood.

Contribution from each point is summed to overall estimate.

Stefanie Scheid - Introduction to Kernel Smoothing - January 5, 2004 5

Gaussian kernel density estimate
1.5
1.0
0.5
0.0

0 5 10 15

Stefanie Scheid - Introduction to Kernel Smoothing - January 5, 2004 6

The bandwidth h

Scaling factor.

Controls how wide the probability mass is spread around a point.

0.6

0.6
0.3

0.4

0.4
0.2

0.2

0.2
0.1
0.0

0.0

0.0
2 1 0 1 2 2 1 0 1 2 2 1 0 1 2

Gauss Triangular
0.4

1.0
0.8
0.3

0.6
0.2

0.4
0.1

0.2
0.0
0.0

2 1 0 1 2
4 2 0 2 4

Stefanie Scheid - Introduction to Kernel Smoothing - January 5, 2004 12

Some kernels

(1 x2)p
K(x, p) = 2p+1 1{|x|<1}
2 B(p + 1, p + 1)
with B(a, b) = (a)(b)/(a + b).

p = 0: Uniform kernel.

p = 1: Epanechnikov kernel.

p = 2: Biweight kernel.

Stefanie Scheid - Introduction to Kernel Smoothing - January 5, 2004 13

Kernel efficiency

Perfomance of kernel is measured by MISE (mean integrated

squared error) or AMISE (asymptotic MISE).

Epanechnikov kernel minimizes AMISE and is therefore optimal.

Kernel efficiency is measured in comparison to Epanechnikov kernel.

Stefanie Scheid - Introduction to Kernel Smoothing - January 5, 2004 14

Kernel Efficiency

Epanechnikov 1.000

Biweight 0.994

Triangular 0.986

Normal 0.951

Uniform 0.930

Choice of kernel is not as important as choice of bandwidth.

Stefanie Scheid - Introduction to Kernel Smoothing - January 5, 2004 15

Modified KDEs

Local KDE: Bandwidth depends on x.

Variable KDE: Smooth out the influence of points in sparse regions.

Transformation KDE: If f is difficult to estimate (highly skewed,

high kurtosis), transform data to gain a pdf that is easier to
estimate.

Stefanie Scheid - Introduction to Kernel Smoothing - January 5, 2004 16

Bandwidth selection

Simple versus high-tech selection rules.

Objective function: MISE/AMISE.

R-function density offers several selection rules.

Stefanie Scheid - Introduction to Kernel Smoothing - January 5, 2004 17

bw.nrd0, bw.nrd

Normal scale rule.

Assumes f to be normal and calculates the AMISE-optimal

bandwidth in this setting.

First guess but oversmoothes if f is multimodal or otherwise not

normal.

Stefanie Scheid - Introduction to Kernel Smoothing - January 5, 2004 18

bw.ucv

Unbiased (or least squares) cross-validation.

Estimates part of MISE by leave-one-out KDE and minimizes this

estimator with respect to h.

Problems: Several local minima, high variability.

Stefanie Scheid - Introduction to Kernel Smoothing - January 5, 2004 19

bw.bcv

Biased cross-validation.

Estimation is based on optimization of AMISE instead of MISE (as

bw.ucv does).

Lower variance but reasonable bias.

Stefanie Scheid - Introduction to Kernel Smoothing - January 5, 2004 20

bw.SJ(method=c("ste", "dpi"))

The AMISE optimization involves the estimation of density

functionals like integrated squared density derivatives.

dpi: Direct plug-in rule. Estimates the needed functionals by KDE.

Problem: Choice of pilot bandwidth.

ste: Solve-the-equation rule. The pilot bandwidth depends on h.

Stefanie Scheid - Introduction to Kernel Smoothing - January 5, 2004 21

Comparison of bandwidth selectors

Simulation results depend on selected true densities.

Selectors with pilot bandwidths perform quite well but rely on

asymptotics less accurate for densities with sharp features
(e.g. multiple modes).

UCV has high variance but does not depend on asymptotics.

BCV performs bad in several simulations.

Authors recommendation: DPI or STE better than UCV or BCV.

Stefanie Scheid - Introduction to Kernel Smoothing - January 5, 2004 22

KDE with Epanechnikov kernel and DPI rule
12
10
8
Density

6
4
2
0

0.0 0.2 0.4 0.6 0.8 1.0

pvalues

Stefanie Scheid - Introduction to Kernel Smoothing - January 5, 2004 23

Gaussian Markov random fields theory and applications 1st Edition Havard Rue - The ebook in PDF/DOCX format is available for instant download
100% (2)
Gaussian Markov random fields theory and applications 1st Edition Havard Rue - The ebook in PDF/DOCX format is available for instant download
47 pages
Business Analytics Project
100% (1)
Business Analytics Project
11 pages
Kernel Smoothing-MP Wand-MC Jones-1995
100% (1)
Kernel Smoothing-MP Wand-MC Jones-1995
228 pages
Non-Parametric Methods Using Kernel Density Estimation
No ratings yet
Non-Parametric Methods Using Kernel Density Estimation
1 page
Comparing The Areas Under Two or More Correlated Receiver Operating Characteristic Curves A Nonparametric Approach
No ratings yet
Comparing The Areas Under Two or More Correlated Receiver Operating Characteristic Curves A Nonparametric Approach
10 pages
STAT 480b Answer Key To Problem Set No. 4
No ratings yet
STAT 480b Answer Key To Problem Set No. 4
3 pages
Sas Questions
No ratings yet
Sas Questions
3 pages
May2015 Examination Diet School of Mathematics & Statistics ID5059
No ratings yet
May2015 Examination Diet School of Mathematics & Statistics ID5059
9 pages
Lec7 Density PDF
No ratings yet
Lec7 Density PDF
9 pages
Introduction To Kernel Smoothing
100% (1)
Introduction To Kernel Smoothing
24 pages
Kernel Density Estimation
No ratings yet
Kernel Density Estimation
10 pages
Chapter One
100% (1)
Chapter One
46 pages
The Intuition Behind PCA: Machine Learning Assignment
No ratings yet
The Intuition Behind PCA: Machine Learning Assignment
11 pages
Fitting & Interpreting Linear Models in Rinear Models in R
100% (1)
Fitting & Interpreting Linear Models in Rinear Models in R
8 pages
Statistical Inference Cheat Sheet
No ratings yet
Statistical Inference Cheat Sheet
4 pages
Chapter 1 Basic Definitions of Stochastic Process, Kolmogorov Consistency Theorem (Lecture On 01-05-2021) - STAT 243 - Stochastic Process
No ratings yet
Chapter 1 Basic Definitions of Stochastic Process, Kolmogorov Consistency Theorem (Lecture On 01-05-2021) - STAT 243 - Stochastic Process
5 pages
Business Statistics: A Decision-Making Approach: Graphs, Charts, and Tables - Describing Your Data
No ratings yet
Business Statistics: A Decision-Making Approach: Graphs, Charts, and Tables - Describing Your Data
47 pages
20 - Basic Concepts and Terminology in Biostatistics (SepI2020)
No ratings yet
20 - Basic Concepts and Terminology in Biostatistics (SepI2020)
38 pages
Duda Solutions PDF
No ratings yet
Duda Solutions PDF
77 pages
Applied Nonparametric Regression
No ratings yet
Applied Nonparametric Regression
433 pages
Question and Answers For Pyplots
No ratings yet
Question and Answers For Pyplots
11 pages
Questions & Answers Chapter - 7 Set 1
No ratings yet
Questions & Answers Chapter - 7 Set 1
6 pages
Class 7
No ratings yet
Class 7
42 pages
Random Variable Generation
No ratings yet
Random Variable Generation
5 pages
Chapter 1 Descriptive Data
No ratings yet
Chapter 1 Descriptive Data
113 pages
An Introduction To Bayesian Statistics and MCMC Methods
No ratings yet
An Introduction To Bayesian Statistics and MCMC Methods
69 pages
Complete Download Introduction to Probability Detailed Solutions to Exercises 1st Edition David F Anderson Timo Sepp Al Ainen Benedek Valkó PDF All Chapters
100% (1)
Complete Download Introduction to Probability Detailed Solutions to Exercises 1st Edition David F Anderson Timo Sepp Al Ainen Benedek Valkó PDF All Chapters
41 pages
Decision Trees For Predictive Modeling (Neville)
100% (1)
Decision Trees For Predictive Modeling (Neville)
24 pages
Modelling in R
No ratings yet
Modelling in R
47 pages
Solution Exercises Estimation
No ratings yet
Solution Exercises Estimation
7 pages
Presentation 2
No ratings yet
Presentation 2
39 pages
Assignment of Econometrics
No ratings yet
Assignment of Econometrics
12 pages
Logit Model For Binary Data
No ratings yet
Logit Model For Binary Data
50 pages
Full Download Multivariate Statistical Methods A Primer Third Edition Manly PDF DOCX
100% (10)
Full Download Multivariate Statistical Methods A Primer Third Edition Manly PDF DOCX
65 pages
Introduction To Survival Analysis: BIOST 515 February 26, 2004
No ratings yet
Introduction To Survival Analysis: BIOST 515 February 26, 2004
30 pages
Non Parametric Methods 8
No ratings yet
Non Parametric Methods 8
23 pages
Introductory Concepts of Probabability & Statistics
No ratings yet
Introductory Concepts of Probabability & Statistics
6 pages
Advanced Statistical Computing PDF
No ratings yet
Advanced Statistical Computing PDF
329 pages
K - Nearest Neighbor Algorithm
No ratings yet
K - Nearest Neighbor Algorithm
18 pages
Answers To Problems For Data Mining and Predictive Analytics (2nd Edition) by Larose
No ratings yet
Answers To Problems For Data Mining and Predictive Analytics (2nd Edition) by Larose
12 pages
Theory of Estimation Ama 4306
No ratings yet
Theory of Estimation Ama 4306
4 pages
Multivariate Normal Distribution
No ratings yet
Multivariate Normal Distribution
9 pages
Unit-9 IGNOU STATISTICS
No ratings yet
Unit-9 IGNOU STATISTICS
16 pages
Sta230 20100329163207
100% (1)
Sta230 20100329163207
62 pages
Introduction To Statistics Using R
No ratings yet
Introduction To Statistics Using R
237 pages
Solution CH # 5
No ratings yet
Solution CH # 5
39 pages
A3 - Random Variables and Distributions
100% (1)
A3 - Random Variables and Distributions
19 pages
Sequential Analysis Hypothesis Testing and Changepoint Detection ( Etc.) (Z-Library)
No ratings yet
Sequential Analysis Hypothesis Testing and Changepoint Detection ( Etc.) (Z-Library)
600 pages
MT 281 Lecture Notes
No ratings yet
MT 281 Lecture Notes
292 pages
K Means Clustering in R Example - Learn by Marketing
No ratings yet
K Means Clustering in R Example - Learn by Marketing
3 pages
PPT7-Discrete Time - Markov Chain
No ratings yet
PPT7-Discrete Time - Markov Chain
37 pages
Maximum Likelihood Estimation
No ratings yet
Maximum Likelihood Estimation
8 pages
Introduction To Descriptive Statistics
No ratings yet
Introduction To Descriptive Statistics
73 pages
Ss Notes
No ratings yet
Ss Notes
34 pages
Lecture 9 PDF
100% (1)
Lecture 9 PDF
28 pages
Kernel Density Estimation - Wikipedia
No ratings yet
Kernel Density Estimation - Wikipedia
11 pages
Non Parametric Density Estimation
No ratings yet
Non Parametric Density Estimation
4 pages
TEAA - Memory Based Tecniques
No ratings yet
TEAA - Memory Based Tecniques
23 pages
On density estimation
No ratings yet
On density estimation
4 pages
Kernel Density Estimation and Its Application
No ratings yet
Kernel Density Estimation and Its Application
8 pages
Kde Presentation PDF
No ratings yet
Kde Presentation PDF
105 pages
Final Exam STS 201 Business Statistics II Spring 2021
No ratings yet
Final Exam STS 201 Business Statistics II Spring 2021
2 pages
Regression of Thermodynamic Data
No ratings yet
Regression of Thermodynamic Data
18 pages
Probability Theory and Stochastic Processes With Applications
70% (10)
Probability Theory and Stochastic Processes With Applications
382 pages
Measurement Uncertainty Chemical
No ratings yet
Measurement Uncertainty Chemical
8 pages
Share MCQS MODULE 5 RGPV MATHEMATICS III
100% (1)
Share MCQS MODULE 5 RGPV MATHEMATICS III
13 pages
C8511 BSC First Year Examination 2012 Research Skills in Psychology
No ratings yet
C8511 BSC First Year Examination 2012 Research Skills in Psychology
22 pages
Chapter 13 Sol
No ratings yet
Chapter 13 Sol
21 pages
Agricultural Statistics and Biometry (Agr 304) - 2021.2022
No ratings yet
Agricultural Statistics and Biometry (Agr 304) - 2021.2022
11 pages
LS_Project_Report
No ratings yet
LS_Project_Report
10 pages
Statistical Modelling Assignment II
No ratings yet
Statistical Modelling Assignment II
3 pages
Examples: Mixture Modeling With Cross-Sectional Data
No ratings yet
Examples: Mixture Modeling With Cross-Sectional Data
56 pages
A Fresh Look at The Kalman Filter
No ratings yet
A Fresh Look at The Kalman Filter
23 pages
Stat and Probability Test
No ratings yet
Stat and Probability Test
22 pages
Distributions: Binomial (Or Bernoulli'S) Distribution
No ratings yet
Distributions: Binomial (Or Bernoulli'S) Distribution
15 pages
An Introduction to Statistics with Python With Applications in the Life Sciences Research PDF Download
100% (2)
An Introduction to Statistics with Python With Applications in the Life Sciences Research PDF Download
14 pages
Tugas Statistik Nama: Andi Sitti Astati NIM: 1612016014 Kelas: 21 A
No ratings yet
Tugas Statistik Nama: Andi Sitti Astati NIM: 1612016014 Kelas: 21 A
11 pages
(eBook PDF) Statistics Learning from Data by Roxy Peck pdf download
100% (2)
(eBook PDF) Statistics Learning from Data by Roxy Peck pdf download
49 pages
hw6-2 Sol
No ratings yet
hw6-2 Sol
1 page
A Researchers Guide To Power Analysis USU-1
No ratings yet
A Researchers Guide To Power Analysis USU-1
11 pages
Statistics From PLTW
No ratings yet
Statistics From PLTW
64 pages
Q.P. Code: 664724
No ratings yet
Q.P. Code: 664724
10 pages
LAS 03 Illustrating A Probability Distribution For A Discrete Random Variable
No ratings yet
LAS 03 Illustrating A Probability Distribution For A Discrete Random Variable
1 page
ENR 304 Lab 4 Assignment
No ratings yet
ENR 304 Lab 4 Assignment
12 pages
MATRÍCULA G17081504 NOMBRE Luis Humberto Caballero Mejía
No ratings yet
MATRÍCULA G17081504 NOMBRE Luis Humberto Caballero Mejía
4 pages
SWEN 5012 Advanced Algorithms and Problem Solving: Lecture 3 Randomized Algorithms Beakal Gizachew Assefa
No ratings yet
SWEN 5012 Advanced Algorithms and Problem Solving: Lecture 3 Randomized Algorithms Beakal Gizachew Assefa
33 pages
B Ed Assignment 1 and 2
No ratings yet
B Ed Assignment 1 and 2
5 pages
Design of Experiment
No ratings yet
Design of Experiment
21 pages
2013 Gutierrez JClim
No ratings yet
2013 Gutierrez JClim
18 pages
Repeated Measures ANOVA
100% (1)
Repeated Measures ANOVA
41 pages