100% found this document useful (2 votes)
59 views

Bootstrap Methods: With Applications in R Gerhard Dikta all chapter instant download

Gerhard

Uploaded by

botalblimp88
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (2 votes)
59 views

Bootstrap Methods: With Applications in R Gerhard Dikta all chapter instant download

Gerhard

Uploaded by

botalblimp88
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 49

Download Full Version ebook - Visit ebookmeta.

com

Bootstrap Methods: With Applications in R Gerhard


Dikta

https://ebookmeta.com/product/bootstrap-methods-with-
applications-in-r-gerhard-dikta/

OR CLICK HERE

DOWLOAD NOW

Discover More Ebook - Explore Now at ebookmeta.com


Instant digital products (PDF, ePub, MOBI) ready for you
Download now and discover formats that fit your needs...

Start reading on any device today!

DNAzymes Methods and Protocols Methods in Molecular


Biology 2439 Gerhard Steger (Editor)

https://ebookmeta.com/product/dnazymes-methods-and-protocols-methods-
in-molecular-biology-2439-gerhard-steger-editor/

ebookmeta.com

Tree-Based Methods for Statistical Learning in R: A


Practical Introduction with Applications in R 1st Edition
Brandon M. Greenwell
https://ebookmeta.com/product/tree-based-methods-for-statistical-
learning-in-r-a-practical-introduction-with-applications-in-r-1st-
edition-brandon-m-greenwell/
ebookmeta.com

Primary Mathematics 3A Hoerst

https://ebookmeta.com/product/primary-mathematics-3a-hoerst/

ebookmeta.com

Senses, Cognition, and Ritual Experience in the Roman


World (Ancient Religion and Cognition) Blanka Misic

https://ebookmeta.com/product/senses-cognition-and-ritual-experience-
in-the-roman-world-ancient-religion-and-cognition-blanka-misic/

ebookmeta.com
Examples Explanations for Professional Responsibility 6th
Edition W Bradley Wendel

https://ebookmeta.com/product/examples-explanations-for-professional-
responsibility-6th-edition-w-bradley-wendel/

ebookmeta.com

Talent Value Management Liberating Organisation Growth 1st


Edition André W. Pandy

https://ebookmeta.com/product/talent-value-management-liberating-
organisation-growth-1st-edition-andre-w-pandy/

ebookmeta.com

The Couple Next Door 2nd Edition Rivers A J

https://ebookmeta.com/product/the-couple-next-door-2nd-edition-rivers-
a-j/

ebookmeta.com

Disconnecting to Survive: Understanding and Recovering


from Trauma-based Dissociation 1st Edition Pamela Fuller

https://ebookmeta.com/product/disconnecting-to-survive-understanding-
and-recovering-from-trauma-based-dissociation-1st-edition-pamela-
fuller/
ebookmeta.com

Learning Git: A Hands-On and Visual Guide to the Basics of


Git 1st Edition Anna Skoulikari

https://ebookmeta.com/product/learning-git-a-hands-on-and-visual-
guide-to-the-basics-of-git-1st-edition-anna-skoulikari/

ebookmeta.com
3D Printing A Revolutionary Process for Industry
Applications 1st Edition Richard Sheng

https://ebookmeta.com/product/3d-printing-a-revolutionary-process-for-
industry-applications-1st-edition-richard-sheng/

ebookmeta.com
Gerhard Dikta
Marsel Scheer

Bootstrap
Methods
With Applications in R
Bootstrap Methods
Gerhard Dikta Marsel Scheer

Bootstrap Methods
With Applications in R

123
Gerhard Dikta Marsel Scheer
Department of Medical Engineering Bayer AG
and Technomathemathics Cologne, Nordrhein-Westfalen, Germany
FH Aachen – University of Applied Sciences
Jülich, Nordrhein-Westfalen, Germany

ISBN 978-3-030-73479-4 ISBN 978-3-030-73480-0 (eBook)


https://doi.org/10.1007/978-3-030-73480-0

RStudio is a trademarks of RStudio, PBC

© Springer Nature Switzerland AG 2021


This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part
of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations,
recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission
or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar
methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this
publication does not imply, even in the absence of a specific statement, that such names are exempt from
the relevant protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this
book are believed to be true and accurate at the date of publication. Neither the publisher nor the
authors or the editors give a warranty, expressed or implied, with respect to the material contained
herein or for any errors or omissions that may have been made. The publisher remains neutral with regard
to jurisdictional claims in published maps and institutional affiliations.

This Springer imprint is published by the registered company Springer Nature Switzerland AG
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
To our families:
Renate and Jan
Natalie, Nikolas and Alexander
for their support and patience
Preface

Efron’s introduction of the classical bootstrap in 1979 was the starting point of an
immense and lasting research activity. Accompanied and supported by the
improvement of PCs’ computing power, these methods are now an established
approach in applied statistics. The appealing simplicity makes it easy to use this
approach in different fields of science where statistics is applied.
The intention of this manuscript is to discuss the bootstrap concept in the context
of statistical testing, with a focus on its use or support in statistical modeling.
Furthermore, we would like to address different reader preferences with the content.
Specifically, we have thought of two types of readers. On the one hand, users of
statistics who have a solid basic knowledge of probability theory and who would
like to have a goal-oriented and short-term problem solution provided. On the other
hand, however, this book is also intended for readers who are more interested in the
theoretical background of a problem solution and who have advanced knowledge of
probability theory and mathematical statistics.
In most cases, we start a topic with some introductory examples, basic mathe-
matical considerations, and simple implementations of the corresponding algorithm.
A reader who is mainly interested in applying a particular approach may stop after
such a section and apply the discussed procedures and implementations to the
problem in mind. This introductory part to a topic is mainly addressed to the first
type of reader. It can also be used just to motivate bootstrap approximations and to
apply them in simulation studies on a computer. The second part of a topic covers
the mathematical framework and further background material. This part is mainly
written for those readers who have a strong background in probability theory and
mathematical statistics.
Throughout all chapters, computational procedures are provided in R. R is a
powerful statistical computing environment, which is freely available and can be
downloaded from the R-Project website at www.r-project.org. We focus only on a
few but very popular packages from the so-called tidyverse, mainly ggplot2 and
dplyr. This hopefully helps readers, who are not familiar with R, understand the
implementations more easily, first because the packages make the source code quite
intuitive to read and second because of their popularity a lot of helpful information

vii
viii Preface

can be found on the Internet. However, the repository of additional R-packages that
have been created by the R-community is immense, also with respect to
non-statistical aspects, that makes it worth to learn and work with R. The
R-programs considered in the text are made available on the website https://www.
springer.com/gp/book/9783030734794.
The first three chapters provide introductory material and are mainly intended for
readers who have never come into contact with bootstrapping. Chapter 1 gives a
short introduction to the bootstrap idea and some notes on R. In the Chap. 2, we
summarize some results about the generation of random numbers. The Chap. 3 lists
some well-known results of the classical bootstrap method.
In Chap. 4, we discuss the first basic statistical tests using the bootstrap method.
Chapters 5 and 6 cover bootstrap applications in the context of linear and gener-
alized linear regression. The focus is on goodness-of-fit tests, which can be used to
detect contradictions between the data and the fitted model. We discuss the work of
Stute on marked empirical processes and transfer parts of his results into the
bootstrap context in order to approximate p-values for the individual
goodness-of-fit tests. Some of the results here are new, at least to the best of our
knowledge. Although the mathematics behind these applications is quite complex,
we consider these tests as useful tools in the context of statistical modeling and
learning. Some of the subsections focus exactly on this modeling topic.
In the appendix, some additional aspects of R with respect to bootstrap appli-
cations are illustrated. In the first part of this appendix, some applications of the
“boot” R-package of Brian Ripley, which can be obtained from the R-project’s
website, are demonstrated. The second part describes the “simTool” R-package of
Marsel Scheer, which was written to simplify the implementation of simulation
studies like bootstrap replications in R. This package also covers applications of
parallel programming issues. Finally, the usage of our “bootGOF” R-package is
illustrated, which provides a tool to perform goodness-of-fit tests for (linear) models
as discussed in Chap. 6.

Jülich, Germany Gerhard Dikta


January 2021 Marsel Scheer
Acknowledgements

The first three chapters of this manuscript were written during the time when the
first author was employed as a Research Assistant at the Chair for Mathematical
Stochastics of the Justus Liebig University in Gießen. They were prepared for a
summer course at the Department of Mathematical Sciences at the University of
Wisconsin-Milwaukee, which the first author taught in 1988 (and several times later
on) after completing his doctorate.
Special thanks must be given to Prof. Dr. Winfried Stute, who supervised the
first author in Giessen. Professor Stute realized the importance of the bootstrap
method at a very early stage and inspired and promoted interest in it among the first
author. In addition, Prof. Stute together with Prof. Gilbert Walter from the
Department of Mathematical Science of the University of Wisconsin-Milwaukee
initiated a cooperation between the two departments, which ultimately formed the
basis for the long-lasting collaboration between the first author and his colleagues
from the statistics group in Milwaukee.
Financially, this long-term cooperation was later on supported by the Department
of Medical Engineering and Technomathematics of the Fachhochschule Aachen and
by the Department of Mathematical Sciences of the University of Wisconsin-
Milwaukee, and we would like to thank Profs. Karen Brucks, Allen Bell, Thomas
O’Bryan, and Richard Stockbridge for their kind assistance.
Finally, the first author would like to thank his colleagues from the statistics
group in Milwaukee, Jay Beder, Vytaras Brazauskas, and especially Jugal Ghorai,
and, from Fachhochschule Aachen, Martin Reißel for their helpful discussions and
support.
Also, the second author gives special thanks to Prof. Dr. Josef G. Steinebach
from the Department of Mathematics of the University of Cologne for his excellent
lectures in Statistics and Probability Theory.
We are both grateful to Dr. Andreas Kleefeld, who kindly provided us with
many comments and corrections to a preliminary version of the book.

ix
Contents

1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Basic Idea of the Bootstrap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 The R-Project for Statistical Computing . . . . . . . . . . . . . . . . . . . . 5
1.3 Usage of R in This Book . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3.1 Further Non-Statistical R-Packages . . . . . . . . . . . . . . . . . . 6
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2 Generating Random Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.1 Distributions in the R-Package Stats . . . . . . . . . . . . . . . . . . . . . . 9
2.2 Uniform df. on the Unit Interval . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.3 The Quantile Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.4 The Normal Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.5 Method of Rejection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.6 Generation of Random Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3 The Classical Bootstrap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..... 21
3.1 An Introductory Example . . . . . . . . . . . . . . . . . . . . . . . . . ..... 21
3.2 Basic Mathematical Background of the Classical Bootstrap . ..... 27
3.3 Discussion of the Asymptotic Accuracy of the Classical
Bootstrap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.4 Empirical Process and the Classical Bootstrap . . . . . . . . . . . . . . . 34
3.5 Mathematical Framework of Mallow’s Metric . . . . . . . . . . . . . . . 36
3.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
4 Bootstrap-Based Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
4.2 The One-Sample Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
4.3 Two-Sample Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

xi
xii Contents

4.4 Goodness-of-Fit (GOF) Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60


4.5 Mathematical Framework of the GOF Test . . . . . . . . . . . . . . . . . 65
4.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
5 Regression Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
5.1 Homoscedastic Linear Regression under Fixed Design . . . . . . . . . 74
5.1.1 Model-Based Bootstrap . . . . . . . . . . . . . . . . . . . . . . . . . . 77
5.1.2 LSE Asymptotic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
5.1.3 LSE Bootstrap Asymptotic . . . . . . . . . . . . . . . . . . . . . . . . 88
5.2 Linear Correlation Model and the Bootstrap . . . . . . . . . . . . . . . . . 90
5.2.1 Classical Bootstrap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
5.2.2 Wild Bootstrap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
5.2.3 Mathematical Framework of LSE . . . . . . . . . . . . . . . . . . . 99
5.2.4 Mathematical Framework of Classical
Bootstrapped LSE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
5.2.5 Mathematical Framework of Wild Bootstrapped LSE . . . . . 104
5.3 Generalized Linear Model (Parametric) . . . . . . . . . . . . . . . . . . . . 106
5.3.1 Mathematical Framework of MLE . . . . . . . . . . . . . . . . . . 121
5.3.2 Mathematical Framework of Bootstrap MLE . . . . . . . . . . . 133
5.4 Semi-parametric Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
5.4.1 Mathematical Framework of LSE . . . . . . . . . . . . . . . . . . . 147
5.4.2 Mathematical Framework of Wild Bootstrap LSE . . . . . . . 153
5.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
6 Goodness-of-Fit Test for Generalized Linear Models . . . . . . . . . . . . 165
6.1 MEP in the Parametric Modeling Context . . . . . . . . . . . . . . . . . . 167
6.1.1 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168
6.1.2 Bike Sharing Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
6.1.3 Artificial Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
6.2 MEP in the Semi-parametric Modeling Context . . . . . . . . . . . . . . 187
6.2.1 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190
6.2.2 Artificial Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192
6.3 Comparison of the GOF Tests under the Parametric
and Semi-parametric Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194
6.4 Mathematical Framework: Marked Empirical Processes . . . . . . . . 197
6.4.1 The Basic MEP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198
6.4.2 The MEP with Estimated Model Parameters
Propagating in a Fixed Direction . . . . . . . . . . . . . . . . . . . 203
6.4.3 The MEP with Estimated Model Parameters
Propagating in an Estimated Direction . . . . . . . . . . . . . . . 207
Contents xiii

6.5 Mathematical Framework: Bootstrap of Marked Empirical


Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214
6.5.1 Bootstrap of the BMEP . . . . . . . . . . . . . . . . . . . . . . . . . . 218
6.5.2 Bootstrap of the EMEP . . . . . . . . . . . . . . . . . . . . . . . . . . 221
6.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230

Appendix A: boot Package . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231


Appendix B: simTool Package . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237
Appendix C: bootGOF Package . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249
Appendix D: Session Info . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255
Abbreviations

a.e. Almost everywhere


a.s. Almost sure
BMEP Basic marked empirical process
CLT Central limit theorem
CvM Cramér-von Mises
df. Distribution function
edf. Empirical distribution function of an i.i.d. sample
EMEP Estimated marked empirical process
EMEPE Estimated marked empirical process in estimated direction
GA General assumptions
GC Glivenko-Cantelli theorem
GLM Generalized linear model
GOF Goodness-of-fit
i.i.d. Independent and identically distributed
KS Kolmogorov-Smirnov
MEP Marked empirical process
MLE Maximum likelihood estimate
pdf. Probability density function
PRNG Pseudo-random number generators
qf. Quantile function
rv. Random variable
RSS Resampling scheme
SLLN Strong law of large numbers
W.l.o.g. Without loss of generality
WLLN Weak law of large numbers
w.p.1 With probability one

xv
xvi Abbreviations

Notations

A :¼ B A is defined by B
AB A and B are equivalent
Bn Borel ralgebra on Rn
C[0,1] Space of continuous, real-valued function on the unit interval
D[0,1] Skorokhod space on the unit interval
EðXÞ Expectation of the random variable X
En ðX  Þ Expectation of the bootstrap random variable X
EXPðaÞ Exponential distribution with parameter a [ 0
Fn Empirical distribution function
Ifx2Ag Indicator function
IfAg ðxÞ Indicator function
Ip Identity matrix of size p  p
h; i Inner product of a Hilbert space
a^b Minimum of a and b
N ðl; r2 Þ Normal distribution with expectation l and variance r2
Pn Probability measure corresponding to bootstrap rvs. based on n
original observations
P Probability measure corresponding to the wild bootstrap
Rn Basic marked empirical process (BMEP)
R1n Marked empirical process with estimated parameters propagating in
fixed direction (EMEP)
 1n
R Marked empirical process with estimated parameters propagating in
an estimated direction (EMEPE)
UNIða; bÞ Uniform distribution on the interval ½a; b
UNI Standard uniform distribution on the interval, i.e., UNIð0; 1Þ
VARðXÞ Variance of the random variable X
VARn ðX  Þ Variance of the bootstrap random variable X
WEIBða; bÞ Weibull distribution with parameter a and b
X F Random variable X is distributed according to F
Chapter 1
Introduction

In this introduction, we discuss the basic idea of the bootstrap procedure using a
simple example. Furthermore, the Statistical Software R and its use in the context of
this manuscript is briefly covered. Readers who are familiar with this material can
skip this chapter.
A short summary of the contents of this manuscript can be found in the Preface
and is not listed here again.

1.1 Basic Idea of the Bootstrap

Typical statistical methods, such as constructing a confidence interval for the expected
value of a random variable or determining critical values for a hypothesis test, require
knowledge of the underlying distribution. However, this distribution is usually only
partially known at most. The statistical method we use to perform the task depends
on our knowledge of the underlying distribution.
Let us be more precise and assume that

X 1, . . . , X n ∼ F

is a sequence of independent and identically distributed (i.i.d.) random variables with


common distribution function (df.) F. Consider the statistic

Electronic supplementary material The online version of this chapter


(https://doi.org/10.1007/978-3-030-73480-0_1) contains supplementary material, which is
available to authorized users.

© Springer Nature Switzerland AG 2021 1


G. Dikta and M. Scheer, Bootstrap Methods,
https://doi.org/10.1007/978-3-030-73480-0_1
2 1 Introduction

1
n
X̄ n := Xi
n i=1

to estimate the parameter μ F = E(X ), that is, the expectation of X .


To construct a confidence interval for μ F or to perform a hypothesis test on μ F ,
we consider the df. of the studentized version of X̄ n , that is,
√  
PF n( X̄ n − μ F ) sn ≤ x , x ∈ R, (1.1)

where
1 
n
sn2 := (X i − X̄ n )2
n − 1 i=1

is the unbiased estimator of σ 2 = VAR(X ), that is, the variance of X . Note that we
write P F here to indicate that F is the data generating df.
If we know that F comes from the class of normal distributions, then the df. under
(1.1) belongs to a tn−1 −distribution, i.e., a Student’s t distribution with n − 1 degrees
of freedom. Using the known quantiles of the tn−1 − distribution exact confidence
interval can be determined. For example, an exact 90% confidence interval for μ F is
given by
 sn q0.95 sn q0.95 
X̄ n − √ , X̄ n + √ , (1.2)
n n

where q0.95 is the 95% quantile of the tn−1 distribution.


But in most situations we are not able to specify a parametric distribution class
for F. In such a case, we have to look for a suitable approximation for (1.1). If it is
ensured that E(X 2 ) < ∞, the central limit theorem (CLT) guarantees that
 √   
 
sup P F n( X̄ n − μ F ) sn ≤ x − Φ(x) −→ 0, for n → ∞, (1.3)
x∈R

where Φ denotes the standard normal df. Based on the CLT, we can now construct
an asymptotic confidence interval. For example, the 90% confidence interval under
(1.2) has the same structure when we construct it using the CLT. However, q0.95 now
is the 95% quantile of the standard normal distribution. The interval constructed in
this way is no longer an exact confidence interval. It can only be guaranteed that the
confidence level of 90% is reached with n → ∞. It should also be noted that for q0.95
the 95% quantile of the tn−1 − distribution can also be chosen, because for n → ∞,
the tn−1 − df. converges to the standard normal df.
So far we have concentrated exclusively on the studentized mean. Let us generalize
this to a statistic of the type

Tn (F) = Tn (X 1 , . . . , X n ; F),
1.1 Basic Idea of the Bootstrap 3

where X 1 , . . . , X n ∼ F are i.i.d. Then the question arises how to approximate the
df.  
P F Tn (F) ≤ x , x ∈R (1.4)

if F is unknown. This is where Efron’s bootstrap enters the game. The basic idea of
the bootstrap method is the assumption that the df. of Tn is about the same when the
data generating distribution F is replaced by another data generating distribution F̂
which is close to F and which is known to us. If we can find such a df. F̂,

P F̂ (Tn ( F̂) ≤ x), x ∈R (1.5)

may also be an approximation of Eq. (1.4). We call this df. for the moment a bootstrap
approximation of the df. given under Eq. (1.4). However, this approach only makes
sense if we can guarantee that
    

sup P F Tn (F) ≤ x − P F̂ Tn ( F̂) ≤ x  −→ 0, for n → ∞. (1.6)
x∈R

Now let us go back to construct a 90% confidence interval for μ F based on the
bootstrap approximation. For this, we take the studentized mean for Tn and assume
that we have a data generating df. F̂ that satisfies (1.6). Since F̂ is known, we can
now, at least theoretically, calculate the 5% and 95% quantiles of the df.
√  
P F̂ n( X̄ n − μ F̂ ) sn ≤ x ,

which we denote by qn,0.05 and qn,0.95 , respectively, to derive


 sn qn,0.95 sn qn,0.05 
X̄ n − √ , X̄ n − √ , (1.7)
n n

an asymptotic 90% confidence interval for μ F .


If we want to use such a bootstrap approach, we have
(A) to choose the data generating df. F̂ such that the bootstrap approximation (1.6)
holds,
(B) to calculate the df. of Tn , where the sample is generated under F̂.
Certainly (A) is the more demanding part, in particular, the proof of the approximation
(1.6). Fortunately, a lot of work has been done on this in the last decades. Also, the
calculation of the df. under (B) may turn out to be very complex. However, this is of
minor importance, because the bootstrap df. in Eq. (1.6) can be approximated very
well by a Monte Carlo approach. It is precisely this opportunity to perform a Monte
Carlo approximation, together with the rapid development of powerful PCs that has
led to the great success of the bootstrap approach.
To demonstrate such a Monte Carlo approximation for the df. of Eq. (1.5), we
proceed as follows:
4 1 Introduction

(a) Construct m i.i.d. (bootstrap) samples independent of one another of the type
∗ ∗
X 1;1 ... X 1;n
.. .. ..
. . .
∗ ∗
X m;1 . . . X m;n

with common df. F̂.


(b) Calculate for each sample k ∈ {1, 2, . . . , m}
∗ ∗ ∗
Tk;n := Tn (X k;1 , . . . , X k;n ; F̂)

∗ ∗
to obtain T1;n , . . . , Tm;n .
∗ ∗
(c) Since the T1;n , . . . , Tm;n are i.i.d. , the Glivenko-Cantelli theorem (GC) guaran-
tees
   1  m 
 
sup P F̂ Tn ( F̂) ≤ x − I{Tk;n

≤x}  −→ 0, for m → ∞, (1.8)
x∈R m k=1

where I{x∈A} ≡ I{A} (x) denotes the indicator function of the set A, that is,

1 : x∈A
I{x∈A} = .
0 : x∈
/ A

The choice of an appropriate F̂ depends on the underlying problem, as we will


see in the following chapters. In the context of this introduction, Fn , the empirical
df. (edf.) of the sample X 1 , . . . , X n , defined by

1
n
Fn (x) := I{X ≤x} , x ∈ R, (1.9)
n i=1 i

is a good choice for F̂ since, by the Glivenko-Cantelli theorem, we get with proba-
bility one (w.p.1)  
sup  Fn (x) − F(x) −→ 0.
n∈R n→∞

If we choose Fn for F̂ then we are talking about the classical bootstrap which was
historically the first to be studied.
1.2 The R-Project for Statistical Computing 5

1.2 The R-Project for Statistical Computing

The programming language R, see R Core Team (2019), is a widely used open-
source software tool for data analysis and graphics which runs on the commonly
used operating systems. It can be downloaded from the R-project’s website at www.r-
project.org. The R Development Core Team also offers some documentation on this
website:
• R installation and administration,
• An introduction to R,
• The R language definition,
• R data import/export, and
• The R reference index.
Additionally to this material, there is a large and strongly growing number of text-
books available covering the R programming language and the applications of R in
different fields of data analysis, for instance, Beginning R or Advanced R.
Besides the R software, one also should install an editor or an integrated develop-
ment environment (IDE) to work with R conveniently. Several open-source products
are available on the web, like
• RStudio, see RStudio Team (2020), at www.rstudio.org;
• RKWard, at http://rkward.sourceforge.net;
• Tinn-R, at http://www.sciviews.org/Tinn-R; and
• Eclipse based StatET, at http://www.walware.de/goto/statet.

1.3 Usage of R in This Book

Throughout the book we implement, for instance, different resampling schemes and
simulation studies in R. Our implementations are free from any checking of function
arguments. We provide R-code that focuses solely on an understandable implemen-
tation of a certain algorithm. Therefore, there is plenty of room to improve the imple-
mentations. Some of these improvements will be discussed within the exercises.
R is organized in packages. A new installation of R comes with some pre-installed
packages. And the packages provided by the R-community makes this programming
language really powerful. More than 15000 packages (as of 2020/Feb) are available
(still growing). But especially for people starting with R this is also a problem. The
CRAN Task View https://cran.r-project.org/web/views summarizes certain packages
within categories like “Graphics”, “MachineLearning”, or “Survival”. We decided
to use only a handful of packages that are directly related to the main objective
of this book, like the boot-package for bootstrapping, or (in the opinion of the
authors) are too important and helpful to be ignored, like ggplot2, dplyr, and
tidyr. In addition, we have often used the simTool package from Marsel Scheer
to carry out simulations. This package is explained in the appendix. Furthermore,
6 1 Introduction

we decided to use the pipe operator, i.e., %>%. There are a few critical voices about
this operator, but the authors as the most R users find it very comfortable to work
with the pipe operator. People familiar with Unix systems will recognize the concept
and probably appreciate it. A small example will demonstrate how the pipe operator
works. Suppose we want to apply a function A to the object x and the result of this
operation should be processed further by the function B. Without the pipe operator
one could use
B(A(x))
# or
tmp = A(x)
B(tmp)

With the pipe operator this becomes


A(x) %>%
B
# or
x %>%
A %>%
B

Especially with longer chains of functions using pipes may help to obtain R-code
that is easier to understand.

1.3.1 Further Non-Statistical R-Packages

There are a lot of packages that are worth to look at. Again the CRAN Task View may
be a good starting point. The following list is focused on writing reports, developing
R-packages, and increasing the speed of R-code itself. By far this list is not exhaustive:
• knitr for writing reports (this book was written with knitr);
• readxl for the import of excel files;
• testthat for creating automated unit tests. It is also helpful for checking func-
tion arguments;
• covR for assessing test coverage of the unit tests;
• devtools for creating/writing packages;
• data.table amazingly fast aggregation, joins, and various manipulations of
large datasets;
• roxygen2 for creating help pages within packages;
• Rcpp for a simple integration of C++ into R;
• profvis a profiling tool that assess at which line of code R spends its time;
• checkpoint, renv for package dependency.
Of course, further packages for importing datasets, connecting to databases, cre-
ating interactive graphs and user interfaces, and so on exist. Again, the packages
provided by the R-community make this programming language really powerful.
1.3 Usage of R in This Book 7

Finally, we want to strongly recommend the R-package drake. According to the


package-manual: It analyzes your workflow, skips steps with up-to-date results, and
orchestrates the rest with optional distributed computing. We want to briefly describe
how this works in principle. One defines a plan with steps one wants to perform:
plan <- drake::drake_plan(
raw = import_data("/foo/bar/data.csv"),
wrangled = preprocess(raw),
model1 = fit1(wrangled),
model2 = fit2(wrangled)
)

This plan can then be executed/processed by drake.


drake::make(plan)

This creates the four objects raw, wrangled, model1, and model2. Assume now
that we change the underlying source code for one of the model-fitting functions, then
there is, of course, no need to rerun the preprocess step. Since drake analyzed our
defined plan it automatically skips the import and preprocessing for us. This can be
extremely helpful if the preprocess step is computationally intensive. Or imagine the
situation that we refactor the data-import function. If these changes do not modify the
raw object created in the first step, then again there is no need to rerun the preprocess
step or to refit the models. Again drake automatically detects that and skips the
preprocessing. Furthermore, looking at the definition of model1 and model2, we see
that there is no logical need to process them sequentially and with drake one can
easily do the computation in parallel. The package does also a lot of other helpful
things in the background, for instance, it measures the time used to perform a single
step of the plan. Although we do not use drake in this book we encourage the reader
to try out the package. A good starting point is the excellent user manual accessible
under https://books.ropensci.org/drake.

References

R Core Team (2019) R: A language and environment for statistical computing. R Foundation for
Statistical Computing, Vienna, Austria, https://www.R-project.org/
RStudio Team (2020) RStudio: integrated development environment for R. RStudio, PBC, Boston,
MA, http://www.rstudio.com/
Chapter 2
Generating Random Numbers

To perform a Monte Carlo approximation, we have to generate random variables


(rv.) on a computer according to a given df. F. In this chapter, we will discuss some
commonly used procedures and their application under R.
Since most of the widely used distributions are implemented in R, random vari-
ables according to these distributions can easily be generated directly in R through
the corresponding built-in R functions. In the first section of this chapter, we will
give a brief overview on those distributions which are implemented in the R stats
package.
However, if a specific distribution is needed which is neither supported by R itself
nor by any additional package, one can try the “quantile transformation method”
or the “method of rejection”. Both approaches are considered in this chapter. For a
detailed discussion of random number generation, we refer to Devroye (1986) and
Ripley (1987). In Eubank and Kupresanin (2011, Chapter 4), this is also considered
in the R-context.

2.1 Distributions in the R-Package Stats

The standard R-package stats contains several standard probability distributions.


We can list them from a R-workspace by typing the command
help(distributions)

For all these distributions, the corresponding cumulative distribution function,


density function, quantile function, and random generation function are implemented
and can be called by
• dxxx(. . .)—density function;
• pxxx(. . .)—distribution function;
© Springer Nature Switzerland AG 2021 9
G. Dikta and M. Scheer, Bootstrap Methods,
https://doi.org/10.1007/978-3-030-73480-0_2
10 2 Generating Random Numbers

• qxxx(. . .)—quantile function; and


• rxxx(. . .)—random number generator function.
In the notation above, “xxx” is the name in R of the corresponding distribution and
(. . .) a placeholder for the required parameters of the function call. The following
example lists some calls regarding a normal distribution with expected value μ = 2
and variance σ 2 = 4, here abbreviated as N (2, 4).
R-Example 2.1 Note that in the corresponding function calls under R, the standard
deviation (sd = 2) is used while in the notation N (2, 4) the variance σ 2 = 4 is given.
The R name “xxx” of the normal distribution is “norm”.
#call the help for rnorm
help(rnorm)
#density function at x = 2
dnorm(x = 2, mean = 2, sd = 2)

## [1] 0.1994711

#distribution function at q = 2
pnorm(q = 2, mean = 2, sd = 2)

## [1] 0.5

#0.5-quantile
qnorm(p = 0.5, mean = 2, sd = 2)

## [1] 2

#3 normal random variables


rnorm(n = 3, mean = 2, sd = 2)

## [1] 1.008104 3.768502 2.064348

2.2 Uniform df. on the Unit Interval

A rv. U is uniformly distributed on the interval [a, b], where −∞ < a < b < ∞, if


⎨0 : u<a
P(U ≤ u) = (u − a)/(b − a) : a ≤ u ≤ b


1 : u > b.
2.2 Uniform df. on the Unit Interval 11

We denote this distribution here by U N I (a, b) and use U N I to abbreviate U N I


(0, 1), the standard uniform distribution, which is also referred to as the
uniform distribution.
The uniform distribution is the most important one in generating rv. As we will
see in the next section, we can generate a rv. X ∼ F for every df. F if we can generate
a rv. U ∼ U N I .
There is a large literature on generating sequences of independent and uniformly
distributed rvs. which we will not discuss here. Eubank and Kupresanin (2011,
Chapter 4) is a good reference for pseudo-random number generators (PRNG), which
specifically addresses R.
Remark 2.2 In this manuscript, we usually take “Mersenne twister” as PRNG. If a
normally distributed rv. is to be created, this is done using the “inversion” method. For
reasons of reproducibility, a starting value (“set.seed”) is set before each simulation.
This seed also contains the name of the PRNG used and the name of the method for
generating normal distributed rvs. A typical call looks like
set.seed(123,kind ="Mersenne-Twister",normal.kind ="Inversion")

With the simTool package, simulations can also be run in parallel. In this case,
“L’Ecuyer-CMRG” is set globally as PRNG!

2.3 The Quantile Transformation

The following theorem says that we can generate a rv. X according to an arbitrary
df. F if we apply a certain transformation to a generated rv. U ∼ U N I .
Theorem 2.3 Let F be the df. of a rv. X and define for 0 < u < 1

F −1 (u) = inf{x ∈ R | F(x) ≥ u} (2.1)

the quantile function (qf.). If U ∼ U N I then:

X := F −1 (U ) ∼ F.

Proof At first note that the qf. equals the inverse function of F if F is strictly increas-
ing. If this is not the case, F −1 is still well defined and therefore qf. is a generalized
inverse of an increasing function.

We have to show that

P(F −1 (U ) ≤ x) = F(x), ∀ x ∈ R.

For this choose x ∈ R and 0 < u < 1 arbitrarily. Then the following equivalence
holds:
12 2 Generating Random Numbers

F −1 (u) ≤ x ⇐⇒ u ≤ F(x) (2.2)

“⇐:” If u ≤ F(x), apply the definition of F −1 to get F −1 (u) ≤ x.


“⇒:” Assume now F −1 (u) ≤ x and continue indirectly. For this assume further
that u > F(x). Since F is continuous from above there exists ε > 0 such that u >
F(x + ε). Apply the definition of F −1 to get F −1 (u) ≥ x + ε. This contradiction
leads to F −1 (u) ≤ x.
Now, apply (2.2) to get for arbitrary x ∈ R:

P(F −1 (U ) ≤ x) = P(U ≤ F(x)) = F(x) = P(X ≤ x),

where the second equality follows from U ∼ U N I . This finally proves the
theorem. 

Example 2.4 Let U ∼ U N I and



0 : x ≤0
F(x) :=
1 − exp(−α x) : x > 0

the df. of the exponential distribution with parameter α > 0, abbreviated by E X P(α).
Calculate the inverse of F to get

ln(1 − u)
F −1 (u) = − .
α

The last theorem guarantees that F −1 (U ) ∼ E X P(α).

R-Example 2.5 This example shows the generation of 1000 E X P(2) variables with
R based on the quantile transformation derived in Example 2.4.
gen.exp <- function(n, alpha){
#n - number of observations
#alpha - distribution parameter

return(-log(1 - runif(n)) / alpha)


}

# set the seed for the pseudo random number generator


# for reproducible results
set.seed(123,kind ="Mersenne-Twister",normal.kind ="Inversion")

# generate 1000 EXP(2) random variables


obs <- gen.exp(n = 1000, alpha = 2)

# draw a histogram with 50 cells


hist(obs, breaks = 50, freq = FALSE,
main = "Histogram of 1000 EXP(2)",
xlab = "", ylab = "density",
2.3 The Quantile Transformation 13

Histogram of 1000 EXP(2)

2.0
1.5
density

1.0
0.5
0.0

0 1 2 3 4

Fig. 2.1 Histogram of 1000 EXP(2) distributed random variables and the EXP(2)-density

xlim=c(0,4),
ylim=c(0,2))

# add the density function of a EXP(2) distributed random


# variable to the plot
curve(dexp(x, rate = 2), add = TRUE, col = "red")

In the first statement, the R-function “gen.exp” is defined with two parameters; n
and alpha, which implements the result derived in Example 2.4. It returns a vector of
n independent realizations of the E X P(alpha) distribution. In the second statement,
the seed is set for the pseudo-random number generator which is here “Mersenne-
Twister” to obtain reproducible results. “gen.exp” is applied with n = 1000 and
alpha = 2. The resulting vector is stored in the variable “obs” in statement three.
With the fourth statement a histogram of the generated variables is produced and
with the last statement this histogram is overlaid with the true density function of the
E X P(2) distribution, see Fig. 2.1.
In the following lemma, some further properties of the quantile function are listed:

Lemma 2.6 Let F be an arbitrary df. and denote by F −1 the corresponding quantile
function. We have for x, x1 , x2 ∈ R and 0 < u < 1:
1. F(x) ≥ u ⇐⇒ F −1 (u) ≤ x.
2. F(x) < u ⇐⇒ F −1 (u) > x.
14 2 Generating Random Numbers

3. F(x1 ) < u ≤ F(x2 ) ⇐⇒ x1 < F −1 (u) ≤ x2 .

Proof (i) Already shown under (2.2) of Theorem 2.3.


(ii) Consequence of part (i).
(iii) Consequence of part (i) and (ii).


Lemma 2.7 Let F be an arbitrary df. and 0 < u < 1. Then

F ◦ F −1 (u) ≥ u.

If u ∈ F(R) the inequality above changes to an equality.

Proof The inequality can be obtained from Lemma 2.6 (ii), since F ◦ F −1 (u) < u
would result in the obvious contradiction F −1 (u) > F −1 (u).
Now assume in addition that u ∈ F(R), i.e., there exists x ∈ R such that u = F(x).
Therefore, by definition of F −1 , we get F −1 (u) ≤ x. Applying F to both sides
of this inequality, the monotony of F implies that F ◦ F −1 (u) ≤ F(x) = u. Thus,
F ◦ F −1 (u) > u is not possible and according to the first part of the proof we get
F ◦ F −1 (u) = u. 

Corollary 2.8 Let X be a rv. with continuous df. F. Then

F(X ) ∼ U N I.

Proof According to Theorem 2.3, we can assume that

X = F −1 (U ),

where U ∼ U N I . Thus, it remains to show that F ◦ F −1 (U ) ∼ U N I . For this


choose 0 < u < 1 arbitrarily. Then continuity of F and the last lemma leads to

P(F ◦ F −1 (U ) ≤ u) = P(U ≤ u) = u

which proves the corollary. 

We finalize the section by another inequality of the quantile function.


Lemma 2.9 For each df. F and x ∈ R, we have

F −1 ◦ F(x) ≤ x.

If in addition x fulfills the extra condition that for all y < x, F(y) < F(x) holds,
then the inequality above changes to an equality.
2.3 The Quantile Transformation 15

Proof If F −1 ◦ F(x) > x for x ∈ R, then Lemma 2.6 (ii) immediately yields the
contradiction F(x) < F(x). Thus, the inequality stated above is correct.
Now, assume the extra condition of the lemma for the point x ∈ R. According to the
part just shown, we have to prove that F −1 (F(x)) < x cannot be correct. Assuming
that this inequality is correct, Lemma 2.7 implies

F(x) ≤ F ◦ F −1 (F(x)) < F(x)

which is obviously a contradiction. 

2.4 The Normal Distribution

Theorem 2.3 of the last section shows how the quantile function can be used to
generate a rv. according to a given df. F. However, the quantile function might
be difficult to calculate. Therefore, the procedure suggested under Theorem 2.3 is
only used in standard situations where F is invertible and the inverse can easily
be obtained. In those cases where it is not possible to calculate F −1 directly, other
procedures should be applied.
In the case of the standard normal distribution, i.e., the rv. X ∼ N (0, 1), the df. Φ
has the density φ with
 x  x 
1 t2
Φ(x) = P(X ≤ x) = φ(t) dt = √ exp − dt
−∞ 2π −∞ 2

which can be obtained only numerically. Thus, quantile transformation is not appli-
cable to generate such a rv.
As the next lemma will show, we can generate a rv. Z ∼ N (μ, σ 2 ), i.e., Z has df.
F with  x 
1 (t − μ)2
F(x) = √ exp − dt, (2.3)
2π σ 2 −∞ 2σ 2

through a linear transformed rv. X ∼ N (0, 1).


Lemma 2.10 Let X ∼ N (0, 1). Then Z := σ · X + μ is distributed according to
N (μ, σ 2 ).
Proof Let μ ∈ R, σ > 0, and z ∈ R be given. Then

P(Z ≤ z) = P(σ X + μ ≤ z) = P(X ≤ (z − μ)/σ )


 (z−μ)/σ  2
1 t
= √ exp − dt.
2 π −∞ 2

Now, differentiate both sides w.r.t. z to obtain by the chain rule and the Fundamental
Theorem of Calculus the density function
16 2 Generating Random Numbers

1 (z − μ)2
f (z) = √ exp − .
2π σ 2 2σ 2

But f is precisely the density function of a rv. which is N (μ, σ 2 ) distributed. 


In the next theorem, the Box-Muller algorithm to generate N (0, 1) distributed
rv. is given.
Theorem 2.11 Box-Muller algorithm. Let U, V ∼ U N I be two independent rv.
uniformly distributed on the unit interval. Then the rv.

X= −2 log(U ) cos(2π V ), Y = −2 log(U ) sin(2π V )

are independent from one another and both are N (0, 1) distributed.
Proof The proof is omitted here but can be found in Box and Muller (1958). 

2.5 Method of Rejection

As already discussed in the last section, quantile transformation is not always applica-
ble in practise. In this section, we discuss a method which is applicable in a situation
where the df. F has a density function f .
Theorem 2.12 Method of Rejection. Let F, G be df. with probability density func-
tions f, g. Furthermore, let M > 0 be such that

f (x) ≤ Mg(x), ∀ x ∈ R.

To generate a rv. X ∼ F perform the following steps:


(i) Generate Y ∼ G.
(ii) Generate U ∼ U N I independent of Y .
(iii) If U ≤ f (Y )/(M · g(Y )), return Y . Else reject Y and start again with step (i).
Proof We have to prove that X ∼ F. Note first that

 P Y ≤ x, U ≤ f (Y )
f (Y ) M·g(Y )
P(X ≤ x) = P Y ≤ x U ≤ = .
M · g(Y ) P U≤ f (Y )
M·g(Y )

For the numerator on the right-hand side, we obtain by conditioning w.r.t. Y


  
f (Y ) x
f (Y )
P Y ≤ x, U ≤ = P U≤ Y = y G(dy)
M · g(Y ) −∞ M · g(Y )
 x 
f (y)
= P U≤ G(dy),
−∞ M · g(y)
2.5 Method of Rejection 17

where the last equality follows from the independence of U and Y . Since U ∼ U N I ,
the last integral is equal to
 
x
f (y) 1 x
F(x)
g(y)dy = f (y) dy = .
−∞ M · g(y) M −∞ M

Since the denominator is the limit of the numerator for x → ∞ and F(x) → 1
for x → ∞, the denominator must be identical to 1/M. This finally proves the
theorem. 

Generally, one chooses the rv. Y ∼ G in such a way that Y can be easily generated
by quantile transformation. The constant M > 0 should then be chosen as small as
possible to minimize the cases of rejection.
In the following example, we apply the rejection method to generate a rv. X ∼
N (0, 1). For the df. G, we choose the Cauchy distribution given under Exercise
2.16.
Example 2.13 At first, we have to find a proper constant M
 2   2
f (x) 1 x 1 x
= √ exp − = π/2 exp − (1 + x 2 ).
g(x) 2π 2 π(1 + x 2 ) 2

2
The function exp − x2 (1 + x 2 ) is symmetric around 0 and has a global maximum
at x = 1. Thus, the constant
√ 
2 π/2 2π
M := √ =
e e

can be used.

R-Example 2.14 The results of the last example can be implemented in R like
set.seed(123,kind ="Mersenne-Twister",normal.kind ="Inversion")
gen.norm.rm <- function(n){
# n - number of observations

# constant used during the method of rejection


M = sqrt(2 * pi * exp(-1))

# actual method of rejection, returning one observation


MethodOfRejection <- function() {
repeat{
Y = rcauchy(1)
if(runif(1) <= dnorm(Y) / (M * dcauchy(Y)))
return(Y)
}
}
18 2 Generating Random Numbers

Rejection−Method

0.4
0.3
density

0.2
0.1
0.0

−4 −2 0 2 4

Fig. 2.2 Histogram of 10000 N (0, 1) rvs. generated with the rejection method and the N (0, 1)-
density

# calling MethodOfRejection n times


replicate(n, MethodOfRejection())
}
obs <- gen.norm.rm(n = 10000)
hist(obs, breaks = 50, freq = FALSE, xlab = "", xlim=c(-4,4),
ylab = "density",
main = "Rejection-Method")
curve(dnorm(x), col = "red", add = TRUE)

In the source code above, we define the function “gen.norm.rm” which returns a
vector of n independent standard normal rvs. by applying the rejection method as
described in Example 2.13. The function is called with n = 10000 and the result is
stored in the variable “obs”. The last two lines produce the histogram in Fig. 2.2.
Within “gen.norm.rm” the functions “rcauchy”, “runif”, “dnorm”, and “dcauchy”
from the stats library are called. For the meaning of these functions, compare
Sect. 2.1.
Random documents with unrelated
content Scribd suggests to you:
NEW HAVEN 522 FIFTH AVENUE
CONNECTICUT NEW YORK CITY

YALE UNIVERSITY PRESS


Announces the Publication of

Poems of Arthur O’Shaughnessy


Selected and Edited by
WILLIAM ALEXANDER PERCY
Mr. Percy says in his remarkable Introduction: “The Yale
University Press, thinking perhaps, with me, that even the most
beautiful things perish if the opportunity for reading or seeing or
hearing them is not offered the vexed and hurrying children of
men, has undertaken here the pious task of making
O’Shaughnessy’s finest poems accessible to readers of English
poetry.... His best is unique, of a haunting beauty, a very precious
heritage.... He had, as Palgrave put it, ‘The exquisite tenderness
of touch, the melody and delicacy’ of his favorite composer,
Chopin.... If I were passing the Siren Isles, one of the songs I
know I should hear drifting across the waves would be that which
Sarrazine sang to her dead lover in Chaitivel:

‘Hath any loved you well, down there,


Summer or winter through?
Down there, have you found any fair
Laid in the grave with you?
Is death’s long kiss a richer kiss
Than mine was wont to be—
Or have you gone to some far bliss
And quite forgotten me?’”

O’Shaughnessy died in 1881. Until the publication of this


admirably edited volume, no considerable part of his work has
been commonly available for many years.
Price $2.00.
*** END OF THE PROJECT GUTENBERG EBOOK THE YALE
LITERARY MAGAZINE (VOL. LXXXVIII, NO. 6, MARCH 1923) ***

Updated editions will replace the previous one—the old editions


will be renamed.

Creating the works from print editions not protected by U.S.


copyright law means that no one owns a United States copyright
in these works, so the Foundation (and you!) can copy and
distribute it in the United States without permission and without
paying copyright royalties. Special rules, set forth in the General
Terms of Use part of this license, apply to copying and
distributing Project Gutenberg™ electronic works to protect the
PROJECT GUTENBERG™ concept and trademark. Project
Gutenberg is a registered trademark, and may not be used if
you charge for an eBook, except by following the terms of the
trademark license, including paying royalties for use of the
Project Gutenberg trademark. If you do not charge anything for
copies of this eBook, complying with the trademark license is
very easy. You may use this eBook for nearly any purpose such
as creation of derivative works, reports, performances and
research. Project Gutenberg eBooks may be modified and
printed and given away—you may do practically ANYTHING in
the United States with eBooks not protected by U.S. copyright
law. Redistribution is subject to the trademark license, especially
commercial redistribution.

START: FULL LICENSE


THE FULL PROJECT GUTENBERG LICENSE
PLEASE READ THIS BEFORE YOU DISTRIBUTE OR USE THIS WORK

To protect the Project Gutenberg™ mission of promoting the


free distribution of electronic works, by using or distributing this
work (or any other work associated in any way with the phrase
“Project Gutenberg”), you agree to comply with all the terms of
the Full Project Gutenberg™ License available with this file or
online at www.gutenberg.org/license.

Section 1. General Terms of Use and


Redistributing Project Gutenberg™
electronic works
1.A. By reading or using any part of this Project Gutenberg™
electronic work, you indicate that you have read, understand,
agree to and accept all the terms of this license and intellectual
property (trademark/copyright) agreement. If you do not agree to
abide by all the terms of this agreement, you must cease using
and return or destroy all copies of Project Gutenberg™
electronic works in your possession. If you paid a fee for
obtaining a copy of or access to a Project Gutenberg™
electronic work and you do not agree to be bound by the terms
of this agreement, you may obtain a refund from the person or
entity to whom you paid the fee as set forth in paragraph 1.E.8.

1.B. “Project Gutenberg” is a registered trademark. It may only


be used on or associated in any way with an electronic work by
people who agree to be bound by the terms of this agreement.
There are a few things that you can do with most Project
Gutenberg™ electronic works even without complying with the
full terms of this agreement. See paragraph 1.C below. There
are a lot of things you can do with Project Gutenberg™
electronic works if you follow the terms of this agreement and
help preserve free future access to Project Gutenberg™
electronic works. See paragraph 1.E below.
1.C. The Project Gutenberg Literary Archive Foundation (“the
Foundation” or PGLAF), owns a compilation copyright in the
collection of Project Gutenberg™ electronic works. Nearly all the
individual works in the collection are in the public domain in the
United States. If an individual work is unprotected by copyright
law in the United States and you are located in the United
States, we do not claim a right to prevent you from copying,
distributing, performing, displaying or creating derivative works
based on the work as long as all references to Project
Gutenberg are removed. Of course, we hope that you will
support the Project Gutenberg™ mission of promoting free
access to electronic works by freely sharing Project
Gutenberg™ works in compliance with the terms of this
agreement for keeping the Project Gutenberg™ name
associated with the work. You can easily comply with the terms
of this agreement by keeping this work in the same format with
its attached full Project Gutenberg™ License when you share it
without charge with others.

1.D. The copyright laws of the place where you are located also
govern what you can do with this work. Copyright laws in most
countries are in a constant state of change. If you are outside
the United States, check the laws of your country in addition to
the terms of this agreement before downloading, copying,
displaying, performing, distributing or creating derivative works
based on this work or any other Project Gutenberg™ work. The
Foundation makes no representations concerning the copyright
status of any work in any country other than the United States.

1.E. Unless you have removed all references to Project


Gutenberg:

1.E.1. The following sentence, with active links to, or other


immediate access to, the full Project Gutenberg™ License must
appear prominently whenever any copy of a Project
Gutenberg™ work (any work on which the phrase “Project
Gutenberg” appears, or with which the phrase “Project
Gutenberg” is associated) is accessed, displayed, performed,
viewed, copied or distributed:

This eBook is for the use of anyone anywhere in the United


States and most other parts of the world at no cost and with
almost no restrictions whatsoever. You may copy it, give it
away or re-use it under the terms of the Project Gutenberg
License included with this eBook or online at
www.gutenberg.org. If you are not located in the United
States, you will have to check the laws of the country where
you are located before using this eBook.

1.E.2. If an individual Project Gutenberg™ electronic work is


derived from texts not protected by U.S. copyright law (does not
contain a notice indicating that it is posted with permission of the
copyright holder), the work can be copied and distributed to
anyone in the United States without paying any fees or charges.
If you are redistributing or providing access to a work with the
phrase “Project Gutenberg” associated with or appearing on the
work, you must comply either with the requirements of
paragraphs 1.E.1 through 1.E.7 or obtain permission for the use
of the work and the Project Gutenberg™ trademark as set forth
in paragraphs 1.E.8 or 1.E.9.

1.E.3. If an individual Project Gutenberg™ electronic work is


posted with the permission of the copyright holder, your use and
distribution must comply with both paragraphs 1.E.1 through
1.E.7 and any additional terms imposed by the copyright holder.
Additional terms will be linked to the Project Gutenberg™
License for all works posted with the permission of the copyright
holder found at the beginning of this work.

1.E.4. Do not unlink or detach or remove the full Project


Gutenberg™ License terms from this work, or any files
containing a part of this work or any other work associated with
Project Gutenberg™.
1.E.5. Do not copy, display, perform, distribute or redistribute
this electronic work, or any part of this electronic work, without
prominently displaying the sentence set forth in paragraph 1.E.1
with active links or immediate access to the full terms of the
Project Gutenberg™ License.

1.E.6. You may convert to and distribute this work in any binary,
compressed, marked up, nonproprietary or proprietary form,
including any word processing or hypertext form. However, if
you provide access to or distribute copies of a Project
Gutenberg™ work in a format other than “Plain Vanilla ASCII” or
other format used in the official version posted on the official
Project Gutenberg™ website (www.gutenberg.org), you must, at
no additional cost, fee or expense to the user, provide a copy, a
means of exporting a copy, or a means of obtaining a copy upon
request, of the work in its original “Plain Vanilla ASCII” or other
form. Any alternate format must include the full Project
Gutenberg™ License as specified in paragraph 1.E.1.

1.E.7. Do not charge a fee for access to, viewing, displaying,


performing, copying or distributing any Project Gutenberg™
works unless you comply with paragraph 1.E.8 or 1.E.9.

1.E.8. You may charge a reasonable fee for copies of or


providing access to or distributing Project Gutenberg™
electronic works provided that:

• You pay a royalty fee of 20% of the gross profits you derive from
the use of Project Gutenberg™ works calculated using the
method you already use to calculate your applicable taxes. The
fee is owed to the owner of the Project Gutenberg™ trademark,
but he has agreed to donate royalties under this paragraph to
the Project Gutenberg Literary Archive Foundation. Royalty
payments must be paid within 60 days following each date on
which you prepare (or are legally required to prepare) your
periodic tax returns. Royalty payments should be clearly marked
as such and sent to the Project Gutenberg Literary Archive
Foundation at the address specified in Section 4, “Information
about donations to the Project Gutenberg Literary Archive
Foundation.”

• You provide a full refund of any money paid by a user who


notifies you in writing (or by e-mail) within 30 days of receipt that
s/he does not agree to the terms of the full Project Gutenberg™
License. You must require such a user to return or destroy all
copies of the works possessed in a physical medium and
discontinue all use of and all access to other copies of Project
Gutenberg™ works.

• You provide, in accordance with paragraph 1.F.3, a full refund of


any money paid for a work or a replacement copy, if a defect in
the electronic work is discovered and reported to you within 90
days of receipt of the work.

• You comply with all other terms of this agreement for free
distribution of Project Gutenberg™ works.

1.E.9. If you wish to charge a fee or distribute a Project


Gutenberg™ electronic work or group of works on different
terms than are set forth in this agreement, you must obtain
permission in writing from the Project Gutenberg Literary
Archive Foundation, the manager of the Project Gutenberg™
trademark. Contact the Foundation as set forth in Section 3
below.

1.F.

1.F.1. Project Gutenberg volunteers and employees expend


considerable effort to identify, do copyright research on,
transcribe and proofread works not protected by U.S. copyright
law in creating the Project Gutenberg™ collection. Despite
these efforts, Project Gutenberg™ electronic works, and the
medium on which they may be stored, may contain “Defects,”
such as, but not limited to, incomplete, inaccurate or corrupt
data, transcription errors, a copyright or other intellectual
property infringement, a defective or damaged disk or other
medium, a computer virus, or computer codes that damage or
cannot be read by your equipment.

1.F.2. LIMITED WARRANTY, DISCLAIMER OF DAMAGES -


Except for the “Right of Replacement or Refund” described in
paragraph 1.F.3, the Project Gutenberg Literary Archive
Foundation, the owner of the Project Gutenberg™ trademark,
and any other party distributing a Project Gutenberg™ electronic
work under this agreement, disclaim all liability to you for
damages, costs and expenses, including legal fees. YOU
AGREE THAT YOU HAVE NO REMEDIES FOR NEGLIGENCE,
STRICT LIABILITY, BREACH OF WARRANTY OR BREACH
OF CONTRACT EXCEPT THOSE PROVIDED IN PARAGRAPH
1.F.3. YOU AGREE THAT THE FOUNDATION, THE
TRADEMARK OWNER, AND ANY DISTRIBUTOR UNDER
THIS AGREEMENT WILL NOT BE LIABLE TO YOU FOR
ACTUAL, DIRECT, INDIRECT, CONSEQUENTIAL, PUNITIVE
OR INCIDENTAL DAMAGES EVEN IF YOU GIVE NOTICE OF
THE POSSIBILITY OF SUCH DAMAGE.

1.F.3. LIMITED RIGHT OF REPLACEMENT OR REFUND - If


you discover a defect in this electronic work within 90 days of
receiving it, you can receive a refund of the money (if any) you
paid for it by sending a written explanation to the person you
received the work from. If you received the work on a physical
medium, you must return the medium with your written
explanation. The person or entity that provided you with the
defective work may elect to provide a replacement copy in lieu
of a refund. If you received the work electronically, the person or
entity providing it to you may choose to give you a second
opportunity to receive the work electronically in lieu of a refund.
If the second copy is also defective, you may demand a refund
in writing without further opportunities to fix the problem.

1.F.4. Except for the limited right of replacement or refund set


forth in paragraph 1.F.3, this work is provided to you ‘AS-IS’,
WITH NO OTHER WARRANTIES OF ANY KIND, EXPRESS
OR IMPLIED, INCLUDING BUT NOT LIMITED TO
WARRANTIES OF MERCHANTABILITY OR FITNESS FOR
ANY PURPOSE.

1.F.5. Some states do not allow disclaimers of certain implied


warranties or the exclusion or limitation of certain types of
damages. If any disclaimer or limitation set forth in this
agreement violates the law of the state applicable to this
agreement, the agreement shall be interpreted to make the
maximum disclaimer or limitation permitted by the applicable
state law. The invalidity or unenforceability of any provision of
this agreement shall not void the remaining provisions.

1.F.6. INDEMNITY - You agree to indemnify and hold the


Foundation, the trademark owner, any agent or employee of the
Foundation, anyone providing copies of Project Gutenberg™
electronic works in accordance with this agreement, and any
volunteers associated with the production, promotion and
distribution of Project Gutenberg™ electronic works, harmless
from all liability, costs and expenses, including legal fees, that
arise directly or indirectly from any of the following which you do
or cause to occur: (a) distribution of this or any Project
Gutenberg™ work, (b) alteration, modification, or additions or
deletions to any Project Gutenberg™ work, and (c) any Defect
you cause.

Section 2. Information about the Mission of


Project Gutenberg™
Project Gutenberg™ is synonymous with the free distribution of
electronic works in formats readable by the widest variety of
computers including obsolete, old, middle-aged and new
computers. It exists because of the efforts of hundreds of
volunteers and donations from people in all walks of life.

Volunteers and financial support to provide volunteers with the


assistance they need are critical to reaching Project
Gutenberg™’s goals and ensuring that the Project Gutenberg™
collection will remain freely available for generations to come. In
2001, the Project Gutenberg Literary Archive Foundation was
created to provide a secure and permanent future for Project
Gutenberg™ and future generations. To learn more about the
Project Gutenberg Literary Archive Foundation and how your
efforts and donations can help, see Sections 3 and 4 and the
Foundation information page at www.gutenberg.org.

Section 3. Information about the Project


Gutenberg Literary Archive Foundation
The Project Gutenberg Literary Archive Foundation is a non-
profit 501(c)(3) educational corporation organized under the
laws of the state of Mississippi and granted tax exempt status by
the Internal Revenue Service. The Foundation’s EIN or federal
tax identification number is 64-6221541. Contributions to the
Project Gutenberg Literary Archive Foundation are tax
deductible to the full extent permitted by U.S. federal laws and
your state’s laws.

The Foundation’s business office is located at 809 North 1500


West, Salt Lake City, UT 84116, (801) 596-1887. Email contact
links and up to date contact information can be found at the
Foundation’s website and official page at
www.gutenberg.org/contact

Section 4. Information about Donations to


the Project Gutenberg Literary Archive
Foundation
Project Gutenberg™ depends upon and cannot survive without
widespread public support and donations to carry out its mission
of increasing the number of public domain and licensed works
that can be freely distributed in machine-readable form
accessible by the widest array of equipment including outdated
equipment. Many small donations ($1 to $5,000) are particularly
important to maintaining tax exempt status with the IRS.

The Foundation is committed to complying with the laws


regulating charities and charitable donations in all 50 states of
the United States. Compliance requirements are not uniform
and it takes a considerable effort, much paperwork and many
fees to meet and keep up with these requirements. We do not
solicit donations in locations where we have not received written
confirmation of compliance. To SEND DONATIONS or
determine the status of compliance for any particular state visit
www.gutenberg.org/donate.

While we cannot and do not solicit contributions from states


where we have not met the solicitation requirements, we know
of no prohibition against accepting unsolicited donations from
donors in such states who approach us with offers to donate.

International donations are gratefully accepted, but we cannot


make any statements concerning tax treatment of donations
received from outside the United States. U.S. laws alone swamp
our small staff.

Please check the Project Gutenberg web pages for current


donation methods and addresses. Donations are accepted in a
number of other ways including checks, online payments and
credit card donations. To donate, please visit:
www.gutenberg.org/donate.

Section 5. General Information About Project


Gutenberg™ electronic works
Professor Michael S. Hart was the originator of the Project
Gutenberg™ concept of a library of electronic works that could
be freely shared with anyone. For forty years, he produced and
distributed Project Gutenberg™ eBooks with only a loose
network of volunteer support.

Project Gutenberg™ eBooks are often created from several


printed editions, all of which are confirmed as not protected by
copyright in the U.S. unless a copyright notice is included. Thus,
we do not necessarily keep eBooks in compliance with any
particular paper edition.

Most people start at our website which has the main PG search
facility: www.gutenberg.org.

This website includes information about Project Gutenberg™,


including how to make donations to the Project Gutenberg
Literary Archive Foundation, how to help produce our new
eBooks, and how to subscribe to our email newsletter to hear
about new eBooks.

You might also like