100% found this document useful (2 votes)
95 views

Complete Statistical Methods For Machine Learning: Discover How To Transform Data Into Knowledge With Python Jason Brownlee PDF For All Chapters

Data

Uploaded by

bisethciunka22
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (2 votes)
95 views

Complete Statistical Methods For Machine Learning: Discover How To Transform Data Into Knowledge With Python Jason Brownlee PDF For All Chapters

Data

Uploaded by

bisethciunka22
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 62

Download the full version of the textbook now at textbookfull.

com

Statistical Methods for Machine Learning:


Discover How to Transform Data into Knowledge
with Python Jason Brownlee

https://textbookfull.com/product/statistical-
methods-for-machine-learning-discover-how-to-
transform-data-into-knowledge-with-python-jason-
brownlee/

Explore and download more textbook at https://textbookfull.com


Recommended digital products (PDF, EPUB, MOBI) that
you can download immediately if you are interested.

Probability for Machine Learning - Discover How To Harness


Uncertainty With Python Jason Brownlee

https://textbookfull.com/product/probability-for-machine-learning-
discover-how-to-harness-uncertainty-with-python-jason-brownlee/

textbookfull.com

Master Machine Learning Algorithms Discover how they work


and Implement Them From Scratch Jason Brownlee

https://textbookfull.com/product/master-machine-learning-algorithms-
discover-how-they-work-and-implement-them-from-scratch-jason-brownlee/

textbookfull.com

Master Machine Learning Algorithms Discover How They Work


and Implement Them From Scratch Jason Brownlee

https://textbookfull.com/product/master-machine-learning-algorithms-
discover-how-they-work-and-implement-them-from-scratch-jason-
brownlee-2/
textbookfull.com

Hands-on Azure Pipelines: Understanding Continuous


Integration and Deployment in Azure DevOps 1st Edition
Chaminda Chandrasekara
https://textbookfull.com/product/hands-on-azure-pipelines-
understanding-continuous-integration-and-deployment-in-azure-
devops-1st-edition-chaminda-chandrasekara/
textbookfull.com
Chemical and Biological Weapons and Terrorism 1st Edition
Anthony Tu

https://textbookfull.com/product/chemical-and-biological-weapons-and-
terrorism-1st-edition-anthony-tu/

textbookfull.com

Critical Social Justice Education and the Assault on Truth


in White Public Pedagogy: The US-Dakota War Re-Examined
Rick Lybeck
https://textbookfull.com/product/critical-social-justice-education-
and-the-assault-on-truth-in-white-public-pedagogy-the-us-dakota-war-
re-examined-rick-lybeck/
textbookfull.com

Marquise (Mansion On The Hill Book 2) 1st Edition


Chashiree M. & M.K. Moore [M.

https://textbookfull.com/product/marquise-mansion-on-the-hill-
book-2-1st-edition-chashiree-m-m-k-moore-m/

textbookfull.com

Electromagnetics, Third Edition Cloud

https://textbookfull.com/product/electromagnetics-third-edition-cloud/

textbookfull.com

The Ethics of Silence An Interdisciplinary Case Analysis


Approach 1st Edition Nancy Billias

https://textbookfull.com/product/the-ethics-of-silence-an-
interdisciplinary-case-analysis-approach-1st-edition-nancy-billias/

textbookfull.com
Artificial Intelligence and Algorithms in Intelligent
Systems Radek Silhavy

https://textbookfull.com/product/artificial-intelligence-and-
algorithms-in-intelligent-systems-radek-silhavy/

textbookfull.com
Statistical Methods for
Machine Learning

Discover how to Transform Data


into Knowledge with Python

Jason Brownlee
i

Disclaimer
The information contained within this eBook is strictly for educational purposes. If you wish to apply
ideas contained in this eBook, you are taking full responsibility for your actions.
The author has made every effort to ensure the accuracy of the information within this book was
correct at time of publication. The author does not assume and hereby disclaims any liability to any
party for any loss, damage, or disruption caused by errors or omissions, whether such errors or
omissions result from accident, negligence, or any other cause.
No part of this eBook may be reproduced or transmitted in any form or by any means, electronic or
mechanical, recording or by any information storage and retrieval system, without written permission
from the author.

Acknowledgements
Special thanks to my copy editor Sarah Martin and my technical editors Arun Koshy and Andrei
Cheremskoy.

Copyright

© Copyright 2019 Jason Brownlee. All Rights Reserved.


Statistical Methods for Machine Learning

Edition: v1.4
Contents

Copyright i

Contents ii

Preface iii

I Introduction v

II Statistics 1
1 Introduction to Statistics 2
1.1 Statistics is Required Prerequisite . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Why Learn Statistics? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 What is Statistics? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.4 Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2 Statistics vs Machine Learning 7


2.1 Machine Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.2 Predictive Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.3 Statistical Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.4 Two Cultures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.5 Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

3 Examples of Statistics in Machine Learning 12


3.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.2 Problem Framing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.3 Data Understanding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.4 Data Cleaning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.5 Data Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.6 Data Preparation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.7 Model Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.8 Model Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.9 Model Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

ii
CONTENTS iii

3.10 Model Presentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15


3.11 Model Predictions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.12 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

III Foundation 17
4 Gaussian and Summary Stats 18
4.1 Tutorial Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
4.2 Gaussian Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
4.3 Sample vs Population . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
4.4 Test Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
4.5 Central Tendency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
4.6 Variance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
4.7 Describing a Gaussian . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
4.8 Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
4.9 Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
4.10 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

5 Simple Data Visualization 31


5.1 Tutorial Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
5.2 Data Visualization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
5.3 Introduction to Matplotlib . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
5.4 Line Plot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
5.5 Bar Chart . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
5.6 Histogram Plot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
5.7 Box and Whisker Plot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
5.8 Scatter Plot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
5.9 Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
5.10 Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
5.11 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

6 Random Numbers 44
6.1 Tutorial Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
6.2 Randomness in Machine Learning . . . . . . . . . . . . . . . . . . . . . . . . . . 45
6.3 Pseudorandom Number Generators . . . . . . . . . . . . . . . . . . . . . . . . . 45
6.4 Random Numbers with Python . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
6.5 Random Numbers with NumPy . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
6.6 When to Seed the Random Number Generator . . . . . . . . . . . . . . . . . . . 54
6.7 How to Control for Randomness . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
6.8 Common Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
6.9 Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
6.10 Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
6.11 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
CONTENTS iv

7 Law of Large Numbers 57


7.1 Tutorial Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
7.2 Law of Large Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
7.3 Worked Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
7.4 Implications in Machine Learning . . . . . . . . . . . . . . . . . . . . . . . . . . 61
7.5 Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
7.6 Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
7.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

8 Central Limit Theorem 64


8.1 Tutorial Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
8.2 Central Limit Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
8.3 Worked Example with Dice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
8.4 Impact on Machine Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
8.5 Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
8.6 Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
8.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

IV Hypothesis Testing 70
9 Statistical Hypothesis Testing 71
9.1 Tutorial Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
9.2 Statistical Hypothesis Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
9.3 Statistical Test Interpretation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
9.4 Errors in Statistical Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
9.5 Degrees of Freedom in Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
9.6 Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
9.7 Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
9.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77

10 Statistical Distributions 78
10.1 Tutorial Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
10.2 Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
10.3 Gaussian Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
10.4 Student’s t-Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
10.5 Chi-Squared Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
10.6 Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
10.7 Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
10.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89

11 Critical Values 90
11.1 Tutorial Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
11.2 Why Do We Need Critical Values? . . . . . . . . . . . . . . . . . . . . . . . . . 90
11.3 What Is a Critical Value? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
11.4 How to Use Critical Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
11.5 How to Calculate Critical Values . . . . . . . . . . . . . . . . . . . . . . . . . . 93
CONTENTS v

11.6 Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
11.7 Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
11.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96

12 Covariance and Correlation 97


12.1 Tutorial Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
12.2 What is Correlation? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
12.3 Test Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
12.4 Covariance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
12.5 Pearson’s Correlation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
12.6 Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
12.7 Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
12.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103

13 Significance Tests 104


13.1 Tutorial Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
13.2 Parametric Statistical Significance Tests . . . . . . . . . . . . . . . . . . . . . . 105
13.3 Test Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
13.4 Student’s t-Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
13.5 Paired Student’s t-Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
13.6 Analysis of Variance Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
13.7 Repeated Measures ANOVA Test . . . . . . . . . . . . . . . . . . . . . . . . . . 109
13.8 Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
13.9 Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
13.10Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111

14 Effect Size 112


14.1 Tutorial Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
14.2 The Need to Report Effect Size . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
14.3 What Is Effect Size? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
14.4 How to Calculate Effect Size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
14.5 Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
14.6 Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
14.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119

15 Statistical Power 120


15.1 Tutorial Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
15.2 Statistical Hypothesis Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
15.3 What Is Statistical Power? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
15.4 Power Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
15.5 Student’s t-Test Power Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
15.6 Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
15.7 Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
15.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
CONTENTS vi

V Resampling Methods 129


16 Introduction to Resampling 130
16.1 Tutorial Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
16.2 Statistical Sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
16.3 Statistical Resampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
16.4 Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
16.5 Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
16.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135

17 Estimation with Bootstrap 136


17.1 Tutorial Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
17.2 Bootstrap Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
17.3 Configuration of the Bootstrap . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
17.4 Worked Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
17.5 Bootstrap in Python . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
17.6 Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
17.7 Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
17.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141

18 Estimation with Cross-Validation 143


18.1 Tutorial Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
18.2 k-Fold Cross-Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
18.3 Configuration of k . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
18.4 Worked Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
18.5 Cross-Validation in Python . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
18.6 Variations on Cross-Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
18.7 Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
18.8 Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
18.9 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149

VI Estimation Statistics 150


19 Introduction to Estimation Statistics 151
19.1 Tutorial Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
19.2 Problems with Hypothesis Testing . . . . . . . . . . . . . . . . . . . . . . . . . . 152
19.3 Estimation Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
19.4 Effect Size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
19.5 Interval Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
19.6 Meta-Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
19.7 Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
19.8 Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
19.9 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
CONTENTS vii

20 Tolerance Intervals 157


20.1 Tutorial Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
20.2 Bounds on Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158
20.3 What Are Statistical Tolerance Intervals? . . . . . . . . . . . . . . . . . . . . . . 158
20.4 How to Calculate Tolerance Intervals . . . . . . . . . . . . . . . . . . . . . . . . 159
20.5 Tolerance Interval for Gaussian Distribution . . . . . . . . . . . . . . . . . . . . 159
20.6 Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
20.7 Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
20.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164

21 Confidence Intervals 165


21.1 Tutorial Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
21.2 What is a Confidence Interval? . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
21.3 Interval for Classification Accuracy . . . . . . . . . . . . . . . . . . . . . . . . . 167
21.4 Nonparametric Confidence Interval . . . . . . . . . . . . . . . . . . . . . . . . . 170
21.5 Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
21.6 Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173
21.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174

22 Prediction Intervals 175


22.1 Tutorial Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175
22.2 Why Calculate a Prediction Interval? . . . . . . . . . . . . . . . . . . . . . . . . 176
22.3 What Is a Prediction Interval? . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176
22.4 How to Calculate a Prediction Interval . . . . . . . . . . . . . . . . . . . . . . . 177
22.5 Prediction Interval for Linear Regression . . . . . . . . . . . . . . . . . . . . . . 178
22.6 Worked Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179
22.7 Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184
22.8 Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
22.9 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186

VII Nonparametric Methods 187


23 Rank Data 188
23.1 Tutorial Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188
23.2 Parametric Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189
23.3 Nonparametric Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189
23.4 Ranking Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190
23.5 Working with Ranked Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191
23.6 Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192
23.7 Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192
23.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193

24 Normality Tests 194


24.1 Tutorial Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194
24.2 Normality Assumption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195
24.3 Test Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195
CONTENTS viii

24.4 Visual Normality Checks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196


24.5 Statistical Normality Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199
24.6 What Test Should You Use? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202
24.7 Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203
24.8 Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203
24.9 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204

25 Make Data Normal 205


25.1 Tutorial Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
25.2 Gaussian and Gaussian-Like . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206
25.3 Sample Size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206
25.4 Data Resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208
25.5 Extreme Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209
25.6 Long Tails . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211
25.7 Power Transforms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214
25.8 Use Anyway . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217
25.9 Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217
25.10Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218
25.11Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218

26 5-Number Summary 220


26.1 Tutorial Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 220
26.2 Nonparametric Data Summarization . . . . . . . . . . . . . . . . . . . . . . . . 220
26.3 Five-Number Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221
26.4 How to Calculate the Five-Number Summary . . . . . . . . . . . . . . . . . . . 222
26.5 Use of the Five-Number Summary . . . . . . . . . . . . . . . . . . . . . . . . . . 223
26.6 Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223
26.7 Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223
26.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224

27 Rank Correlation 225


27.1 Tutorial Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225
27.2 Rank Correlation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225
27.3 Test Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227
27.4 Spearman’s Rank Correlation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 228
27.5 Kendall’s Rank Correlation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229
27.6 Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230
27.7 Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231
27.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232

28 Rank Significance Tests 233


28.1 Tutorial Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233
28.2 Nonparametric Statistical Significance Tests . . . . . . . . . . . . . . . . . . . . 234
28.3 Test Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234
28.4 Mann-Whitney U Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235
28.5 Wilcoxon Signed-Rank Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236
28.6 Kruskal-Wallis H Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237
CONTENTS ix

28.7 Friedman Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239


28.8 Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 240
28.9 Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 240

29 Independence Test 243


29.1 Tutorial Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243
29.2 Contingency Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243
29.3 Pearson’s Chi-Squared Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244
29.4 Example Chi-Squared Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245
29.5 Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247
29.6 Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 248
29.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249

VIII Appendix 250


A Getting Help 251
A.1 Statistics on Wikipedia . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251
A.2 Statistics Textbooks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251
A.3 Python API Resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252
A.4 Ask Questions About Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . 253
A.5 How to Ask Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253
A.6 Contact the Author . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253

B How to Setup a Workstation for Python 254


B.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254
B.2 Download Anaconda . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254
B.3 Install Anaconda . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 256
B.4 Start and Update Anaconda . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 258
B.5 Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261
B.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261

C Basic Math Notation 262


C.1 Tutorial Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 262
C.2 The Frustration with Math Notation . . . . . . . . . . . . . . . . . . . . . . . . 263
C.3 Arithmetic Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263
C.4 Greek Alphabet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265
C.5 Sequence Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 266
C.6 Set Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267
C.7 Other Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 268
C.8 Tips for Getting More Help . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 268
C.9 Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 270
C.10 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 270
Visit https://textbookfull.com
now to explore a rich
collection of eBooks, textbook
and enjoy exciting offers!
CONTENTS x

IX Conclusions 271
How Far You Have Come 272
Preface

Statistics is Important
Statistics is important to machine learning practitioners.

ˆ Statistics is a prerequisite in most courses and books on applied machine learning.

ˆ Statistical methods are used at each step in an applied machine learning project.

ˆ Statistical learning is the applied statistics equivalent of predictive modeling in machine


learning.

A machine learning practitioner cannot be effective without an understanding of basic


statistical concepts and statistics methods, and an effective practitioner cannot excel without
being aware of and leveraging the terminology and methods used in the sister field of statistical
learning.

Practitioners Don’t Know Stats


Developers don’t know statistics and this is a huge problem. Programmers don’t need to know
and use statistical methods in order to develop software. Software engineering and computer
science courses generally don’t include courses on statistics, let alone advanced statistical tests.
As such, it is common for machine learning practitioners coming from the computer science
or developer tradition to not know and not value statistical methods. This is a problem given
the pervasive use of statistical methods and statistical thinking in the preparation of data,
evaluation of learned models, and all other steps in a predictive modeling project.

Practitioners Study The Wrong Stats


Eventually, machine learning practitioners realize the need for skills in statistics. This might
start with a need to better interpret descriptive statistics or data visualizations and may progress
to the need to start using sophisticated hypothesis tests. The problem is, they don’t seek out
the statistical information they need. Instead, they try to read through a text book on statistics
or work through the material for an undergraduate course on statistics. This approach is slow,
it’s boring, and it covers a breadth and depth of material on statistics that is beyond the needs
of the machine learning practitioner.

xi
xii

Practitioners Study Stats The Wrong Way


It’s worse than this. Regardless of the medium used to learn statistics, be it books, videos,
or course material, machine learning practitioners study statistics the wrong way. Because
the material is intended for undergraduate students that need to pass a test, the material is
focused on the theory, on proofs, on derivations. This is great for testing students but terrible
for practitioners that need results. Practitioners need methods that clearly state when they are
appropriate and instruction on how to interpret the result. They need code examples that they
can use immediately on their project.

A Better Way
I set out to write a playbook for machine learning practitioners that gives them only those parts
of statistics that they need to know in order to work through a predictive modeling project. I
set out to present statistical methods in the way that practitioners learn–that is with simple
language and working code examples. Statistics is important to machine learning, and I believe
that if it is taught at the right level for practitioners, that it can be a fascinating, fun, directly
applicable, and immeasurably useful area of study. I hope that you agree.

Jason Brownlee
2019
Part I

Introduction

xiii
Welcome

Welcome to Statistical Methods for Machine Learning. The field of statistics is hundreds of
years old and statistical methods are central to working through predictive modeling problems
with machine learning. Statistical methods refer to a range of techniques from simple summary
statistics intended to help better understand data, to statistical hypothesis tests and estimation
statistics that can be used to interpret the results of experiments and predictions from models.
I designed this book to teach you step-by-step the basics of statistical methods with concrete
and executable examples in Python.

Who Is This Book For?


Before we get started, let’s make sure you are in the right place. This book is for developers that
may know some applied machine learning. Maybe you know how to work through a predictive
modeling problem end-to-end, or at least most of the main steps, with popular tools. The
lessons in this book do assume a few things about you, such as:

ˆ You know your way around basic Python for programming.

ˆ You may know some basic NumPy for array manipulation.

ˆ You want to learn statistical methods to deepen your understanding and application of
machine learning.

This guide was written in the top-down and results-first machine learning style that you’re
used to from Machine Learning Mastery.

About Your Outcomes


This book will teach you the basics of statistical methods that you need to know as a machine
learning practitioner. After reading and working through this book, you will know:

ˆ About the field of statistics, how it relates to machine learning, and how to harness
statistical methods on a machine learning project.

ˆ How to calculate and interpret common summary statistics and how to present data using
standard data visualization techniques.

ˆ Findings from mathematical statistics that underlie much of the field, such as the central
limit theorem and the law of large numbers.

xiv
xv

ˆ How to evaluate and interpret the relationship between variables and the independence of
variables.

ˆ How to calculate and interpret parametric statistical hypothesis tests for comparing two
or more data samples.

ˆ How to calculate and interpret interval statistics for distributions, population parameters,
and observations.

ˆ How to use statistical resampling to make good economic use of available data in order to
evaluate predictive models.

ˆ How to calculate and interpret nonparametric statistical hypothesis tests for comparing
two or more data samples that do not conform to the expectations of parametric tests.

This new basic understanding of statistical methods will impact your practice of machine
learning in the following ways:

ˆ Use descriptive statistics and data visualizations to quickly and more deeply understand
the shape and relationships in data.

ˆ Use inferential statistical tests to quickly and effectively quantify the relationships between
samples, such as the results of experiments with different predictive algorithms or differing
configurations.

ˆ Use estimation statistics to quickly and effectively quantify the confidence in estimated
model skill and model predictions.

This book is not a substitute for an undergraduate course in statistics or a textbook for such
a course, although it could complement such materials. For a good list of top courses, textbooks,
and other resources on statistics, see the Further Reading section at the end of each tutorial.

How to Read This Book


This book was written to be read linearly, from start to finish. That being said, if you know the
basics and need help with a specific notation or operation, then you can flip straight to that
section and get started. This book was designed for you to read on your workstation, on the
screen, not on a tablet or eReader. My hope is that you have the book open right next to your
editor and run the examples as you read about them.
This book is not intended to be read passively or be placed in a folder as a reference text. It
is a playbook, a workbook, and a guidebook intended for you to learn by doing and then apply
your new understanding with working Python examples. To get the most out of the book, I
would recommend playing with the examples in each tutorial. Extend them, break them, then
fix them. Try some of the extensions presented at the end of each lesson and let me know how
you do.
xvi

About the Book Structure


This book was designed around major statistical techniques that are directly relevant to applied
machine learning. There are a lot of things you could learn about statistics, from theory to
abstract concepts to APIs. My goal is to take you straight to developing an intuition for the
elements you must understand with laser-focused tutorials. I designed the tutorials to focus on
how to get things done with statistics. They give you the tools to both rapidly understand and
apply each technique or operation.
Each of the tutorials are designed to take you about one hour to read through and complete,
excluding the extensions and further reading. You can choose to work through the lessons one
per day, one per week, or at your own pace. I think momentum is critically important, and this
book is intended to be read and used, not to sit idle. I would recommend picking a schedule
and sticking to it. The tutorials are divided into 6 parts, they are:

ˆ Part 1: Statistics. Provides a gentle introduction to the field of statistics, the relationship
to machine learning, and the importance that statistical methods have when working
through a predictive modeling problem.

ˆ Part 2: Foundation. Introduction to descriptive statistics, data visualization, random


numbers, and important findings in statistics such as the law of large numbers and the
central limit theorem.

ˆ Part 3: Hypothesis Testing. Covers statistical hypothesis tests for comparing popula-
tions of samples and the interpretation of tests with p-values and critical values.

ˆ Part 4: Resampling. Covers methods from statistics used to economically use small
samples of data to evaluate predictive models such as k-fold cross-validation and the
bootstrap.

ˆ Part 5: Estimation Statistics. Covers an alternative to hypothesis testing called


estimation statistics, including tolerance intervals, confidence intervals, and prediction
intervals.

ˆ Part 6: Nonparametric Methods. Covers nonparametric statistical hypothesis testing


methods for use when data does not meet the expectations of parametric tests.

Each part targets a specific learning outcome, and so does each tutorial within each part.
This acts as a filter to ensure you are only focused on the things you need to know to get to a
specific result and do not get bogged down in the math or near-infinite number of digressions.
The tutorials were not designed to teach you everything there is to know about each of the
theories or techniques of statistics. They were designed to give you an understanding of how
they work, how to use them, and how to interpret the results the fastest way I know how: to
learn by doing.

About Python Code Examples


The code examples were carefully designed to demonstrate the purpose of a given lesson. Code
examples are complete and standalone. The code for each lesson will run as-is with no code
xvii

from prior lessons or third-parties required beyond the installation of the required packages. A
complete working example is presented with each tutorial for you to inspect and copy-and-paste.
All source code is also provided with the book and I would recommend running the provided
files whenever possible to avoid any copy-paste issues.
The provided code was developed in a text editor and intended to be run on the command
line. No special IDE or notebooks are required. If you are using a more advanced development
environment and are having trouble, try running the example from the command line instead.
All code examples were tested on a POSIX-compatible machine with Python 3.

About Further Reading


Each lesson includes a list of further reading resources. This may include:

ˆ Books and book chapters.

ˆ API documentation.

ˆ Articles and Webpages.

Wherever possible, I try to list and link to the relevant API documentation for key functions
used in each lesson so you can learn more about them. I have tried to link to books on Amazon
so that you can learn more about them. I don’t know everything, and if you discover a good
resource related to a given lesson, please let me know so I can update the book.

About Getting Help


You might need help along the way. Don’t worry; you are not alone.

ˆ Help with a Technique? If you need help with the technical aspects of a specific
operation or technique, see the Further Reading section at the end of each lesson.
ˆ Help with Python APIs? If you need help with using the NumPy or SciPy libraries,
see the list of resources in the Further Reading section at the end of each lesson, and also
see Appendix A.
ˆ Help with your workstation? If you need help setting up your environment, I would
recommend using Anaconda and following my tutorial in Appendix B.
ˆ Help with the math? I provided a list of locations where you can search for answers
and ask questions about statistics math in Appendix A. You can also see Appendix C for
a crash course on math notation.
ˆ Help in general? You can shoot me an email. My details are in Appendix A.

Summary
Are you ready? Let’s dive in! Next up you will discover a gentle introduction to the field of
statistics.
Part II

Statistics

1
Chapter 1

Introduction to Statistics

Statistics is a collection of tools that you can use to get answers to important questions about
data. You can use descriptive statistical methods to transform raw observations into information
that you can understand and share. You can use inferential statistical methods to reason from
small samples of data to whole domains. In this chapter, you will discover clearly why statistics
is important in general and for machine learning and generally the types of methods that are
available. After reading this chapter, you will know:

ˆ Statistics is generally considered a prerequisite to the field of applied machine learning.

ˆ We need statistics to help transform observations into information and to answer questions
about samples of observations.

ˆ Statistics is a collection of tools developed over hundreds of years for summarizing data
and quantifying properties of a domain given a sample of observations.

Let’s get started.

1.1 Statistics is Required Prerequisite


Machine learning and statistics are two tightly related fields of study. So much so that
statisticians refer to machine learning as applied statistics or statistical learning rather than the
computer-science-centric name. Machine learning is almost universally presented to beginners
assuming that the reader has some background in statistics. We can make this concrete with a
few cherry picked examples. Take a look at this quote from the beginning of a popular applied
machine learning book titled Applied Predictive Modeling:

... the reader should have some knowledge of basic statistics, including variance,
correlation, simple linear regression, and basic hypothesis testing (e.g. p-values and
test statistics).

— Page vii, Applied Predictive Modeling, 2013.

Here’s another example from the popular Introduction to Statistical Learning book:

We expect that the reader will have had at least one elementary course in statistics.

2
Visit https://textbookfull.com
now to explore a rich
collection of eBooks, textbook
and enjoy exciting offers!
1.2. Why Learn Statistics? 3

— Page 9, An Introduction to Statistical Learning with Applications in R, 2013.

Even when statistics is not a prerequisite, some primitive prior knowledge is required as can
be seen in this quote from the widely read Programming Collective Intelligence:

... this book does not assume you have any prior knowledge of [...] or statistics.
[...] but having some knowledge of trigonometry and basic statistics will help you
understand the algorithms.

— Page xiii, Programming Collective Intelligence: Building Smart Web 2.0 Applications, 2007.

In order to be able to understand machine learning, some basic understanding of statistics


is required. To see why this is the case, we must first understand why we need the field of
statistics in the first place.

1.2 Why Learn Statistics?


Raw observations alone are data, but they are not information or knowledge. Data raises
questions, such as:

ˆ What is the most common or expected observation?

ˆ What are the limits on the observations?

ˆ What does the data look like?

Although they appear simple, these questions must be answered in order to turn raw
observations into information that we can use and share. Beyond raw data, we may design
experiments in order to collect observations. From these experimental results we may have more
sophisticated questions, such as:

ˆ What variables are most relevant?

ˆ What is the difference in an outcome between two experiments?

ˆ Are the differences real or the result of noise in the data?

Questions of this type are important. The results matter to the project, to stakeholders, and
to effective decision making. Statistical methods are required to find answers to the questions
that we have about data. We can see that in order to both understand the data used to train
a machine learning model and to interpret the results of testing different machine learning
models, that statistical methods are required. This is just the tip of the iceberg as each step in
a predictive modeling project will require the use of a statistical method.
1.3. What is Statistics? 4

1.3 What is Statistics?


Statistics is a subfield of mathematics. It refers to a collection of methods for working with
data and using data to answer questions.

Statistics is the art of making numerical conjectures about puzzling questions. [...]
The methods were developed over several hundred years by people who were looking
for answers to their questions.

— Page xiii, Statistics, Fourth Edition, 2007.

It is because the field is comprised of a grab bag of methods for working with data that it
can seem large and amorphous to beginners. It can be hard to see the line between methods
that belong to statistics and methods that belong to other fields of study. Often a technique
can be both a classical method from statistics and a modern algorithm used for feature selection
or modeling. Although a working knowledge of statistics does not require deep theoretical
knowledge, some important and easy-to-digest theorems from the relationship between statistics
and probability can provide a valuable foundation.
Two examples include the law of large numbers and the central limit theorem; the first aids
in understanding why bigger samples are often better and the second provides a foundation for
how we can compare the expected values between samples (e.g mean values). When it comes
to the statistical tools that we use in practice, it can be helpful to divide the field of statistics
into two large groups of methods: descriptive statistics for summarizing data and inferential
statistics for drawing conclusions from samples of data.

Statistics allow researchers to collect information, or data, from a large number of


people and then summarize their typical experience. [...] Statistics are also used to
reach conclusions about general differences between groups. [...] Statistics can also
be used to see if scores on two variables are related and to make predictions.

— Pages ix-x, Statistics in Plain English, Third Edition, 2010.

1.3.1 Descriptive Statistics


Descriptive statistics refer to methods for summarizing raw observations into information that
we can understand and share. Commonly, we think of descriptive statistics as the calculation of
statistical values on samples of data in order to summarize properties of the sample of data, such
as the common expected value (e.g. the mean or median) and the spread of the data (e.g. the
variance or standard deviation). Descriptive statistics may also cover graphical methods that
can be used to visualize samples of data. Charts and graphics can provide a useful qualitative
understanding of both the shape or distribution of observations as well as how variables may
relate to each other.

1.3.2 Inferential Statistics


Inferential statistics is a fancy name for methods that aid in quantifying properties of the domain
or population from a smaller set of obtained observations called a sample. Commonly, we think
1.4. Further Reading 5

of inferential statistics as the estimation of quantities from the population distribution, such as
the expected value or the amount of spread.
More sophisticated statistical inference tools can be used to quantify the likelihood of
observing data samples given an assumption. These are often referred to as tools for statistical
hypothesis testing, where the base assumption of a test is called the null hypothesis. There are
many examples of inferential statistical methods given the range of hypotheses we may assume
and the constraints we may impose on the data in order to increase the power or likelihood that
the finding of the test is correct.

1.4 Further Reading


This section provides more resources on the topic if you are looking to go deeper.

1.4.1 Books
ˆ Applied Predictive Modeling, 2013.
https://amzn.to/2InAS0T

ˆ An Introduction to Statistical Learning with Applications in R, 2013.


https://amzn.to/2Gvhkqz

ˆ Programming Collective Intelligence: Building Smart Web 2.0 Applications, 2007.


https://amzn.to/2GIN9jc

ˆ Statistics, Fourth Edition, 2007.


https://amzn.to/2pUA0tU

ˆ All of Statistics: A Concise Course in Statistical Inference, 2004.


https://amzn.to/2H224Tp

ˆ Statistics in Plain English, Third Edition, 2010.


https://amzn.to/2Gv0A2V

1.4.2 Articles
ˆ Statistics on Wikipedia.
https://en.wikipedia.org/wiki/Statistics

ˆ Portal:Statistics on Wikipedia.
https://en.wikipedia.org/wiki/Portal:Statistics

ˆ List of statistics articles on Wikipedia.


https://en.wikipedia.org/wiki/List_of_statistics_articles

ˆ Mathematical statistics on Wikipedia.


https://en.wikipedia.org/wiki/Mathematical_statistics

ˆ History of statistics on Wikipedia.


https://en.wikipedia.org/wiki/History_of_statistics
1.5. Summary 6

ˆ Descriptive Statistics on Wikipedia.


https://en.wikipedia.org/wiki/Descriptive_statistics

ˆ Statistical Inference on Wikipedia.


https://en.wikipedia.org/wiki/Statistical_inference

1.5 Summary
In this chapter, you discovered clearly why statistics is important in general and for machine
learning, and generally the types of methods that are available. Specifically, you learned:

ˆ Statistics is generally considered a prerequisite to the field of applied machine learning.

ˆ We need statistics to help transform observations into information and to answer questions
about samples of observations.

ˆ Statistics is a collection of tools developed over hundreds of years for summarizing data
and quantifying properties of a domain given a sample of observations.

1.5.1 Next
In the next section, you will discover the tight relationship and differences between machine
learning and statistics.
Chapter 2

Statistics vs Machine Learning

The machine learning practitioner has a tradition of algorithms and a pragmatic focus on results
and model skill above other concerns such as model interpretability. Statisticians work on
much the same type of modeling problems under the names of applied statistics and statistical
learning. Coming from a mathematical background, they have more of a focus on the behavior
of models and explainability of predictions.
The very close relationship between the two approaches to the same problem means that
both fields have a lot to learn from each other. The statisticians need to consider algorithmic
methods was called out in the classic two cultures paper. Machine learning practitioners must
also take heed, keep an open mind, and learn both the terminology and relevant methods
from applied statistics. In this chapter, you will discover that machine learning and statistical
learning are two closely related but different perspectives on the same problem. After reading
this chapter, you will know:

ˆ Machine learning and predictive modeling are a computer science perspective on modeling
data with a focus on algorithmic methods and model skill.
ˆ Statistics and statistical learning are a mathematical perspective on modeling data with a
focus on data models and on goodness of fit.
ˆ Machine learning practitioners must keep an open mind and leverage methods and
understand the terminology from the closely related fields of applied statistics and statistical
learning.

Let’s get started.

2.1 Machine Learning


Machine learning is a subfield of artificial intelligence and is related to the broader field of
computer science. When it comes to developing machine learning models in order to make
predictions, there is a heavy focus on algorithms, code, and results. Machine learning is a lot
broader than developing models in order to make predictions, as can be seen by the definition
in the classic 1997 textbook by Tom Mitchell.

The field of machine learning is concerned with the question of how to construct
computer programs that automatically improve with experience.

7
Random documents with unrelated
content Scribd suggests to you:
It is this citta which appears as the particular states of
consciousness in which both the knower and the known are reflected,
and it comprehends them both in one state of consciousness. It must,
however, be remembered that this citta is essentially a modification
of prakṛti, and as such is non-intelligent; but by the seeming
reflection of the purusha it appears as the knower knowing a certain
object, and we therefore see that in the states themselves are
comprehended both the knower and the known. This citta is not,
however, a separate tattva, but is the sum or unity of the eleven
senses and the ego and also of the five prāṇas or biomotor forces
(Nāgeśa, IV. 10). It thus stands for all that is psychical in man: his
states of consciousness including the living principle in man
represented by the activity of the five prāṇas.
It is the object of Yoga gradually to restrain the citta from its
various states and thus cause it to turn back to its original cause, the
kāraṇacitta, which is all-pervading. The modifications of the
kāraṇacitta into such states as the kāryyacitta is due to its being
overcome by its inherent tamas and rajas; so when the
transformations of the citta into the passing states are arrested by
concentration, there takes place a backward movement and the all-
pervading state of the citta being restored to itself and all tamas
being overcome, the Yogin acquires omniscience, and finally when
this citta becomes as pure as the form of purusha itself, the purusha
becomes conscious of himself and is liberated from the bonds of
prakṛti.
The Yoga philosophy in the first chapter describes the Yoga for
him whose mind is inclined towards trance-cognition. In the second
chapter is described the means by which one with an ordinary
worldly mind (vyutthāna citta) may also acquire Yoga. In the third
chapter are described those phenomena which strengthen the faith of
the Yogin on the means of attaining Yoga described in the second
chapter. In the fourth chapter is described kaivalya, absolute
independence or oneness, which is the end of all the Yoga practices.
The Bhāshya describes the five classes of cittas and comments
upon their fitness for the Yoga leading to kaivalya. Those are I.
kshipta (wandering), II. mūḍha (forgetful), III. vikshipta
(occasionally steady), IV. ekāgra (one-pointed), niruddha
(restrained).
I. The kshiptacitta is characterised as wandering, because it is
being always moved by the rajas. This is that citta which is always
moved to and fro by the rise of passions, the excess of which may
indeed for the time overpower the mind and thus generate a
temporary concentration, but it has nothing to do with the
contemplative concentration required for attaining absolute
independence. The man moved by rajas, far from attaining any
mastery of himself, is rather a slave to his own passions and is always
being moved to and fro and shaken by them (see Siddhānta-
candrikā, I. 2, Bhojavṛtti, I. 2).
II. The mūḍhacitta is that which is overpowered by tamas, or
passions, like that of anger, etc., by which it loses its senses and
always chooses the wrong course. Svāmin Hariharāraṇya suggests a
beautiful example of such concentration as similar to that of certain
snakes which become completely absorbed in the prey upon which
they are about to pounce.
III. The vikshiptacitta, or distracted or occasionally steady citta, is
that mind which rationally avoids the painful actions and chooses the
pleasurable ones. Now none of these three kinds of mind can hope to
attain that contemplative concentration called Yoga. This last type of
mind represents ordinary people, who sometimes tend towards good
but relapse back to evil.
IV. The one-pointed (ekāgra) is that kind of mind in which true
knowledge of the nature of reality is present and the afflictions due to
nescience or false knowledge are thus attenuated and the mind better
adapted to attain the nirodha or restrained state. All these come
under the saṃprajñāta (concentration on an object of knowledge)
type.
V. The nirodha or restrained mind is that in which all mental
states are arrested. This leads to kaivalya.
Ordinarily our minds are engaged only in perception, inference,
etc.—those mental states which we all naturally possess. These
ordinary mental states are full of rajas and tamas. When these are
arrested, the mind flows with an abundance of sattva in the
saṃprajñāta samādhi; lastly when even the saṃprajñāta state is
arrested, all possible states become arrested.
Another important fact which must be noted is the relation of the
actual states of mind called the vṛttis with the latent states called the
saṃskāras—the potency. When a particular mental state passes away
into another, it is not altogether lost, but is preserved in the mind in
a latent form as a saṃskāra, which is always trying to manifest itself
in actuality. The vṛttis or actual states are thus both generating the
saṃskāras and are also always tending to manifest themselves and
actually generating similar vṛttis or actual states. There is a
circulation from vṛttis to saṃskāras and from them again to vṛttis
(saṃskārāḥ vṛttibhiḥ kriyante, saṃskāraiśca vṛttayaḥ evaṃ
vṛttisaṃskāracakramaniśamāvarttate). So the formation of
saṃskāras and their conservation are gradually being strengthened
by the habit of similar vṛttis or actual states, and their continuity is
again guaranteed by the strength and continuity of these saṃskāras.
The saṃskāras are like roots striking deep into the soil and growing
with the growth of the plant above, but even when the plant above
the soil is destroyed, the roots remain undisturbed and may again
shoot forth as plants whenever they obtain a favourable season. Thus
it is not enough for a Yogin to arrest any particular class of mental
states; he must attain such a habit of restraint that the saṃskāra thus
generated is able to overcome, weaken and destroy the saṃskāra of
those actual states which he has arrested by his contemplation.
Unless restrained by such a habit, the saṃskāra of cessation
(nirodhaja saṃskāra) which is opposed to the previously acquired
mental states become powerful and destroy the latter, these are sure
to shoot forth again in favourable season into their corresponding
actual states.
The conception of avidyā or nescience is here not negative but has
a definite positive aspect. It means that kind of knowledge which is
opposed to true knowledge (vidyāviparītaṃ jñānāntaramavidyā).
This is of four kinds: (1) The thinking of the non-eternal world,
which is merely an effect, as eternal. (2) The thinking of the impure
as the pure, as for example the attraction that a woman’s body may
have for a man leading him to think the impure body pure. (3) The
thinking of vice as virtue, of the undesirable as the desirable, of pain
as pleasure. We know that for a Yogin every phenomenal state of
existence is painful (II. 15). A Yogin knows that attachment (rāga) to
sensual and other objects can only give temporary pleasure, for it is
sure to be soon turned into pain. Enjoyment can never bring
satisfaction, but only involves a man further and further in sorrows.
(4) Considering the non-self, e.g. the body as the self. This causes a
feeling of being injured on the injury of the body.
At the moment of enjoyment there is always present suffering from
pain in the form of aversion to pain; for the tendency to aversion
from pain can only result from the incipient memory of previous
sufferings. Of course this is also a case of pleasure turned into pain
(pariṇāmaduḥkhatā), but it differs from it in this that in the case of
pariṇāmaduḥkha pleasure is turned into pain as a result of change or
pariṇāma in the future, whereas in this case the anxiety as to pain is
a thing of the present, happening at one and the same time that a
man is enjoying pleasure.
Enjoyment of pleasure or suffering from pain causes those
impressions called saṃskāra or potencies, and these again when
aided by association naturally create their memory and thence comes
attachment or aversion, then again action, and again pleasure and
pain and hence impressions, memory, attachment or aversion, and
again action and so forth.
All states are modifications of the three guṇas; in each one of them
the functions of all the three guṇas are seen, contrary to one another.
These contraries are observable in their developed forms, for the
guṇas are seen to abide in various proportions and compose all our
mental states. Thus a Yogin who wishes to be released from pain
once for all is very sensitive and anxious to avoid even our so-called
pleasures. The wise are like the eye-ball. As a thread of wool thrown
into the eye pains by merely touching it, but not when it comes into
contact with any other organ, so the Yogin is as tender as the eye-
ball, when others are insensible of pain. Ordinary persons, however,
who have again and again suffered pains as the consequence of their
own karma, and who again seek them after having given them up,
are all round pierced through as it were by nescience, their minds
become full of afflictions, variegated by the eternal residua of the
passions. They follow in the wake of the “I” and the “Mine” in
relation to things that should be left apart, pursuing threefold pain in
repeated births, due to external and internal causes. The Yogin
seeing himself and the world of living beings surrounded by the
eternal flow of pain, turns for refuge to right knowledge, cause of the
destruction of all pains (Vyāsa-bhāshya, II. 15).
Thinking of the mind and body and the objects of the external
world as the true self and feeling affected by their change is avidyā
(false knowledge).
The modifications that this avidyā suffers may be summarised
under four heads.
I. The ego, which, as described above, springs from the
identification of the buddhi with the purusha.
II. From this ego springs attachment (rāga) which is the
inclination towards pleasure and consequently towards the means
necessary for attaining it in a person who has previously experienced
pleasures and remembers them.
II. Repulsion from pain also springs from the ego and is of the
nature of anxiety for its removal; anger at pain and the means which
produces pain, remains in the mind in consequence of the feeling of
pain, in the case of him who has felt and remembers pain.
IV. Love of life also springs from the ego. This feeling exists in all
persons and appears in a positive aspect in the form “would that I
were never to cease.” This is due to the painful experience of death in
some previous existence, which abides in us as a residual potency
(vāsanā) and causes the instincts of self-preservation, fear of death
and love of life. These modifications including avidyā are called the
five kleśas or afflictions.
We are now in a position to see the far-reaching effects of the
identification of the purusha with the buddhi. We have already seen
how it has generated the macrocosm or external world on the one
hand, and manas and the senses on the other. Now we see that from
it also spring attachment to pleasure, aversion from pain and love of
life, motives observable in most of our states of consciousness, which
are therefore called the klishṭa vṛtti or afflicted states. The five
afflictions (false knowledge and its four modifications spoken above)
just mentioned are all comprehended in avidyā, since avidyā or false
knowledge is at the root of all worldly experiences. The sphere of
avidyā is all false knowledge generally, and that of asmitā is also
inseparably connected with all our experiences which consist in the
identification of the intelligent self with the sensual objects of the
world, the attainment of which seems to please us and the loss of
which is so painful to us. It must, however, be remembered that
these five afflictions are only different aspects of avidyā and cannot
be conceived separately from avidyā. These always lead us into the
meshes of the world, far from our final goal—the realisation of our
own self—emancipation of the purusha.
Opposed to it are the vṛttis or states which are called unafflicted,
aklishṭa, the habit of steadiness (abhyāsa) and non-attachment to
pleasures (vairāgya) which being antagonistic to the afflicted states,
are helpful towards achieving true knowledge. These represent such
thoughts as tend towards emancipation and are produced from our
attempts to conceive rationally our final state of emancipation, or to
adopt suitable means for this. They must not, however, be confused
with puṇyakarma (virtuous action), for both puṇya and pāpa karma
are said to have sprung from the kleśas. There is no hard and fast
rule with regard to the appearance of these klishṭa and aklishṭa
states, so that in the stream of the klishṭa states or in the intervals
thereof, aklishṭa states may also appear—as practice and
desirelessness born from the study of the Veda-reasoning and
precepts—and remain quite distinct in itself, unmixed with the
klishṭa states. A Brahman being in a village which is full of the
Kirātas, does not himself become a Kirāta (a forest tribe) for that
reason.
Each aklishṭa state produces its own potency or saṃskāra, and
with the frequency of the states their saṃskāra is strengthened which
in due course suppresses the aklishṭa states.
These klishṭa and aklishṭa modifications are of five descriptions:
pramāṇa (real cognition), viparyyaya (unreal cognition), vikalpa
(logical abstraction and imagination), nidrā (sleep), smṛti (memory).
These vṛttis or states, however, must be distinguished from the six
kinds of mental activity mentioned in Vyāsa-bhāshya, II. 18:
grahaṇa (reception or presentative ideation), dhāraṇa (retention),
ūha (assimilation), apoha (differentiation), tattvajñāna (right
knowledge), abhiniveśa (decision and determination), of which these
states are the products.
We have seen that from avidyā spring all the kleśas or afflictions,
which are therefore seen to be the source of the klishṭa vṛttis as well.
Abhyāsa and vairāgya—the aklishṭa vṛttis, which spring from
precepts, etc., lead to right knowledge, and as such are antagonistic
to the modification of the guṇas on the avidyā side.
We know also that both these sets of vṛttis—the klishṭa and the
aklishṭa—produce their own kinds of saṃskāras, the klishṭa
saṃskāra and the aklishṭa or prajñā saṃskāra. All these
modifications of citta as vṛtti and saṃskāra are the dharmas of citta,
considered as the dharmin or substance.
CHAPTER IX
THE THEORY OF KARMA

The vṛttis are called the mānasa karmas (mental work) as different
from the bāhya karmas (external work) achieved in the exterior
world by the five motor or active senses. These may be divided into
four classes: (1) kṛshṇa (black), (2) śukla (white), (3) śuklakṛshṇa
(white and black), (4) aśuklākṛshṇa (neither white nor black). (1) The
kṛshṇa karmas are those committed by the wicked and, as such, are
wicked actions called also adharma (demerit). These are of two
kinds, viz. bāhya and mānasa, the former being of the nature of
speaking ill of others, stealing others’ property, etc., and the latter of
the nature of such states as are opposed to śraddhā, vīrya, etc., which
are called the śukla karma. (2) The śukla karmas are virtuous or
meritorious deeds. These can only occur in the form of mental states,
and as such can take place only in the mānasa karma. These are
śraddhā (faith), vīrya (strength), smṛti (meditation), samādhi
(absorption), and prajñā (wisdom), which are infinitely superior to
actions achieved in the external world by the motor or active senses.
The śukla karma belongs to those who resort to study and
meditation. (3) The śuklakṛshṇa karma are the actions achieved in
the external world by the motor or active senses. These are called
white and black, because actions achieved in the external world,
however good (śukla) they might be, cannot be altogether devoid of
wickedness (kṛshṇa), since all external actions entail some harm to
other living beings.
Even the Vedic duties, though meritorious, are associated with
sins, for they entail the sacrificing of animals.[40]
The white side of these actions, viz.: that of helping others and
doing good is therefore called dharma, as it is the cause of the
enjoyment of pleasure and happiness for the doer. The kṛshṇa or
black side of these actions, viz. that of doing injury to others is called
adharma, as it is the cause of the suffering of pain to the doer. In all
our ordinary states of existence we are always under the influence of
dharma and adharma, which are therefore called vehicles of actions
(āśerate sāṃsārikā purushā asmin niti āśayaḥ). That in which some
thing lives is its vehicle. Here the purushas in evolution are to be
understood as living in the sheath of actions (which is for that reason
called a vehicle or āśaya). Merit or virtue, and sin or demerit are the
vehicles of actions. All śukla karma, therefore, either mental or
external, is called merit or virtue and is productive of happiness; all
kṛshṇa karma, either mental or external, is called demerit, sin or vice
and is productive of pain.
(4) The karma called aśuklakṛshṇa (neither black nor white) is of
those who have renounced everything, whose afflictions have been
destroyed and whose present body is the last one they will have.
Those who have renounced actions, the karma-sannyāsis (and not
those who belong to the sannyāsāśrama merely), are nowhere found
performing actions which depend upon external means. They have
not got the black vehicle of actions, because they do not perform such
actions. Nor do they possess the white vehicle of actions, because
they dedicate to Īśvara the fruits of all vehicles of action, brought
about by the practice of Yoga.
Returning to the question of karmāśaya again for review, we see
that being produced from desire (kāma), avarice (lobha), ignorance
(moha), and anger (krodha) it has really got at its root the kleśas
(afflictions) such as avidyā (ignorance), asmitā (egoism), rāga
(attachment), dvesha (antipathy), abhiniveśa (love of life). It will be
easily seen that the passions named above, desire, lust, etc., are not
in any way different from the kleśas or afflictions previously
mentioned; and as all actions, virtuous or sinful, have their springs
in the said sentiments of desire, anger, covetousness, and
infatuation, it is quite enough that all these virtuous or sinful actions
spring from the kleśas.
Now this karmāśaya ripens into life-state, life-experience and life-
time, if the roots—the afflictions—exist. Not only is it true that when
the afflictions are rooted out, no karmāśaya can accumulate, but
even when many karmāśayas of many lives are accumulated, they are
rooted out when the afflictions are destroyed. Otherwise, it is
difficult to conceive that the karmāśaya accumulated for an infinite
number of years, whose time of ripeness is uncertain, will be rooted
out! So even if there be no fresh karmāśaya after the rise of true
knowledge, the purusha cannot be liberated but will be required to
suffer an endless cycle of births and rebirths to exhaust the already
accumulated karmāśayas of endless lives. For this reason, the mental
plane becomes a field for the production of the fruits of action only,
when it is watered by the stream of afflictions. Hence the afflictions
help the vehicle of actions (karmāśaya) in the production of their
fruits also. It is for this reason that when the afflictions are destroyed
the power which helps to bring about the manifestation also
disappears; and on that account the vehicles of actions although
existing in innumerable quantities have no time for their fruition and
do not possess the power of producing fruit, because their seed-
powers are destroyed by intellection.
Karmāśaya is of two kinds. (1) Ripening in the same life
dṛshṭajanmavedanīya. (2) Ripening in another unknown life. That
puṇya karmāśaya, which is generated by intense purificatory action,
trance and repetition of mantras, and that pāpa karmāśaya, which is
generated by repeated evil done either to men who are suffering the
extreme misery of fear, disease and helplessness, or to those who
place confidence in them or to those who are high-minded and
perform tapas, ripen into fruit in the very same life, whereas other
kinds of karmāśayas ripen in some unknown life.
Living beings in hell have no dṛshṭajanma karmāśaya, for this life
is intended for suffering only and their bodies are called the bhoga-
śarīras intended for suffering alone and not for the accumulation of
any karmāśaya which could take effect in that very life.
There are others whose afflictions have been spent and exhausted
and thus they have no such karmāśaya, the effect of which they will
have to reap in some other life. They are thus said to have no
adṛshṭa-janmavedanīya karma.
The karmāśaya of both kinds described above ripens into life-state,
life-time and life-experience. These are called the three ripenings or
vipākas of the karmāśaya; and they are conducive to pleasure or
pain, according as they are products of puṇyakarmāśaya (virtue) or
pāpa karmāśaya (vice or demerit). Many karmāśayas combine to
produce one life-state; for it is not possible that each karma should
produce one or many life-states, for then there would be no
possibility of experiencing the effects of the karmas, because if for
each one of the karmas we had one or more lives, karmas, being
endless, space for obtaining lives in which to experience effects
would not be available, for it would take endless time to exhaust the
karmas already accumulated. It is therefore held that many karmas
unite to produce one life-state or birth (jāti) and to determine also its
particular duration (āyush) and experience (bhoga). The virtuous
and sinful karmāśayas accumulated in one life, in order to produce
their effects, cause the death of the individual and manifest
themselves in producing his rebirth, his duration of life and
particular experiences, pleasurable or painful. The order of
undergoing the experiences is the order in which the karmas
manifest themselves as effects, the principal ones being manifested
earlier in life. The principal karmas here refer to those which are
quite ready to generate their effects. Thus it is said that those karmas
which produce their effects immediately are called primary, whereas
those which produce effects after some delay are called secondary.
Thus we see that there is continuity of existence throughout; when
the karmas of this life ripen jointly they tend to fructify by causing
another birth as a means to which death is caused, and along with it
life is manifested in another body (according to the dharma and
adharma of the karmāśaya) formed by the prakṛtyāpūra (cf. the citta
theory described above); and the same karmāśaya regulates the life-
period and experiences of that life, the karmāśayas of which again
take a similar course and manifest themselves in the production of
another life and so on.
We have seen that the karmāśaya has three fructifications, viz. jāti,
āyush and bhoga. Now generally the karmāśaya is regarded as
ekabhavika or unigenital, i.e. it accumulates in one life. Ekabhava
means one life and ekabhavika means the product of one life, or
accumulated in one life. Regarded from this point of view, it may be
contrasted with the vāsanās which remain accumulated from
thousands of previous lives since eternity, the mind, being pervaded
all over with them, as a fishing-net is covered all over with knots.
This vāsanā results from memory of the experiences of a life
generated by the fructification of the karmāśaya and kept in the citta
in the form of potency or impressions (saṃskāra). Now we have
previously seen that the citta remains constant in all the births and
rebirths that an individual has undergone from eternity; it therefore
keeps the memory of those various experiences of thousands of lives
in the form of saṃskāra or potency and is therefore compared with a
fishing-net pervaded all over with knots. The vāsanās therefore are
not the results of the accumulation of experiences or their memory in
one life but in many lives, and are therefore called anekabhavika as
contrasted with the karmāśaya representing virtuous and vicious
actions which are accumulated in one life and which produce another
life, its experiences and its life-duration as a result of fructification
(vipāka). This vāsanā is the cause of the instinctive tendencies, or
habits of deriving pleasures and pains peculiar to different animal
lives.
Thus the habits of a dog-life and its peculiar modes of taking its
experiences and of deriving pleasures and pains are very different in
nature from those of a man-life; they must therefore be explained on
the basis of an incipient memory in the form of potency, or
impressions (saṃskāra) of the experiences that an individual must
have undergone in a previous dog-life.
Now when by the fructification of the karmāśaya a dog-life is
settled for a person, his corresponding vāsanās of a previous dog-life
are at once revived and he begins to take interest in his dog-life in the
manner of a dog; the same principle applies to the virtue of
individuals as men or as gods (IV. 8).
If there was not this law of vāsanās, then any vāsanā would be
revived in any life, and with the manifestation of the vāsanā of
animal life a man would take interest in eating grass and derive
pleasure from it. Thus Nāgeśa says: “Now if those karmas which
produce a man-life should manifest the vāsanās of animal lives, then
one might be inclined to eat grass as a man, and it is therefore said
that only the vāsanās corresponding to the karmas are revived.”
Now as the vāsanās are of the nature of saṃskāras or impressions,
they lie ingrained in the citta and nothing can prevent their being
revived. The intervention of other births has no effect. For this
reason, the vāsanās of a dog-life are at once revived in another dog-
life, though between the first dog-life and the second dog-life, the
individual may have passed through many other lives, as a man, a
bull, etc., though the second dog-life may take place many hundreds
of years after the first dog-life and in quite different countries. The
difference between saṃskāras, impressions, and smṛti or memory is
simply this that the former is the latent state whereas the latter is the
manifested state; so we see that the memory and the impressions are
identical in nature, so that whenever a saṃskāra is revived, it means
nothing but the manifestation of the memory of the same
experiences conserved in the saṃskāra in a latent state. Experiences,
when they take place, keep their impressions in the mind, though
thousands of other experiences, lapse of time, etc., may intervene.
They are revived in one moment with the proper cause of their
revival, and the other intervening experiences can in no way hinder
this revival. So it is with the vāsanās, which are revived at once
according to the particular fructification of the karmāśaya, in the
form of a particular life, as a man, a dog, or anything else.
It is now clear that the karmāśaya tending towards fructification is
the cause of the manifestation of the vāsanās already existing in the
mind in a latent form. Thus the Sūtra says:—“When two similar lives
are separated by many births, long lapses of time and remoteness of
space, even then for the purpose of the revival of the vāsanās, they
may be regarded as immediately following each other, for the
memories and impressions are the same” (Yoga-sūtra, IV. 9). The
Bhāshya says: “the vāsanā is like the memory (smṛti), and so there
can be memory from the impressions of past lives separated by many
lives and by remote tracts of country. From these memories the
impressions (saṃskāras) are derived, and the memories are revived
by manifestation of the karmāśayas, and though memories from past
impressions may have many lives intervening, these interventions do
not destroy the causal antecedence of those past lives” (IV. 9).
These vāsanās are, however, beginningless, for a baby just after
birth is seen to feel the fear of death instinctively, and it could not
have derived it from its experience in this life. Again, if a small baby
is thrown upwards, it is seen to shake and cry like a grown-up man,
and from this it may be inferred that it is afraid of falling down on
the ground and is therefore shaking through fear. Now this baby has
never learnt in this life from experience that a fall on the ground will
cause pain, for it has never fallen on the ground and suffered pain
therefrom; so the cause of this fear cannot be sought in the
experiences of this life, but in the memory of past experiences of fall
and pain arising therefrom, which is innate in this life as vāsanā and
causes this instinctive fear. So this innate memory which causes
instinctive fear of death from the very time of birth, has not its origin
in this life but is the memory of the experience of some previous life,
and in that life, too, it existed as innate memory of some other
previous life, and in that again as the innate memory of some other
life and so on to beginningless time. This goes to show that the
vāsanās are without beginning.
We come now to the question of unigenitality—ekabhavikatva—of
the karmāśaya and its exceptions. We find that great confusion has
occurred among the commentators about the following passage in
the Bhāshya which refers to this subject: The Bhāshya according to
Vācaspati in II. 13 reads: tatra dṛshṭajanmavedanīyasya
niyatavipākasya, etc. Here Bhikshu and Nāgeśa read
tatrādṛshṭajanmavedanīyasya niyatavipākasya, etc. There is thus a
divergence of meaning on this point between Yoga-vārttika and his
follower Nāgeśa, on one side, and Vācaspati on the other.
Vācaspati says that the dṛshṭajanmavedanīya (to be fructified in
the same visible life) karma is the only true karma where the
karmāśaya is ekabhavika, unigenital, for here these effects are
positively not due to the karma of any other previous lives, but to the
karma of that very life. Thus these are the only true causes of
ekabhavika karmāśaya.
Thus according to Vācaspati we see that the adṛshṭajanmavedanīya
karma (to be fructified in another life) of unappointed fruition is
never an ideal of ekabhavikatva or unigenital character; for it may
have three different courses: (1) It may be destroyed without fruition.
(2) It may become merged in the ruling action. (3) It may exist for a
long time overpowered by the ruling action whose fruition has been
appointed.
Vijñāna Bhikshu and his follower Nāgeśa, however, say that the
dṛshṭajanmavedanīya karma (to be fructified in the same visible life)
can never be ekabhavika or unigenital for there is no bhava, or
previous birth there, whose product is being fructified in that life, for
this karma is of that same visible life and not of some other previous
bhava or life; and they agree in holding that it is for that reason that
the Bhāshya makes no mention of this dṛshṭajanmavedanīya karma;
it is clear that the karmāśaya in no other bhava is being fructified
here. Thus we see that about dṛshṭajanmavedanīya karma, Vācaspati
holds that it is the typical case of ekabhavika karma (karma of the
same birth), whereas Vijñāna Bhikshu holds just the opposite view,
viz. that the dṛhṭajanmavedanīya karma should by no means be
considered as ekabhavika since there is here no bhava or birth, it
being fructified in the same life.
The adṛshṭajanmavedanīya karma (works to be fructified in
another life) of unfixed fruition has three different courses: (I) As we
have observed before, by the rise of aśuklākṛshṇa (neither black nor
white) karma, the other karmas—śukla (black), kṛshṇa (white) and
śuklakṛshṇa (both black and white)—are rooted out. The śukla
karmāśaya again arising from study and asceticism destroys the
kṛshṇa karmas without their being able to generate their effects.
These therefore can never be styled ekabhavika, since they are
destroyed without producing any effect. (II) When the effects of
minor actions are merged in the effects of the major and ruling
action. The sins originating from the sacrifice of animals at a holy
sacrifice are sure to produce bad effects, though they may be minor
and small in comparison with the good effects arising from the
performance of the sacrifice in which they are merged. Thus it is said
that the experts being immersed in floods of happiness brought
about by their sacrifices bear gladly particles of the fire of sorrow
brought about by the sin of killing animals at sacrifice. So we see that
here also the minor actions having been performed with the major do
not produce their effects independently, and so all their effects are
not fully manifested, and hence these secondary karmāśayas cannot
be regarded as ekabhavika. (III) Again the adṛshṭajanmavedanīya
karma (to be fructified in another life) of unfixed fruition (aniyata
vipāka) remains overcome for a long time by another
adṛshṭajanmavedanīya karma of fixed fruition. A man may for
example do some good actions and some extremely vicious ones, so
that at the time of death, the karmāśaya of those vicious actions
becoming ripe and fit for appointed fruition, generates an animal
life. His good action, whose benefits are such as may be reaped only
in a man-life, will remain overcome until the man is born again as a
man: so this also cannot be said to be ekabhavika (to be reaped in
one life). We may summarise the classification of karmas according
to Vācaspati in a table as follows:—

Thus the karmāśaya may be viewed from two sides, one being that
of fixed fruition and the other unfixed fruition, and the other that of
dṛshṭajanmavedanīya and adṛshṭajanmavedanīya. Now the theory is
that the niyatavipāka (of fixed fruition) karmāśaya is always
ekabhavika, i.e. it does not remain separated by other lives, but
directly produces its effects in the succeeding life.
Ekabhavika means that which is produced from the accumulation
of karmas in one life in the life which succeeds it. Vācaspati,
however, takes it also to mean that action which attains fruition in
the same life in which it is performed, whereas what Vijñāna Bhikshu
understands by ekabhavika is that action alone which is produced in
the life immediately succeeding the life in which it was accumulated.
So according to Vijñāna Bhikshu, the niyata vipāka (of fixed fruition)
dṛshṭajanmavedanīya (to be fructified in the same life) action is not
ekabhavika, since it has no bhava, i.e. it is not the production of a
preceding life. Neither can it be anekabhavika; thus this
niyatavipākadṛshṭajanmavedanīya action is neither ekabhavika nor
anekbhavika. Whereas Vācaspati is inclined to call this also
ekabhavika. About the niyatavipāka-adṛshṭajanmavedanīya action
being called ekabhavika (unigenital) there seems to be no dispute.
The aniyatavipāka-adṛshṭajanmavedanīya action cannot be called
ekabhavika as it undergoes three different courses described above.
CHAPTER X
THE ETHICAL PROBLEM

We have described avidyā and its special forms as the kleśas, from
which also proceed the actions virtuous and vicious, which in their
turn again produce as a result of their fruition, birth, life and
experiences of pleasure and pain and the vāsanās or residues of the
memory of these experiences. Again every new life or birth is
produced from the fructification of actions of a previous life; a man is
made to perform actions good or bad by the kleśas which are rooted
in him, and these actions, as a result of their fructification, produce
another life and its experiences, in which life again new actions are
earned by virtue of the kleśas, and thus the cycle is continued. When
there is pralaya or involution of the cosmical world-process the
individual cittas of the separate purushas return back to the prakṛti
and lie within it, together with their own avidyās, and at the time of
each new creation or evolution these are created anew with such
changes as are due according to their individual avidyās, with which
they had to return back to their original cause, the prakṛti, and spend
an indivisible inseparable existence with it. The avidyās of some
other creation, being merged in the prakṛti along with the cittas,
remain in the prakṛti as vāsanās, and prakṛti being under the
influence of these avidyās as vāsanās creates as modifications of itself
the corresponding minds for the individual purushas, connected with
them before the last pralaya dissolution. So we see that though the
cittas had returned to their original causes with their individual
nescience (avidyā), the avidyā was not lost but was revived at the
time of the new creation and created such minds as should be
suitable receptacles for it. These minds (buddhi) are found to be
modified further into their specific cittas or mental planes by the
same avidyā which is manifested in them as the kleśas, and these
again in the karmāśaya, jāti, āyush and bhoga, and so on; the
individual, however, is just in the same position as he was or would
have been before the involution of pralaya. The avidyās of the cittas
which had returned to the prakṛti at the time of the creation being
revived, create their own buddhis of the previous creation, and by
their connection with the individual purushas are the causes of the
saṃsāra or cosmic evolution—the evolution of the microcosm, the
cittas, and the macrocosm or the exterior world.
In this new creation, the creative agencies of God and avidyā are
thus distinguished in that the latter represents the end or purpose of
the prakṛti—the ever-evolving energy transforming itself into its
modifications as the mental and the material world; whereas the
former represents that intelligent power which abides outside the
pale of prakṛti, but removes obstructions offered by the prakṛti.
Though unintelligent and not knowing how and where to yield so as
to form the actual modifications necessary for the realisation of the
particular and specific objects of the numberless purushas, these
avidyās hold within themselves the serviceability of the purushas,
and are the cause of the connection of the purusha and the prakṛti, so
that when these avidyās are rooted out it is said that the
purushārthatā or serviceability of the purusha is at an end and the
purusha becomes liberated from the bonds of prakṛti, and this is
called the final goal of the purusha.
The ethical problem of the Pātañjala philosophy is the uprooting of
this avidyā by the attainment of true knowledge of the nature of the
purusha, which will be succeeded by the liberation of the purusha
and his absolute freedom or independence—kaivalya—the last
realisation of the purusha—the ultimate goal of all the movements of
the prakṛti.
This final uprooting of the avidyā with its vāsanās directly follows
the attainment of true knowledge called prajñā, in which state the
seed of false knowledge is altogether burnt and cannot be revived
again. Before this state, the discriminative knowledge which arises as
the recognition of the distinct natures of purusha and buddhi
remains shaky; but when by continual practice this discriminative
knowledge becomes strengthened in the mind, its potency gradually
grows stronger and stronger, and roots out the potency of the
ordinary states of mental activity, and thus the seed of false
knowledge becomes burnt up and incapable of fruition, and the
impurity of the energy of rajas being removed, the sattva as the
manifesting entity becomes of the highest purity, and in that state
flows on the stream of the notion of discrimination—the recognition
of the distinct natures of purusha and buddhi—free from impurity.
Thus when the state of buddhi becomes almost as pure as the
purusha itself, all self-enquiry subsides, the vision of the real form of
the purusha arises, and false knowledge, together with the kleśas and
the consequent fruition of actions, ceases once for all. This is that
state of citta which, far from tending towards the objective world,
tends towards the kaivalya of the purusha.
In the first stages, when the mind attains discriminative
knowledge, the prajñā is not deeply seated, and occasionally
phenomenal states of consciousness are seen to intervene in the form
of “I am,” “Mine,” “I know,” “I do not know,” because the old
potencies, though becoming weaker and weaker are not finally
destroyed, and consequently occasionally produce their
corresponding conscious manifestation as states which impede the
flow of discriminative knowledge. But constant practice in rooting
out the potency of this state destroys the potencies of the outgoing
activity, and finally no intervention occurs in the flow of the stream
of prajñā through the destructive influence of phenomenal states of
consciousness. In this higher state when the mind is in its natural,
passive, and objectless stream of flowing prajñā, it is called the
dharmamegha-saṁādhi. When nothing is desired even from dhyāna
arises the true knowledge which distinguishes prakṛti from purusha
and is called the dharmamegha-samādhi (Yoga-sūtra, IV. 29). The
potency, however, of this state of consciousness lasts until the
purusha is finally liberated from the bonds of prakṛti and is
absolutely free (kevalī). Now this is the state when the citta becomes
infinite, and all its tamas being finally overcome, it shines forth like
the sun, which can reflect all, and in comparison to which the
crippled insignificant light of objective knowledge shrinks altogether,
and thus an infinitude is acquired, which has absorbed within itself
all finitude, which cannot have any separate existence or
manifestation through this infinite knowledge. All finite states of
knowledge are only a limitation of true infinite knowledge, in which
there is no limitation of this and that. It absorbs within itself all these
limitations.
The purusha in this state may be called the emancipated being,
jīvanmukta. Nāgeśa in explaining Vyāsa-bhāshya, IV. 31, describing
the emancipated life says: “In this jīvanmukta stage, being freed
from all impure afflictions and karmas, the consciousness shines in
its infirmity. The infiniteness of consciousness is different from the
infiniteness of materiality veiled by tamas. In those stages there
could be consciousness only with reference to certain things with
reference to which the veil of tamas was raised by rajas. When all
veils and impurities are removed, then little is left which is not
known. If there were other categories besides the 25 categories, these
also would then have been known” (Chāyāvyākhyā, IV. 31).
Now with the rise of such dharmamegha the succession of the
changes of the qualities is over, inasmuch as they have fulfilled their
object by having achieved experience and emancipation, and their
succession having ended, they cannot stay even for a moment. And
now comes absolute freedom, when the guṇas return back to the
pradhāna their primal cause, after performing their service for the
purusha by providing his experience and his salvation, so that they
lose all their hold on purusha and purusha remains as he is in
himself, and never again has any connection with the buddhi. The
purusha remains always in himself in absolute freedom.
The order of the return of the guṇas for a kevalī purusha is
described below in the words of Vācaspati: The guṇas as cause and
effect involving ordinary experiences samādhi and nirodha, become
submerged in the manas; the manas becomes submerged in the
asmitā, the asmitā in the liṅga, and the liṅga in the aliṅga.
This state of kaivalya must be distinguished from the state of
mahāpralaya in which also the guṇas return back to prakṛti, for that
state is again succeeded by later connections of prakṛti with purushas
through the buddhis, but the state of kaivalya is an eternal state
which is never again disturbed by any connection with prakṛti, for
now the separation of prakṛti from purusha is eternal, whereas that
in the mahāpralaya state was only temporary.
We shall conclude this section by noting two kinds of eternity of
purusha and of prakṛti, and by offering a criticism of the prajñā state.
The former is said to be perfectly and unchangeably eternal
(kūṭastha nitya), and the latter is only eternal in an evolutionary
form. The permanent or eternal reality is that which remains
unchanged amid its changing appearances; and from this point of
view both purusha and prakṛti are eternal. It is indeed true, as we
have seen just now, that the succession of changes of qualities with
regard to buddhi, etc., comes to an end when kaivalya is attained, but
this is with reference to purusha, for the changes of qualities in the
guṇas themselves never come to an end. So the guṇas in themselves
are eternal in their changing or evolutionary character, and are
therefore said to possess evolutionary eternity (pariṇāminityatā).
Our phenomenal conception cannot be free from change, and
therefore it is that in our conception of the released purushas we
affirm their existence, as for example when we say that the released
purushas exist eternally. But it must be carefully noted that this is
due to the limited character of our thoughts and expressions, not to
the real nature of the released purushas, which remain for ever
unqualified by any changes or modifications, pure and colourless as
the very self of shining intelligence (see Vyāsa-bhāshya, IV. 33).
We shall conclude this section by giving a short analysis of the
prajñā state from its first appearance to the final release of purusha
from the bondage of prakṛti. Patañjali says that this prajñā state
being final in each stage is sevenfold. Of these the first four stages are
due to our conscious endeavour, and when these conscious states of
prajñā (supernatural wisdom) flow in a stream and are not hindered
or interfered with in any way by other phenomenal conscious states
of pratyayas the purusha becomes finally liberated through the
natural backward movement of the citta to its own primal cause, and
this backward movement is represented by the other three stages.
The seven prajñā stages may be thus enumerated:—
I. The pain to be removed is known. Nothing further remains to be
known of it.
This is the first aspect of the prajñā, in which the person willing to
be released knows that he has exhausted all that is knowable of the
pains.
II. The cause of the pains has been removed and nothing further
remains to be removed of it. This is the second stage or aspect of the
rise of prajñā.
III. The nature of the extinction of pain has already been perceived
by me in the state of samādhi, so that I have come to learn that the
final extinction of my pain will be something like it.
IV. The final discrimination of prakṛti and purusha, the true and
immediate means of the extinction of pain, has been realised.
After this stage, nothing remains to be done by the purusha
himself. For this is the attainment of final true knowledge. It is also
called the para vairāgya. It is the highest consummation, in which
the purusha has no further duties to perform. This is therefore called
the kārya vimukti (or salvation depending on the endeavour of the
purusha) or jīvanmukti.
After this follows the citta vimukti or the process of release of the
purusha from the citta, in three stages.
V. The aspect of the buddhi, which has finally finished its services
to purusha by providing scope for purusha’s experiences and release;
so that it has nothing else to perform for purusha. This is the first
stage of the retirement of the citta.
VI. As soon as this state is attained, like the falling of stones
thrown from the summit of a hill, the guṇas cannot remain even for a
moment to bind the purusha, but at once return back to their primal
cause, the prakṛti; for the avidyā being rooted out, there is no tie or
bond which can keep it connected with purusha and make it suffer
changes for the service of purusha. All the purushārthatā being
ended, the guṇas disappear of themselves.
VII. The seventh and last aspect of the guṇas is that they never
return back to bind purusha again, their teleological purpose being
fulfilled or realised. It is of course easy to see that, in these last three
stages, purusha has nothing to do; but the guṇas of their own nature
suffer these backward modifications and return back to their own
primal cause and leave the purusha kevalī (for ever solitary). Vyāsa-
bhāshya, II. 15.
Vyāsa says that as the science of medicine has four divisions: (1)
disease, (2) the cause of disease, (3) recovery, (4) medicines; so this
Yoga philosophy has also four divisions, viz.: (I) Saṃsāra (the
evolution of the prakṛti in connection with the purusha). (II) The
cause of saṃsāra. (III) Release. (IV) The means of release.
Of these the first three have been described at some length above.
We now direct our attention to the fourth. We have shown above that
the ethical goal, the ideal to be realised, is absolute freedom or
kaivalya, and we shall now consider the line of action that must be
adopted to attain this goal—the summum bonum. All actions which
tend towards the approximate realisation of this goal for man are
called kuśala, and the man who achieves this goal is called kuśalī. It
is in the inherent purpose of prakṛti that man should undergo pains
which include all phenomenal experiences of pleasures as well, and
ultimately adopt such a course of conduct as to avoid them altogether
and finally achieve the true goal, the realisation of which will
extinguish all pains for him for ever. The motive therefore which
prompts a person towards this ethico-metaphysical goal is the
avoidance of pain. An ordinary man feels pain only in actual pain,
but a Yogin who is as highly sensitive as the eye-ball, feels pain in
pleasure as well, and therefore is determined to avoid all
experiences, painful or so-called pleasurable. The extinguishing of all
experiences, however, is not the true ethical goal, being only a means
to the realisation of kaivalya or the true self and nature of the
purusha. But this means represents the highest end of a person, the
goal beyond which all his duties cease; for after this comes kaivalya
which naturally manifests itself on the necessary retirement of the
prakṛti. Purusha has nothing to do in effectuating this state, which
comes of itself. The duties of the purusha cease with the thorough
extinguishing of all his experiences. This therefore is the means of
extinguishing all his pains, which are the highest end of all his duties;
but the complete extinguishing of all pains is identical with the
extinguishing of all experiences, the states or vṛttis of consciousness,
and this again is identical with the rise of prajñā or true
discriminative knowledge of the difference in nature of prakṛti and
its effects from the purusha—the unchangeable. These three sides are
only the three aspects of the same state which immediately precede
kaivalya. The prajñā aspect is the aspect of the highest knowledge,
the suppression of the states of consciousness or experiences, and it
is the aspect of the cessation of all conscious activity and of
painlessness or the extinguishing of all pains as the feeling aspect of
the same nirvīja—samādhi state. But when the student directs his
attention to this goal in his ordinary states of experience, he looks at
it from the side of the feeling aspect, viz. that of acquiring a state of
painlessness, and as a means thereto he tries to purify the mind and
be moral in all his actions, and begins to restrain and suppress his
mental states, in order to acquire this nirvīja or seedless state. This is
the sphere of conduct which is called Yogāṅga.
Of course there is a division of duties according to the
advancement of the individual, as we shall have occasion to show
hereafter. This suppression of mental states which has been
described as the means of attaining final release, the ultimate ethical
goal of life, is called Yoga. We have said before that of the five kinds
of mind—kshipta, mūḍha, vikshipta, ekāgra, niruddha—only the last
two are fit for the process of Yoga and ultimately acquire absolute
freedom. In the other three, though concentration may occasionally
happen, yet there is no extrication of the mind from the afflictions of
avidyā and consequently there is no final release.
CHAPTER XI
YOGA PRACTICE

The Yoga which, after weakening the hold of the afflictions and
causing the real truth to dawn upon our mental vision, gradually
leads us towards the attainment of our final goal, is only possible for
the last two kinds of minds and is of two kinds: (1) samprajñāta
(cognitive) and (2) asamprajñāta (ultra-cognitive). The samprajñāta
Yoga is that in which the mind is concentrated upon some object,
external or internal, in such a way that it does not oscillate or move
from one object to another, but remains fixed and settled in the
object that it holds before itself. At first, the Yogin holds a gross
material object before his view, but when he can make himself steady
in doing this, he tries with the subtle tanmātras, the five causes of the
grosser elements, and when he is successful in this he takes his
internal senses as his object and last of all, when he has fully
succeeded in these attempts, he takes the great egohood as his object,
in which stage his object gradually loses all its determinate character
and he is said to be in a state of suppression in himself, although
devoid of any object. This state, like the other previous states of the
samprajñāta type, is a positive state of the mind and not a mere state
of vacuity of objects or negativity. In this state, all determinate
character of the states disappears and their potencies only remain
alive. In the first stages of a Yogin practising samādhi conscious
states of the lower stages often intervene, but gradually, as the mind
becomes fixed, the potencies of the lower stages are overcome by the
potencies of this stage, so that the mind flows in a calm current and
at last the higher prajñā dawns, whereupon the potencies of this state
also are burnt and extinguished, the citta returns back to its own
primal cause, prakṛti, and purusha attains absolute freedom.
The first four stages of the samprajñāta state are called
madhumatī, madhupratīka, viśoka and the saṃskāraśesha and also
vitarkānugata, vicārānugata, ānandānugata and asmitānugata.
True knowledge begins to dawn from the first stage of this
samprajñāta state, and when the Yogin reaches the last stage the
knowledge reaches its culminating point, but still so long as the
potencies of the lower stages of relative knowledge remain, the
knowledge cannot obtain absolute certainty and permanency, as it
will always be threatened with a possible encroachment by the other
states of the past phenomenal activity now existing as the
subconscious. But the last stage of asamprajñāta samādhi represents
the stage in which the ordinary consciousness has been altogether
surpassed and the mind is in its own true infinite aspect, and the
potencies of the stages in which the mind was full of finite knowledge
are also burnt, so that with the return of the citta to its primal cause,
final emancipation is effected. The last state of samprajñāta samādhi
is called saṃskāraśesha, only because here the residua of the
potencies of subconscious thought only remain and the actual states
of consciousness become all extinct. It is now easy to see that no
mind which is not in the ekāgra or one-pointed state can be fit for the
asamprajñāta samādhi in which it has to settle itself on one object
and that alone. So also no mind which has not risen to the state of
highest suppression is fit for the asamprajñāta or nirvīja state.
It is now necessary to come down to a lower level and examine the
obstructions, on account of which a mind cannot easily become one-
pointed or ekāgra. These, nine in number, are the following:—
Disease, languor, indecision, want of the mental requirements
necessary for samādhi, idleness of body and mind, attachment to
objects of sense, false and illusory knowledge, non-attainment of the
state of concentrated contemplation, unsteadiness and unstability of
the mind in a samādhi state even if it can somehow attain it. These
are again seen to be accompanied with pain and despair owing to the
non-fulfilment of desire, physical shakiness or unsteadiness of the
limbs, taking in of breath and giving out of it, which are seen to
follow the nine distractions of a distracted mind described above.
To prevent these distractions and their accompaniments it is
necessary that we should practise concentration on one truth.
Vācaspati says that this one truth on which the mind should be
settled and fixed is Īśvara, and Rāmānanda Sarasvatī and Nārāyaṇa
Tīrtha agree with him. Vijñāna Bhikshu, however, says that one truth
means any object, gross or fine, and Bhoja supports Vijñāna
Bhikshu, staying that here “one truth” might mean any desirable
object.
Abhyāsa means the steadiness of the mind in one state and not
complete absence of any state; for the Bhāshyakāra himself has said
in the samāpattisūtra, that samprajñāta trance comes after this
steadiness. As we shall see later, it means nothing but the application
of the five means, śraddhā, vīrya, smṛti, samādhi and prajñā; it is an
endeavour to settle the mind on one state, and as such does not differ
from the application of the five means of Yoga with a view to settle
and steady the mind (Yoga-vārttika, I. 13). This effort becomes
firmly rooted, being well attended to for a long time without
interruption and with devotion.
Now it does not matter very much whether this one truth is Īśvara
or any other object; for the true principle of Yoga is the setting of the
mind on one truth, principle or object. But for an ordinary man this
is no easy matter; for in order to be successful the mind must be
equipped with śraddhā or faith—the firm conviction of the Yogin in
the course that he adopts. This keeps the mind steady, pleased, calm
and free from doubts of any kind, so that the Yogin may proceed to
the realisation of his object without any vacillation. Unless a man has
a firm hold on the course that he pursues, all the steadiness that he
may acquire will constantly be threatened with the danger of a
sudden collapse. It will be seen that vairāgya or desirelessness is only
the negative aspect of this śraddhā. For by it the mind is restrained
from the objects of sense, with an aversion or dislike towards the
objects of sensual pleasure and worldly desires; this aversion
towards worldly joys is only the other aspect of the faith of the mind
and the calmness of its currents (cittaprasāda) towards right
knowledge and absolute freedom. So it is said that the vairāgya is the
effect of śraddhā and its product (Yoga-vārttika, I. 20). In order to
make a person suitable for Yoga, vairāgya represents the cessation of
the mind from the objects of sense and their so-called pleasures, and
śraddhā means the positive faith of the mind in the path of Yoga that
one adopts, and the right aspiration towards attaining the highest
goal of absolute freedom.
In its negative aspect, vairāgya is of two kinds, apara and para. The
apara is that of a mind free from attachment to worldly enjoyments,
such as women, food, drinks and power, as also from thirst for
heavenly pleasures attainable by practising the vedic rituals and
sacrifices. Those who are actuated by apara vairāgya do not desire to
remain in a bodiless state (videha) merged in the senses or merged
in the prakṛti (prakṛtilīna). It is a state in which the mind is
indifferent to all kinds of pleasures and pains. This vairāgya may be
said to have four stages: (1) Yatamāna—in which sensual objects are
discovered to be defective and the mind recoils from them. (2)
Vyatireka—in which the senses to be conquered are noted. (3)
Ekendriya—in which attachment towards internal pleasures and
aversion towards external pains, being removed, the mind sets
before it the task of removing attachment and aversion towards
mental passions for obtaining honour or avoiding dishonour, etc. (4)
The fourth and last stage of vairāgya called vaśīkāra is that in which
the mind has perceived the futility of all attractions towards external
objects of sense and towards the pleasures of heaven, and having
suppressed them altogether feels no attachment, even should it come
into connection with them.
With the consummation of this last stage of apara vairāgya, comes
the para vairāgya which is identical with the rise of the final prajñā
leading to absolute independence. This vairāgya, śraddhā and the
abhyāsa represent the unafflicted states (aklishṭavṛtti) which
suppress gradually the klishṭa or afflicted mental states. These lead
the Yogin from one stage to another, and thus he proceeds higher
and higher until the final state is attained.
As vairāgya advances, śraddhā also advances; from śraddhā comes
vīrya, energy, or power of concentration (dhāraṇā); and from it
again springs smṛti—or continuity of one object of thought; and from
it comes samādhi or cognitive and ultra-cognitive trance; after which
follows prajñā, cognitive and ultra-cognitive trance; after which
follows prajñā and final release. Thus by the inclusion of śraddhā
within vairāgya, its effect, and the other products of śraddhā with
abhyāsa, we see that the abhyāsa and vairāgya are the two internal
means for achieving the final goal of the Yogin, the supreme
suppression and extinction of all states of consciousness, of all
afflictions and the avidyā—the last state of supreme knowledge or
prajñā.
As śraddhā, vīrya, smṛti, samādhi which are not different from
vairāgya and abhyāsa (they being only their other aspects or
simultaneous products), are the means of attaining Yoga, it is
possible to make a classification of the Yogins according to the
strength of these with each, and the strength of the quickness
(saṃvega) with which they may be applied towards attaining the
goal of the Yogin. Thus Yogins are of nine kinds:—
(1) mildly energetic, (2) of medium energy, (3) of intense energy.
Each of these may vary in a threefold way according to the
mildness, medium state, or intensity of quickness or readiness with
which the Yogin may apply the means of attaining Yoga. There are
nine kinds of Yogins. Of these the best is he whose mind is most
intensely engaged and whose practice is also the strongest.
There is a difference of opinion here about the meaning of the
word saṃvega, between Vācaspati and Vijñāna Bhikshu. The former
says that saṃvega means vairāgya here, but the latter holds that
saṃvega cannot mean vairāgya, and vairāgya being the effect of
śraddhā cannot be taken separately from it. “Saṃvega” means
quickness in the performance of the means of attaining Yoga; some
say that it means “vairāgya.” But that is not true, for if vairāgya is an
effect of the due performance of the means of Yoga, there cannot be
the separate ninefold classification of Yoga apart from the various
degrees of intensity of the means of Yoga practice. Further, the word
“saṃvega” does not mean “vairāgya” etymologically (Yoga-vārttika,
I. 20).
We have just seen that śraddhā, etc., are the means of attaining
Yoga, but we have not discussed what purificatory actions an
ordinary man must perform in order to attain śraddhā, from which
the other requisites are derived. Of course these purificatory actions
are not the same for all, since they must necessarily depend upon the
conditions of purity or impurity of each mind; thus a person already
in an advanced state, may not need to perform those purificatory
actions necessary for a man in a lower state. We have just said that
Yogins are of nine kinds, according to the strength of their mental
acquirements—śraddhā, etc.—the requisite means of Yoga and the
degree of rapidity with which they may be applied. Neglecting
division by strength or quickness of application along with these

You might also like