100% found this document useful (7 votes)
365 views

Complete Download (Ebook) Python Data Analytics: With Pandas, NumPy, and Matplotlib, 3rd Edition by Fabio Nelli ISBN 9781484295311, 1484295315 PDF All Chapters

The document provides information about the ebook 'Python Data Analytics: With Pandas, NumPy, and Matplotlib, 3rd Edition' by Fabio Nelli, including its ISBN and download links. It also lists additional related ebooks and details about the content covered in the book, such as data analysis processes and Python libraries. The document emphasizes the availability of instant digital products in various formats for immediate access.

Uploaded by

noerryilma19
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (7 votes)
365 views

Complete Download (Ebook) Python Data Analytics: With Pandas, NumPy, and Matplotlib, 3rd Edition by Fabio Nelli ISBN 9781484295311, 1484295315 PDF All Chapters

The document provides information about the ebook 'Python Data Analytics: With Pandas, NumPy, and Matplotlib, 3rd Edition' by Fabio Nelli, including its ISBN and download links. It also lists additional related ebooks and details about the content covered in the book, such as data analysis processes and Python libraries. The document emphasizes the availability of instant digital products in various formats for immediate access.

Uploaded by

noerryilma19
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 71

Download the Full Ebook and Access More Features - ebooknice.

com

(Ebook) Python Data Analytics: With Pandas, NumPy,


and Matplotlib, 3rd Edition by Fabio Nelli ISBN
9781484295311, 1484295315

https://ebooknice.com/product/python-data-analytics-with-
pandas-numpy-and-matplotlib-3rd-edition-51978758

OR CLICK HERE

DOWLOAD EBOOK

Download more ebook instantly today at https://ebooknice.com


Instant digital products (PDF, ePub, MOBI) ready for you
Download now and discover formats that fit your needs...

Start reading on any device today!

(Ebook) Python Data Analytics: With Pandas, NumPy, and


Matplotlib, 3rd Edition by Fabio Nelli ISBN 9781484295311,
9781484295328, 1484295315, 1484295323
https://ebooknice.com/product/python-data-analytics-with-pandas-numpy-
and-matplotlib-3rd-edition-51983918

ebooknice.com

(Ebook) Python Data Analytics with Pandas, NumPy and


Matplotlib, 2nd Edition by Fabio Nelli ISBN 9781484239131,
148423913X
https://ebooknice.com/product/python-data-analytics-with-pandas-numpy-
and-matplotlib-2nd-edition-7198962

ebooknice.com

(Ebook) Python Data Analytics: With Pandas, NumPy, and


Matplotlib by Nelli, Fabio ISBN 9781484239124,
9781484239131, 1484239121, 148423913X
https://ebooknice.com/product/python-data-analytics-with-pandas-numpy-
and-matplotlib-11712292

ebooknice.com

(Ebook) Biota Grow 2C gather 2C cook by Loucas, Jason;


Viles, James ISBN 9781459699816, 9781743365571,
9781925268492, 1459699815, 1743365578, 1925268497
https://ebooknice.com/product/biota-grow-2c-gather-2c-cook-6661374

ebooknice.com
(Ebook) Python Data Analytics: Data Analysis and Science
Using Pandas, Matplotlib and the Python Programming
Language by Nelli Fabio ISBN 9781484209592, 1484209591
https://ebooknice.com/product/python-data-analytics-data-analysis-and-
science-using-pandas-matplotlib-and-the-python-programming-
language-38169124
ebooknice.com

(Ebook) Python Data Analytics: Data Analysis and Science


Using Pandas, Matplotlib and the Python Programming
Language by Nelli Fabio ISBN 9781484209592, 1484209591
https://ebooknice.com/product/python-data-analytics-data-analysis-and-
science-using-pandas-matplotlib-and-the-python-programming-
language-38180776
ebooknice.com

(Ebook) Python Data Analytics: Data Analysis and Science


Using Pandas, Matplotlib and the Python Programming
Language by Nelli Fabio ISBN 9781484209592, 1484209591
https://ebooknice.com/product/python-data-analytics-data-analysis-and-
science-using-pandas-matplotlib-and-the-python-programming-
language-38180778
ebooknice.com

(Ebook) Python Data Analysis Numpy, Matplotlib and Pandas


by Bernd Klein

https://ebooknice.com/product/python-data-analysis-numpy-matplotlib-
and-pandas-47505714

ebooknice.com

(Ebook) Matematik 5000+ Kurs 2c Lärobok by Lena


Alfredsson, Hans Heikne, Sanna Bodemyr ISBN 9789127456600,
9127456609
https://ebooknice.com/product/matematik-5000-kurs-2c-larobok-23848312

ebooknice.com
Python Data Analytics
With Pandas, NumPy, and Matplotlib

Third Edition

Fabio Nelli
Python Data Analytics: With Pandas, NumPy, and Matplotlib
Fabio Nelli
Rome, Italy

ISBN-13 (pbk): 978-1-4842-9531-1 ISBN-13 (electronic): 978-1-4842-9532-8


https://doi.org/10.1007/978-1-4842-9532-8

Copyright © 2023 by Fabio Nelli


This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the
material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation,
broadcasting, reproduction on microfilms or in any other physical way, and transmission or information
storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now
known or hereafter developed.
Trademarked names, logos, and images may appear in this book. Rather than use a trademark symbol with
every occurrence of a trademarked name, logo, or image we use the names, logos, and images only in an
editorial fashion and to the benefit of the trademark owner, with no intention of infringement of the
trademark.
The use in this publication of trade names, trademarks, service marks, and similar terms, even if they are not
identified as such, is not to be taken as an expression of opinion as to whether or not they are subject to
proprietary rights.
While the advice and information in this book are believed to be true and accurate at the date of publication,
neither the authors nor the editors nor the publisher can accept any legal responsibility for any errors or
omissions that may be made. The publisher makes no warranty, express or implied, with respect to the
material contained herein.
Managing Director, Apress Media LLC: Welmoed Spahr
Acquisitions Editor: Celestin Suresh John
Development Editor: James Markham
Coordinating Editor: Mark Powers
Copy Editor: Kezia Endsley
Cover image by Tyler B on Unsplash (www.unsplash.com)
Distributed to the book trade worldwide by Springer Science+Business Media New York, 233 Spring Street,
6th Floor, New York, NY 10013. Phone 1-800-SPRINGER, fax (201) 348-4505, e-mail orders-ny@springer-
sbm.com, or visit www.springeronline.com. Apress Media, LLC is a California LLC and the sole member
(owner) is Springer Science + Business Media Finance Inc (SSBM Finance Inc). SSBM Finance Inc is a
Delaware corporation.
For information on translations, please e-mail [email protected]; for reprint,
paperback, or audio rights, please e-mail [email protected].
Apress titles may be purchased in bulk for academic, corporate, or promotional use. eBook versions and
licenses are also available for most titles. For more information, reference our Print and eBook Bulk Sales
web page at http://www.apress.com/bulk-sales.
Any source code or other supplementary material referenced by the author in this book is available to readers
on GitHub (github.com/apress). For more detailed information, please visit https://www.apress.com/
gp/services/source-code.
Paper in this product is recyclable
“Science leads us forward in knowledge, but only analysis makes us more aware”

This book is dedicated to all those who are constantly looking for awareness
Table of Contents

About the Author���������������������������������������������������������������������������������������������������xvii


About the Technical Reviewer��������������������������������������������������������������������������������xix
Preface�������������������������������������������������������������������������������������������������������������������xxi


■Chapter 1: An Introduction to Data Analysis��������������������������������������������������������� 1
Data Analysis�������������������������������������������������������������������������������������������������������������������� 1
Knowledge Domains of the Data Analyst������������������������������������������������������������������������� 2
Computer Science���������������������������������������������������������������������������������������������������������������������������������� 2
Mathematics and Statistics�������������������������������������������������������������������������������������������������������������������� 3
Machine Learning and Artificial Intelligence������������������������������������������������������������������������������������������ 3
Professional Fields of Application����������������������������������������������������������������������������������������������������������� 3

Understanding the Nature of the Data������������������������������������������������������������������������������ 4


When the Data Become Information������������������������������������������������������������������������������������������������������� 4
When the Information Becomes Knowledge������������������������������������������������������������������������������������������� 4
Types of Data������������������������������������������������������������������������������������������������������������������������������������������ 4

The Data Analysis Process����������������������������������������������������������������������������������������������� 4


Problem Definition���������������������������������������������������������������������������������������������������������������������������������� 5
Data Extraction��������������������������������������������������������������������������������������������������������������������������������������� 6
Data Preparation������������������������������������������������������������������������������������������������������������������������������������� 6
Data Exploration/Visualization���������������������������������������������������������������������������������������������������������������� 7
Predictive Modeling�������������������������������������������������������������������������������������������������������������������������������� 7
Model Validation������������������������������������������������������������������������������������������������������������������������������������� 8
Deployment�������������������������������������������������������������������������������������������������������������������������������������������� 8

v
■ Table of Contents

Quantitative and Qualitative Data Analysis����������������������������������������������������������������������� 9


Open Data������������������������������������������������������������������������������������������������������������������������� 9
Python and Data Analysis����������������������������������������������������������������������������������������������� 12
Conclusions�������������������������������������������������������������������������������������������������������������������� 13

■Chapter 2: Introduction to the Python World������������������������������������������������������ 15
Python—The Programming Language��������������������������������������������������������������������������� 15
The Interpreter and the Execution Phases of the Code������������������������������������������������������������������������ 16
Installing Python������������������������������������������������������������������������������������������������������������� 18
Python Distributions����������������������������������������������������������������������������������������������������������������������������� 19
Using Python����������������������������������������������������������������������������������������������������������������������������������������� 23
Writing Python Code����������������������������������������������������������������������������������������������������������������������������� 26
IPython�������������������������������������������������������������������������������������������������������������������������������������������������� 30

PyPI—The Python Package Index���������������������������������������������������������������������������������� 36


The IDEs for Python������������������������������������������������������������������������������������������������������������������������������ 37

SciPy������������������������������������������������������������������������������������������������������������������������������ 42
NumPy�������������������������������������������������������������������������������������������������������������������������������������������������� 42
Pandas�������������������������������������������������������������������������������������������������������������������������������������������������� 43
matplotlib��������������������������������������������������������������������������������������������������������������������������������������������� 43

Conclusions�������������������������������������������������������������������������������������������������������������������� 43

■Chapter 3: The NumPy Library����������������������������������������������������������������������������� 45
NumPy: A Little History��������������������������������������������������������������������������������������������������� 45
The NumPy Installation�������������������������������������������������������������������������������������������������� 46
ndarray: The Heart of the Library����������������������������������������������������������������������������������� 47
Create an Array������������������������������������������������������������������������������������������������������������������������������������� 48
Types of Data���������������������������������������������������������������������������������������������������������������������������������������� 49
The dtype Option���������������������������������������������������������������������������������������������������������������������������������� 50
Intrinsic Creation of an Array���������������������������������������������������������������������������������������������������������������� 50

Basic Operations������������������������������������������������������������������������������������������������������������ 51
Arithmetic Operators���������������������������������������������������������������������������������������������������������������������������� 52
The Matrix Product������������������������������������������������������������������������������������������������������������������������������� 53

vi
■ Table of Contents

Increment and Decrement Operators��������������������������������������������������������������������������������������������������� 54


Universal Functions (ufunc)������������������������������������������������������������������������������������������������������������������ 54
Aggregate Functions���������������������������������������������������������������������������������������������������������������������������� 55

Indexing, Slicing, and Iterating��������������������������������������������������������������������������������������� 55


Indexing������������������������������������������������������������������������������������������������������������������������������������������������ 55
Slicing��������������������������������������������������������������������������������������������������������������������������������������������������� 57
Iterating an Array���������������������������������������������������������������������������������������������������������������������������������� 59

Conditions and Boolean Arrays�������������������������������������������������������������������������������������� 60


Shape Manipulation������������������������������������������������������������������������������������������������������� 61
Array Manipulation��������������������������������������������������������������������������������������������������������� 62
Joining Arrays��������������������������������������������������������������������������������������������������������������������������������������� 62
Splitting Arrays������������������������������������������������������������������������������������������������������������������������������������� 63

General Concepts����������������������������������������������������������������������������������������������������������� 64
Copies or Views of Objects������������������������������������������������������������������������������������������������������������������� 64
Vectorization����������������������������������������������������������������������������������������������������������������������������������������� 65
Broadcasting���������������������������������������������������������������������������������������������������������������������������������������� 66

Structured Arrays����������������������������������������������������������������������������������������������������������� 68
Reading and Writing Array Data on Files������������������������������������������������������������������������ 70
Loading and Saving Data in Binary Files���������������������������������������������������������������������������������������������� 70
Reading Files with Tabular Data����������������������������������������������������������������������������������������������������������� 70

Conclusions�������������������������������������������������������������������������������������������������������������������� 72

■Chapter 4: The pandas Library—An Introduction������������������������������������������������ 73
pandas: The Python Data Analysis Library��������������������������������������������������������������������� 73
Installation of pandas����������������������������������������������������������������������������������������������������� 74
Installation from Anaconda������������������������������������������������������������������������������������������������������������������� 74
Installation from PyPI���������������������������������������������������������������������������������������������������������������������������� 78

Getting Started with pandas������������������������������������������������������������������������������������������� 78


Introduction to pandas Data Structures������������������������������������������������������������������������� 79

vii
■ Table of Contents

The Series��������������������������������������������������������������������������������������������������������������������������������������������� 80
The Dataframe�������������������������������������������������������������������������������������������������������������������������������������� 87
The Index Objects��������������������������������������������������������������������������������������������������������������������������������� 94

Other Functionalities on Indexes������������������������������������������������������������������������������������ 96


Reindexing�������������������������������������������������������������������������������������������������������������������������������������������� 96
Dropping����������������������������������������������������������������������������������������������������������������������������������������������� 98
Arithmetic and Data Alignment������������������������������������������������������������������������������������������������������������� 99

Operations Between Data Structures��������������������������������������������������������������������������� 100


Flexible Arithmetic Methods��������������������������������������������������������������������������������������������������������������� 100
Operations Between Dataframes and Series�������������������������������������������������������������������������������������� 101

Function Application and Mapping������������������������������������������������������������������������������� 102


Functions by Element������������������������������������������������������������������������������������������������������������������������� 102
Functions by Row or Column�������������������������������������������������������������������������������������������������������������� 102
Statistics Functions���������������������������������������������������������������������������������������������������������������������������� 103

Sorting and Ranking����������������������������������������������������������������������������������������������������� 104


Correlation and Covariance������������������������������������������������������������������������������������������ 107
“Not a Number” Data��������������������������������������������������������������������������������������������������� 108
Assigning a NaN Value������������������������������������������������������������������������������������������������������������������������ 108
Filtering Out NaN Values��������������������������������������������������������������������������������������������������������������������� 109
Filling in NaN Occurrences����������������������������������������������������������������������������������������������������������������� 110

Hierarchical Indexing and Leveling������������������������������������������������������������������������������ 110


Reordering and Sorting Levels����������������������������������������������������������������������������������������������������������� 112
Summary Statistics with groupby Instead of with Level�������������������������������������������������������������������� 113

Conclusions������������������������������������������������������������������������������������������������������������������ 114

■Chapter 5: pandas: Reading and Writing Data��������������������������������������������������� 115
I/O API Tools������������������������������������������������������������������������������������������������������������������ 115
CSV and Textual Files��������������������������������������������������������������������������������������������������� 116
Reading Data in CSV or Text Files��������������������������������������������������������������������������������� 116
Using Regexp to Parse TXT Files�������������������������������������������������������������������������������������������������������� 119
Reading TXT Files Into Parts��������������������������������������������������������������������������������������������������������������� 121
Writing Data in CSV���������������������������������������������������������������������������������������������������������������������������� 121
viii
■ Table of Contents

Reading and Writing HTML Files���������������������������������������������������������������������������������� 123


Writing Data in HTML�������������������������������������������������������������������������������������������������������������������������� 124
Reading Data from an HTML File�������������������������������������������������������������������������������������������������������� 126

Reading Data from XML����������������������������������������������������������������������������������������������� 127


Reading and Writing Data on Microsoft Excel Files������������������������������������������������������ 129
JSON Data�������������������������������������������������������������������������������������������������������������������� 131
The HDF5 Format��������������������������������������������������������������������������������������������������������� 135
Pickle—Python Object Serialization����������������������������������������������������������������������������� 136
Serialize a Python Object with cPickle����������������������������������������������������������������������������������������������� 136
Pickling with pandas�������������������������������������������������������������������������������������������������������������������������� 137

Interacting with Databases������������������������������������������������������������������������������������������ 137


Loading and Writing Data with SQLite3���������������������������������������������������������������������������������������������� 138
Loading and Writing Data with PostgreSQL in a Docker Container���������������������������������������������������� 140

Reading and Writing Data with a NoSQL Database: MongoDB������������������������������������� 146


Conclusions������������������������������������������������������������������������������������������������������������������ 148

■Chapter 6: pandas in Depth: Data Manipulation������������������������������������������������ 149
Data Preparation���������������������������������������������������������������������������������������������������������� 149
Merging���������������������������������������������������������������������������������������������������������������������������������������������� 150

Concatenating�������������������������������������������������������������������������������������������������������������� 154
Combining������������������������������������������������������������������������������������������������������������������������������������������ 156
Pivoting����������������������������������������������������������������������������������������������������������������������������������������������� 157
Removing�������������������������������������������������������������������������������������������������������������������������������������������� 160

Data Transformation����������������������������������������������������������������������������������������������������� 161


Removing Duplicates�������������������������������������������������������������������������������������������������������������������������� 161
Mapping���������������������������������������������������������������������������������������������������������������������������������������������� 162

Discretization and Binning������������������������������������������������������������������������������������������� 166


Detecting and Filtering Outliers���������������������������������������������������������������������������������������������������������� 168

Permutation������������������������������������������������������������������������������������������������������������������ 169
Random Sampling������������������������������������������������������������������������������������������������������������������������������ 170

ix
■ Table of Contents

String Manipulation������������������������������������������������������������������������������������������������������ 170


Built-in Methods for String Manipulation������������������������������������������������������������������������������������������� 170
Regular Expressions��������������������������������������������������������������������������������������������������������������������������� 172

Data Aggregation���������������������������������������������������������������������������������������������������������� 173


GroupBy���������������������������������������������������������������������������������������������������������������������������������������������� 174
A Practical Example���������������������������������������������������������������������������������������������������������������������������� 175
Hierarchical Grouping������������������������������������������������������������������������������������������������������������������������� 176
Group Iteration������������������������������������������������������������������������������������������������������������� 176
Chain of Transformations�������������������������������������������������������������������������������������������������������������������� 177
Functions on Groups��������������������������������������������������������������������������������������������������������������������������� 178

Advanced Data Aggregation����������������������������������������������������������������������������������������� 179


Conclusions������������������������������������������������������������������������������������������������������������������ 181

■Chapter 7: Data Visualization with matplotlib and Seaborn������������������������������ 183
The matplotlib Library�������������������������������������������������������������������������������������������������� 183
Installation�������������������������������������������������������������������������������������������������������������������� 184
The matplotlib Architecture������������������������������������������������������������������������������������������ 185
Backend Layer������������������������������������������������������������������������������������������������������������������������������������ 186
Artist Layer����������������������������������������������������������������������������������������������������������������������������������������� 186
Scripting Layer (pyplot)���������������������������������������������������������������������������������������������������������������������� 188
pylab and pyplot��������������������������������������������������������������������������������������������������������������������������������� 188

pyplot��������������������������������������������������������������������������������������������������������������������������� 189
The Plotting Window��������������������������������������������������������������������������������������������������������������������������� 189

Data Visualization with Jupyter Notebook�������������������������������������������������������������������� 191


Set the Properties of the Plot�������������������������������������������������������������������������������������������������������������� 192
matplotlib and NumPy������������������������������������������������������������������������������������������������������������������������ 194

Using kwargs���������������������������������������������������������������������������������������������������������������� 196


Working with Multiple Figures and Axes�������������������������������������������������������������������������������������������� 196

Adding Elements to the Chart��������������������������������������������������������������������������������������� 198


Adding Text����������������������������������������������������������������������������������������������������������������������������������������� 198
Adding a Grid�������������������������������������������������������������������������������������������������������������������������������������� 202
Adding a Legend��������������������������������������������������������������������������������������������������������������������������������� 203
x
■ Table of Contents

Saving Your Charts������������������������������������������������������������������������������������������������������� 206


Saving the Code���������������������������������������������������������������������������������������������������������������������������������� 206
Saving Your Notebook as an HTML File or as Other File Formats������������������������������������������������������� 207
Saving Your Chart Directly as an Image���������������������������������������������������������������������������������������������� 208

Handling Date Values��������������������������������������������������������������������������������������������������� 208


Chart Typology�������������������������������������������������������������������������������������������������������������� 211
Line Charts������������������������������������������������������������������������������������������������������������������� 211
Line Charts with pandas��������������������������������������������������������������������������������������������������������������������� 217

Histograms������������������������������������������������������������������������������������������������������������������� 218
Bar Charts�������������������������������������������������������������������������������������������������������������������� 219
Horizontal Bar Charts�������������������������������������������������������������������������������������������������������������������������� 222
Multiserial Bar Charts������������������������������������������������������������������������������������������������������������������������� 223
Multiseries Bar Charts with a pandas Dataframe������������������������������������������������������������������������������� 225
Multiseries Stacked Bar Charts���������������������������������������������������������������������������������������������������������� 227
Stacked Bar Charts with a pandas Dataframe������������������������������������������������������������������������������������ 229
Other Bar Chart Representations�������������������������������������������������������������������������������������������������������� 230

Pie Charts��������������������������������������������������������������������������������������������������������������������� 231


Pie Charts with a pandas Dataframe�������������������������������������������������������������������������������������������������� 234

Advanced Charts���������������������������������������������������������������������������������������������������������� 235


Contour Plots�������������������������������������������������������������������������������������������������������������������������������������� 235
Polar Charts���������������������������������������������������������������������������������������������������������������������������������������� 236

The mplot3d Toolkit������������������������������������������������������������������������������������������������������ 237


3D Surfaces���������������������������������������������������������������������������������������������������������������������������������������� 238
Scatter Plots in 3D������������������������������������������������������������������������������������������������������������������������������ 239
Bar Charts in 3D��������������������������������������������������������������������������������������������������������������������������������� 240

Multipanel Plots������������������������������������������������������������������������������������������������������������ 241


Display Subplots Within Other Subplots��������������������������������������������������������������������������������������������� 241
Grids of Subplots�������������������������������������������������������������������������������������������������������������������������������� 243

The Seaborn Library����������������������������������������������������������������������������������������������������� 245


Conclusions������������������������������������������������������������������������������������������������������������������ 257

xi
■ Table of Contents


■Chapter 8: Machine Learning with scikit-learn������������������������������������������������� 259
The scikit-learn Library������������������������������������������������������������������������������������������������ 259
Machine Learning��������������������������������������������������������������������������������������������������������� 259
Supervised and Unsupervised Learning��������������������������������������������������������������������������������������������� 259
Training Set and Testing Set��������������������������������������������������������������������������������������������������������������� 260

Supervised Learning with scikit-learn������������������������������������������������������������������������� 260


The Iris Flower Dataset������������������������������������������������������������������������������������������������ 261
The PCA Decomposition��������������������������������������������������������������������������������������������������������������������� 264

K-Nearest Neighbors Classifier������������������������������������������������������������������������������������ 267


Diabetes Dataset���������������������������������������������������������������������������������������������������������� 271
Linear Regression: The Least Square Regression�������������������������������������������������������� 272
Support Vector Machines (SVMs)��������������������������������������������������������������������������������� 276
Support Vector Classification (SVC)���������������������������������������������������������������������������������������������������� 277
Nonlinear SVC������������������������������������������������������������������������������������������������������������������������������������� 281
Plotting Different SVM Classifiers Using the Iris Dataset�������������������������������������������������������������������� 283
Support Vector Regression (SVR)�������������������������������������������������������������������������������������������������������� 285

Conclusions������������������������������������������������������������������������������������������������������������������ 287

■Chapter 9: Deep Learning with TensorFlow������������������������������������������������������� 289
Artificial Intelligence, Machine Learning, and Deep Learning�������������������������������������� 289
Artificial Intelligence��������������������������������������������������������������������������������������������������������������������������� 289
Machine Learning Is a Branch of Artificial Intelligence���������������������������������������������������������������������� 290
Deep Learning Is a Branch of Machine Learning�������������������������������������������������������������������������������� 290
The Relationship Between Artificial Intelligence, Machine Learning, and Deep Learning������������������ 290

Deep Learning�������������������������������������������������������������������������������������������������������������� 291


Neural Networks and GPUs����������������������������������������������������������������������������������������������������������������� 291
Data Availability: Open Data Source, Internet of Things, and Big Data����������������������������������������������� 292
Python������������������������������������������������������������������������������������������������������������������������������������������������� 292
Deep Learning Python Frameworks��������������������������������������������������������������������������������������������������� 292

Artificial Neural Networks�������������������������������������������������������������������������������������������� 293

xii
■ Table of Contents

How Artificial Neural Networks Are Structured���������������������������������������������������������������������������������� 293


Single Layer Perceptron (SLP)������������������������������������������������������������������������������������������������������������ 294
Multilayer Perceptron (MLP)��������������������������������������������������������������������������������������������������������������� 296
Correspondence Between Artificial and Biological Neural Networks������������������������������������������������� 297

TensorFlow������������������������������������������������������������������������������������������������������������������� 298
TensorFlow: Google’s Framework������������������������������������������������������������������������������������������������������� 298
TensorFlow: Data Flow Graph������������������������������������������������������������������������������������������������������������� 298

Start Programming with TensorFlow���������������������������������������������������������������������������� 299


TensorFlow 2.x vs TensorFlow 1.x������������������������������������������������������������������������������������������������������ 299
Installing TensorFlow�������������������������������������������������������������������������������������������������������������������������� 300
Programming with the Jupyter Notebook������������������������������������������������������������������������������������������� 300
Tensors����������������������������������������������������������������������������������������������������������������������������������������������� 300
Loading Data Into a Tensor from a pandas Dataframe����������������������������������������������������������������������� 303
Loading Data in a Tensor from a CSV File������������������������������������������������������������������������������������������� 304
Operation on Tensors�������������������������������������������������������������������������������������������������������������������������� 306

Developing a Deep Learning Model with TensorFlow��������������������������������������������������� 307


Model Building������������������������������������������������������������������������������������������������������������� 307
Model Compiling���������������������������������������������������������������������������������������������������������� 308
Model Training and Testing������������������������������������������������������������������������������������������� 309
Prediction Making�������������������������������������������������������������������������������������������������������� 309
Practical Examples with TensorFlow 2.x���������������������������������������������������������������������� 310
Single Layer Perceptron with TensorFlow������������������������������������������������������������������������������������������ 310
Multilayer Perceptron (with One Hidden Layer) with TensorFlow������������������������������������������������������� 317
Multilayer Perceptron (with Two Hidden Layers) with TensorFlow����������������������������������������������������� 319

Conclusions������������������������������������������������������������������������������������������������������������������ 321

■Chapter 10: An Example—Meteorological Data������������������������������������������������ 323
A Hypothesis to Be Tested: The Influence of the Proximity of the Sea������������������������� 323
The System in the Study: The Adriatic Sea and the Po Valley������������������������������������������������������������� 323

Finding the Data Source����������������������������������������������������������������������������������������������� 327


Data Analysis on Jupyter Notebook������������������������������������������������������������������������������ 328

xiii
■ Table of Contents

Analysis of Processed Meteorological Data����������������������������������������������������������������� 332


The RoseWind�������������������������������������������������������������������������������������������������������������� 343
Calculating the Mean Distribution of the Wind Speed������������������������������������������������������������������������ 347

Conclusions������������������������������������������������������������������������������������������������������������������ 348

■Chapter 11: Embedding the JavaScript D3 Library in the IPython Notebook���� 349
The Open Data Source for Demographics�������������������������������������������������������������������� 349
The JavaScript D3 Library�������������������������������������������������������������������������������������������� 352
Drawing a Clustered Bar Chart������������������������������������������������������������������������������������� 355
The Choropleth Maps��������������������������������������������������������������������������������������������������� 358
The Choropleth Map of the U.S. Population in 2022����������������������������������������������������� 362
Conclusions������������������������������������������������������������������������������������������������������������������ 366

■Chapter 12: Recognizing Handwritten Digits���������������������������������������������������� 367
Handwriting Recognition���������������������������������������������������������������������������������������������� 367
Recognizing Handwritten Digits with scikit-learn�������������������������������������������������������� 367
The Digits Dataset�������������������������������������������������������������������������������������������������������� 368
Learning and Predicting����������������������������������������������������������������������������������������������� 370
Recognizing Handwritten Digits with TensorFlow�������������������������������������������������������� 372
Learning and Predicting with an SLP��������������������������������������������������������������������������� 376
Learning and Predicting with an MLP�������������������������������������������������������������������������� 381
Conclusions������������������������������������������������������������������������������������������������������������������ 384

■Chapter 13: Textual Data Analysis with NLTK���������������������������������������������������� 385
Text Analysis Techniques���������������������������������������������������������������������������������������������� 385
The Natural Language Toolkit (NLTK)�������������������������������������������������������������������������������������������������� 386
Import the NLTK Library and the NLTK Downloader Tool��������������������������������������������������������������������� 386
Search for a Word with NLTK�������������������������������������������������������������������������������������������������������������� 389
Analyze the Frequency of Words�������������������������������������������������������������������������������������������������������� 390
Select Words from Text����������������������������������������������������������������������������������������������������������������������� 392
Bigrams and Collocations������������������������������������������������������������������������������������������������������������������� 393
Preprocessing Steps��������������������������������������������������������������������������������������������������������������������������� 394

xiv
■ Table of Contents

Use Text on the Network��������������������������������������������������������������������������������������������������������������������� 397


Extract the Text from the HTML Pages������������������������������������������������������������������������������������������������ 398
Sentiment Analysis����������������������������������������������������������������������������������������������������������������������������� 399

Conclusions������������������������������������������������������������������������������������������������������������������ 401

■Chapter 14: Image Analysis and Computer Vision with OpenCV����������������������� 403
Image Analysis and Computer Vision��������������������������������������������������������������������������� 403
OpenCV and Python������������������������������������������������������������������������������������������������������ 404
OpenCV and Deep Learning������������������������������������������������������������������������������������������ 404
Installing OpenCV��������������������������������������������������������������������������������������������������������� 404
First Approaches to Image Processing and Analysis���������������������������������������������������� 404
Before Starting����������������������������������������������������������������������������������������������������������������������������������� 404
Load and Display an Image���������������������������������������������������������������������������������������������������������������� 405
Work with Images������������������������������������������������������������������������������������������������������������������������������� 406
Save the New Image��������������������������������������������������������������������������������������������������������������������������� 407
Elementary Operations on Images������������������������������������������������������������������������������������������������������ 407
Image Blending����������������������������������������������������������������������������������������������������������������������������������� 411

Image Analysis������������������������������������������������������������������������������������������������������������� 412


Edge Detection and Image Gradient Analysis��������������������������������������������������������������� 413
Edge Detection����������������������������������������������������������������������������������������������������������������������������������� 413
The Image Gradient Theory���������������������������������������������������������������������������������������������������������������� 413
A Practical Example of Edge Detection with the Image Gradient Analysis����������������������������������������� 415
A Deep Learning Example: Face Detection������������������������������������������������������������������� 420
Conclusions������������������������������������������������������������������������������������������������������������������ 422

■Appendix A: Writing Mathematical Expressions with LaTeX����������������������������� 423


■Appendix B: Open Data Sources������������������������������������������������������������������������ 435

Index��������������������������������������������������������������������������������������������������������������������� 437

xv
About the Author

Fabio Nelli is a data scientist and Python consultant who designs and develops Python applications for
data analysis and visualization. He also has experience in the scientific world, having performed various
data analysis roles in pharmaceutical chemistry for private research companies and universities. He has
been a computer consultant for many years at IBM, EDS, and Hewlett-Packard, along with several banks
and insurance companies. He holds a master’s degree in organic chemistry and a bachelor’s degree in
information technologies and automation systems, with many years of experience in life sciences (as a tech
specialist at Beckman Coulter, Tecan, and SCIEX).
For further info and other examples, visit his page at www.meccanismocomplesso.org and the GitHub
page at https://github.com/meccanismocomplesso.

xvii
About the Technical Reviewer

Akshay R. Kulkarni is an artificial intelligence (AI) and machine learning


(ML) evangelist and thought leader. He has consulted with several Fortune
500 and global enterprises to drive AI and data science–led strategic
transformations. He is a Google developer, an author, and a regular
speaker at major AI and data science conferences, including the O’Reilly
Strata Data & AI Conference and the Great International Developer
Summit (GIDS). He has been a visiting faculty member at some of the
top graduate institutes in India. In 2019, he was featured as one of India’s
“top 40 under 40” data scientists. In his spare time, Akshay enjoys reading,
writing, coding, and helping aspiring data scientists. He lives in Bangalore
with his family.

xix
Preface

About five years have passed since the last edition of this book. In drafting this third edition, I made some
necessary changes, both to the text and to the code. First, all the Python code has been ported to 3.8 and
greater, and all references to Python 2.x versions have been dropped. Some chapters required a total
rewrite because the content was no longer compatible. I'm referring to TensorFlow 3.x which, compared
to TensorFlow 2.x (covered in the previous edition), has completely revamped its entire reference system.
In five years, the deep learning modules and code developed with version 2.x have proven completely
unusable. Keras and all its modules have been incorporated into the TensorFlow library, replacing all the
classes, functions, and modules that performed similar functions. The construction of neural network
models, their learning phases, and the functions they use have all completely changed. In this edition,
therefore, you have the opportunity to learn the methods of TensorFlow 3.x and to acquire familiarity with
the concepts and new paradigms in the new version.
Regarding data visualization, I decided to add information about the Seaborn library to the matplotlib
chapter. Seaborn, although still in version 0.x, is proving to be a very useful matplotlib extension for data
analysis, thanks to its statistical display of plots and its compatibility with pandas dataframes. I hope that,
with this completely updated third edition, I can further entice you to study and deepen your data analysis
with Python. This book will be a valuable learning tool for you now, and serve as a dependable reference in
the future.
—Fabio Nelli

xxi
CHAPTER 1

An Introduction to Data Analysis

In this chapter, you’ll take your first steps in the world of data analysis, learning in detail the concepts and
processes that make up this discipline. The concepts discussed in this chapter are helpful background
for the following chapters, where these concepts and procedures are applied in the form of Python code,
through the use of several libraries that are discussed in later chapters.

Data Analysis
In a world increasingly centralized around information technology, huge amounts of data are produced
and stored each day. Often these data come from automatic detection systems, sensors, and scientific
instrumentation, or you produce them daily and subconsciously every time you make a withdrawal from the
bank or purchase something, when you record various blogs, or even when you post on social networks.
But what are the data? The data actually are not information, at least in terms of their form. In the
formless stream of bytes, at first glance it is difficult to understand their essence, if they are not strictly
numbers, words, or times. This information is actually the result of processing, which, taking into account a
certain dataset, extracts conclusions that can be used in various ways. This process of extracting information
from raw data is called data analysis.
The purpose of data analysis is to extract information that is not easily deducible but, when understood,
enables you to carry out studies on the mechanisms of the systems that produced the data. This in turn
allows you to forecast possible responses of these systems and their evolution in time.
Starting from a simple methodical approach to data protection, data analysis has become a real
discipline, leading to the development of real methodologies that generate models. The model is in fact
a translation of the system to a mathematical form. Once there is a mathematical or logical form that can
describe system responses under different levels of precision, you can predict its development or response
to certain inputs. Thus, the aim of data analysis is not the model, but the quality of its predictive power.
The predictive power of a model depends not only on the quality of the modeling techniques but also
on the ability to choose a good dataset upon which to build the entire analysis process. So the search for
data, their extraction, and their subsequent preparation, while representing preliminary activities of an
analysis, also belong to data analysis itself, because of their importance in the success of the results.
So far I have spoken of data, their handling, and their processing through calculation procedures. In
parallel to all the stages of data analysis processing, various methods of data visualization have also been
developed. In fact, to understand the data, both individually and in terms of the role they play in the dataset,
there is no better system than to develop the techniques of graphical representation. These techniques are
capable of transforming information, sometimes implicitly hidden, into figures, which help you more easily
understand the meaning of the data. Over the years, many display modes have been developed for different
modes of data display, called charts.

© Fabio Nelli 2023 1


F. Nelli, Python Data Analytics, https://doi.org/10.1007/978-1-4842-9532-8_1
Chapter 1 ■ An Introduction to Data Analysis

At the end of the data analysis process, you have a model and a set of graphical displays and you can
predict the responses of the system under study; after that, you move to the test phase. The model is tested
using another set of data for which you know the system response. These data do not define the predictive
model. Depending on the ability of the model to replicate real, observed responses, you get an error
calculation and knowledge of the validity of the model and its operating limits.
These results can be compared to any other models to understand if the newly created one is
more efficient than the existing ones. Once you have assessed that, you can move to the last phase of
data analysis—deployment. This phase consists of implementing the results produced by the analysis,
namely, implementing the decisions to be made based on the predictions generated by the model and its
associated risks.
Data analysis is well suited to many professional activities. So, knowledge of it and how it can be put
into practice is relevant. It allows you to test hypotheses and understand the systems you’ve analyzed
more deeply.

Knowledge Domains of the Data Analyst


Data analysis is basically a discipline suitable to the study of problems that occur in several fields of
applications. Moreover, data analysis includes many tools and methodologies and requires knowledge of
computing, mathematical, and statistical concepts.
A good data analyst must be able to move and act in many disciplinary areas. Many of these disciplines
are the basis of the data analysis methods, and proficiency in them is almost necessary. Knowledge of other
disciplines is necessary, depending on the area of application and the particular data analysis project. More
generally, sufficient experience in these areas can help you better understand the issues and the type of data
you need.
Often, regarding major problems of data analysis, it is necessary to have an interdisciplinary team of
experts who can contribute in the best possible way to their respective fields of competence. Regarding
smaller problems, a good analyst must be able to recognize problems that arise during data analysis,
determine which disciplines and skills are necessary to solve these problems, study these disciplines, and
maybe even ask the most knowledgeable people in the sector. In short, the analyst must be able to search not
only for data, but also for information on how to treat that data.

Computer Science
Knowledge of computer science is a basic requirement for any data analyst. In fact, only when you have
good knowledge of and experience in computer science can you efficiently manage the necessary tools for
data analysis. In fact, every step concerning data analysis involves using calculation software (such as IDL,
MATLAB, etc.) and programming languages (such as C ++, Java, and Python).
The large amount of data available today, thanks to information technology, requires specific skills in
order to be managed as efficiently as possible. Indeed, data research and extraction require knowledge of
these various formats. The data are structured and stored in files or database tables with particular formats.
XML, JSON, or simply XLS or CSV files, are now the common formats for storing and collecting data, and
many applications allow you to read and manage the data stored in them. When it comes to extracting data
contained in a database, things are not so immediate, but you need to know the SQL Query language or use
software specially developed for the extraction of data from a given database.
Moreover, for some specific types of data research, the data are not available in an explicit format, but
are present in text files (documents and log files) or web pages, or shown as charts, measures, number of
visitors, or HTML tables. This requires specific technical expertise to parse and eventually extract these data
(called web scraping).

2
Chapter 1 ■ An Introduction to Data Analysis

Knowledge of information technology is necessary for using the various tools made available by
contemporary computer science, such as applications and programming languages. These tools, in turn, are
needed to perform data analysis and data visualization.
The purpose of this book is to provide all the necessary knowledge, as far as possible, regarding the
development of methodologies for data analysis. The book uses the Python programming language and
specialized libraries that contribute to the performance of the data analysis steps, from data research to data
mining, to publishing the results of the predictive model.

Mathematics and Statistics


As you will see throughout the book, data analysis requires a lot of complex math to treat and process the
data. You need to be competent in all of this, at least enough to understand what you are doing. Some
familiarity with the main statistical concepts is also necessary because the methods applied to the analysis
and interpretation of data are based on these concepts. Just as you can say that computer science gives you
the tools for data analysis, you can also say that statistics provide the concepts that form the basis of data
analysis.
This discipline provides many tools to the analyst, and a good knowledge of how to best use them
requires years of experience. Among the most commonly used statistical techniques in data analysis are
• Bayesian methods
• Regression
• Clustering
Having to deal with these cases, you’ll discover how mathematics and statistics are closely related.
Thanks to the special Python libraries covered in this book, you will be able to manage and handle them.

Machine Learning and Artificial Intelligence


One of the most advanced tools that falls in the data analysis camp is machine learning. In fact, despite the
data visualization and techniques such as clustering and regression, which help you find information about
the dataset, during this phase of research, you may often prefer to use special procedures that are highly
specialized in searching patterns within the dataset.
Machine learning is a discipline that uses a whole series of procedures and algorithms that analyze the
data in order to recognize patterns, clusters, or trends and then extracts useful information for analysis in an
automated way.
This discipline is increasingly becoming a fundamental tool of data analysis, and thus knowledge of it,
at least in general, is of fundamental importance to the data analyst.

Professional Fields of Application


Another very important point is the domain of data competence (its source—biology, physics, finance,
materials testing, statistics on population, etc.). In fact, although analysts have had specialized preparation
in the field of statistics, they must also be able to document the source of the data, with the aim of perceiving
and better understanding the mechanisms that generated the data. In fact, the data are not simple strings
or numbers; they are the expression, or rather the measure, of any parameter observed. Thus, a better
understanding of where the data came from can improve their interpretation. Often, however, this is too
costly for data analysts, even ones with the best intentions, and so it is good practice to find consultants or
key figures to whom you can pose the right questions.

3
Chapter 1 ■ An Introduction to Data Analysis

Understanding the Nature of the Data


The object of data analysis is basically the data. The data then will be the key player in all processes of data
analysis. The data constitute the raw material to be processed, and thanks to their processing and analysis,
it is possible to extract a variety of information in order to increase the level of knowledge of the system
under study.

When the Data Become Information


Data are the events recorded in the world. Anything that can be measured or categorized can be converted
into data. Once collected, these data can be studied and analyzed, both to understand the nature of events
and very often also to make predictions or at least to make informed decisions.

When the Information Becomes Knowledge


You can speak of knowledge when the information is converted into a set of rules that helps you better
understand certain mechanisms and therefore make predictions on the evolution of some events.

Types of Data
Data can be divided into two distinct categories:
• Categorical (nominal and ordinal)
• Numerical (discrete and continuous)
Categorical data are values or observations that can be divided into groups or categories. There are two
types of categorical values: nominal and ordinal. A nominal variable has no intrinsic order that is identified
in its category. An ordinal variable instead has a predetermined order.
Numerical data are values or observations that come from measurements. There are two types of
numerical values: discrete and continuous numbers. Discrete values can be counted and are distinct and
separated from each other. Continuous values, on the other hand, are values produced by measurements or
observations that assume any value within a defined range.

The Data Analysis Process


Data analysis can be described as a process consisting of several steps in which the raw data are transformed
and processed in order to produce data visualizations and make predictions, thanks to a mathematical
model based on the collected data. Then, data analysis is nothing more than a sequence of steps, each of
which plays a key role in the subsequent ones. So, data analysis is schematized as a process chain consisting
of the following sequence of stages:
• Problem definition
• Data extraction
• Data preparation - data cleaning
• Data preparation - data transformation
• Data exploration and visualization

4
Chapter 1 ■ An Introduction to Data Analysis

• Predictive modeling
• Model validation/testing
• Visualization and interpretation of results
• Deployment of the solution (implementation of the solution in the real world)
Figure 1-1 shows a schematic representation of all the processes involved in data analysis.

Figure 1-1. The data analysis process

Problem Definition
The process of data analysis actually begins long before the collection of raw data. In fact, data analysis
always starts with a problem to be solved, which needs to be defined.
The problem is defined only after you have focused the system you want to study; this may be a
mechanism, an application, or a process in general. Generally this study can be in order to better understand
its operation, but in particular, the study is designed to understand the principles of its behavior in order to
be able to make predictions or choices (defined as an informed choice).
The definition step and the corresponding documentation (deliverables) of the scientific problem or
business are both very important in order to focus the entire analysis strictly on getting results. In fact, a
comprehensive or exhaustive study of the system is sometimes complex and you do not always have enough
information to start with. So the definition of the problem and especially its planning can determine the
guidelines for the whole project.

5
Chapter 1 ■ An Introduction to Data Analysis

Once the problem has been defined and documented, you can move to the project planning stage of
data analysis. Planning is needed to understand which professionals and resources are necessary to meet
the requirements to carry out the project as efficiently as possible. You consider the issues involving the
resolution of the problem. You look for specialists in various areas of interest and install the software needed
to perform data analysis.
Also during the planning phase, you choose an effective team. Generally, these teams should be cross-
disciplinary in order to solve the problem by looking at the data from different perspectives. So, building a
good team is certainly one of the key factors leading to success in data analysis.

Data Extraction
Once the problem has been defined, the first step is to obtain the data in order to perform the analysis.
The data must be chosen with the basic purpose of building the predictive model, and so data selection is
crucial for the success of the analysis as well. The sample data collected must reflect as much as possible
the real world, that is, how the system responds to stimuli from the real world. For example, if you’re using
huge datasets of raw data and they are not collected competently, these may portray false or unbalanced
situations.
Thus, poor choice of data, or even performing analysis on a dataset that’s not perfectly representative of
the system, will lead to models that will move away from the system under study.
The search and retrieval of data often require a form of intuition that goes beyond mere technical
research and data extraction. This process also requires a careful understanding of the nature and form of
the data, which only good experience and knowledge in the problem’s application field can provide.
Regardless of the quality and quantity of data needed, another issue is using the best data sources.
If the studio environment is a laboratory (technical or scientific) and the data generated are
experimental, then in this case the data source is easily identifiable. In this case, the problems will be only
concerning the experimental setup.
But it is not possible for data analysis to reproduce systems in which data are gathered in a strictly
experimental way in every field of application. Many fields require searching for data from the surrounding
world, often relying on external experimental data, or even more often collecting them through interviews
or surveys. So in these cases, finding a good data source that is able to provide all the information you need
for data analysis can be quite challenging. Often it is necessary to retrieve data from multiple data sources to
supplement any shortcomings, to identify any discrepancies, and to make the dataset as general as possible.
When you want to get the data, a good place to start is the web. But most of the data on the web can be
difficult to capture; in fact, not all data are available in a file or database, but might be content that is inside
HTML pages in many different formats. To this end, a methodology called web scraping allows the collection
of data through the recognition of specific occurrence of HTML tags within web pages. There is software
specifically designed for this purpose, and once an occurrence is found, it extracts the desired data. Once the
search is complete, you will get a list of data ready to be subjected to data analysis.

Data Preparation
Among all the steps involved in data analysis, data preparation, although seemingly less problematic, in
fact requires more resources and more time to be completed. Data are often collected from different data
sources, each of which has data in it with a different representation and format. So, all of these data have to
be prepared for the process of data analysis.
The preparation of the data is concerned with obtaining, cleaning, normalizing, and transforming
data into an optimized dataset, that is, in a prepared format that’s normally tabular and is suitable for the
methods of analysis that have been scheduled during the design phase.
Many potential problems can arise, including invalid, ambiguous, or missing values, replicated fields,
and out-of-range data.

6
Chapter 1 ■ An Introduction to Data Analysis

Data Exploration/Visualization
Exploring the data involves essentially searching the data in a graphical or statistical presentation in order
to find patterns, connections, and relationships. Data visualization is the best tool to highlight possible
patterns.
In recent years, data visualization has been developed to such an extent that it has become a real
discipline in itself. In fact, numerous technologies are utilized exclusively to display data, and many display
types are applied to extract the best possible information from a dataset.
Data exploration consists of a preliminary examination of the data, which is important for
understanding the type of information that has been collected and what it means. In combination with the
information acquired during the definition problem, this categorization determines which method of data
analysis is most suitable for arriving at a model definition.
Generally, this phase, in addition to a detailed study of charts through the visualization data, may
consist of one or more of the following activities:
• Summarizing data
• Grouping data
• Exploring the relationship between the various attributes
• Identifying patterns and trends
Generally, data analysis requires summarizing statements regarding the data to be studied.
Summarization is a process by which data are reduced to interpretation without sacrificing important
information.
Clustering is a method of data analysis that is used to find groups united by common attributes (also
called grouping).
Another important step of the analysis focuses on the identification of relationships, trends, and
anomalies in the data. In order to find this kind of information, you often have to resort to the tools as well as
perform another round of data analysis, this time on the data visualization itself.
Other methods of data mining, such as decision trees and association rules, automatically extract
important facts or rules from the data. These approaches can be used in parallel with data visualization to
uncover relationships between the data.

Predictive Modeling
Predictive modeling is a process used in data analysis to create or choose a suitable statistical model to
predict the probability of a result.
After exploring the data, you have all the information needed to develop the mathematical model that
encodes the relationship between the data. These models are useful for understanding the system under
study, and in a specific way they are used for two main purposes. The first is to make predictions about the
data values produced by the system; in this case, you will be dealing with regression models if the result is
numeric or with classification models if the result is categorical. The second purpose is to classify new data
products, and in this case, you will be using classification models if the results are identified by classes or
clustering models if the results could be identified by segmentation. In fact, it is possible to divide the models
according to the type of result they produce:
• Classification models: If the result obtained by the model type is categorical.
• Regression models: If the result obtained by the model type is numeric.
• Clustering models: If the result obtained by the model type is a segmentation.

7
Chapter 1 ■ An Introduction to Data Analysis

Simple methods to generate these models include techniques such as linear regression, logistic
regression, classification and regression trees, and k-nearest neighbors. But the methods of analysis are
numerous, and each has specific characteristics that make it excellent for some types of data and analysis.
Each of these methods will produce a specific model, and then their choice is relevant to the nature of the
product model.
Some of these models will provide values corresponding to the real system and according to their
structure. They will explain some characteristics of the system under study in a simple and clear way. Other
models will continue to give good predictions, but their structure will be no more than a “black box” with
limited ability to explain characteristics of the system.

Model Validation
Validation of the model, that is, the test phase, is an important phase that allows you to validate the model
built on the basis of starting data. That is important because it allows you to assess the validity of the data
produced by the model by comparing these data directly with the actual system. But this time, you are
coming from the set of starting data on which the entire analysis has been established.
Generally, you refer to the data as the training set when you are using them to build the model, and as
the validation set when you are using them to validate the model.
Thus, by comparing the data produced by the model with those produced by the system, you can
evaluate the error, and using different test datasets, you can estimate the limits of validity of the generated
model. In fact the correctly predicted values could be valid only within a certain range, or they could have
different levels of matching depending on the range of values taken into account.
This process allows you not only to numerically evaluate the effectiveness of the model but also to
compare it with any other existing models. There are several techniques in this regard; the most famous is
the cross-validation. This technique is based on the division of the training set into different parts. Each of
these parts, in turn, is used as the validation set and any other as the training set. In this iterative manner,
you will have an increasingly perfected model.

Deployment
This is the final step of the analysis process, which aims to present the results, that is, the conclusions of the
analysis. In the deployment process of the business environment, the analysis is translated into a benefit
for the client who has commissioned it. In technical or scientific environments, it is translated into design
solutions or scientific publications. That is, the deployment basically consists of putting into practice the
results obtained from the data analysis.
There are several ways to deploy the results of data analysis or data mining. Normally, a data analyst’s
deployment consists of writing a report for management or for the customer who requested the analysis.
This document conceptually describes the results obtained from the analysis of data. The report should
be directed to the managers, who are then able to make decisions. Then, they will put into practice the
conclusions of the analysis.
In the documentation supplied by the analyst, each of these four topics is discussed in detail:
• Analysis results
• Decision deployment
• Risk analysis
• Measuring the business impact
When the results of the project include the generation of predictive models, these models can be
deployed as stand-alone applications or can be integrated into other software.

8
Chapter 1 ■ An Introduction to Data Analysis

Quantitative and Qualitative Data Analysis


Data analysis is completely focused on data. Depending on the nature of the data, it is possible to make
some distinctions.
When the analyzed data have a strictly numerical or categorical structure, then you are talking about
quantitative analysis, but when you are dealing with values that are expressed through descriptions in
natural language, then you are talking about qualitative analysis.
Precisely because of the different nature of the data processed by the two types of analyses, you can
observe some differences between them.
Quantitative analysis has to do with data with a logical order or that can be categorized in some way.
This leads to the formation of structures within the data. The order, categorization, and structures in turn
provide more information and allow further processing of the data in a more mathematical way. This leads
to the generation of models that provide quantitative predictions, thus allowing the data analyst to draw
more objective conclusions.
Qualitative analysis instead has to do with data that generally do not have a structure, at least not one
that is evident, and their nature is neither numeric nor categorical. For example, data under qualitative
study could include written textual, visual, or audio data. This type of analysis must therefore be based on
methodologies, often ad hoc, to extract information that will generally lead to models capable of providing
qualitative predictions. That means the conclusions to which the data analyst can arrive may also include
subjective interpretations. On the other hand, qualitative analysis can explore more complex systems and
draw conclusions that are not possible using a strictly mathematical approach. Often this type of analysis
involves the study of systems that are not easily measurable, such as social phenomena or complex
structures.
Figure 1-2 shows the differences between the two types of analyses.

Figure 1-2. Quantitative and qualitative analyses

Open Data
In support of the growing demand for data, a huge number of data sources are now available on the Internet.
These data sources freely provide information to anyone in need, and they are called open data.

9
Chapter 1 ■ An Introduction to Data Analysis

Here is a list of some open data available online covering different topics. You can find a more complete
list and details of the open data available online in Appendix B.
• Kaggle (www.kaggle.com/datasets) is a huge community of apprentices and expert
data scientists who provide a vast amount of datasets and code that they use for
their analyses. The extensive documentation and the introduction to every aspect
of machine learning are also excellent. They also hold interesting competitions
organized around the resolution of various problems.
• DataHub (datahub.io/search) is a community that makes a huge amount of
datasets freely available, along with tools for their command-line management. The
dataset topics cover various fields, ranging from the financial market, to population
statistics, to the prices of cryptocurrencies.
• Nasa Earth Observations (https://neo.gsfc.nasa.gov/dataset_index.php/)
provides a wide range of datasets that contain data collected from global climate and
environmental observations.
• World Health Organization (www.who.int/data/collections) manages and
maintains a wide range of data collections related to global health and well-being.
• World Bank Open Data (https://data.worldbank.org/) provides a listing of
available World Bank datasets covering financial and banking data, development
indicators, and information on the World Bank’s lending projects from 1947 to the
present.
• Data.gov (https://data.gov) is intended to collect and provide access to the
U.S. government’s Open Data, a broad range of government information collected at
different levels (federal, state, local, and tribal).
• European Union Open Data Portal (https://data.europa.eu/en) collects and
makes publicly available a wide range of datasets concerning the public sector of the
European member states.
• Healthdata.gov (www.healthdata.gov/) provides data about health and health care
for doctors and researchers so they can carry out clinical studies and solve problems
regarding diseases, virus spread, and health practices, as well as improve the level of
global health.
• Google Trends Datastore (https://googletrends.github.io/data/) collects and
makes available the collected data divided by topic of the famous and very useful
Google Trends, which is used to carry out analyses on its own account.
Finally, recently Google has made available a search page dedicated to datasets,
where you can search for a topic and obtain a series of datasets (or even data
sources) that correspond as much as possible to what you are looking for. For
example, in Figure 1-3, you can see how, when researching the price of houses, a
series of datasets or data sources are suggested in real time.

10
Chapter 1 ■ An Introduction to Data Analysis

Figure 1-3. Example of a search for a dataset regarding the prices of houses on Google Dataset Search

As an idea of open data sources available online, you can look at the LOD cloud diagram (http://cas.
lod-cloud.net), which displays the connections of the data link among several open data sources currently
available on the network (see Figure 1-4). The diagram contains a series of circular elements corresponding
to the available data sources; their color corresponds to a specific topic of the data provided. The legend
indicates the topic-color correspondence. When you click an element on the diagram, you see a page
containing all the information about the selected data source and how to access it.

11
Chapter 1 ■ An Introduction to Data Analysis

Figure 1-4. Linked open data cloud diagram 2023, by Max Schmachtenberg, Christian Bizer, Anja Jentzsch,
and Richard Cyganiak. http://cas.lod-cloud.net [CC-BY license]

Python and Data Analysis


The main argument of this book is to develop all the concepts of data analysis by treating them in terms of
Python. The Python programming language is widely used in scientific circles because of its large number of
libraries that provide a complete set of tools for analysis and data manipulation.

12
Chapter 1 ■ An Introduction to Data Analysis

Compared to other programming languages generally used for data analysis, such as R and MATLAB,
Python not only provides a platform for processing data, but it also has features that make it unique
compared to other languages and specialized applications. The development of an ever-increasing number
of support libraries, the implementation of algorithms of more innovative methodologies, and the ability to
interface with other programming languages (C and Fortran) all make Python unique among its kind.
Furthermore, Python is not only specialized for data analysis, but it also has many other applications,
such as generic programming, scripting, interfacing to databases, and more recently web development,
thanks to web frameworks like Django. So it is possible to develop data analysis projects that are compatible
with the web server with the possibility to integrate them on the web.
For those who want to perform data analysis, Python, with all its packages, is considered the best choice
for the foreseeable future.

Conclusions
In this chapter, you learned what data analysis is and, more specifically, the various processes that comprise
it. Also, you have begun to see the role that data play in building a prediction model and how their careful
selection is at the basis of a careful and accurate data analysis.
In the next chapter, you take this vision of Python and the tools it provides to perform data analysis.

13
CHAPTER 2

Introduction to the Python World

The Python language, and the world around it, is made by interpreters, tools, editors, libraries, notebooks,
and so on. This Python world has expanded greatly in recent years, enriching and taking forms that
developers who approach it for the first time can sometimes find complicated and somewhat misleading.
Thus, if you are approaching Python for the first time, you might feel lost among so many choices, especially
about where to start.
This chapter gives you an overview of the entire Python world. You’ll first gain an introduction to the
Python language and its unique characteristics. You’ll learn where to start, what an interpreter is, and how to
begin writing your first lines of code in Python before being presented with some new and more advanced
forms of interactive writing with respect to shells, such as IPython and the IPython Notebook.

Python—The Programming Language


The Python programming language was created by Guido Von Rossum in 1991 and started with a previous
language called ABC. This language can be characterized by a series of adjectives:
• Interpreted
• Portable
• Object-oriented
• Interactive
• Interfaced
• Open source
• Easy to understand and use
Python is an interpreted programming language, that is, it’s pseudo-compiled. Once you write the
code, you need an interpreter to run it. An interpreter is a program that is installed on each machine; it
interprets and runs the source code. Unlike with languages such as C, C++, and Java, there is no compile
time with Python.
Python is a highly portable programming language. The decision to use an interpreter as an interface
for reading and running code has a key advantage: portability. In fact, you can install an interpreter on any
platform (Linux, Windows, and Mac) and the Python code will not change. Because of this, Python is often
used with many small-form devices, such as the Raspberry Pi and other microcontrollers.

© Fabio Nelli 2023 15


F. Nelli, Python Data Analytics, https://doi.org/10.1007/978-1-4842-9532-8_2
Chapter 2 ■ Introduction to the Python World

Python is an object-oriented programming language. In fact, it allows you to specify classes of objects
and implement their inheritance. But unlike C++ and Java, there are no constructors or destructors. Python
also allows you to implement specific constructs in your code to manage exceptions. However, the structure
of the language is so flexible that it allows you to program with alternative approaches with respect to the
object-­oriented one. For example, you can use functional or vectorial approaches.
Python is an interactive programming language. Thanks to the fact that Python uses an interpreter to
be executed, this language can take on very different aspects depending on the context in which it is used.
In fact, you can write long lines of code, similar to what you might do in languages like C++ or Java, and then
launch the program, or you can enter the command line at once and execute a command, immediately
getting the results. Then, depending on the results, you can decide what command to run next. This highly
interactive way to execute code makes the Python computing environment similar to MATLAB. This feature
of Python is one reason it’s popular with the scientific community.
Python is a programming language that can be interfaced. In fact, this programming language can be
interfaced with code written in other programming languages such as C/C++ and FORTRAN. Even this
was a winning choice. In fact, thanks to this aspect, Python can compensate for what is perhaps its only
weak point, the speed of execution. The nature of Python, as a highly dynamic programming language, can
sometimes lead to execution of programs up to 100 times slower than the corresponding static programs
compiled with other languages. The solution to this kind of performance problem is to interface Python to
the compiled code of other languages by using it as if it were its own.
Python is an open-source programming language. CPython, which is the reference implementation
of the Python language, is completely free and open source. Additionally every module or library in the
network is open source and their code is available online. Every month, an extensive developer community
includes improvements to make this language and all its libraries even richer and more efficient. CPython is
managed by the nonprofit Python Software Foundation, which was created in 2001 and has given itself the
task of promoting, protecting, and advancing the Python programming language.
Finally, Python is a simple language to use and learn. This aspect is perhaps the most important,
because it is the most direct aspect that a developer, even a novice, faces. The high intuitiveness and ease of
reading of Python code often leads to “sympathy” for this programming language, and consequently most
newcomers to programming choose to use it. However, its simplicity does not mean narrowness, since
Python is a language that is spreading in every field of computing. Furthermore, Python is doing all of this
very simply, in comparison to existing programming languages such as C++, Java, and FORTRAN, which by
their nature are very complex.

The Interpreter and the Execution Phases of the Code


Unlike programming languages such as Java or C, whose code must be compiled before being executed,
Python is a language that allows direct execution of instructions. In fact, it is possible to execute code written
in Python it two ways. You can execute entire programs (.py files) by running the python command followed
by the file name, or you can open a session through a special command console, characterized by a >>>
prompt (running the python command with no arguments). In this console, you can enter one instruction at
a time, obtaining the result immediately by executing it directly.
In both cases, you have the immediate execution of the inserted code, without having to go through
explicit compilation or other operations.
This direct execution operation can be schematized in four phases:
• Lexing or tokenization
• Parsing
• Compiling
• Interpreting

16
Chapter 2 ■ Introduction to the Python World

Lexing, or tokenization, is the initial phase in which the Python (human-readable) code is converted
into a sequence of logical entities, the so-called lexical tokens (see Figure 2-1).
Parsing is the next stage in which the syntax and grammar of the lexical tokens are checked by a parser,
which produces an abstract syntax tree (AST) as a result.
Compiling is the phase in which the compiler reads the AST and, based on this information, generates
the Python bytecode (.pyc or .pyo files), which contains very basic execution instructions. Although this
is a compilation phase, the generated bytecode is still platform-independent, which is very similar to what
happens in the Java language.
The last phase is interpreting, in which the generated bytecode is executed by a Python virtual
machine (PVM).

Figure 2-1. The steps performed by the Python interpreter

You can find good documentation on this process at www.ics.uci.edu/~pattis/ICS-31/lectures/


tokens.pdf.
All these phases are performed by the interpreter, which in the case of Python is a fundamental
component. When referring to the Python interpreter, this usually means the /urs/bin/python binary. In
reality, there are currently several versions of this Python interpreter, each of which is profoundly different in
its nature and specifications.

CPython
The standard Python interpreter is CPython, and it was written in C. This made it possible to use C-based
libraries over Python. CPython is available on a variety of platforms, including ARM, iOS, and RISC. Despite
this, CPython has been optimized on portability and other specifications, but not on speed.

Cython
The strongly intrinsic nature of C in the CPython interpreter has been taken further with the Cython project.
This project is based on creating a compiler that translates Python code into C. This code is then executed
within a Cython environment at runtime. This type of compilation system makes it possible to introduce C
semantics into the Python code to make it even more efficient. This system has led to the merging of two worlds
of programming language with the birth of Cython, which can be considered a new programming language.
You can find documentation about it online. I advise you to visit cython.readthedocs.io/en/latest/.

Pyston
Pyston (www.pyston.org/) is a fork of the CPython interpreter that implements performance optimization.
This project arises precisely from the need to obtain an interpreter that can replace CPython over time to
remedy its poor performance in terms of execution speed. Recent results seem to confirm these predictions,
reporting a 30 percent improvement in performance in the case of large, real-world applications.
Unfortunately, due to the lack of compatible binary packages, Pyston packages have to be rebuilt during the
download phase.

17
Chapter 2 ■ Introduction to the Python World

Jython
In parallel to Cython, there is a version built and compiled in Java, called Jython. It was created by Jim
Hugunin in 1997 (www.jython.org/). Jython is an implementation of the Python programming language in
Java; it is further characterized by using Java classes instead of Python modules to implement extensions and
packages of Python.

IronPython
Even the .NET framework offers the possibility of being able to execute Python code inside it. For this
purpose, you can use the IronPython interpreter (https://ironpython.net/). This interpreter allows .NET
developers to develop Python programs on the Visual Studio platform, integrating perfectly with the other
development tools of the .NET platform.
Initially built by Jim Hugunin in 2006 with the release of version 1.0, the project was later supported by a
small team at Microsoft until version 2.7 in 2010. Since then, numerous other versions have been released up
to the current 3.4, all ported forward by a group of volunteers on Microsoft’s CodePlex repository.

PyPy
The PyPy interpreter is a JIT (just-in-time) compiler, and it converts the Python code directly to machine
code at runtime. This choice was made to speed up the execution of Python. However, this choice has led to
the use of a smaller subset of Python commands, defined as RPython. For more information on this, consult
the official website at www.pypy.org/.

RustPython
As the name suggests, RustPython (rustpython.github.io/) is a Python interpreter written in Rust. This
programming language is quite new but it is gaining popularity. RustPython is an interpreter like CPython
but can also be used as a JIT compiler. It also allows you to run Python code embedded in Rust programs
and compile the code into WebAssembly, so you can run Python code directly from web browsers.

Installing Python
In order to develop programs in Python, you have to install it on your operating system. Linux distributions
and macOS X machines should have a preinstalled version of Python. If not, or if you want to replace that
version with another, you can easily install it. The process for installing Python differs from operating system
to operating system. However, it is a rather simple operation.
On Debian-Ubuntu Linux systems, the first thing to do is to check whether Python is already installed
on your system and what version is currently in use.
Open a terminal (by pressing ALT+CTRL+T) and enter the following command:

python3 --version

If you get the version number as output, then Python is already present on the Ubuntu system. If you get
an error message, Python hasn’t been installed yet.
In this last case

sudo apt install python3

18
Chapter 2 ■ Introduction to the Python World

If, on the other hand, the current version is old, you can update it with the latest version of your Linux
distribution by entering the following command:

sudo apt --only-upgrade install python3

Finally, if instead you want to install a specific version on your system, you have to explicitly indicate it
in the following way:

sudo apt install python3.10

On Red Hat and CentOS Linux systems working with rpm packages, run this command instead:

yum install python3

If you are running Windows or macOS X, you can go to the official Python site (www.python.org) and
download the version you prefer. The packages in this case are installed automatically.
However, today there are distributions that provide a number of tools that make the management and
installation of Python, all libraries, and associated applications easier. I strongly recommend you choose one
of the distributions available online.

Python Distributions
Due to the success of the Python programming language, many Python tools have been developed to meet
various functionalities over the years. There are so many that it’s virtually impossible to manage all of them
manually.
In this regard, many Python distributions efficiently manage hundreds of Python packages. In fact,
instead of individually downloading the interpreter, which includes only the standard libraries, and then
needing to individually install all the additional libraries, it is much easier to install a Python distribution.
At the heart of these distributions are the package managers, which are nothing more than applications
that automatically manage, install, upgrade, configure, and remove Python packages that are part of the
distribution.
Their functionality is very useful, since the user simply makes a request regarding a particular package
(which could be an installation for example). Then the package manager, usually via the Internet, performs
the operation by analyzing the necessary version, alongside all dependencies with any other packages, and
downloads them if they are not present.

Anaconda
Anaconda is a free distribution of Python packages distributed by Continuum Analytics (www.anaconda.com).
This distribution supports Linux, Windows, and macOS X operating systems. Anaconda, in addition to
providing the latest packages released in the Python world, comes bundled with most of the tools you need
to set up a Python development environment.
Indeed, when you install the Anaconda distribution on your system, you can use many tools and
applications described in this chapter, without worrying about having to install and manage them
separately. The basic distribution includes Spyder, an IDE used to develop complex Python programs,
Jupyter Notebook, a wonderful tool for working interactively with Python in a graphical and orderly way, and
Anaconda Navigator, a graphical panel for managing packages and virtual environments.

19
Chapter 2 ■ Introduction to the Python World

The management of the entire Anaconda distribution is performed by an application called conda. This
is the package manager and the environment manager of the Anaconda distribution and it handles all of the
packages and their versions.

conda install <package name>

One of the most interesting aspects of this distribution is the ability to manage multiple development
environments, each with its own version of Python. With Anaconda, you can work simultaneously and
independently with different Python versions at the same time, by creating several virtual environments.
You can create, for instance, an environment based on Python 3.11 even if the current Python version is still
3.10 in your system. To do this, you write the following command via the console:

conda create -n py311 python=3.11 anaconda

This will generate a new Anaconda virtual environment with all the packages related to the Python
3.11 version. This installation will not affect the Python version installed on your system and won’t generate
any conflicts. When you no longer need the new virtual environment, you can simply uninstall it, leaving
the Python system installed on your operating system completely unchanged. Once it’s installed, you can
activate the new environment by entering the following command:

source activate py311

On Windows, use this command instead:

activate py311
C:\Users\Fabio>activate py311
(py311) C:\Users\Fabio>

You can create as many versions of Python as you want; you need only to change the parameter passed
with the python option in the conda create command. When you want to return to work with the original
Python version, use the following command:

source deactivate

On Windows, use this command:

(py311) C:\Users\Fabio>deactivate
Deactivating environment "py311"...
C:\Users\Fabio>

A
 naconda Navigator
Although at the base of the Anaconda distribution there is the conda command for the management of
packages and virtual environments, working through the command console is not always practical and
efficient. As you will see in the following chapters of the book, Anaconda provides a graphical tool called
Anaconda Navigator, which allows you to manage the virtual environments and related packages in a
graphical and very simplified way (see Figure 2-2).

20
Chapter 2 ■ Introduction to the Python World

Figure 2-2. Home panel of Anaconda Navigator

Anaconda Navigator is mainly composed of four panels:


• Home
• Environments
• Learning
• Community
Each of them is selectable through the list of buttons clearly visible on the left.
The Home panel presents all the Python (and also R) development applications installed (or available)
for a given virtual environment. By default, Anaconda Navigator will show the base operating system
environment, referred as base(root) in the top-center drop-down menu (see Figure 2-2).
The second panel, called Environments, shows all the virtual environments created in the distribution
(see Figure 2-3). From there, it is possible to select the virtual environment to activate by clicking it directly.
It will display all the packages installed (or available) on that virtual environment, with the relative versions.

21
Chapter 2 ■ Introduction to the Python World

Figure 2-3. Environments panel on Anaconda Navigator

Also from the Environments panel it is possible to create new virtual environments, selecting the basic
Python version. Similarly, the same virtual environments can be deleted, cloned, backed up, or imported
using the menu shown in Figure 2-4.

Figure 2-4. Button menu for managing virtual environments in Anaconda Navigator

But that is not all. Anaconda Navigator is not only a useful application for managing Python
applications, virtual environments, and packages. In the third panel, called Learning (see Figure 2-5), it
provides links to the main sites of many useful Python libraries (including those covered in this book). By
clicking one of these links, you can access a lot of documentation. This is always useful to have on hand if
you program in Python on a daily basis.

22
Chapter 2 ■ Introduction to the Python World

Figure 2-5. Learning panel of Anaconda Navigator

An identical panel to this is the next one, called Community. There are links here too, but this time to
forums from the main Python development and Data Analytics communities.
The Anaconda platform, with its multiple applications and Anaconda Navigator, allows developers to
take advantage of this simple and organized work environment and be well prepared for the development
of Python code. It is no coincidence that this platform has become almost a standard for those belonging to
the sector.

Using Python
Python is rich, but simple and very flexible. It allows you to expand your development activities in many
areas of work (data analysis, scientific, graphic interfaces, etc.). Precisely for this reason, Python can be used
in many different contexts, often according to the taste and ability of the developer. This section presents
the various approaches to using Python in the course of the book. According to the various topics discussed
in different chapters, these different approaches will be used specifically, as they are more suited to the task
at hand.

Python Shell
The easiest way to approach the Python world is to open a session in the Python shell, which is a terminal
running a command line. In fact, you can enter one command at a time and test its operation immediately.
This mode makes clear the nature of the interpreter that underlies Python. In fact, the interpreter can read
one command at a time, keeping the status of the variables specified in the previous lines, a behavior similar
to that of MATLAB and other calculation software.

23
Chapter 2 ■ Introduction to the Python World

This approach is helpful when approaching Python the first time. You can test commands one at a time
without having to write, edit, and run an entire program, which could be composed of many lines of code.
This mode is also good for testing and debugging Python code one line at a time, or simply to make
calculations. To start a session on the terminal, simply type this on the command line:

C:\Users\nelli>python
Python 3.10 | packaged by Anaconda, Inc. | (main, Mar  1 2023, 18:18:21) [MSC v.1916 64 bit
(AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>>

The Python shell is now active and the interpreter is ready to receive commands in Python. Start by
entering the simplest of commands, but a classic for getting started with programming.

>>> print("Hello World!")


Hello World!

If you have the Anaconda platform available on your system, you can open a Python shell related to a
specific virtual environment you want to work on. In this case, from Anaconda Navigator, in the Home panel,
activate the virtual environment from the drop-down menu and click the Launch button of the CMD.exe
Prompt application, as shown in Figure 2-6.

Figure 2-6. CMD.exe Prompt application in Anaconda Navigator

24
Chapter 2 ■ Introduction to the Python World

A command console will open with the name of the active virtual environment prefixed in brackets in
the prompt. From there, you can run the python command to activate the Python shell.

(Edition3) C:\Users\nelli>python
Python 3.11.0 | packaged by Anaconda, Inc. | (main, Mar  1 2023, 18:18:21) [MSC v.1916 64
bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>>

Run an Entire Program


The best way to become familiar with Python is to write an entire program and then run it from the terminal.
First write a program using a simple text editor. For example, you can use the code shown in Listing 2-1 and
save it as MyFirstProgram.py.

Listing 2-1. MyFirstProgram.py

myname = input("What is your name?\n")


print("Hi %s, I'm glad to say: Hello world!" %myname)

Now you’ve written your first program in Python, and you can run it directly from the command line by
calling the python command and then the name of the file containing the program code.

python MyFirstProgram.py

From the output, the program will ask for your name. Once you enter it, it will say hello.

What is your name?


Fabio Nelli
Hi Fabio Nelli, I'm glad to say: Hello world!

Implement the Code Using an IDE


A more comprehensive approach than the previous ones is to use an IDE (an Integrated Development
Environment). These editors provide a work environment on which to develop your Python code. They are
rich in tools that make developers’ lives easier, especially when debugging. In the following sections, you see
in detail which IDEs are currently available.

I nteract with Python


The last approach to using Python, and in my opinion, perhaps the most innovative, is the interactive one.
In fact, in addition to the three previous approaches, this approach provides you with the opportunity to
interact directly with the Python code.
In this regard, the Python world has been greatly enriched with the introduction of IPython. IPython is
a very powerful tool, designed specifically to meet the needs of interacting between the Python interpreter
and the developer, which under this approach takes the role of analyst, engineer, or researcher. IPython and
its features are explained in more detail in a later section.

25
Chapter 2 ■ Introduction to the Python World

Writing Python Code


In the previous section, you saw how to write a simple program in which the string "Hello World" was
printed. Now in this section, you get a brief overview of the basics of the Python language.
This section is not intended to teach you to program in Python, or to illustrate syntax rules of the
programming language, but just to give you a quick overview of some basic principles of Python necessary to
continue with the topics covered in this book.
If you already know the Python language, you can safely skip this introductory section. Instead, if you
are not familiar with programming and you find it difficult to understand the topics, I highly recommend
that you visit online documentation, tutorials, and courses of various kinds.

M
 ake Calculations
You have already seen that the print() function is useful for printing almost anything. Python, in addition
to being a printing tool, is a great calculator. Start a session on the Python shell and begin to perform these
mathematical operations:

>>> 1 + 2
3
>>> (1.045 * 3)/4
0.78375
>>> 4 ** 2
16
>>> ((4 + 5j) * (2 + 3j))
(-7+22j)
>>> 4 < (2*3)
True

Python can calculate many types of data, including complex numbers and conditions with Boolean
values. As you can see from these calculations, the Python interpreter directly returns the result of the
calculations without the need to use the print() function. The same thing applies to values contained in
variables. It’s enough to call the variable to see its contents.

>>> a = 12 * 3.4
>>> a
40.8

Import New Libraries and Functions


You saw that Python is characterized by the ability to extend its functionality by importing numerous
packages and modules. To import a module in its entirety, you have to use the import command.

>>> import math

In this way, all the functions contained in the math package are available in your Python session so you
can call them directly. Thus, you have extended the standard set of functions available when you start a
Python session. These functions are called with the following expression.

library_name.function_name()

26
Chapter 2 ■ Introduction to the Python World

For example, you can now calculate the sine of the value contained in the variable a.

>>> math.sin(a)

As you can see, the function is called along with the name of the library. Sometimes you might find the
following expression for declaring an import.

>>> from math import *

Even if this works properly, it is to be avoided for good practice. In fact, writing an import in this way
involves the importation of all functions without necessarily defining the library to which they belong.

>>> sin(a)
0.040693257349864856

This form of import can lead to very large errors, especially if the imported libraries are numerous. In
fact, it is not unlikely that different libraries have functions with the same name, and importing all of these
would result in an override of all functions with the same name that were previously imported. Therefore,
the behavior of the program could generate numerous errors or worse, abnormal behavior.
Actually, this way to import is generally used for only a limited number of functions, that is, functions
that are strictly necessary for the functioning of the program, thus avoiding the importation of an entire
library when it is completely unnecessary.

>>> from math import sin

Data Structure
You saw in the previous examples how to use simple variables containing a single value. Python provides a
number of extremely useful data structures. These data structures can contain lots of data simultaneously
and sometimes even data of different types. The various data structures provided are defined differently
depending on how their data are structured internally.
• List
• Set
• Strings
• Tuples
• Dictionary
• Deque
• Heap
This is only a small part of all the data structures that can be made with Python. Among all these data
structures, the most commonly used are dictionaries and lists.
The type dictionary, defined also as dicts, is a data structure in which each particular value is associated
with a particular label, called a key. The data collected in a dictionary have no internal order but are only
definitions of key/value pairs.

>>> dict = {'name':'William', 'age':25, 'city':'London'}

27
Chapter 2 ■ Introduction to the Python World

If you want to access a specific value within the dictionary, you have to indicate the name of the
associated key.

>>> dict["name"]
'William'

If you want to iterate the pairs of values in a dictionary, you have to use the for-in construct. This is
possible through the use of the items() function.

>>> for key, value in dict.items():


...    print(key,value)
...
name William
age 25
city London

The type list is a data structure that contains a number of objects in a precise order to form a sequence
to which elements can be added and removed. Each item is marked with a number corresponding to the
order of the sequence, called the index.

>>> list = [1,2,3,4]


>>> list
[1, 2, 3, 4]

If you want to access the individual elements, it is sufficient to specify the index in square brackets (the
first item in the list has 0 as its index), while if you take out a portion of the list (or a sequence), it is sufficient
to specify the range with the indices i and j corresponding to the extremes of the portion.

>>> list[2]
3
>>> list[1:3]
[2, 3]

If you are using negative indices instead, this means you are considering the last item in the list and
gradually moving to the first.

>>> list[-1]
4

In order to do a scan of the elements of a list, you can use the for-in construct.

>>> items = [1,2,3,4,5]


>>> for item in items:
...        print(item + 1)
...
2
3
4
5
6

28
Other documents randomly have
different content
reminds us of the doings of Justice, when she did act, in the reign of
James VI. Hunter and Strachan, a notary, were hanged on the 18th of
February, ‘as an example to the terror of others,’ says Fountainhall.
Three other persons, including a notary, were glad to save
themselves from a trial, by voluntary banishment. ‘Some moved that
they might be delivered to a captain of the recruits, to serve as
soldiers in Flanders; but the other method was judged more
legal.’[397]

The parish of Spott, in East Lothian, Dec. 30.


having no communion-cups of its own, was
accustomed to borrow those of the neighbouring parish of Stenton,
when required. The Stenton kirk-session latterly tired of this
benevolence, and resolved to charge half-a-crown each time their
cups were borrowed by Spott. Spott then felt a little ashamed of its
deficiency of communion-cups, and resolved to provide itself with a
pair. Towards the sum required, the minister was directed to take all
the foreign coin now in the box, as it was to be no longer current, and
such further sum as might be necessary.
The parish is soon after found sanctioning the account of Thomas
Kerr, an Edinburgh goldsmith, for ‘ane pair of communion-cups,
weighing 33 oz. 6 drops, at £3, 16s. per oz.,’ 1707.
being £126, 12s. in all, Scots money, besides
‘two shillings sterling of drink-money given to the goldsmith’s
men.’[398]

The Union produced some immediate 1708.


effects of a remarkable nature on the
industry and traffic of Scotland—not all of them good, it must be
owned, but this solely by reason of the erroneous laws in respect of
trade which existed in England, and to which Scotland was obliged to
conform.
Scotland had immediately to cease importing wines, brandy, and
all things produced by France; with no remeed but what was
supplied by the smuggler. This was one branch of her public or
ostensible commerce now entirely destroyed. She had also, in
conformity with England, to cease exporting her wool. This, however,
was an evil not wholly unalleviated, as will presently be seen.
Before this time, as admitted by Defoe, the Scotch people had
‘begun to come to some perfection in making broad cloths, druggets,
and [woollen] stuffs of all sorts.’ Now that there was no longer a
prohibition of English goods of the same kinds, these began to come
in in such great quantity, and at such prices, as at once extinguished
the superior woollen manufacture in Scotland. There remained the
manufacture of coarse cloths, as Stirling serges, Musselburgh stuffs,
and the like; and this now rather flourished, partly because the wool,
being forbidden to be sent abroad, could be had at a lower price, and
partly because these goods came into demand in England. Of course,
the people at large were injured by not getting the best price for their
wool, and benefited by getting the finer English woollen goods at a
cheaper rate than they had formerly paid for their own manufactures
of the same kinds; but no one saw such matters in such a light at that
time. The object everywhere held in view was to benefit trade—that
is, everybody’s peculium, as distinguished from the general good.
The general good was left to see after itself, after everybody’s
peculium had been served; and small enough were the crumbs
usually left to it.
On the other hand, duties being taken off Scottish linen introduced
into England, there was immediately a large increase to that branch
of the national industry. Englishmen came down and established
works for sail-cloth, for damasks, and other linen articles heretofore
hardly known in the north; and thus it was 1708.
remarked there was as much employment
for the poor as in the best days of the woollen manufacture.
The colonial trade being now, moreover, open to Scottish
enterprise, there was an immediate stimulus to the building of ships
for that market. Cargoes of Scottish goods went out in great quantity,
in exchange for colonial products brought in. According to Defoe,
‘several ships were laden for Virginia and Barbadoes the very first
year after the Union.’[399]
We get a striking idea of the small scale on which the earlier
commercial efforts were conducted, from a fact noted by Wodrow, as
to a loss made by the Glasgow merchants in the autumn of 1709. ‘In
the beginning of this month [November],’ says he, ‘Borrowstounness
and Glasgow have suffered very much by the fleet going to Holland,
its being taken by the French. It’s said that in all there is about eighty
thousand pounds sterling lost there, whereof Glasgow has lost ten
thousand pounds. I wish trading persons may see the language of
such a providence. I am sure the Lord is remarkably frowning upon
our trade, in more respects than one, since it was put in the room of
religion, in the late alteration of our constitution.’[400]
When one thinks of the present superb wealth and commercial
distinction of the Queen of the West, it is impossible to withhold a
smile at Wodrow’s remarks on its loss of ten thousand pounds. Yet
the fact is, that up to this time Glasgow had but a petty trade, chiefly
in sugar, herrings, and coarse woollen wares. Its tobacco-trade, the
origin of its grandeur, is understood to date only from 1707, and it
was not till 1718 that Glasgow sent any vessel belonging to itself
across the Atlantic. Sir John Dalrymple, writing shortly before 1788,
says: ‘I once asked the late Provost Cochrane of Glasgow, who was
eminently wise, and who has been a merchant there for seventy
years, to what causes he imputed the sudden rise of Glasgow. He said
it was all owing to four young men of talents and spirit, who started
at one time in business, and whose success gave example to the rest.
The four had not ten thousand pounds amongst them when they
began.’[401]
Defoe tells us that, within little more than 1708.
a year after the Union, Scotland felt the
benefit of the liberation of her commerce in one article to a most
remarkable extent. In that time, she sent 170,000 bolls of grain into
England, besides a large quantity which English merchants bought
up and shipped directly off for Portugal. The hardy little cattle of her
pastures, which before the Union had been sent in large droves into
England, being doubtless the principal article represented in the two
hundred thousand pounds which Scotland was ascertained to obtain
annually from her English customers, were now transmitted in still
larger numbers, insomuch that men of birth and figure went into the
trade. Even a Highland gentleman would think it not beneath him to
engage in so lucrative a traffic, however much in his soul he might
despise the Saxons whose gluttony he considered himself as
gratifying. It has often been told that the Honourable Patrick Ogilvie,
whom the reader has already seen engaged in a different career of
activity, took up the cattle-trade, and was soon after remonstrated
with by his brother, the Earl of Seafield, who, as Chancellor of
Scotland, had been deeply concerned in bringing about the Union.
The worthy scion of nobility drily remarked in answer: ‘Better sell
nowte than sell nations.’[402]
A sketch given of a cattle-fair at Crieff in 1723 by an intelligent
traveller, shews that the trade continued to prosper. ‘There were,’
says he, ‘at least thirty thousand cattle sold there, most of them to
English drovers, who paid down above thirty thousand guineas in
ready money to the Highlanders; a sum they had never before seen.
The Highland gentlemen were mighty civil, dressed in their slashed
waistcoats, a trousing (which is, breeches and stockings of one piece
of striped stuff), with a plaid for a cloak, and a blue bonnet. They
have a poniard knife and fork in one sheath, hanging at one side of
their belt, their pistol at the other, and their snuff-mill before; with a
great broadsword by their side. Their attendance was very numerous,
all in belted plaids, girt like women’s petticoats down to the knee;
their thighs and half of the leg all bare. They had also each their
broadsword and poniard, and spake all Irish, an unintelligible
language to the English. However, these poor creatures hired
themselves out for a shilling a day, to drive the cattle to England, and
to return home at their own charge.’[403]
Previous to the Union, the Customs and 1708. May 1.
Excise of Scotland were farmed respectively
at £30,000 and £35,000 per annum,[404] which, after every
allowance is made for smuggling, must be admitted as indicative of a
very restricted commercial system, and a simple and meagre style of
living on the part of the people. At the Union, the British government
took the Customs and Excise of Scotland into its own hands, placing
them severally under commissions, partly composed of Englishmen,
and also sending English officers of experience down to Scotland, to
assist in establishing proper arrangements for collection. We learn
from Defoe that all these new fiscal arrangements were unpopular.
The anti-union spirit delighted in proclaiming them as the outward
symptoms of that English tyranny to which poor Scotland had been
sold. Smuggling naturally flourished, for it became patriotic to cheat
the English revenue-officers. The people not only assisted and
screened the contrabandist, but if his goods chanced to be captured,
they rose in arms to rescue them. Owing to the close of the French
trade, the receiving of brandy became a favourite and flourishing
business. It was alleged that, when a Dutch fleet approached the
Scottish shores some months after the Union, several thousands of
small casks of that liquor were put ashore, with hardly any effort at
concealment.
Assuming the Excise as a tolerably fair index to the power of a
people to indulge in what they feel as comforts and luxuries, the
progress of this branch of the public revenue may be esteemed as a
history of wealth in Scotland during the remarkable period following
upon the Union. The summations it gives us are certainly of a kind
such as no Scotsman of the reign of Queen Anne, adverse or friendly
to the incorporation of the two countries, could have dreamed of.
The items in the account of the first year ending at May 1, 1708, are
limited to four—namely, for beer, ale, and vinegar, £43,653; spirits,
£901; mum,[405] £50; fines and forfeitures, £58; giving—when £6350
for salaries, and some other deductions, were allowed for—a net total
of £34,898, as a contribution to the revenue of the country.
The totals, during the next eleven years, go on thus: £41,096,
£37,998, £46,795, £51,609, £61,747, £46,979, £44,488, £45,285—
this refers to the year of the Rebellion— 1708.
£48,813, £46,649, £50,377. On this last
sum the charges of management amounted to £15,400. After this,
the total net produce of the Excise, exclusive of malt, never again
came up to fifty thousand pounds, till the year 1749. The malt tax,
which was first imposed in 1725, then amounted to £22,627, making
the entire Excise revenue of Scotland in the middle of the eighteenth
century no more than £75,987. It is to be feared that increase of
dexterity and activity in the smuggler had some concern in keeping
down these returns at so low an amount;[406] yet when large
allowance is made on that score, we are still left to conclude that the
means of purchasing luxuries remained amongst our people at a very
humble point.
I am informed by a gentleman long connected with the Excise
Board in Scotland, that the books exhibited many curious indications
of the simplicity, as well as restrictedness, of all monetary affairs as
relating to our country in the reigns of Anne and the first George.
According to a recital which he has been kind enough to
communicate in writing, ‘The remittances were for the most part
made in coin, and various entries in the Excise accounts shew that
what were called broad pieces frequently formed a part of the
moneys sent. The commissioners were in the habit of availing
themselves of the opportunity of persons of rank travelling to
London, to make them the bearers of the money; and it is a curious
historical fact, that the first remittance out of the Excise duties,
amounting to £20,000, was sent by the Earl of Leven, who delivered
£19,000 of the amount at the proper office in London, retaining the
other thousand pounds for his trouble and risk in the service. As the
Board in Scotland could only produce to their comptroller a voucher
for the sum actually delivered in London, he could not allow them
credit for more. The £1000 was therefore placed “insuper” upon the
accounts, and so remained for several years; until at last a warrant
was issued by the Treasury, authorising the sum to be passed to the
credit of the commissioners.’
After the middle of the century, the progress is such as to shew
that, whether by the removal of repressive influences, or the
imparting of some fresh spring of energy, the means of the people
were at length undergoing a rapid increase. In 1761, including part of
the first year of George III., the net total 1708.
Excise revenue had sprung up to £100,985.
It included taxes on glass (£1151), candles (£6107), leather (£8245),
soap and paper (£2992), and wheel-carriages (£2308). The total had,
however, receded fully fourteen thousand pounds by 1775. After that
time, war increased the rate of taxation, and we therefore need not
be surprised to find the Scottish Excise producing £200,432 in 1781.
In 1790, when Robert Burns honoured this branch of the revenue by
taking an office in it, it had reached but to the comparatively
insignificant sum of £331,117. In 1808, being the hundredth year of
its existence, it yielded £1,793,430, being rather more than fifty-one
times its produce during the first year.[407]

The Duke of Argyle resigning his place as June 1.


an extraordinary Lord of Session, in order
to follow his charge in the army, his younger brother, the Earl of Ilay,
succeeded him, though under twenty-five years of age; not
apparently that he might take part in the decisions of the bench, but
rather that he might be a learner there, it ‘being,’ says Fountainhall,
‘the best school for the nobility to learn that is in Europe.’

The election of a knight to represent June 26.


Ross-shire in the British parliament took
place at Fortrose, under the presidency of the sheriff, Hugh Rose, of
Kilravock. There was much dissension in the county, and the sheriff,
whose son was elected, had probably reasons of his own for
appointing the last day of the week for the ceremony. This, however,
having led to travelling on Sunday, was taken into consideration by
the synod some months later, as a breach of decorum on the part of
the sheriff, who consequently received a letter from one of their
number who had been appointed to administer their censure. It set
forth how, even if the meeting had been dissolved on the Saturday
evening, many could not have got home without breaking the fourth
commandment; but Kilravock had caused worse than this, for, by
making the meeting late in the day, he had ‘occasioned the affair to
be protracted till the Sabbath began more than to dawn [two
o’clock],’ and there had been ‘gross disorders,’ in consequence of late
drinking in taverns. ‘Some,’ says the document, ‘who were in your
own company, are said to have sung, shott, 1708.
and danced in their progress to the ferry,
without any check or restraint, as if they meant to spit in the face of
all sacred and civil laws,’ The synod had found it impossible to keep
silence and allow such miscarriages to remain unreproved.
It is to be feared that Kilravock was little benefited by their
censure, as he left the paper docketed in his repositories as ‘a comical
synodical rebuke.’[408]

That remarkable property of human Aug. 18.


nature—the anxiety everybody is under that
all other people should be virtuous—had worked itself out in sundry
famous acts of parliament, general assembly, and town-council,
throughout our history subsequent to the Reformation. There was an
act of Queen Mary against adultery, and several of Charles II. against
profaneness, drunkenness, and other impurities of life. There was
not one of William and Mary for the enforcement of the fifth
commandment; but the general principle operated in their reign very
conspicuously nevertheless, particularly in regard to profaneness and
profanation of the Lord’s Day. King William had also taken care in
1698 to issue a proclamation containing an abbreviate of all the acts
against immorality, and in which that of Charles II. against cursing
and beating of parents was certainly not overlooked, as neither were
those against adultery. So far had the anxiety for respectable conduct
in others gone in the present reign, that sheriffs and magistrates
were now enjoined by proclamation to hold courts, once a month at
least, for taking notice of vice and immorality, fining the guilty, and
rewarding informers; moreover, all naval and military officers were
ordered to exemplify the virtues for the sake of those under them,
and, above all, see that the latter duly submitted themselves to kirk
discipline.
An act of the town-council of Edinburgh ‘anent prophaneness,’ in
August 1693, threatened a rigorous execution of all the public
statutes regarding immoral conduct, such as swearing, sitting late in
taverns, and desecration of the Lord’s Day. It strictly prohibited all
persons within the city and suburbs ‘to brew, or to work any other
handiwork, on the Lord’s Day, or to be found on the streets, standing
or walking idly, or to go in company or vague to the Castlehill [the
only open space then within the city walls], 1708.
public yards, or fields.’ It discharged all
going to taverns on that day, unseasonably or unnecessarily, and
forbade ‘all persons to bring in water from the wells to houses in
greater quantities than single pints.’ By another act in 1699, tavern-
keepers were forbidden to have women for servants who had not
heretofore been of perfectly correct conduct. All these denunciations
were renewed in an act of February 1701, in which, moreover, there
was a severe threat against barbers who should shave or trim any one
on Sunday, and against all who should be found on that day carrying
periwigs, clothes, or other apparel through the streets.
Not long after this, the Edinburgh council took into their
consideration three great recent calamities—namely, the fire in the
Kirk-heugh in February 1700; another fire ‘which happened on the
north side of the Land market, about mid-day upon the 28th of
October 1701, wherein several men, and women, and children were
consumed in the flames, and lost by the fall of ruinous walls;’ and
finally, ‘that most tremendous and terrible blowing up of gunpowder
in Leith, upon the 3d of July last;’ and, reflecting on these things as
tokens of God’s wrath, came to the resolution, ‘to be more watchful
over our hearts and ways than formerly, and each of us in our several
capacities to reprove vice with zeal and prudence, and promote the
execution of the laws for punishing the vicious.’
All originality is taken from a notorious parliamentary enactment
of our time by a council act of April 1704, wherein, after reference to
the great decay of virtue and piety, and an acknowledgment that ‘all
manner of scandals and immoralities do daily abound,’ it is ordered
that taverners, under strong penalties, shall shut at ten o’clock at
night, all persons harbouring there at a later hour to be likewise
punished.
Inordinate playing at cards and dice in taverns is instanced in a
council act of about the same period, as one of the most flagrant vices
of the time.
It is to be understood that the discipline of the church over the
morals of congregations was at the same time in full vigour, although
not now fortified by a power of excommunication, inferring loss of
civil rights, as had been the case before the Revolution. Much was
done in this department by fines, proportioned to the quality of
offenders, and for the application of these to charitable uses there
was a lay-officer, styled the Kirk-treasurer, who naturally became a
very formidable person. The poems of 1708.
Ramsay and others during the earlier half of
the eighteenth century are full of waggish allusions to the terrible
powers of even the ‘man’ or servant of the Kirk-treasurer; and in a
parody of the younger Ramsay on the Integer Vitæ of Horace, this
personage is set forth as the analogue of the Sabine wolf:
‘For but last Monday, walking at noon-day,
Conning a ditty, to divert my Betty,
By me that sour Turk (I not frighted) our Kirk-
Treasurer’s man passed.

And sure more horrid monster in the Torrid


Zone cannot be found, sir, though for snakes renowned, sir;
Nor does Czar Peter’s empire boast such creatures,
Of bears the wet-nurse.’[409]
Burt, who, as an English stranger, viewed the moral police of
Scotland with a curious surprise, broadly asserts that the Kirk-
treasurer employed spies to track out and report upon private
individuals; so that ‘people lie at the mercy of villains who would
perhaps forswear themselves for sixpence.’ Sometimes, a brother and
sister, or a man and his wife, walking quietly together, would find
themselves under the observation of emissaries of the Kirk-treasurer.
Burt says he had known the town-guard in Edinburgh under arms
for a night besetting a house into which two persons had been seen
to enter. He at the same time remarks the extreme anxiety about
Sabbath observance. It seemed as if the Scotch recognised no other
virtue. ‘People would startle more at the humming or whistling of a
tune on a Sunday, than if anybody should tell them you had ruined a
family.’[410]
It must have been a great rejoicement to the gay people, when a
Kirk-treasurer—as we are told by Burt[411]—‘having a round sum of
money in his keeping, the property of the kirk, marched off with the
cash, and took his neighbour’s wife along with him to bear him
company and partake of the spoil.’
The very imperfect success of acts and statutes for improving the
habits of the people, is strongly hinted at by their frequent repetition
or renewal. We find it acknowledged by the Town Council of
Edinburgh, in June 1709, that the Lord’s Day is still ‘profaned by
people standing on the streets, and vaguing to fields and gardens,
and to the Castlehill; also by standing idle gazing out at windows,
and children, apprentices, and other 1708.
[412]
servants playing on the streets.’

James Stirling of Keir, Archibald Seton of Nov. 22.


Touch, Archibald Stirling of Carden,
Charles Stirling of Kippendavie, and Patrick Edmondstone of
Newton, were tried for high treason in Edinburgh, on the ground of
their having risen in arms in March last, in connection with the
French plan of invasion, and marched about for several days,
encouraging others to rise in like manner, and openly drinking the
health of the Pretender. Considering the openness of this treason, the
charges against the five gentlemen were remarkably ill supported by
evidence, the only witnesses being David Fenton, a tavern-keeper at
Dunkeld; John Macleran, ‘change-keeper’ at Bridge of Turk; and
Daniel Morison and Peter Wilson, two servants of the Laird of Keir.
These persons were all free to testify that the gentlemen carried
swords and pistols, which few people travelled without in that age;
but as for treasonable talk, or drinking of treasonable healths, their
memories were entirely blank. Wilson knew of no reason for Keir
leaving his own house but dread of being taken up on suspicion by
the soldiers in Stirling Castle. A verdict of Not Proven unavoidably
followed.[413]
It has been constantly remembered since in Keir’s family, that as
he was riding home after the trial, with his servant behind him—
probably Wilson—he turned about, and asked from mere curiosity,
how it came to pass that his friend had forgotten so much of what
passed at their parade for the Chevalier in March last, when the man
responded: ‘I ken very weel what you mean, laird; but my mind was
clear to trust my saul to the mercy o’ Heaven, rather than your
honour’s body to the mercy o’ the Whigs.’

Sir James Hall of Dunglass was Nov.


proprietor of a barony called Old Cambus.
Within it was a ‘room’ or small piece of land belonging to Sir Patrick
Home of Renton, a member of a family of whose hotness of blood we
have already seen some evidences. To save a long roundabout, it had
been the custom for the tenants of the ‘room’ to drive peats from
Coldingham Muir through the Old Cambus grounds, but only on
sufferance, and when the corn was off the 1708.
fields, nor even then without a quart of ale
to make matters pleasant with Sir James’s tenants. Some dispute
having now arisen between the parties, the tenant of Headchester
forbade Sir Patrick Home’s people to pass through his farm any more
with their peats; and they, on the other hand, determined that they
should go by that short passage as usual. The winter stock of fuel
being now required, the time had come for making good their
assumed right. Mr John Home, eldest son of Sir Patrick,
accompanied the carts, with a few servants to assist in making way. A
collision took place, attended with much violence on both sides, but
with no exhibition of weapons that we hear of, excepting Mr John’s
sword, which, he alleged, he did not offer to draw till his horse had
been ‘beat in the face with a great rung [stick].’ The affair was
nevertheless productive of serious consequences, for a blacksmith
was trod to death, and several persons were hurt. Had it happened
eighty years earlier, there would have been both swords and pistols
used, and probably a dozen people would have been killed.
The justices of the peace for Berwickshire took up the matter, and
imposed a fine of fifty pounds upon Mr John Home, as the person
chiefly guilty of the riot. He appealed to the Court of Session, setting
forth several objections to the sentence. The Earl of Marchmont,
whose daughter had married Sir James Hall, and two other members
of the justice-court, ought to be held as disqualified by affinity to sit
in judgment in the case. To this it was answered, that Sir James was
not the complainer, and his lady was dead. Home then alleged a right
to the passage. It was shewn, on the other hand, that there never had
been a passage save by tolerance and on consideration of the quart of
ale; and though it had been otherwise, he ought to have applied to
the magistrates, and not taken the law into his own hands: ‘however
one enters into possession, though cast in with a sling-stone, yet he
must be turned out by order of law. The Lords would not hear of
reversing the award of the justices; but they reduced the fine to thirty
pounds.’[414]

The family of the antiquary, Sir James 1709. Mar.


Balfour, to whom we owe the preservation
of so many historical manuscripts, appears to have been a very
unfortunate one. We have seen that his youngest son and successor,
Sir Robert, was slaughtered in the reign of 1709.
[415]
Charles II. by M‘Gill of Rankeillour. The
head of a succeeding generation of the family, Sir Michael Balfour,
was a quiet country gentleman, with a wife and seven children,
residing at the semi-castellated old manor-house, which we now see
standing a melancholy ruin, in a pass through the Fife hills near
Newburgh. He appears to have had debts; but we do not anywhere
learn that they were of serious extent, and we hear of nothing else to
his disadvantage. One day in this month, Sir Michael rode forth at an
early hour ‘to visit some friends and for other business,’ attended by
a servant, whom, on his return home, he despatched on an errand to
Cupar, telling him he would be home before him. From that hour,
Denmill was never again seen. He was searched for in the
neighbourhood. Inquiries were made for him in the towns at a
distance. There were even advertisements inserted in London and
continental newspapers, offering rewards for any information that
might enable his friends to ascertain his fate. All in vain. ‘There were
many conjectures about him,’ says a contemporary judge of the Court
of Session, ‘for some have been known to retire and go abroad upon
melancholy and discontent; others have been said to be transported
and carried away by spirits; a third set have given out they were lost,
to cause their creditors compound, as the old Lord Belhaven was said
to be drowned in Solway Sands, and so of Kirkton, yet both of them
afterwards appeared. The most probable opinion was, that Denmill
and his horse had fallen under night into some deep coal-pit, though
these were also searched which lay in his way home.’ At the distance
of ten months from his disappearance, his wife applied to the Court
of Session, setting forth that her husband’s creditors were ‘falling
upon his estate, and beginning to use diligence,’ and she could not
but apprehend serious injury to the means of the family, though
these far exceeded the debts, unless a factor were appointed. We
learn that the court could better have interposed if the application
had come from the creditors; but, seeing ‘the case craved some pity
and compassion,’ they appointed a factor for a year, to manage the
estate for both creditors and relict, hoping that, before that time
elapsed, it would be ascertained whether Denmill were dead or alive.
[416]

The year passed, and many more years after it, without clearing up
the mystery. We find no trace of further legal proceedings regarding
the missing gentleman, his family, or property. The fact itself
remained green in the popular 1709.
remembrance, particularly in the district to
which Sir Michael belonged. In November 1724, the public curiosity
was tantalised by a story published on a broadside, entitled Murder
will Out, and professing to explain how the lost gentleman had met
his death. The narrative was said to proceed on the death-bed
confession of a woman who had, in her infancy, seen Sir Michael
murdered by her parents, his tenants, in order to evade a debt which
they owed him, and of which he had called to crave payment on the
day of his disappearance. Stabbing him with his own sword as he sat
at their fireside, they were said to have buried his body and that of
his horse, and effectually concealed their guilt while their own lives
lasted. Now, it was said, their daughter, who had involuntarily
witnessed a deed she could not prevent, had been wrought upon to
disclose all the particulars, and these had been verified by the finding
of the bones of Sir Michael, which were now transferred to the
sepulchre of his family. But this story was merely a fiction trafficking
on the public curiosity. On its being alluded to in the Edinburgh
Evening Courant as an actual occurrence, ‘the son and heir of the
defunct Sir Michael’ informed the editor of its falsity, which was also
acknowledged by the printer of the statement himself; and pardon
was craved of the honourable family and their tenants for putting it
into circulation. On making inquiry in the district, I have become
satisfied that the disappearance of this gentleman from the field of
visible life was never explained, as it now probably never will be. In
time, the property was bought by a neighbouring gentleman, who did
not require to use the mansion as his residence. Denmill Castle
accordingly fell out of order, and became a ruin. The fathers of
people still living thereabouts remembered seeing the papers of the
family—amongst which were probably some that had belonged to the
antiquarian Sir James—scattered in confusion about a garret
pervious to the elements, under which circumstances they were
allowed to perish.

There was at this time a dearth of victual May.


in Scotland, and it was considered to be
upon the increase. The magistrates and justices of Edinburgh
arranged means for selling meal in open market, though in quantities
not exceeding a firlot, at twelve shillings Scots per peck. They also
ordered all possessors of grain to have it thrashed out and brought to
market before the 20th of May, reserving none to themselves, and
forbade, on high penalties, any one to buy 1709.
[417]
up grain upon the road to market.
A well-disposed person offered in print an expedient for
preventing the dearth of victual. He discommended the fixing of a
price at market, for when this plan was tried in the last dearth,
farmers brought only some inferior kind of grain to market, ‘so that
the remedy was worse than the disease.’ Neither could he speak in
favour of the plan of the French king—namely, the confiscating of all
grain remaining after harvest—for it had not succeeded in France,
and would still less suit a country where the people were accustomed
to more liberty. He suggested the prohibition of exportation; the
recommending possessors of grain to sell it direct to the people,
instead of victual-mongers; and the use of strict means for fining all
who keep more than a certain quantity in reserve. This writer
thought that the corn was in reality not scarce; all that was needed
was, to induce possessors of the article to believe it to be best for
their interest to sell immediately.[418]

There is an ancient and well-known July 21.


privilege, still kept up, in connection with
the palace and park of Holyroodhouse, insuring that a debtor
otherwise than fraudulent, and who has not the crown for his
creditor, cannot have diligence executed against him there;
consequently, may live there in safety from his creditors. At this
time, the privilege was taken advantage of by Patrick Haliburton,
who was in debt to the extraordinary amount of nearly £3000
sterling, and who was believed to have secretly conveyed away his
goods.
It being also part of the law of Scotland that diligence cannot be
proceeded with on Sunday, the Abbey Lairds, as they were jocularly
called, were enabled to come forth on that day and mingle in their
wonted society.
It pleased Patrick Haliburton to come to town one Sunday, and call
upon one of his creditors named Stewart, in order to treat with him
regarding some proposed accommodation of the matters that stood
between them. Mr Stewart received Patrick with apparent kindness,
asked him to take supper, and so plied his hospitality as to detain
him till past twelve o’clock, when, as he was leaving the house, a
messenger appeared with a writ of caption, and conducted him to
prison. Patrick considered himself as 1709.
trepanned, and presented a complaint to
the Court of Session, endeavouring to shew that a caption, of which
all the preparatory steps had been executed on the Sunday, was the
same as if it had been executed on the Sunday itself; that he had been
treacherously dealt with; and that he was entitled to protection
under the queen’s late indemnity. The Lords repelled the latter plea,
but ‘allowed trial to be taken of the time of his being apprehended,
and the manner how he was detained, or if he offered to go back to
the Abbey, and was enticed to stay and hindered to go out.’[419] The
termination of the affair does not appear.
A case with somewhat similar features occurred in 1724. Mrs Dilks
being a booked inmate of the Abbey sanctuary, one of her creditors
formed a design of getting possession of her person. He sent a
messenger-at-law, who, planting himself in a tavern within the
privileged ground, but close upon its verge, sent for the lady to come
and speak with him. She, obeying, could not reach the house without
treading for a few paces beyond ‘the girth,’ and the messenger’s
concurrents took the opportunity to lay hold of her. This, however,
was too much to be borne by a fairplay-loving populace. The very
female residents of the Abbey rose at the news, and, attacking the
party, rescued Mrs Dilks, and bore her back in triumph within the
charmed circle.[420]

The Rev. James Greenshields, an Irish curate, but of Scottish birth


and ordination—having received this rite at the hands of the deposed
Bishop of Ross in 1694—set up a meeting-house in a court near the
Cross of Edinburgh, where he introduced the English liturgy, being
the first time a prayer-book had been publicly presented in Scotland
since the Jenny Geddes riot of July 1637. Greenshields was to be
distinguished from the nonjurant Scottish Episcopalian clergy, for he
had taken the oath of abjuration (disclaiming the ‘Pretender’), and he
prayed formally for the queen; but he was perhaps felt to be, on this
account, only the more dangerous to the Established Church. It was
necessary that something should be done to save serious people from
the outrage of having a modified idolatry practised so near them. The
first effort consisted of a process raised by the landlord of the house
against Mr Greenshields, in the Dean of 1709.
Guild’s court, on account of his having used
part of the house, which he took for a dwelling, as a chapel, and for
that purpose broken down certain partitions. The Dean readily
ordained that the house should be restored to its former condition.
Mr Greenshields having easily procured accommodation elsewhere,
it became necessary to try some other method for extinguishing the
nuisance. A petition to the presbytery of Edinburgh, craving their
interference, was got up and signed by two or three hundred persons
in a few hours. The presbytery, in obedience to their call, cited Mr
Greenshields to appear before them. He declined their jurisdiction,
and they discharged him from continuing to officiate, under high
pains and penalties.
Mr Greenshields having persisted, next Sep.
Sunday, in reading prayers to his
congregation, the magistrates, on the requirement of the presbytery,
called him before them, and formally demanded that he should
discontinue his functions in their city. Daniel Defoe, who could so
cleverly expose the intolerance of the Church of England to the
dissenters, viewed an Episcopalian martyrdom with different
feelings. He tells us that Greenshields conducted himself with
‘haughtiness’ before the civic dignitaries—what his own people of
course regarded as a heroic courage. He told them positively that he
would not obey them; and accordingly, next Sunday, he read the
service as usual in his obscure chapel. Even now, if we are to believe
Defoe, the magistrates would not have committed him, if he had
been modest in his recusancy; but, to their inconceivable disgust,
this insolent upstart actually appeared next day at the Cross, among
the gentlemen who were accustomed to assemble there as in an
Exchange, and thus seemed to brave their authority! For its
vindication, they were, says Defoe, ‘brought to an absolute necessity
to commit him;’ and they committed him accordingly to the
Tolbooth.
Here he lay till the beginning of November, when, the Court of
Session sitting down, he presented a petition, setting forth the
hardship of his case, seeing that there was no law forbidding any one
to read the English liturgy, and he had fully qualified to the civil
government by taking the necessary oaths. It was answered for the
magistrates, that ‘there needs no law condemning the English
service, for the introducing the Presbyterian worship explodes it as
inconsistent,’ and the statute had only promised that the oath-taking
should protect ministers who had been in possession of charges. ‘The
generality of the Lords,’ says Fountainhall, 1709.
[421]
‘regretted the man’s case;’ but they
refused to set him at liberty, unless he would engage to ‘forbear the
English service.’ Amongst his congregation there was a considerable
number of English people, who had come to Edinburgh as officers of
Customs and Excise. It must have bewildered them to find what was
so much venerated in their own part of the island, a subject of such
wrathful hatred and dread in this.
Greenshields, continuing a prisoner in the Tolbooth, determined,
with the aid of friends, to appeal to the House of Lords against the
decision of the Court of Session. Such appeals had become possible
only two years ago by the Union, and they were as yet a novelty in
Scotland. The local authorities had never calculated on such a step
being taken, and they were not a little annoyed by it. They persisted,
nevertheless, in keeping the clergyman in his loathsome prison, till,
after a full year, an order of the House of Lords came for his release.
Meanwhile, other troubles befell the church, for a Tory ministry
came into power, who, like the queen herself, did not relish seeing
the Episcopalian clergy and liturgy treated contumeliously in
Scotland. The General Assembly desired to have a fast on account of
‘the crying sins of the land, irreligion, popery, many errors and
delusions;’ and they chafed at having to send for authority to
Westminster, where it was very grudgingly bestowed. It seemed as if
they had no longer a barrier for the protection of that pure faith
which it was the happy privilege of Scotland, solely of all nations on
the face of the earth, to enjoy. Their enemies, too, well saw the
advantage that had been gained over them, and eagerly supported
Greenshields in his tedious and expensive process, which ended
(March 1711) in the reversal of the Session’s decision. ‘It is a tacit
rescinding,’ says Wodrow, ‘of all our laws for the security of our
worship, and that unhappy man [an Irish curate of fifteen pounds a
year, invited to Edinburgh on a promise of eighty] has been able to
do more for the setting up of the English service in Scotland than
King Charles the First was able to do.’

The Lords of Session decided this day on Nov. 9.


a critical question, involving the use of a
word notedly of uncertain meaning. John Purdie having committed
an act of immorality on which a parliamentary act of 1661 imposed a
penalty of a hundred pounds in the case of 1709.
‘a gentleman,’ the justices of peace fined
him accordingly, considering him a gentleman within the
construction of the act, as being the son of ‘a heritor,’ or land-
proprietor. ‘When charged for payment by Thomas Sandilands,
collector of these fines, he suspended, upon this ground that the fine
was exorbitant, in so far as he was but a small heritor, and, as all
heritors are not gentlemen, so he denied that he had the least
pretence to the title of a gentleman. The Lords sustained the reason
of suspension to restrict the fine to ten pounds Scots, because the
suspender had not the face or air of a gentleman: albeit it was
alleged by the charger [Sandilands] that the suspender’s
profligateness and debauchery, the place of the country where he
lives, and the company haunted by him, had influenced his mien.’[422]

An anonymous gentleman of Scotland, writing to the Earl of


Seafield, on the improvement of the salmon-fishing in Scotland,
informs us how the fish were then, as now, massacred in their
pregnant state, by country people. ‘I have known,’ he says, ‘a fellow
not worth a groat kill with a spear in one night’s time a hundred
black fish or kipper, for the most part full of rawns unspawned.’ He
adds: ‘Even a great many gentlemen, inhabitants by the rivers, are
guilty of the same crimes,’ little reflecting on ‘the prodigious treasure
thus miserably dilapidated.’
Notwithstanding these butcheries, he tells us that no mean profit
was then derived from the salmon-fishing in Scotland; he had known
from two to three thousand barrels, worth about six pounds sterling
each, exported in a single year. ‘Nay, I know Sir James Calder of
Muirton alone sold to one English merchant a thousand barrels in
one year’s fishing.’ He consequently deems himself justified in
estimating the possible product of the salmon-fishing, if rightly
protected and cultivated, at forty thousand barrels, yielding
£240,000 sterling, per annum.[423]

At Inchinnan, in Renfrewshire, there fell 1710. Feb.


out a ‘pretty peculiar accident.’ One Robert
Hall, an elder, and reputed as an estimable man, falling into debt
with his landlord, the Laird of Blackston, was deprived of all he had,
and left the place. Two months before this date, he returned secretly,
and being unable to live contentedly 1710.
without going to church, he disguised himself for that purpose in
women’s clothes. It was his custom to go to Eastwood church, but
curiosity one day led him to his own old parish-church of Inchinnan.
As he crossed a ferry, he was suspected by the boatman and a beadle
of being a man in women’s clothes, and traced on to the church. The
minister, apprised of the suspicion, desired them not to meddle with
him; but on a justice of peace coming up, he was brought forward for
examination. He readily owned the fact, and desired to be taken to
the minister, who, he said, would know him. The minister protected
him for the remainder of the day, that he might escape the rudeness
of the mob; and on the ensuing day, he was taken to Renfrew, and
liberated, at the intercession of his wife’s father.[424]

The General Assembly passed an act, May.


declaring the marriage of Robert Hunter, in
the bounds of the presbytery of Biggar, with one John [Joan]
Dickson to be incestuous, the woman having formerly been the
mother of a child, the father of which was grand-uncle to her present
husband. The act discharged the parties from remaining united
under pain of highest censure.
The church kept up long after this period a strict discipline
regarding unions which involved real or apparent relationship. In
May 1730, we find John Baxter, elder in Tealing parish, appealing
against a finding of the synod, that his marriage with his deceased
wife’s brother’s daughter’s daughter, was incestuous. Two years later,
the General Assembly had under its attention a case, which, while
capable of being stated in words, is calculated to rack the very brain
of whoever would try to realise it in his conceptions. A Carrick man,
named John M‘Taggart, had unluckily united himself to a woman
named Janet Kennedy, whose former husband, Anthony M‘Harg,
‘was a brother to John M‘Taggart’s grandmother, which
grandmother was said to be natural daughter of the said Anthony
M‘Harg’s father!’ The presbytery of Ayr took up the case, and
M‘Taggart was defended by a solicitor, in a paper full of derision and
mockery at the law held to have been offended; ‘a new instance,’ says
Wodrow,’ of the unbounded liberty that lawyers take.’[425] The
presbytery having condemned the marriage as incestuous, M‘Taggart
appealed in wonted form to the synod, which affirmed the former
Welcome to our website – the ideal destination for book lovers and
knowledge seekers. With a mission to inspire endlessly, we offer a
vast collection of books, ranging from classic literary works to
specialized publications, self-development books, and children's
literature. Each book is a new journey of discovery, expanding
knowledge and enriching the soul of the reade

Our website is not just a platform for buying books, but a bridge
connecting readers to the timeless values of culture and wisdom. With
an elegant, user-friendly interface and an intelligent search system,
we are committed to providing a quick and convenient shopping
experience. Additionally, our special promotions and home delivery
services ensure that you save time and fully enjoy the joy of reading.

Let us accompany you on the journey of exploring knowledge and


personal growth!

ebooknice.com

You might also like