Instant Download Probability and Statistics For Computer Scientists Third Edition Michael Baron PDF All Chapter
Instant Download Probability and Statistics For Computer Scientists Third Edition Michael Baron PDF All Chapter
com
https://textbookfull.com/product/probability-and-statistics-
for-computer-scientists-third-edition-michael-baron/
OR CLICK BUTTON
DOWNLOAD NOW
https://textbookfull.com/product/probability-statistics-for-engineers-
scientists-walpole/
textboxfull.com
https://textbookfull.com/product/introduction-to-probability-and-
statistics-for-engineers-and-scientists-6th-edition-sheldon-m-ross/
textboxfull.com
https://textbookfull.com/product/analysis-for-computer-scientists-
foundations-methods-and-algorithms-michael-oberguggenberger/
textboxfull.com
https://textbookfull.com/product/probability-and-statistics-with-
reliability-queueing-and-computer-science-applications-trivedi/
textboxfull.com
Statistics and Probability with Applications for Engineers
and Scientists Using MINITAB, R and JMP 2nd Edition
Bhisham C. Gupta
https://textbookfull.com/product/statistics-and-probability-with-
applications-for-engineers-and-scientists-using-minitab-r-and-jmp-2nd-
edition-bhisham-c-gupta/
textboxfull.com
https://textbookfull.com/product/statistics-for-engineers-and-
scientists-william-navidi/
textboxfull.com
https://textbookfull.com/product/statistics-for-engineers-and-
scientists-5th-edition-william-navidi/
textboxfull.com
https://textbookfull.com/product/python-for-probability-statistics-
and-machine-learning-jose-unpingco/
textboxfull.com
Probability and Statistics for
Computer Scientists
Third Edition
Probability and Statistics for
Computer Scientists
Third Edition
Michael Baron
Department of Mathematics and Statistics
College of Arts and Sciences
American University
Washington DC
MATLAB is a trademark of The MathWorks, Inc. and is used with permission. The MathWorks does
not warrant the accuracy of the text or exercises in this book. This book’s use or discussion of MATLAB
software or related products does not constitute endorsement or sponsorship by The MathWorks of a par-
ticular pedagogical approach or particular use of the MATLAB software.
CRC Press
Taylor & Francis Group
6000 Broken Sound Parkway NW, Suite 300
Boca Raton, FL 33487-2742
This book contains information obtained from authentic and highly regarded sources. Reasonable efforts
have been made to publish reliable data and information, but the author and publisher cannot assume
responsibility for the validity of all materials or the consequences of their use. The authors and publishers
have attempted to trace the copyright holders of all material reproduced in this publication and apologize
to copyright holders if permission to publish in this form has not been obtained. If any copyright material
has not been acknowledged please write and let us know so we may rectify in any future reprint.
Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced, trans-
mitted, or utilized in any form by any electronic, mechanical, or other means, now known or hereafter
invented, including photocopying, microfilming, and recording, or in any information storage or retrieval
system, without written permission from the publishers.
For permission to photocopy or use material electronically from this work, please access www.copyright.com
(http://www.copyright.com/) or contact the Copyright Clearance Center, Inc. (CCC), 222 Rosewood Drive,
Danvers, MA 01923, 978-750-8400. CCC is a not-for-profit organization that provides licenses and registra-
tion for a variety of users. For organizations that have been granted a photocopy license by the CCC, a
separate system of payment has been arranged.
Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are
used only for identification and explanation without intent to infringe.
Preface xv
2 Probability 9
vii
viii Contents
4 Continuous Distributions 75
11 Regression 375
Appendix 417
Index 461
Preface
Starting with the fundamentals of probability, this text leads readers to computer simula-
tions and Monte Carlo methods, stochastic processes and Markov chains, queuing systems,
statistical inference, and regression. These areas are heavily used in modern computer sci-
ence, computer engineering, software engineering, and related fields.
The book is primarily intended for junior undergraduate to beginning graduate level stu-
dents majoring in computer-related fields – computer science, software engineering, infor-
mation systems, data science, information technology, telecommunications, etc. At the same
time, it can be used by electrical engineering, mathematics, statistics, natural science, and
other majors for a standard calculus-based introductory statistics course. Standard topics
in probability and statistics are covered in Chapters 1–4 and 8–9.
Graduate students can use this book to prepare for probability-based courses such as queu-
ing theory, artificial neural networks, computer performance, etc.
The book can also be used as a standard reference on probability and statistical methods,
simulation, and modeling tools.
Recommended courses
The text is recommended for a one-semester course with several open-end options available.
At the same time, with the new material added in the second and the third editions, the
book can serve as a text for a full two-semester course in Probability and Statistics.
After introducing probability and distributions in Chapters 1–4, instructors may choose the
following continuations, see Figure 1.
Probability-oriented course. Proceed to Chapters 6–7 for Stochastic Processes, Markov
Chains, and Queuing Theory. Computer science majors will find it attractive to supple-
ment such a course with computer simulations and Monte Carlo methods. Students can
learn and practice general simulation techniques in Chapter 5, then advance to the simu-
lation of stochastic processes and rather complex queuing systems in Sections 6.4 and 7.6.
Chapter 5 is highly recommended but not required for the rest of the material.
Statistics-focused course. Proceed to Chapters 8–9 directly after the probability core, fol-
lowed by additional topics in Statistics selected from Chapters 10 and 11. Such a curriculum
is more standard, and it is suitable for a wide range of majors. Chapter 5 remains optional
but recommended; it discusses statistical methods based on computer simulations. Modern
bootstrap techniques in Section 10.3 will attractively continue this discussion.
A course satisfying ABET requirements. Topics covered in this book satisfy ABET (Accred-
itation Board for Engineering and Technology) requirements for probability and statistics.
xv
xvi Preface
Chap. 6 Chap. 7
Stochastic - Queuing
> Processes Theory
R R
Chap. 1–4 Chap. 5 Sec. 6.4 Sec. 7.6
Probability - Monte Carlo - Simulation of - Simulation
Stochastic of Queuing
Core Methods Processes Systems
Chap. 10
R Advanced
~ Chap. 8–9 > Statistics
Statistics
Core
~ Chap. 11
Regression
To meet the requirements, instructors should choose topics from Chapters 1–11. All or some
of Chapters 5–7 and 10–11 may be considered optional, depending on the program’s ABET
objectives.
A two-semester course will cover all Chapters 1–11, possibly skipping some sections. The
material presented in this book splits evenly between Probability topics for the first semester
(Chapters 1–7) and Statistics topics for the second semester (Chapters 8–11).
Working differentiation and integration skills are required starting from Chapter 4. They
are usually covered in one semester of university calculus.
As a refresher, the appendix has a very brief summary of the minimum calculus techniques
required for reading this book (Section A.4). Certainly, this section cannot be used to learn
calculus “from scratch”. It only serves as a reference and student aid.
Next, Chapters 6–7 and Sections 11.3–11.4 rely on very basic matrix computations. Es-
sentially, readers should be able to multiply matrices, solve linear systems (Chapters 6–7),
and compute inverse matrices (Section 11.3). A basic refresher of these skills with some
examples is in the appendix, Section A.5.
The book is written in a lively style and reasonably simple language that students find easy
to read and understand. Reading this book, students should feel as if an experienced and
enthusiastic lecturer is addressing them in person.
Preface xvii
Besides computer science applications and multiple motivating examples, the book contains
related interesting facts, paradoxical statements, wide applications to other fields, etc. I
expect prepared students to enjoy the course, benefit from it, and find it attractive and
useful for their careers.
Every chapter contains multiple examples with explicit solutions, many of them motivated
by computer science applications. Every chapter is concluded with a short summary and
more exercises for homework assignments and self-training. Over 270 problems can be as-
signed from this book.
Frequent self-explaining figures help readers understand and visualize concepts, formulas,
and even some proofs. Moreover, instructors and students are invited to use included short
programs for computer demonstrations. Randomness, uncertainty, behavior of random vari-
ables and stochastic processes, convergence results such as the Central Limit Theorem, and
especially Monte Carlo simulations can be nicely visualized by animated graphics.
These short computer codes contain very basic and simple commands, written in R and
MATLAB with detailed commentary. Preliminary knowledge of these languages is not
necessary. Readers can also choose another software and reproduce the given commands in
it line by line or use them as a flow-chart.
Instructors have options of teaching the course with R, MATLAB, both of them, some other
software, or with no software at all.
Having understood computerized examples in the text, students will use similar codes to
complete the projects and mini-projects proposed in this book.
For educational purposes, data sets used in the book are not large. They are printed in the
book, typically at the first place where they are used, and also, placed in our data inventory
on the web site http://fs2.american.edu/baron/www/Book/. Students can either type
them as part of their computer programs, or they can download the data files given in text
and comma-separated value formats. All data sets are listed in Section A.1, where we also
teach how to read them into R and MATLAB.
Broad feedback coming from professors who use this book for their courses in different
countries motivated me to work on the second edition. As a result, the Statistical Inference
chapter expanded and split into Chapters 9 and 10. The added material is organized in the
new sections, according to Table 0.1.
Also, the 2nd edition has about 60 additional exercises. Enjoy practicing, dear students,
you will only benefit from extra training!
The main news in the 3rd edition is the use of R, a popular software for statistical com-
puting. MATLAB has fulfilled its mission in earlier editions as a tool for computer demon-
strations, simulations, animated graphics, and basic statistical methods with easy-to-read
xviii Preface
codes. At the same time, expansion of Statistics chapters, adoption of this book in a number
of universities across four continents, and broad feedback from course instructors (thank
you, colleagues!) prompted me to add R examples parallel to MATLAB.
R, an unbelievably popular and still developing statistical software, is freely avail-
able for a variety of operating systems for anyone to install from the site
https://www.r-project.org/. The same site contains links to support, news, and other
information about R. Supplementary to basic R, one can invoke numerous additional pack-
ages written by different users for various statistical methods. Some of them will be used in
this book.
The author is grateful to Taylor & Francis Group for their constant professional help, respon-
siveness, support, and encouragement. Special thanks are due David Grubbs, Bob Stern,
Marcus Fontaine, Jill Jurgensen, Barbara Johnson, Rachael Panthier, and Shashi Kumar.
Many thanks go to my colleagues at American University, the University of Texas at Dal-
las, and other universities for their inspiring support and invaluable feedback, especially to
Professors Stephen Casey, Betty Malloy, and Nathalie Japkowicz from American University,
Farid Khafizov and Pankaj Choudhary from UT-Dallas, Joan Staniswalis and Amy Wagler
from UT-El Paso, Lillian Cassel from Villanova, Alan Sprague from the University of Al-
abama, Katherine Merrill from the University of Vermont, Alessandro Di Bucchianico from
Eindhoven University of Technology, Marc Aerts from Hasselt University, Pratik Shah from
the Indian Institute of Information Technology in Vadodara, and Magagula Vusi Mpendulo
from the University of Eswatini. I am grateful to Elena Baron for creative illustrations; Kate
Pechekhonova for interesting examples; and last but not least, to Eric, Anthony, Masha,
and Natasha Baron for their amazing patience and understanding.
Preface xix
This course is about uncertainty, measuring and quantifying uncertainty, and making de-
cisions under uncertainty. Loosely speaking, by uncertainty we mean the condition when
results, outcomes, the nearest and remote future are not completely determined; their de-
velopment depends on a number of factors and just on a pure chance.
Simple examples of uncertainty appear when you buy a lottery ticket, turn a wheel of
fortune, or toss a coin to make a choice.
Uncertainly appears in virtually all areas of Computer Science and Software Engineering.
Installation of software requires uncertain time and often uncertain disk space. A newly
released software contains an uncertain number of defects. When a computer program is
executed, the amount of required memory may be uncertain. When a job is sent to a printer,
it takes uncertain time to print, and there is always a different number of jobs in a queue
ahead of it. Electronic components fail at uncertain times, and the order of their failures
cannot be predicted exactly. Viruses attack a system at unpredictable times and affect an
unpredictable number of files and directories.
Uncertainty surrounds us in everyday life, at home, at work, in business, and in leisure. To
take a snapshot, let us listen to the evening news.
Example 1.1 . We may find out that the stock market had several ups and downs today
which were caused by new contracts being made, financial reports being released, and other
events of this sort. Many turns of stock prices remained unexplained. Clearly, nobody would
have ever lost a cent in stock trading had the market contained no uncertainty.
We may find out that a launch of a space shuttle was postponed because of weather con-
ditions. Why did not they know it in advance, when the event was scheduled? Forecasting
weather precisely, with no error, is not a solvable problem, again, due to uncertainty.
To support these words, a meteorologist predicts, say, a 60% chance of rain. Why cannot
she let us know exactly whether it will rain or not, so we’ll know whether or not to take
our umbrellas? Yes, because of uncertainty. Because she cannot always know the situation
with future precipitation for sure.
1
2 Probability and Statistics for Computer Scientists
We may find out that eruption of an active volcano has suddenly started, and it is not clear
which regions will have to evacuate.
We may find out that a heavily favored home team unexpectedly lost to an outsider, and
a young tennis player won against expectations. Existence and popularity of totalizators,
where participants place bets on sports results, show that uncertainty enters sports, results
of each game, and even the final standing.
We may also hear reports of traffic accidents, crimes, and convictions. Of course, if that
driver knew about the coming accident ahead of time, he would have stayed home. ♦
Certainly, this list can be continued (at least one thing is certain!). Even when you drive
to your college tomorrow, you will see an unpredictable number of green lights when you
approach them, you will find an uncertain number of vacant parking slots, you will reach
the classroom at an uncertain time, and you cannot be certain now about the number of
classmates you will find in the classroom when you enter it.
Realizing that many important phenomena around us bear uncertainty, we have to un-
derstand it and deal with it. Most of the time, we are forced to make decisions under
uncertainty. For example, we have to deal with internet and e-mail knowing that we may
not be protected against all kinds of viruses. New software has to be released even if its
testing probably did not reveal all the defects. Some memory or disk quota has to be allo-
cated for each customer by servers, internet service providers, etc., without knowing exactly
what portion of users will be satisfied with these limitations. And so on.
This book is about measuring and dealing with uncertainty and randomness. Through basic
theory and numerous examples, it teaches
– how to evaluate probabilities, or chances of different results (when the exact result is
uncertain),
– how to select a suitable model for a phenomenon containing uncertainty and use it in
subsequent decision making,
– how to evaluate performance characteristics and other important parameters for new
devices and servers,
Server I
R
- Arrivals - - - - -
Departures
Server II
R Server III
When direct computation is too complicated, resource consuming, too approximate, or sim-
ply not feasible, we shall use Monte Carlo methods. The book contains standard examples
of computer codes simulating rather complex queuing systems and evaluating their vital
characteristics. The codes are written in R and MATLAB, with detailed explanations of
steps, and most of them can be directly translated to other computer languages.
Next, we turn to Statistical Inference. While in Probability, we usually deal with more or
less clearly described situations (models), in Statistics, all the analysis is based on collected
and observed data. Given the data, a suitable model (say, a family of distributions) is fitted,
its parameters are estimated, and conclusions are drawn concerning the entire totality of
observed and unobserved subjects of interest that should follow the same model.
A typical Probability problem sounds like this.
Example 1.2 . A folder contains 50 executable files. When a computer virus or a hacker
attacks the system, each file is affected with probability 0.2. Compute the probability that
during a virus attack, more than 15 files get affected. ♦
Notice that the situation is rather clearly described, in terms of the total number of files
and the chance of affecting each file. The only uncertain quantity is the number of affected
files, which cannot be predicted for sure.
Example 1.3 . A folder contains 50 executable files. When a computer virus or a hacker
attacks the system, each file is affected with probability p. It has been observed that during
a virus attack, 15 files got affected. Estimate p. Is there a strong indication that p is greater
than 0.2? ♦
This is a practical situation. A user only knows the objectively observed data: the number
of files in the folder and the number of files that got affected. Based on that, he needs to
estimate p, the proportion of all the files, including the ones in his system and any similar
systems. One may provide a point estimator of p, a real number, or may opt to construct a
confidence interval of “most probable” values of p. Similarly, a meteorologist may predict,
say, a temperature of 70o F, which, realistically, does not exclude a possibility of 69 or 72
degrees, or she may give us an interval by promising, say, between 68 and 72 degrees.
Introduction and Overview 5
Most forecasts are being made from a carefully and suitably chosen model that fits the data.
A widely used method is regression that utilizes the observed data to find a mathematical
form of relationship between two variables (Chapter 11). One variable is called predictor,
the other is response. When the relationship between them is established, one can use the
predictor to infer about the response. For example, one can more or less accurately estimate
the average installation time of a software given the size of its executable files. An even more
accurate inference about the response can be made based on several predictors such as the
size of executable files, amount of random access memory (RAM), and type of processor
and operating system. This type of data analysis will require multivariate regression.
Each method will be illustrated by numerous practical examples and exercises. As the ulti-
mate target, by the end of this course, students should be able to read a word problem or a
corporate report, realize the uncertainty involved in the described situation, select a suitable
probability model, estimate and test its parameters based on real data, compute probabil-
ities of interesting events and other vital characteristics, make meaningful conclusions and
forecasts, and explain these results to other people.
Exercises
1.1. List 20 situations involving uncertainty that happened with you yesterday.
1.2. Name 10 random variables that you observed or dealt with yesterday.
1.3. Name 5 stochastic processes that played a role in your actions yesterday.
1.4. In a famous joke, a rather lazy student tosses a coin in order to decide what to do
next. If it turns up heads, play a computer game. If tails, watch a video. If it stands on its
edge, do the homework. If it hangs in the air, study for an exam.
(a) Which events should be assigned probability 0, probability 1, and some probability
strictly between 0 and 1?
6 Probability and Statistics for Computer Scientists
(b) What probability between 0 and 1 would you assign to the event “watch a video”,
and how does it help you to define “a fair coin”?
1.5. A new software package is being tested by specialists. Every day, a number of defects
is found and corrected. It is planned to release this software in 30 days. Is it possible to
predict how many defects per day specialists will be finding at the time of the release? What
data should be collected for this purpose, what is the predictor, and what is the response?
1.6. Mr. Cheap plans to open a computer store to trade hardware. He would like to stock
an optimal number of different hardware products in order to optimize his monthly profit.
Data are available on similar computer stores opened in the area. What kind of data should
Mr. Cheap collect in order to predict his monthly profit? What should he use as a predictor
and as a response in his analysis?
Part I
7
Discovering Diverse Content Through
Random Scribd Documents
The Project Gutenberg eBook of Sandman's rainy
day stories
This ebook is for the use of anyone anywhere in the United
States and most other parts of the world at no cost and with
almost no restrictions whatsoever. You may copy it, give it away
or re-use it under the terms of the Project Gutenberg License
included with this ebook or online at www.gutenberg.org. If you
are not located in the United States, you will have to check the
laws of the country where you are located before using this
eBook.
Language: English
SANDMAN’S
STORIES OF DRUSILLA
DOLL
SANDMAN’S RAINY
DAY STORIES
SANDMAN’S
CHRISTMAS STORIES
SANDMAN’S
TWILIGHT STORIES
TOLD BY THE
SANDMAN
SANDMAN’S TALES
THE SANDMAN’S
HOUR
Our website is not just a platform for buying books, but a bridge
connecting readers to the timeless values of culture and wisdom. With
an elegant, user-friendly interface and an intelligent search system,
we are committed to providing a quick and convenient shopping
experience. Additionally, our special promotions and home delivery
services ensure that you save time and fully enjoy the joy of reading.
textbookfull.com