0% found this document useful (0 votes)

37 views200 pages

Computer Science I

Computer Science1

Uploaded by

dustbin.idf

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

37 views200 pages

Computer Science I

Computer Science1

Uploaded by

dustbin.idf

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 200

Computer Science I

Dr. Chris Bourke

[email protected]
Department of Computer Science & Engineering
University of Nebraska–Lincoln
Lincoln, NE 68588, USA

2018/08/09 16:16:18
Version 1.3.6
Copyleft (Copyright)

The entirety of this book is free and is released under a Creative Commons Attribution-
ShareAlike 4.0 International License (see http://creativecommons.org/licenses/
by-sa/4.0/ for details).

i
Draft Notice
This book is a draft that has been released for evaluation and comment. Some of the later
chapters are included as placeholders and indicators for the intended scope of the final
draft, but are intentionally left blank. The author encourages people to send feedback
including suggestions, corrections, and reviews to inform and influence the final draft.
Thank you in advance to anyone helping out or sending constructive criticisms.

iii
Preface

“If you really want to understand something, the best way is to try and
explain it to someone else. That forces you to sort it out in your own mind...
that’s really the essence of programming. By the time you’ve sorted out a
complicated idea into little steps that even a stupid machine can deal with,
you’ve certainly learned something about it yourself.” —Douglas Adams,
Dirk Gently’s Holistic Detective Agency [8]

“The world of A.D. 2014 will have few routine jobs that cannot be done better
by some machine than by any human being. Mankind will therefore have
become largely a race of machine tenders. Schools will have to be oriented in
this direction. All the high-school students will be taught the fundamentals
of computer technology, will become proficient in binary arithmetic and will
be trained to perfection in the use of the computer languages that will have
developed out of those like the contemporary Fortran” —Isaac Asimov 1964

I’ve been teaching Computer Science since 2008 and was a Teaching Assistant long
before that. Before that I was a student. During that entire time I’ve been continually
disappointed in the value (note, not quality) of textbooks, particularly Computer Science
textbooks and especially introductory textbooks. Of primary concern are the costs,
which have far outstripped inflation over the last decade [30] while not providing any
real additional value. New editions with trivial changes are released on a regular basis in
an attempt to nullify the used book market. Publishers engage in questionable business
practices and unfortunately many institutions are complicit in this process.

In established fields such as mathematics and physics, new textbooks are especially
questionable as the material and topics don’t undergo many changes. However, in
Computer Science, new languages and technologies are created and change at breakneck
speeds. Faculty and students are regularly trying to give away stacks of textbooks
(“Learn Java 4!,” “Introduction to Cold Fusion,” etc.) that are only a few years old and
yet are completely obsolete and worthless. The problem is that such books have built-in
obsolescence by focusing too much on technological specifics and not enough on concepts.
There are dozens of introductory textbooks for Computer Science; add in the fact that
there are multiple languages and many gimmicks (“Learn Multimedia Java,” “Gaming
with JavaScript,” “Build a Robot with C!”), it is a publisher’s paradise: hundreds of
variations, a growing market, and customers with few alternatives.

v
Preface

That’s why I like organizations like OpenStax (http://openstaxcollege.org/) that

attempt to provide free and “open” learning materials. Though they have textbooks for
a variety of disciplines, Computer Science is not one of them (currently, that is). This
might be due to the fact that there are already a huge amount of resources available
online such as tutorials, videos, online open courses, and even interactive code learning
tools. With such a huge amount of resources, why write this textbook then? Firstly,
layoff. Secondly, I don’t really expect this book to have much impact beyond my own
courses or department. I wanted a resource that presented an introduction to Computer
Science how I teach it in my courses and it wasn’t available. However, if it does find its
way into another instructor’s classes or into the hands of an aspiring student that wants
to learn, then great!

Several years ago our department revamped our introductory courses in a “Renaissance
in Computing” initiative in which we redeveloped several different “flavors” of Computer
Science I (one intended for Computer Science majors, one for Computer Engineering
majors, one for non-CE engineering majors, one for humanities majors, etc.). The courses
are intended to be equivalent in content but have a broader appeal to those in different
disciplines. The intent was to provide multiple entry points into Computer Science. Once
a student had a solid foundation, they could continue into Computer Science II and pick
up a second programming language with little difficulty.

This basic idea informed how I structured this book. There is a separation of concepts
and programming language syntax. The first part of this book uses pseudocode with
a minimum of language-specific elements. Subsequent parts of the book recapitulate
these concepts but in the context of a specific programming language. This allows for a
“plug-in” style approach to Computer Science: the same book could theoretically be used
for multiple courses or the book could be extended by adding another part for a new
language with minimal effort.

Another inspiration for the structure of this book is the Computer Science I Honors course
that I developed. Usually Computer Science majors take CS1 using Java as the primary
language while CE students take CS1 using C. Since the honors course consists of both
majors (as well as some of the top students), I developed the Honors version to cover
both languages at the same time in parallel. This has led to many interesting teaching
moments: by covering two languages, it provides opportunities to highlight fundamental
differences and concepts in programming languages. It also keeps concepts as the focus of
the course emphasizing that syntax and idiosyncrasies of individual languages are only of
secondary concern. Finally, actively using multiple languages in the first class provides a
better opportunity to extend knowledge to other programming languages–once a student
has a solid foundation in one language learning a new one should be relatively easy.

The exercises in this book are a variety of exercises I’ve used in my courses over the
years. They have been made as generic as possible so that they could be assigned using
any language. While some have emphasized the use of “real-world” exercises (whatever
that means), my exercises have focused more on solving problems of a mathematical

vi
nature (most of my students have been Engineering students). Some of them are more
easily understood if students have had Calculus but it is not absolutely necessary.

It may be cliché, but the two quotes above exemplify what I believe a Computer Science
I course is about. The second is from Isaac Asimov who was asked at the 1964 World’s
Fair what he though the world of 2014 would look like. His prediction didn’t become
entirely true, but I do believe we are on the verge of a fundamental social change that
will be caused by more and more automation. Like the industrial revolution, but on
a much smaller time scale and to a far greater extent, automation will fundamentally
change how we live and not work (I say “not work” because automation will very easily
destroy the vast majority of today’s jobs–this represents a huge economic and political
challenge that will need to be addressed). The time is quickly approaching where being
able to program and develop software will be considered a fundamental skill as essential
as arithmetic. I hope this book plays some small role in helping students adjust to that
coming world.

The first quote describes programming, or more fundamentally Computer Science and
“problem solving.” Computers do not solve problems, humans do. Computers only make
it possible to automate solutions on a large scale. At the end of the day, the human race
is still responsible for tending the machines and will be for some time despite what Star
Trek and the most optimistic of AI advocates think.

I hope that people find this book useful. If value is a ratio of quality vs cost then this
book has already succeeded in having infinite value.1 If you have suggestions on how to
improve it, please feel free to contact me. If you end up using it and finding it useful,
please let me know that too!

1
or it might be undefined, or NaN, or this book is Exceptional depending on which language sections
you read

vii
Acknowledgements
I’d like to thank the Department of Computer Science & Engineering at the University
of Nebraska–Lincoln for their support during my writing and maintaining this book.

This book is dedicated to my family.

ix
Contents

Copyleft (Copyright) i

Draft Notice iii

Preface v

Acknowledgements ix

1. Introduction 1
1.1. Problem Solving . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2. Computing Basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3. Basic Program Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.4. Syntax Rules & Pseudocode . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.5. Documentation, Comments, and Coding Style . . . . . . . . . . . . . . . 14

2. Basics 17
2.1. Control Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.1.1. Flowcharts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.2. Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.2.1. Naming Rules & Conventions . . . . . . . . . . . . . . . . . . . . 19
2.2.2. Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.2.3. Declaring Variables: Dynamic vs. Static Typing . . . . . . . . . . 31
2.2.4. Scoping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.3. Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.3.1. Assignment Operators . . . . . . . . . . . . . . . . . . . . . . . . 33
2.3.2. Numerical Operators . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.3.3. String Concatenation . . . . . . . . . . . . . . . . . . . . . . . . . 37
2.3.4. Order of Precedence . . . . . . . . . . . . . . . . . . . . . . . . . 38
2.3.5. Common Numerical Errors . . . . . . . . . . . . . . . . . . . . . . 38
2.3.6. Other Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
2.4. Basic Input/Output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
2.4.1. Standard Input & Output . . . . . . . . . . . . . . . . . . . . . . 42
2.4.2. Graphical User Interfaces . . . . . . . . . . . . . . . . . . . . . . . 42
2.4.3. Output Using printf() -style Formatting . . . . . . . . . . . . . 43
2.4.4. Command Line Input . . . . . . . . . . . . . . . . . . . . . . . . . 44

xi
Contents

2.5. Debugging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
2.5.1. Types of Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
2.5.2. Strategies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
2.6. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
2.6.1. Temperature Conversion . . . . . . . . . . . . . . . . . . . . . . . 50
2.6.2. Quadratic Roots . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
2.7. Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

3. Conditionals 65
3.1. Logical Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
3.1.1. Comparison Operators . . . . . . . . . . . . . . . . . . . . . . . . 66
3.1.2. Negation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
3.1.3. Logical And . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
3.1.4. Logical Or . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
3.1.5. Compound Statements . . . . . . . . . . . . . . . . . . . . . . . . 71
3.1.6. Short Circuiting . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
3.2. The If Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
3.3. The If-Else Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
3.4. The If-Else-If Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
3.5. Ternary If-Else Operator . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
3.6. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
3.6.1. Meal Discount . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
3.6.2. Look Before You Leap . . . . . . . . . . . . . . . . . . . . . . . . 83
3.6.3. Comparing Elements . . . . . . . . . . . . . . . . . . . . . . . . . 84
3.6.4. Life & Taxes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
3.7. Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87

4. Loops 95
4.1. While Loops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
4.1.1. Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
4.2. For Loops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
4.2.1. Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
4.3. Do-While Loops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
4.4. Foreach Loops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
4.5. Other Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
4.5.1. Nested Loops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
4.5.2. Infinite Loops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
4.5.3. Common Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
4.5.4. Equivalency of Loops . . . . . . . . . . . . . . . . . . . . . . . . . 106
4.6. Problem Solving With Loops . . . . . . . . . . . . . . . . . . . . . . . . . 106
4.7. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
4.7.1. For vs While Loop . . . . . . . . . . . . . . . . . . . . . . . . . . 107
4.7.2. Primality Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
4.7.3. Paying the Piper . . . . . . . . . . . . . . . . . . . . . . . . . . . 109

xii
Contents

4.8. Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111

5. Functions 133
5.1. Defining & Using Functions . . . . . . . . . . . . . . . . . . . . . . . . . 134
5.1.1. Function Signatures . . . . . . . . . . . . . . . . . . . . . . . . . . 134
5.1.2. Calling Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
5.1.3. Organizing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
5.2. How Functions Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
5.2.1. Call By Value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
5.2.2. Call By Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
5.3. Other Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
5.3.1. Functions as Entities . . . . . . . . . . . . . . . . . . . . . . . . . 142
5.3.2. Function Overloading . . . . . . . . . . . . . . . . . . . . . . . . . 144
5.3.3. Variable Argument Functions . . . . . . . . . . . . . . . . . . . . 145
5.3.4. Optional Parameters & Default Values . . . . . . . . . . . . . . . 145
5.4. Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146

6. Error Handling 151

6.1. Error Handling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
6.2. Error Handling Strategies . . . . . . . . . . . . . . . . . . . . . . . . . . 153
6.2.1. Defensive Programming . . . . . . . . . . . . . . . . . . . . . . . 153
6.2.2. Exceptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
6.3. Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157

7. Arrays, Collections & Dynamic Memory 159

7.1. Basic Usage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
7.2. Static & Dynamic Memory . . . . . . . . . . . . . . . . . . . . . . . . . . 162
7.2.1. Dynamic Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
7.2.2. Shallow vs. Deep Copies . . . . . . . . . . . . . . . . . . . . . . . 166
7.3. Multidimensional Arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
7.4. Other Collections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
7.5. Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168

8. Strings 177
8.1. Basic Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
8.2. Comparisons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178
8.3. Tokenizing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179
8.4. Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179

9. File Input/Output 183

9.1. Processing Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183
9.1.1. Paths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184
9.1.2. Error Handling . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
9.1.3. Buffered and Unbuffered . . . . . . . . . . . . . . . . . . . . . . . 187

xiii
Contents

9.1.4. Binary vs Text Files . . . . . . . . . . . . . . . . . . . . . . . . . 187

9.2. Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188

10.Encapsulation & Objects 197

10.1. Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198
10.1.1. Defining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198
10.1.2. Creating . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199
10.1.3. Using Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200
10.2. Design Principles & Best Practices . . . . . . . . . . . . . . . . . . . . . 200
10.3. Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201

11.Recursion 203
11.1. Writing Recursive Functions . . . . . . . . . . . . . . . . . . . . . . . . . 204
11.1.1. Tail Recursion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
11.2. Avoiding Recursion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206
11.2.1. Memoization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208
11.3. Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209

12.Searching & Sorting 211

12.1. Searching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211
12.1.1. Linear Search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212
12.1.2. Binary Search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213
12.1.3. Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215
12.2. Sorting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 220
12.2.1. Selection Sort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221
12.2.2. Insertion Sort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224
12.2.3. Quick Sort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227
12.2.4. Merge Sort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232
12.2.5. Other Sorts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237
12.2.6. Comparison & Summary . . . . . . . . . . . . . . . . . . . . . . . 237
12.3. Searching & Sorting In Practice . . . . . . . . . . . . . . . . . . . . . . . 238
12.3.1. Using Libraries and Comparators . . . . . . . . . . . . . . . . . . 238
12.3.2. Preventing Arithmetic Errors . . . . . . . . . . . . . . . . . . . . 239
12.3.3. Avoiding the Difference Trick . . . . . . . . . . . . . . . . . . . . 241
12.3.4. Importance of a Total Order . . . . . . . . . . . . . . . . . . . . . 242
12.3.5. Artificial Ordering . . . . . . . . . . . . . . . . . . . . . . . . . . 242
12.3.6. Sorting Stability . . . . . . . . . . . . . . . . . . . . . . . . . . . 243
12.4. Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244

13.Graphical User Interfaces & Event Driven Programming 247

14.Introduction to Databases & Database Connectivity 249

xiv
Contents

I. The C Programming Language 251

15.Basics 253
15.1. Getting Started: Hello World . . . . . . . . . . . . . . . . . . . . . . . . 253
15.2. Basic Elements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254
15.2.1. Basic Syntax Rules . . . . . . . . . . . . . . . . . . . . . . . . . . 255
15.2.2. Preprocessor Directives . . . . . . . . . . . . . . . . . . . . . . . . 255
15.2.3. Comments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 258
15.2.4. The main() Function . . . . . . . . . . . . . . . . . . . . . . . . 259
15.3. Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 260
15.3.1. Declaration & Assignment . . . . . . . . . . . . . . . . . . . . . . 260
15.4. Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 262
15.5. Basic I/O . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263
15.6. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265
15.6.1. Converting Units . . . . . . . . . . . . . . . . . . . . . . . . . . . 265
15.6.2. Computing Quadratic Roots . . . . . . . . . . . . . . . . . . . . . 267

16.Conditionals 271
16.1. Logical Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271
16.1.1. Order of Precedence . . . . . . . . . . . . . . . . . . . . . . . . . 273
16.1.2. Comparing Strings and Characters . . . . . . . . . . . . . . . . . 273
16.2. If, If-Else, If-Else-If Statements . . . . . . . . . . . . . . . . . . . . . . . 274
16.3. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 276
16.3.1. Computing a Logarithm . . . . . . . . . . . . . . . . . . . . . . . 276
16.3.2. Life & Taxes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 277
16.3.3. Quadratic Roots Revisited . . . . . . . . . . . . . . . . . . . . . . 279

17.Loops 283
17.1. While Loops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 283
17.2. For Loops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 285
17.3. Do-While Loops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 285
17.4. Other Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 286
17.5. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 287
17.5.1. Normalizing a Number . . . . . . . . . . . . . . . . . . . . . . . . 287
17.5.2. Summation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 287
17.5.3. Nested Loops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 288
17.5.4. Paying the Piper . . . . . . . . . . . . . . . . . . . . . . . . . . . 288

18.Functions 291
18.1. Defining & Using Functions . . . . . . . . . . . . . . . . . . . . . . . . . 291
18.1.1. Declaration: Prototypes . . . . . . . . . . . . . . . . . . . . . . . 291
18.1.2. Void Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293
18.1.3. Organizing Functions . . . . . . . . . . . . . . . . . . . . . . . . . 294
18.1.4. Calling Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . 294

xv
Contents

18.2. Pointers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 295

18.2.1. Passing By Reference . . . . . . . . . . . . . . . . . . . . . . . . . 297
18.2.2. Function Pointers . . . . . . . . . . . . . . . . . . . . . . . . . . . 300
18.3. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 301
18.3.1. Generalized Rounding . . . . . . . . . . . . . . . . . . . . . . . . 301
18.3.2. Quadratic Roots . . . . . . . . . . . . . . . . . . . . . . . . . . . 303

19.Error Handling 305

19.1. Language Supported Error Codes . . . . . . . . . . . . . . . . . . . . . . 305
19.1.1. POSIX Error Codes . . . . . . . . . . . . . . . . . . . . . . . . . 306
19.2. Error Handling By Design . . . . . . . . . . . . . . . . . . . . . . . . . . 308
19.3. Enumerated Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 309
19.4. Using Enumerated Types for Error Codes . . . . . . . . . . . . . . . . . 310

20.Arrays 313
20.1. Basic Usage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313
20.2. Dynamic Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 316
20.3. Using Arrays with Functions . . . . . . . . . . . . . . . . . . . . . . . . . 318
20.4. Multidimensional Arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . 319
20.4.1. Contiguous 2-D Arrays . . . . . . . . . . . . . . . . . . . . . . . . 322
20.5. Dynamic Data Structures . . . . . . . . . . . . . . . . . . . . . . . . . . 324

21.Strings 325
21.1. Character Arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 325
21.2. String Library . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 327
21.3. Arrays of Strings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 330
21.4. Comparisons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 331
21.5. Conversions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 332
21.6. Tokenizing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 333

22.File I/O 335

22.1. Opening Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 335
22.2. Reading & Writing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 336
22.2.1. Plaintext Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 336
22.2.2. Binary Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 338
22.3. Closing Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 339

23.Structures 341
23.1. Defining Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 341
23.1.1. Alternative Declarations . . . . . . . . . . . . . . . . . . . . . . . 342
23.1.2. Nested Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . 343
23.2. Usage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 344
23.2.1. Declaration & Initialization . . . . . . . . . . . . . . . . . . . . . 344
23.2.2. Selection Operators . . . . . . . . . . . . . . . . . . . . . . . . . . 346

xvi
Contents

23.3. Arrays of Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 348

23.4. Using Structures With Functions . . . . . . . . . . . . . . . . . . . . . . 351
23.4.1. Factory Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . 353
23.4.2. To String Functions . . . . . . . . . . . . . . . . . . . . . . . . . . 354
23.4.3. Passing Arrays of Structures . . . . . . . . . . . . . . . . . . . . . 355

24.Recursion 357

25.Searching & Sorting 361

25.1. Comparator Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 361
25.2. Function Pointers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 365
25.3. Searching & Sorting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 372
25.3.1. Searching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 372
25.3.2. Sorting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 374
25.3.3. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 374
25.4. Other Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 377
25.4.1. Sorting Pointers to Elements . . . . . . . . . . . . . . . . . . . . . 377

II. The Java Programming Language 381

26.Basics 383
26.1. Getting Started: Hello World . . . . . . . . . . . . . . . . . . . . . . . . 384
26.2. Basic Elements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 385
26.2.1. Basic Syntax Rules . . . . . . . . . . . . . . . . . . . . . . . . . . 385
26.2.2. Program Structure . . . . . . . . . . . . . . . . . . . . . . . . . . 386
26.2.3. The main() Method . . . . . . . . . . . . . . . . . . . . . . . . . 389
26.2.4. Comments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 389
26.3. Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 390
26.3.1. Declaration & Assignment . . . . . . . . . . . . . . . . . . . . . . 391
26.4. Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 393
26.5. Basic I/O . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 395
26.6. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 396
26.6.1. Converting Units . . . . . . . . . . . . . . . . . . . . . . . . . . . 396
26.6.2. Computing Quadratic Roots . . . . . . . . . . . . . . . . . . . . . 400

27.Conditionals 403
27.1. Logical Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 403
27.1.1. Order of Precedence . . . . . . . . . . . . . . . . . . . . . . . . . 405
27.1.2. Comparing Strings and Characters . . . . . . . . . . . . . . . . . 406
27.2. If, If-Else, If-Else-If Statements . . . . . . . . . . . . . . . . . . . . . . . 407
27.3. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 408
27.3.1. Computing a Logarithm . . . . . . . . . . . . . . . . . . . . . . . 408
27.3.2. Life & Taxes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 410

xvii
Contents

27.3.3. Quadratic Roots Revisited . . . . . . . . . . . . . . . . . . . . . . 411

28.Loops 415
28.1. While Loops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 415
28.2. For Loops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 417
28.3. Do-While Loops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 417
28.4. Enhanced For Loops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 418
28.5. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 419
28.5.1. Normalizing a Number . . . . . . . . . . . . . . . . . . . . . . . . 419
28.5.2. Summation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 419
28.5.3. Nested Loops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 420
28.5.4. Paying the Piper . . . . . . . . . . . . . . . . . . . . . . . . . . . 421

29.Methods 423
29.1. Defining Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 424
29.1.1. Void Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 426
29.1.2. Using Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . 426
29.1.3. Passing By Reference . . . . . . . . . . . . . . . . . . . . . . . . . 427
29.2. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 428
29.2.1. Generalized Rounding . . . . . . . . . . . . . . . . . . . . . . . . 428

30.Error Handling & Exceptions 431

30.1. Exceptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 431
30.1.1. Catching Exceptions . . . . . . . . . . . . . . . . . . . . . . . . . 431
30.1.2. Throwing Exceptions . . . . . . . . . . . . . . . . . . . . . . . . . 433
30.1.3. Creating Custom Exceptions . . . . . . . . . . . . . . . . . . . . . 433
30.1.4. Checked Exceptions . . . . . . . . . . . . . . . . . . . . . . . . . . 434
30.2. Enumerated Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 436
30.2.1. More Tricks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 437

31.Arrays 439
31.1. Basic Usage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 439
31.2. Dynamic Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 442
31.3. Using Arrays with Methods . . . . . . . . . . . . . . . . . . . . . . . . . 442
31.4. Multidimensional Arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . 443
31.5. Dynamic Data Structures . . . . . . . . . . . . . . . . . . . . . . . . . . 444

32.Strings 449
32.1. Basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 449
32.2. String Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 450
32.3. Arrays of Strings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 452
32.4. Comparisons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 453
32.5. Tokenizing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 455

xviii
Contents

33.File I/O 457

33.1. File Input . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 457
33.2. File Output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 459

34.Objects 461
34.1. Data Visibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 462
34.2. Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 463
34.2.1. Accessor & Mutator Methods . . . . . . . . . . . . . . . . . . . . 464
34.3. Constructors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 467
34.4. Usage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 469
34.5. Common Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 469
34.6. Composition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 472
34.7. Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 474

35.Recursion 479

36.Searching & Sorting 483

36.1. Comparators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 483
36.2. Searching & Sorting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 487
36.2.1. Searching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 487
36.2.2. Sorting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 488
36.3. Other Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 488
36.3.1. Sorted Collections . . . . . . . . . . . . . . . . . . . . . . . . . . . 488
36.3.2. Handling null values . . . . . . . . . . . . . . . . . . . . . . . . 490
36.3.3. Importance of equals() and hashCode() Methods . . . . . . . 491
36.3.4. Java 8: Lambda Expressions . . . . . . . . . . . . . . . . . . . . . 493

III. The PHP Programming Language 495

37.Basics 497
37.1. Getting Started: Hello World . . . . . . . . . . . . . . . . . . . . . . . . 498
37.2. Basic Elements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 499
37.2.1. Basic Syntax Rules . . . . . . . . . . . . . . . . . . . . . . . . . . 499
37.2.2. PHP Tags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 500
37.2.3. Libraries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 500
37.2.4. Comments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 500
37.2.5. Entry Point & Command Line Arguments . . . . . . . . . . . . . 502
37.3. Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 502
37.3.1. Using Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . 503
37.4. Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 504
37.4.1. Type Juggling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 505
37.4.2. String Concatenation . . . . . . . . . . . . . . . . . . . . . . . . . 508
37.5. Basic I/O . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 508

xix
Contents

37.6. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 509

37.6.1. Converting Units . . . . . . . . . . . . . . . . . . . . . . . . . . . 509
37.6.2. Computing Quadratic Roots . . . . . . . . . . . . . . . . . . . . . 512

38.Conditionals 515
38.1. Logical Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 515
38.1.1. Order of Precedence . . . . . . . . . . . . . . . . . . . . . . . . . 517
38.2. If, If-Else, If-Else-If Statements . . . . . . . . . . . . . . . . . . . . . . . 517
38.3. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 519
38.3.1. Computing a Logarithm . . . . . . . . . . . . . . . . . . . . . . . 519
38.3.2. Life & Taxes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 520
38.3.3. Quadratic Roots Revisited . . . . . . . . . . . . . . . . . . . . . . 522

39.Loops 527
39.1. While Loops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 527
39.2. For Loops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 528
39.3. Do-While Loops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 529
39.4. Foreach Loops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 529
39.5. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 530
39.5.1. Normalizing a Number . . . . . . . . . . . . . . . . . . . . . . . . 530
39.5.2. Summation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 530
39.5.3. Nested Loops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 531
39.5.4. Paying the Piper . . . . . . . . . . . . . . . . . . . . . . . . . . . 531

40.Functions 535
40.1. Defining & Using Functions . . . . . . . . . . . . . . . . . . . . . . . . . 535
40.1.1. Declaring Functions . . . . . . . . . . . . . . . . . . . . . . . . . . 535
40.1.2. Organizing Functions . . . . . . . . . . . . . . . . . . . . . . . . . 537
40.1.3. Calling Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . 537
40.1.4. Passing By Reference . . . . . . . . . . . . . . . . . . . . . . . . . 538
40.1.5. Optional & Default Parameters . . . . . . . . . . . . . . . . . . . 539
40.1.6. Function Pointers . . . . . . . . . . . . . . . . . . . . . . . . . . . 540
40.2. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 540
40.2.1. Generalized Rounding . . . . . . . . . . . . . . . . . . . . . . . . 540
40.2.2. Quadratic Roots . . . . . . . . . . . . . . . . . . . . . . . . . . . 541

41.Error Handling & Exceptions 543

41.1. Throwing Exceptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 543
41.2. Catching Exceptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 543
41.3. Creating Custom Exceptions . . . . . . . . . . . . . . . . . . . . . . . . . 544

42.Arrays 547
42.1. Creating Arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 547

xx
Contents

42.2. Indexing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 547

42.2.1. Strings as Indices . . . . . . . . . . . . . . . . . . . . . . . . . . . 549
42.2.2. Non-Contiguous Indices . . . . . . . . . . . . . . . . . . . . . . . 549
42.2.3. Key-Value Initialization . . . . . . . . . . . . . . . . . . . . . . . 549
42.3. Useful Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 550
42.4. Iteration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 551
42.5. Adding Elements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 552
42.6. Removing Elements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 553
42.7. Using Arrays in Functions . . . . . . . . . . . . . . . . . . . . . . . . . . 554
42.8. Multidimensional Arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . 555

43.Strings 557
43.1. Basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 557
43.2. String Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 558
43.3. Arrays of Strings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 560
43.4. Comparisons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 560
43.5. Tokenizing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 561

44.File I/O 563

44.1. Opening Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 563
44.2. Reading & Writing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 564
44.2.1. Using URLs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 565
44.2.2. Closing Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 565

45.Objects 567
45.1. Data Visibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 568
45.2. Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 568
45.2.1. Accessor & Mutator Methods . . . . . . . . . . . . . . . . . . . . 570
45.3. Constructors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 571
45.4. Usage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 572
45.5. Common Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 573
45.6. Composition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 573
45.7. Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 574

46.Recursion 577

47.Searching & Sorting 581

47.1. Comparator Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 581
47.1.1. Searching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 584
47.1.2. Sorting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 584

Glossary 587

Acronyms 599

xxi
Contents

Index 610

References 613

xxii
List of Algorithms

1.1. An example of pseudocode: finding a minimum value . . . . . . . . . . . . 13

2.1. Assignment Operator Demonstration . . . . . . . . . . . . . . . . . . . . . 34

2.2. Addition and Subtraction Demonstration . . . . . . . . . . . . . . . . . . 35

2.3. Multiplication and Division Demonstration . . . . . . . . . . . . . . . . . 36

2.4. Temperature Conversion Program . . . . . . . . . . . . . . . . . . . . . . 50

2.5. Quadratic Roots Program . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

3.1. An if-statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

3.2. An if-else Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

3.3. Example If-Else-If Statement . . . . . . . . . . . . . . . . . . . . . . . . . 79

3.4. General If-Else-If Statement . . . . . . . . . . . . . . . . . . . . . . . . . . 81

3.5. If-Else-If Statement With a Bug . . . . . . . . . . . . . . . . . . . . . . . 81

3.6. A simple receipt program . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

3.7. Preventing Division By Zero Using an If Statement . . . . . . . . . . . . . 83

3.8. Comparing Students by Name . . . . . . . . . . . . . . . . . . . . . . . . . 84

3.9. Computing Tax Liability with If-Else-If . . . . . . . . . . . . . . . . . . . 86

3.10. Computing Tax Credit with If-Else-If . . . . . . . . . . . . . . . . . . . . . 86

4.1. Counter-Controlled While Loop . . . . . . . . . . . . . . . . . . . . . . . . 97

4.2. Normalizing a Number With a While Loop . . . . . . . . . . . . . . . . . 99

4.3. A General For Loop . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99

xxiii
LIST OF ALGORITHMS

4.4. Counter-Controlled For Loop . . . . . . . . . . . . . . . . . . . . . . . . . 100

4.5. Summation of Numbers in a For Loop . . . . . . . . . . . . . . . . . . . . 100

4.6. Counter-Controlled Do-While Loop . . . . . . . . . . . . . . . . . . . . . . 101

4.7. Flag-Controlled Do-While Loop . . . . . . . . . . . . . . . . . . . . . . . . 102

4.8. Example Foreach Loop . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102

4.9. Foreach Loop Computing Grades . . . . . . . . . . . . . . . . . . . . . . . 103

4.10. Nested For Loops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103

4.11. Infinite Loop . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104

4.12. Computing the Geometric Series Using a For Loop . . . . . . . . . . . . . 107

4.13. Computing the Geometric Series Using a While Loop . . . . . . . . . . . . 108

4.14. Determining if a Number is Prime or Composite . . . . . . . . . . . . . . 109

4.15. Counting the number of primes. . . . . . . . . . . . . . . . . . . . . . . . . 109

4.16. Computing a loan amortization schedule . . . . . . . . . . . . . . . . . . . 111

4.17. Scaling a Value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123

5.1. A function in pseudocode . . . . . . . . . . . . . . . . . . . . . . . . . . . 135

5.2. Using a function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137

11.1. Recursive CountDown(n) Function . . . . . . . . . . . . . . . . . . . . . 203

11.2. Recursive Fibonacci(n) Function . . . . . . . . . . . . . . . . . . . . . . 204

11.3. Recursive Fibonacci(n) Function With Memoization . . . . . . . . . . . 208

12.1. Linear Search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212

12.2. Recursive Binary Search Algorithm, BinarySearch(A, l, r, ek ) . . . . . . 214

12.3. Iterative Binary Search Algorithm, BinarySearch(A, ek ) . . . . . . . . . 215

12.4. Selection Sort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222

12.5. Insertion Sort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225

12.6. QuickSort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229

xxiv
LIST OF ALGORITHMS

12.7. In-Place Partition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229

12.8. MergeSort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233

12.9. Merge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234

xxv
List of Code Samples
1.1. A simple program in C . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.2. A simple program in C, compiled to assembly . . . . . . . . . . . . . . . 10
1.3. A simple program in C, resulting machine code formatted in hexadecimal
(partial) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

2.1. Example of variable scoping in C . . . . . . . . . . . . . . . . . . . . . . 32

2.2. Compound Assignment Operators in C . . . . . . . . . . . . . . . . . . . 41
2.3. printf() examples in C . . . . . . . . . . . . . . . . . . . . . . . . . . 45
2.4. Output Result . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

4.1. Zune Bug . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105

15.1. Hello World Program in C . . . . . . . . . . . . . . . . . . . . . . . . . . 254

15.2. Fahrenheit-to-Celsius Conversion Program in C . . . . . . . . . . . . . . 268
15.3. Quadratic Roots Program in C . . . . . . . . . . . . . . . . . . . . . . . 269

16.1. Examples of Conditional Statements in C . . . . . . . . . . . . . . . . . . 275

16.2. Logarithm Calculator Program in C . . . . . . . . . . . . . . . . . . . . . 280
16.3. Tax Program in C . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 281
16.4. Quadratic Roots Program in C With Error Checking . . . . . . . . . . . 282

17.1. While Loop in C . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 283

17.2. Flag-controlled While Loop in C . . . . . . . . . . . . . . . . . . . . . . . 284
17.3. For Loop in C . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 285
17.4. Do-While Loop in C . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 286
17.5. Normalizing a Number with a While Loop in C . . . . . . . . . . . . . . 287
17.6. Summation of Numbers using a For Loop in C . . . . . . . . . . . . . . . 287
17.7. Nested For Loops in C . . . . . . . . . . . . . . . . . . . . . . . . . . . . 288
17.8. Loan Amortization Program in C . . . . . . . . . . . . . . . . . . . . . . 290

19.1. Using the errno.h library . . . . . . . . . . . . . . . . . . . . . . . . . 307

23.1. A Student structure declaration . . . . . . . . . . . . . . . . . . . . . . 344

25.1. C Function Pointer Syntax Examples . . . . . . . . . . . . . . . . . . . . 371

25.2. C Search Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 375
25.3. C Sort Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 376
25.4. C Comparator Function for Strings . . . . . . . . . . . . . . . . . . . . . 377

xxvii
List of Code Samples

25.5. Sorting Structures via Pointers . . . . . . . . . . . . . . . . . . . . . . . 378

25.6. Handling Null Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 379

26.1. Hello World Program in Java . . . . . . . . . . . . . . . . . . . . . . . . 384

26.2. Basic Input/Output in Java . . . . . . . . . . . . . . . . . . . . . . . . . 396
26.3. Fahrenheit-to-Celsius Conversion Program in Java . . . . . . . . . . . . . 399
26.4. Quadratic Roots Program in Java . . . . . . . . . . . . . . . . . . . . . . 402

27.1. Examples of Conditional Statements in Java . . . . . . . . . . . . . . . . 407

27.2. Logarithm Calculator Program in Java . . . . . . . . . . . . . . . . . . . 412
27.3. Tax Program in Java . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 413
27.4. Quadratic Roots Program in Java With Error Checking . . . . . . . . . . 414

28.1. While Loop in Java . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 415

28.2. Flag-controlled While Loop in Java . . . . . . . . . . . . . . . . . . . . . 416
28.3. For Loop in Java . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 417
28.4. Do-While Loop in Java . . . . . . . . . . . . . . . . . . . . . . . . . . . . 418
28.5. Enhanced For Loops in Java Example 1 . . . . . . . . . . . . . . . . . . . 418
28.6. Enhanced For Loops in Java Example 2 . . . . . . . . . . . . . . . . . . . 419
28.7. Normalizing a Number with a While Loop in Java . . . . . . . . . . . . . 419
28.8. Summation of Numbers using a For Loop in Java . . . . . . . . . . . . . 420
28.9. Nested For Loops in Java . . . . . . . . . . . . . . . . . . . . . . . . . . . 420
28.10.Loan Amortization Program in Java . . . . . . . . . . . . . . . . . . . . . 422

34.1. The completed Java Student class. . . . . . . . . . . . . . . . . . . . . 477

36.1. Java Search Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . 489

36.2. Using Java Collection’s Sort Method . . . . . . . . . . . . . . . . . . . . 490
36.3. Handling Null Values in Java Comparators . . . . . . . . . . . . . . . . . 491

37.1. Hello World Program in PHP . . . . . . . . . . . . . . . . . . . . . . . . 498

37.2. Hello World Program in PHP with HTML . . . . . . . . . . . . . . . . . 498
37.3. Type Juggling in PHP . . . . . . . . . . . . . . . . . . . . . . . . . . . . 506
37.4. Fahrenheit-to-Celsius Conversion Program in PHP . . . . . . . . . . . . . 511
37.5. Quadratic Roots Program in PHP . . . . . . . . . . . . . . . . . . . . . . 513

38.1. Examples of Conditional Statements in PHP . . . . . . . . . . . . . . . . 518

38.2. Logarithm Calculator Program in C . . . . . . . . . . . . . . . . . . . . . 523
38.3. Tax Program in PHP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 524
38.4. Quadratic Roots Program in PHP With Error Checking . . . . . . . . . 525

39.1. While Loop in PHP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 527

39.2. Flag-controlled While Loop in PHP . . . . . . . . . . . . . . . . . . . . . 528
39.3. For Loop in PHP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 529
39.4. Do-While Loop in PHP . . . . . . . . . . . . . . . . . . . . . . . . . . . . 529
39.5. Normalizing a Number with a While Loop in PHP . . . . . . . . . . . . . 530

xxviii
List of Code Samples

39.6. Summation of Numbers using a For Loop in PHP . . . . . . . . . . . . . 531

39.7. Nested For Loops in PHP . . . . . . . . . . . . . . . . . . . . . . . . . . 531
39.8. Loan Amortization Program in PHP . . . . . . . . . . . . . . . . . . . . 533

44.1. Processing a file line-by-line in PHP . . . . . . . . . . . . . . . . . . . . . 564

45.1. The completed PHP Student class. . . . . . . . . . . . . . . . . . . . . 576

47.1. Using PHP’s usort() Function . . . . . . . . . . . . . . . . . . . . . . . 585

xxix
List of Figures

1.1. Depiction of Computer Memory . . . . . . . . . . . . . . . . . . . . . . . 6

1.2. A Compiling Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

2.1. Types of Flowchart Nodes . . . . . . . . . . . . . . . . . . . . . . . . . . 18

2.2. Example of a flowchart for a simple ATM process . . . . . . . . . . . . . 19
2.3. Elements of a printf() statement in C . . . . . . . . . . . . . . . . . . 44
2.4. Intersection of two circles. . . . . . . . . . . . . . . . . . . . . . . . . . . 63

3.1. Control flow diagrams for sequential control flow and an if-statement. . . 77
3.2. An if-else Flow Chart . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
3.3. Control Flow for an If-Else-If Statement . . . . . . . . . . . . . . . . . . 80
3.4. Quadrants of the Cartesian Plane . . . . . . . . . . . . . . . . . . . . . . 87
3.5. Three types of triangles . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
3.6. Intersection of Two Rectangles . . . . . . . . . . . . . . . . . . . . . . . . 91
3.7. Examples of Floor Tiling . . . . . . . . . . . . . . . . . . . . . . . . . . . 93

4.1. A Typical Loop Flow Chart . . . . . . . . . . . . . . . . . . . . . . . . . 96

4.2. A Do-While Loop Flow Chart. The continuation condition is checked after
the loop body. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
4.3. Plot of f (x) = sinx x . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
4.4. A rectangle for the interval [−5, 5]. . . . . . . . . . . . . . . . . . . . . . 117
4.5. Follow the bouncing ball . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
4.6. Sampling points in a circle . . . . . . . . . . . . . . . . . . . . . . . . . . 119
4.7. Regular polygons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
4.8. A polygon and its centroid. Whoo! . . . . . . . . . . . . . . . . . . . . . 130

5.1. A function declaration (prototype) in the C programming language with

the return type, identifier, and parameter list labeled. . . . . . . . . . . . 135
5.2. Program Stack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
5.3. Demonstration of Pass By Value . . . . . . . . . . . . . . . . . . . . . . . 141
5.4. Demonstration of Pass By Reference . . . . . . . . . . . . . . . . . . . . 143

7.1. Example of an Array . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160

7.2. Example returning a static array . . . . . . . . . . . . . . . . . . . . . . 163
7.3. Pitfalls of Returning Static Arrays . . . . . . . . . . . . . . . . . . . . . . 174
7.4. Depiction of Application Memory. . . . . . . . . . . . . . . . . . . . . . . 175
7.5. Shallow vs. Deep Copies . . . . . . . . . . . . . . . . . . . . . . . . . . . 176

xxxi
List of Figures

9.1. Linux Tree Directory Structure . . . . . . . . . . . . . . . . . . . . . . . 186

9.2. An example polygon for n = 5 . . . . . . . . . . . . . . . . . . . . . . . . 188
9.3. A Word Search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189
9.4. A solved Sudoku puzzle . . . . . . . . . . . . . . . . . . . . . . . . . . . 191
9.5. A DNA Sequence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194
9.6. Codon Table for RNA to Protein Translation . . . . . . . . . . . . . . . . 195

11.1. Recursive Fibonacci Computation Tree . . . . . . . . . . . . . . . . . . . 207

12.1. Array of Integers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212

12.2. A Sorted Array . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213
12.3. Binary Search Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216
12.4. Example of the benefit of ordered (indexed) elements in Windows 7 . . . 220
12.5. Selection Sort Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223
12.6. Insertion Sort Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226
12.7. Partitioning Example 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230
12.8. Partitioning Example 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230
12.9. Partitioning Example 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231
12.10.Merge Sort Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235
12.11.Merge Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236
12.12.Generalized Sorting with a Comparator . . . . . . . . . . . . . . . . . . . 240

18.1. Pointer Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 298

20.1. Dynamically Allocating Multidimensional Arrays . . . . . . . . . . . . . 321

20.2. Contiguous Two Dimensional Array . . . . . . . . . . . . . . . . . . . . . 324

21.1. Example of a character array (string) in C. . . . . . . . . . . . . . . . . . 325

23.1. Contiguous Structure Array . . . . . . . . . . . . . . . . . . . . . . . . . 350

23.2. Array of Structure Pointers . . . . . . . . . . . . . . . . . . . . . . . . . 350
23.3. Hybrid Array of Structures . . . . . . . . . . . . . . . . . . . . . . . . . . 352

xxxii
1. Introduction

Computers are awesome. The human race has seen more advancements in the last 50 years
than in the entire 10,000 years of human history. Technology has transformed the way we
live our daily lives, how we interact with each other, and has changed the course of our
history. Today, everyone carries smart phones which have more computational power than
supercomputers from even 20 years ago. Computing has become ubiquitous, the “internet
of things” will soon become a reality in which every device will become interconnected
and data will be collected and available even about the smallest of minutiae.

However, computers are also dumb. Despite the most fantastical of depictions in science
fiction and and hopes of Artificial Intelligence, computers can only do what they are told
to do. The fundamental art of Computer Science is problem solving. Computers are
not good at problem solving; you are the problem solver. It is still up to you, the user,
to approach a complex problem, study it, understand it, and develop a solution to it.
Computers are only good at automating solutions once you have solved the problem.

Computational sciences have become a fundamental tool of almost every discipline.

Scholars have used textual analysis and data mining techniques to analyze classical
literature and historic texts, providing new insights and opening new areas of study.
Astrophysicists have used computational analysis to detect dozens of new exoplanets.
Complex visualizations and models can predict astronomical collisions on a galactic scale.
Physicists have used big data analytics to push the boundaries of our understanding of
matter in the search for the Higgs boson and study of elementary particles. Chemists
simulate the interaction of millions of combinations of compounds without the need for
expensive and time consuming physical experiments. Biologists use massively distributed
computing models to simulate protein folding and other complex processes. Meteorologists
can predict weather and climactic changes with ever greater accuracy.

Technology and data analytics have changed how political campaigns are run, how
products are marketed and even delivered. Social networks can be data mined to track
and predict the spread of flu epidemics. Computing and automation will only continue
to grow. The time is soon coming where basic computational thinking and the ability
to develop software will be considered a basic skill necessary to every discipline, a
requirement for many jobs and an essential skill akin to arithmetic.

Computer Science is not programming. Programming is a necessary skill, but it is only

the beginning. This book is intended to get you started on your journey.

1
1. Introduction

1.1. Problem Solving

At its heart, Computer Science is about problem solving. That is not to say that only
Computer Science is about problem solving. It would be hubris to think that Computer
Science holds a monopoly on “problem solving.” Indeed, it would be hard to find any
discipline in which solving problems was not a substantial aspect or motivation if not
integral. Instead, Computer Science is the study of computers and computation. It
involves studying and understanding computational processes and the development of
algorithms and techniques and how they apply to problems.

Problem solving skills are not something that can be distilled down into a single step-
by-step process. Each area and each problem comes with its own unique challenges and
considerations. General problem solving techniques can be identified, studied and taught,
but problem solving skills are something that come with experience, hard work, and most
importantly, failure. Problem solving is part and parcel of the human experience.

That doesn’t mean we can’t identify techniques and strategies for approaching problems,
in particular problems that lend themselves to computational solutions. A prerequisite
to solving a problem is understanding it. What is the problem? Who or what entities
are involved in the problem? How do those entities interact with each other? What are
the problems or deficiencies that need to be addressed? Answering these questions, we
get an idea of where we are.

Ultimately, what is desired in a solution? What are the objectives that need to be
achieved? What would an ideal solution look like or what would it do? Who would use
the solution and how would they use it? By answering these questions, we get an idea of
where we want to be. Once we know where we are and where we want to be, the problem
solving process can begin: how do we get from point A to point B?

One of the first things a good engineer asks is: does a solution already exist? If a solution
already exists, then the problem is already solved! Ideally the solution is an “off-the-shelf”
solution: something that already exists and which may have been designed for a different
purpose but that can be repurposed for our problem. However, there may be exceptions
to this. The existing solution may be infeasible: it may be too resource intensive or
expensive. It may be too difficult or too expensive to adapt to our problem. It may solve
most of our problem, but may not work in some corner cases. It may need to be heavily
modified in order to work. Still, this basic question may save a lot of time and effort in
many cases.

In a very broad sense, the problem solving process is one that involves

1. Design

2. Implementation

2
1.1. Problem Solving

3. Testing

4. Refinement

After one has a good understanding of a problem, they can start designing a solution. A
design is simply a plan on the construction of a solution. A design “on paper” allows
you to see what the potential solution would look like before investing the resources in
building it. It also allows you to identify possible impediments or problems that were
not readily apparent. A design allows you to an opportunity to think through possible
alternative solutions and weigh the advantages and disadvantages of each. Designing a
solution also allows you to understand the problem better. Design can involve gathering
requirements and developing use cases. How would an individual use the proposed
solution? What features would they need or want?

Implementations can involve building prototype solutions to test the feasibility of the
design. It can involve building individual components and integrating them together.

Testing involves finding, designing, and developing test cases: actual instances of the
problem that can be used to test your solution. Ideally, the a test case instance involves
not only the “input” of the problem, but also the “output” of the problem: a feasible or
optimal solution that is known to be correct via other means. Test cases allow us to test
our solution to see if it gives correct and perhaps optimal solutions.

Refinement is a process by which we can redesign, reimplement and retest our solution.
We may want to make the solution more efficient, cheaper, simpler or more elegant. We
may find there are components that are redundant or unnecessary and try to eliminate
them. We may find errors or bugs in our solution that fail to solve the problem for some
or many instances. We may have misinterpreted requirements or there may have been
miscommunication, misunderstanding or differing expectations in the solution between
the designers and stakeholders. Situations may change or requirements may have been
modified or new requirements created and the solution needs to be adapted. Each of
these steps may need to be repeated many times until an ideal solution, or at least
acceptable, solution is achieved.

Yet another phase of problem solving is maintenance. The solution we create may need
to be maintained in order to remain functional and stay relevant. Design flaws or bugs
may become apparent that were missed in previous phases. The solution may need to be
updated to adapt to new technology or requirements.

In software design there are two general techniques for problem solving; top-down and
bottom-up design. A top-down design strategy approaches a problem by breaking it
down into smaller and smaller problems until either a solution is obvious or trivial or a
preexisting solution (the aforementioned “off-the-shelf” solution) exists. The solutions to
the subproblems are combined and interact to solve the overall problem.

A bottom-up strategy attempts to first completely define the smallest components or

3
1. Introduction

entities that make up a system first. Once these have been defined and implemented,
they are combined and interactions between them are defined to produce a more complex
system.

1.2. Computing Basics

Everyone has some level of familiarity with computers and computing devices just as
everyone has familiarity with automotive basics. However, just because you drive a car
everyday doesn’t mean you can tell the difference between a crankshaft and a piston. To
get started, let’s familiarize ourselves with some basic concepts.

A computer is a device, usually electronic, that stores, receives, processes, and outputs
information. Modern computing devices include everything from simple sensors to mobile
devices, tablets, desktops, mainframes/servers, supercomputers and huge grid clusters
consisting of multiple computers networked together.

Computer hardware usually refers to the physical components in a computing system

which includes input devices such as a mouse/touchpad, keyboard, or touchscreen, output
devices such as monitors, storage devices such as hard disks and solid state drives, as
well as the electronic components such as graphics cards, main memory, motherboards
and chips that make up the Central Processing Unit (CPU).

Computer processors are complex electronic circuits (referred to as Very Large Scale Inte-
gration (VLSI)) which contain thousands of microscopic electronic transistors–electronic
“gates” that can perform logical operations and complex instructions. In addition to
the CPU a processor may contain an Arithmetic and Logic Unit (ALU) that performs
arithmetic operations such as addition, multiplication, division, etc.

Computer Software usually refers to the actual machine instructions that are run on a
processor. Software is usually written in a high-level programming language such as C or
Java and then converted to machine code that the processor can execute.

Computers “speak” in binary code. Binary is nothing more than a structured collection
of 0s and 1s. A single 0 or 1 is referred to as a bit. Bits can be collected to form larger
chunks of information: 8 bits form a byte, 1024 bytes is referred to as a kilobyte, etc.
Table 1.1 contains a several more binary units. Each unit is in terms of a power of 2
instead of a power of 10. As humans, we are more familiar with decimal–base-10 numbers
and so units are usually expressed as powers of 10, kilo- refers to 103 , mega- is 106 , etc.
However, since binary is base-2 (0 or 1), units are associated with the closest power
of 2. Computers are binary machines because it is the most practical to implement in
electronic devices. 0s and 1s can be easily represented by low/high voltage; low/high
frequency; on-off; etc. It is much easier to design and implement systems that switch
between only two states.

4
1.3. Basic Program Structure

Unit 2n Number of bytes

Kilobyte (KB) 210 1,024
Megabyte (MB) 220 1,048,576
Gigabyte (GB) 230 1,073,741,824
Terabyte (TB) 240 1,099,511,627,776
Petabyte (PB) 250 1,125,899,906,842,624
Exabyte (EB) 260 1,152,921,504,606,846,976
Zettabyte (ZB) 270 1,180,591,620,717,411,303,424
Yottabyte (YB) 280 1,208,925,819,614,629,174,706,176

Table 1.1.: Various units of digital information with respect to bytes. Memory is usually
measured using powers of two.

Computer memory can refer to secondary memory which are typically longterm storage
devices such as hard disks, flash drives, SD cards, optical disks (CDs, DVDs), etc. These
generally have a large capacity but are slower (the time it takes to access a chunk of data
is longer). Or, it can refer to main memory (or primary memory): data stored on chips
that is much faster but also more expensive and thus generally smaller.

The first hard disk (IBM 350) was developed in 1956 by IBM and had a capacity of
3.75MB and cost $3,200 ($27,500 in 2015 dollars) per month to lease. For perspective,
the first commercially available TB hard drive was released in 2007. As of 2015, terabyte
hard disks can be commonly purchased for $50–$100.

Main memory, sometimes referred to as Random Access Memory (RAM) consists of

a collection of addresses along with contents. An address usually refers to a single
byte of memory (called byte-addressing). The content, that is the byte of data that is
stored at an address, can be anything. It can represent a number, a letter, etc. To the
computer it is all just a bunch of 0s and 1s. For convenience, memory addresses are
represented using hexadecimal, which is a base-16 counting system using the symbols
0, 1, . . . , 9, a, b, c, d, e, f . Numbers are prefixed with a 0x to indicate they represent
hexadecimal numbers. Figure 1.1 depicts memory and its address/contents.

Separate computing devices can be connected to each other through a network. Networks
can be wired with electrical signals or light as in fiber optics which provide large bandwidth
(the amount of data that can be sent at any one time), but can be expensive to build and
maintain. They can also be wireless, but provide shorter range and lower bandwidth.

1.3. Basic Program Structure

Programs start out as source code, a collection of instructions usually written in a high-
level programming language. A source file containing source code is nothing more than

5
1. Introduction

Address Contents
.. ..
. .

0x7fff58310b8f
0x7fff58310b8b 0x32
0x7fff58310b8a 0x3e
0x7fff58310b89 0xcf
0x7fff58310b88 0x23
0x7fff58310b87 0x01
0x7fff58310b86 0x32
0x7fff58310b85 0x7c
0x7fff58310b84 0xff
0x7fff58310b83
0x7fff58310b82
0x7fff58310b81
0x7fff58310b80
0x7fff58310b7f
0x7fff58310b7e
0x7fff58310b7d
0x7fff58310b7c 3.14159265359
0x7fff58310b7b
0x7fff58310b7a
0x7fff58310b79
0x7fff58310b78 32,321,231
0x7fff58310b77
0x7fff58310b76
0x7fff58310b75
0x7fff58310b74 1,458,321
0x7fff58310b73 \0
0x7fff58310b72 o
0x7fff58310b71 l
0x7fff58310b70 l
0x7fff58310b6f e
0x7fff58310b6e H
0x7fff58310b88 0xfa
0x7fff58310b87 0xa8
0x7fff58310b86 0xba

.. ..
. .

Figure 1.1.: Depiction of Computer Memory. Each address refers to a byte, but different
types of data (integers, floating-point numbers, characters) may require
different amounts of memory. Memory addresses and some data is represented
in hexadecimal.

6
1.3. Basic Program Structure

a plain text file that can be edited by any text editor. However, many developers and
programmers utilize modern Integrated Development Environment (IDE) that provide a
text editor with code highlighting: various elements are displayed in different colors to
make the code more readable and elements can be easily identified. Mistakes such as
unclosed comments or curly brackets can be readily apparent with such editors. IDEs can
also provide automated compile/build features and other tools that make the development
process easier and faster.

Some languages are compiled languages meaning that a source file must be translated
into machine code that a processor can understand and execute. This is actually a
multistep process. A compiler may first preprocess the source file(s) and perform some
pre-compiler operations. It may then transform the source code into another language
such as an assembly language, a lower-level more machine-like language. Ultimately, the
compiler transforms the source code into object code, a binary format that the machine
can understand.

To produce an executable file that can actually be run, a linker may then take the object
code and link in any other necessary objects or precompiled library code necessary to
produce a final program. Finally, an executable file (still just a bunch of binary code) is
produced.

Once an executable file has been produced we can run the program. When a program is
executed, a request is sent to the operating system to load and run the program. The
operating system loads the executable file into memory and may setup additional memory
for its variables as well as its call stack (memory to enable the program to make function
calls). Once loaded and setup, the operating system begins executing the instructions at
the program’s entry point.

In many languages, a program’s entry point is defined by a main function or method.

A program may contain many functions and pieces of code, but this special function is
defined as the one that gets invoked when a program starts. Without a main function,
the code may still be useful: libraries contain many useful functions and procedures so
that you don’t have to write a program from scratch. However, these functions are not
intended to be run by themselves. Instead, they are written so that other programs can
use them. A program becomes executable only when a main entry point is provided.

This compile-link-execute process is roughly depicted in Code Sample 1.2. An example of

a simple C program can be found in Code Sample 1.1 along with the resulting assembly
code produced by a compiler in Figure 1.2 and the final machine code represented in
hexadecimal in Code Sample 1.3.

In contrast, some languages are interpreted, not compiled. The source code is contained
in a file usually referred to as a script. Rather than being run directly by an operating
system, the operating system loads and execute another program called an interpreter.
The interpreter then loads the script, parses, and execute its instructions. Interpreted

7
1. Introduction

Text Editor
Source File
or IDE

Syntax Other Object

Compiler Files &
Error(s)
Libraries

success

Object File Linker

Input

Executable Results &

File Output
run

Figure 1.2.: A Compiling Process

8
1.3. Basic Program Structure

1 #include <stdlib.h>
2 #include <stdio.h>
3 #include <math.h>
4

5 int main(int argc, char **argv) {

7 if(argc != 2) {
8 fprintf(stderr, "Usage: %s x\n", argv[0]);
9 exit(1);
10 }
11

12 double x = atof(argv[1]);
13 double result = sqrt(x);
14

15 if(x < 0) {
16 fprintf(stderr, "Cannot handle complex roots\n");
17 exit(2);
18 }
19

20 printf("square root of %f = %f\n", x, result);

22 return 0;
23 }

Code Sample 1.1.: A simple program in C

9
1. Introduction

.section __TEXT,__text,regular,pure_instructions
.globl _main
.align 4, 0x90
_main: ## @main
.cfi_startproc
## BB#0:
pushq %rbp
Ltmp2:
.cfi_def_cfa_offset 16
Ltmp3:
.cfi_offset %rbp, -16
movq %rsp, %rbp
Ltmp4:
.cfi_def_cfa_register %rbp
subq $48, %rsp
movl $0, -4(%rbp)
movl %edi, -8(%rbp)
movq %rsi, -16(%rbp)
cmpl $2, -8(%rbp)
je LBB0_2
## BB#1:
leaq L_.str(%rip), %rsi
movq ___stderrp@GOTPCREL(%rip), %rax
movq (%rax), %rdi
movq -16(%rbp), %rax
movq (%rax), %rdx
movb $0, %al
callq _fprintf
movl $1, %edi
movl %eax, -36(%rbp) ## 4-byte Spill
callq _exit
LBB0_2:
movq -16(%rbp), %rax
movq 8(%rax), %rdi
callq _atof
xorps %xmm1, %xmm1
movsd %xmm0, -24(%rbp)
movsd -24(%rbp), %xmm0
sqrtsd %xmm0, %xmm0
movsd %xmm0, -32(%rbp)
ucomisd -24(%rbp), %xmm1
jbe LBB0_4
## BB#3:
leaq L_.str1(%rip), %rsi
movq ___stderrp@GOTPCREL(%rip), %rax
movq (%rax), %rdi
movb $0, %al
callq _fprintf
movl $2, %edi
movl %eax, -40(%rbp) ## 4-byte Spill
callq _exit
LBB0_4:
leaq L_.str2(%rip), %rdi
movsd -24(%rbp), %xmm0
movsd -32(%rbp), %xmm1
movb $2, %al
callq _printf
movl $0, %ecx
movl %eax, -44(%rbp) ## 4-byte Spill
movl %ecx, %eax
addq $48, %rsp
popq %rbp
retq
.cfi_endproc

.section __TEXT,__cstring,cstring_literals
L_.str: ## @.str
.asciz "Usage: %s x\n"

L_.str1: ## @.str1
.asciz "Cannot handle complex roots\n"

L_.str2: ## @.str2
.asciz "square root of %f = %f\n"

.subsections_via_symbols

Code Sample 1.2.: A simple program in C, compiled to assembly

10
1.3. Basic Program Structure

00000e40 55 48 89 e5 48 83 ec 30 c7 45 fc 00 00 00 00 89 |UH..H..0.E......|
00000e50 7d f8 48 89 75 f0 81 7d f8 02 00 00 00 0f 84 2c |}.H.u..}.......,|
00000e60 00 00 00 48 8d 35 f2 00 00 00 48 8b 05 9f 01 00 |...H.5....H.....|
00000e70 00 48 8b 38 48 8b 45 f0 48 8b 10 b0 00 e8 94 00 |.H.8H.E.H.......|
00000e80 00 00 bf 01 00 00 00 89 45 dc e8 81 00 00 00 48 |........E......H|
00000e90 8b 45 f0 48 8b 78 08 e8 6e 00 00 00 0f 57 c9 f2 |.E.H.x..n....W..|
00000ea0 0f 11 45 e8 f2 0f 10 45 e8 f2 0f 51 c0 f2 0f 11 |..E....E...Q....|
00000eb0 45 e0 66 0f 2e 4d e8 0f 86 25 00 00 00 48 8d 35 |E.f..M...%...H.5|
00000ec0 a5 00 00 00 48 8b 05 45 01 00 00 48 8b 38 b0 00 |....H..E...H.8..|
00000ed0 e8 41 00 00 00 bf 02 00 00 00 89 45 d8 e8 2e 00 |.A.........E....|
00000ee0 00 00 48 8d 3d 9d 00 00 00 f2 0f 10 45 e8 f2 0f |..H.=.......E...|
00000ef0 10 4d e0 b0 02 e8 22 00 00 00 b9 00 00 00 00 89 |.M....".........|
00000f00 45 d4 89 c8 48 83 c4 30 5d c3 ff 25 08 01 00 00 |E...H..0]..%....|
00000f10 ff 25 0a 01 00 00 ff 25 0c 01 00 00 ff 25 0e 01 |.%.....%.....%..|
00000f20 00 00 00 00 4c 8d 1d dd 00 00 00 41 53 ff 25 cd |....L......AS.%.|
00000f30 00 00 00 90 68 00 00 00 00 e9 e6 ff ff ff 68 0c |....h.........h.|
00000f40 00 00 00 e9 dc ff ff ff 68 18 00 00 00 e9 d2 ff |........h.......|
00000f50 ff ff 68 27 00 00 00 e9 c8 ff ff ff 55 73 61 67 |..h'........Usag|
00000f60 65 3a 20 25 73 20 78 0a 00 43 61 6e 6e 6f 74 20 |e: %s x..Cannot |
00000f70 68 61 6e 64 6c 65 20 63 6f 6d 70 6c 65 78 20 72 |handle complex r|
00000f80 6f 6f 74 73 0a 00 73 71 75 61 72 65 20 72 6f 6f |oots..square roo|
00000f90 74 20 6f 66 20 25 66 20 3d 20 25 66 0a 00 00 00 |t of %f = %f....|
00000fa0 01 00 00 00 1c 00 00 00 00 00 00 00 1c 00 00 00 |................|
00000fb0 00 00 00 00 1c 00 00 00 02 00 00 00 40 0e 00 00 |............@...|
00000fc0 34 00 00 00 34 00 00 00 0b 0f 00 00 00 00 00 00 |4...4...........|
00000fd0 34 00 00 00 03 00 00 00 0c 00 01 00 10 00 01 00 |4...............|
00000fe0 00 00 00 00 00 00 00 01 14 00 00 00 00 00 00 00 |................|
00000ff0 01 7a 52 00 01 78 10 01 10 0c 07 08 90 01 00 00 |.zR..x..........|
00001000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00001010 00 00 00 00 00 00 00 00 34 0f 00 00 01 00 00 00 |........4.......|
00001020 3e 0f 00 00 01 00 00 00 48 0f 00 00 01 00 00 00 |>.......H.......|
00001030 52 0f 00 00 01 00 00 00 00 00 00 00 00 00 00 00 |R...............|
00001040 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
...(cut for room)...
00002000 11 22 18 54 00 00 00 00 11 40 5f 5f 5f 73 74 64 |.".T.....@___std|
00002010 65 72 72 70 00 51 72 10 90 40 64 79 6c 64 5f 73 |errp.Qr..@dyld_s|
00002020 74 75 62 5f 62 69 6e 64 65 72 00 80 e8 ff ff ff |tub_binder......|
00002030 ff ff ff ff ff 01 90 00 72 18 11 40 5f 61 74 6f |........r..@_ato|
00002040 66 00 90 00 72 20 11 40 5f 65 78 69 74 00 90 00 |f...r .@_exit...|
00002050 72 28 11 40 5f 66 70 72 69 6e 74 66 00 90 00 72 |r(.@_fprintf...r|
00002060 30 11 40 5f 70 72 69 6e 74 66 00 90 00 00 00 00 |0.@_printf......|
00002070 00 01 5f 00 05 00 02 5f 6d 68 5f 65 78 65 63 75 |.._...._mh_execu|
00002080 74 65 5f 68 65 61 64 65 72 00 21 6d 61 69 6e 00 |te_header.!main.|
00002090 25 02 00 00 00 03 00 c0 1c 00 00 00 00 00 00 00 |%...............|
000020a0 c0 1c 00 00 00 00 00 00 fa de 0c 05 00 00 00 14 |................|
000020b0 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 00 |................|
000020c0 02 00 00 00 0f 01 10 00 00 00 00 00 01 00 00 00 |................|
000020d0 16 00 00 00 0f 01 00 00 40 0e 00 00 01 00 00 00 |........@.......|
000020e0 1c 00 00 00 01 00 00 01 00 00 00 00 00 00 00 00 |................|
000020f0 27 00 00 00 01 00 00 01 00 00 00 00 00 00 00 00 |'...............|
00002100 2d 00 00 00 01 00 00 01 00 00 00 00 00 00 00 00 |-...............|
00002110 33 00 00 00 01 00 00 01 00 00 00 00 00 00 00 00 |3...............|
00002120 3c 00 00 00 01 00 00 01 00 00 00 00 00 00 00 00 |<...............|
00002130 44 00 00 00 01 00 00 01 00 00 00 00 00 00 00 00 |D...............|
00002140 03 00 00 00 04 00 00 00 05 00 00 00 06 00 00 00 |................|
00002150 07 00 00 00 00 00 00 40 02 00 00 00 03 00 00 00 |.......@........|
00002160 04 00 00 00 05 00 00 00 06 00 00 00 20 00 5f 5f |............ .__|
00002170 6d 68 5f 65 78 65 63 75 74 65 5f 68 65 61 64 65 |mh_execute_heade|
00002180 72 00 5f 6d 61 69 6e 00 5f 5f 5f 73 74 64 65 72 |r._main.___stder|
00002190 72 70 00 5f 61 74 6f 66 00 5f 65 78 69 74 00 5f |rp._atof._exit._|
000021a0 66 70 72 69 6e 74 66 00 5f 70 72 69 6e 74 66 00 |fprintf._printf.|
000021b0 64 79 6c 64 5f 73 74 75 62 5f 62 69 6e 64 65 72 |dyld_stub_binder|
000021c0 00 00 00 00 |....|
000021c4

Code Sample 1.3.: A simple program in C, resulting machine code formatted in hexadec-
imal (partial)
11
1. Introduction

languages may still have a predefined main function, but in general, a script starts
executing starting with the first instruction in the script file. Adhering to the syntax
rules is still important, but since interpreted languages are not compiled, syntax errors
become runtime errors. A program may run fine until its first syntax error at which
point it fails.

There are other ways of compiling and running programs. Java for example represents a
compromise between compiled and interpreted languages. Java source code is compiled
into Java bytecode which is not actually machine code that the operating system and
hardware can run directly. Instead, it is compiled code for a Java Virtual Machine (JVM).
This allows a developer to write highly portable code, compile it once and it is runnable
on any JVM on any system (write-once, compile-once, run-anywhere).

In general, interpreted languages are slower than compiled languages because they are
being run through another program (the interpreter) instead of being executed directly
by the processor. Modern tools have been introduced to solve this problem. Just In Time
(JIT) compilers have been developed that take scripts that are not usually compiled,
and compile them to a native machine code format which has the potential to run much
faster than when interpreted. Modern web browsers typically do this for JavaScript code
(Google Chrome’s V8 JavaScript engine for example).

Another related technology are transpilers. Transpilers are source-to-source compilers.

They don’t produce assembly or machine code, instead they translate code in one
high-level programming language to another high-level programming language. This
is sometimes done to ensure that scripting languages like JavaScript are backwards
compatible with previous versions of the language. Transpilers can also be used to
translate one language into the same language but with different aspects (such as parallel
or synchronized code) automatically added. They can also be used to translate older
languages such as Pascal to more modern languages as a first step in updating a legacy
system.

1.4. Syntax Rules & Pseudocode

Programming languages are a lot like human languages in that they have syntax rules.
These rules dictate the appropriate arrangements of words, punctuation, and other
symbols that form valid statements in the language. For example, in many programming
languages, commands or statements are terminated by semicolons (just as most sentences
are ended with a period). This is an example of “punctuation” in a programming language.
In English paragraphs are separated by lines, in programming languages blocks of code
are separated by curly brackets. Variables are comparable to nouns and operations and
functions are comparable to verbs. Complex documents often have footnotes that provide
additional explanations; code has comments that provide documentation and explanation
for important elements. English is read top-to-bottom, left-to-right. Programming

12
1.4. Syntax Rules & Pseudocode

languages are similar: individual executable commands are written one per line. When a
program executes, each command executes one after the other, top-to-bottom. This is
known as sequential control flow.

A block of code is a section of code that has been logically grouped together. Many
languages allow you to define a block by enclosing the grouped code around opening and
closing curly brackets. Blocks can be nested within each other to form sub-blocks.

Most languages also have reserved words and symbols that have special meaning. For
example, many languages assign special meaning to keywords such as for , if , while ,
etc. that are used to define various control structures such as conditionals and loops.
Special symbols include operators such as + and * for performing basic arithmetic.

Failure to adhere to the syntax rules of a particular language will lead to bugs and
programs that fail to compile and/or run. Natural languages such as English are very
forgiving: we can generally discern what someone is trying to say even if they speak
in broken English (to a point). However, a compiler or interpreter isn’t as smart as a
human. Even a small syntax error (an error in the source code that does not conform
to the language’s rules) will cause a compiler to completely fail to understand the code
you have written. Learning a programming language is a lot like learning a new spoken
language (but, fortunately a lot easier).

In subsequent parts of this book we focus on particular languages. However, in order

to focus on concepts, we’ll avoid specific syntax rules by using pseudocode, informal,
high-level descriptions of algorithms and processes. Good pseudocode makes use of plain
English and mathematical notation, making it more readable and abstract. A small
example can be found in Algorithm 1.1.

Input : A collection of numbers, A = {a1 , a2 , . . . , an }

Output : The minimal element in A
1 Let min be equal to a1
2 foreach element ai in A do
3 if ai < min then
4 ai is less than the smallest element we’ve found so far
5 Update min to be equal to ai
6 end
7 end
8 output min

Algorithm 1.1: An example of pseudocode: finding a minimum value

13
1. Introduction

1.5. Documentation, Comments, and Coding Style

Good code is not just functional, it is also beautiful. Good code is organized, easy to
read, and well documented. Organization can be achieved by separating code into useful
functions and collecting functions into modules or libraries. Good organization means
that at any one time, we only need to focus on a small part of a program.

It would be difficult to read an essay that contained random line breaks, paragraphs
were not indented, it contained different spacing or different fonts, etc. Likewise, code
should be legible. Well written code is consistent and makes good use of whitespace
and indentation. Code within the same code block should be indented at the same level.
Nested blocks should be further indented just like the outline of an essay or table of
contents.

Code should also be well documented. Each line or segment of code should be clear
enough that it tells the user what the code does and how it does it. This is referred to as
“self-documenting” code. A person familiar with the particular language should be able
to read your code and immediately understand what it does. In addition, well-written
code should contain sufficient and clear comments. A comment in a program is intended
for a human user to read. A comment is ultimately ignored by the compiler/interpreter
and has no effect on the actual program. Good comments tell the user why the code was
written or why it was written the way it was. Comments provide a high-level description
of what a block of code, function, or program does. If the particular method or algorithm
is of interest, it should also be documented.

There are typically two ways to write comments. Single line comments usually begin
with two forward slashes, // .1 Everything after the slashes until the next line is ignored.
Multiline comments begin with a /* and end with a */ ; everything between them is
ignored even if it spans multiple lines. This syntax is shared among many languages
including C, Java, PHP and others. Some examples:

1 double x = sqrt(y); //this is a single line comment

3 /*
4 This is a multiline comment
5 each line is ignored, but allows
6 for better formatting
7 */
8

9 /**
10 * This is a doc-style comment, usually placed in

1
You can remember the difference between a forward slash / and a backslash \ by thinking of a
person facing right and either leaning backwards (backslash) or forwards (forward slash).

14
1.5. Documentation, Comments, and Coding Style

11 * front of major portions of code such as a function

12 * to provide documentation
13 * It begins with a forward-slash-star-star
14 */

The last example above is a doc-style comment. It originated with Java, but has since
been adopted by many other programming languages. Syntactically it is a normal
multiline comment, but begins with a /** . Asterisks are aligned together on each
line. Certain commenting systems allow you to place other marked up data inside these
comments such as labeling parameters ( @param x ) or use HTML code to provide links
and style. These doc-style comments are used to provide documentation for major parts
of the code especially functions and data structures. Though not part of the language,
other documentation tools can be used to gather the information in doc-style comments
to produce formatted documentation such as web pages or Portable Document Format
(PDF) documents.

Comments should not be trivial: they should not explain something that should be
readily apparent to an experienced user or programmer. For example, if a piece of code
adds two numbers together and stores the result, there should not be a comment that
explains the process. It is a simple and common enough operation that is self-evident.
However, if a function uses a particular process or algorithm such as a Fourier Transform
to perform an operation, it would be appropriate to document it in a series of comments.

Comments can also detail how a function or piece of code should be used. This is
typically done when developing an Application Programmer Interface (API) for use
by other programmers. The API’s available functions should be well documented so
that users will know how and when to use a particular function. It can document the
function’s expectations and behavior such as how it handles bad input or error situations.

15
2. Basics

2.1. Control Flow

The flow of control (or simply control flow) is how a program processes its instructions.
Typically, programs operate in a linear or sequential flow of control. Executable statements
or instructions in a program are performed one after another. In source code, the order
that instructions are written defines their order. Just like English, a program is “read”
top to bottom. Each statement may modify the state of a program. The state of a
program is the value of all its variables and other information/data stored in memory
at a given moment during its execution. Further, an executable statement may instead
invoke (or call or execute) another procedure (also called subroutine, function, method,
etc.) which is another unit of code that has been encapsulated into one unit so that it
can be reused.

This type of control flow is usually associated with a procedural programming paradigm
(which is closely related to imperative or structured programming paradigms). Though
this text will mostly focus on languages that are procedural (or that have strong procedural
aspects), it is important to understand that there are other programming language
paradigms. Functional programming languages such as Scheme and Haskell achieve
computation through the evaluation of mathematical functions with as little or no (“pure”
functional) state at all. Declarative languages such as those used in database languages
like SQL or in spreadsheets like Excel specify computation by expressing the logic of
computation rather than explicitly specifying control flow. For a more formal introduction
to programming language paradigms, a good resource is Seven Languages in Seven Weeks:
A Pragmatic Guide to Learning Programming Languages by Tate [36].

2.1.1. Flowcharts

Sometimes processes are described using diagrams called flowcharts. A flowchart is a

visual representation of an algorithm or process consisting of boxes or “nodes” connected
by directed edges. Boxes can represent an individual step or a decision to be made. The
edges establish an order of operations in the diagram.

Some boxes represent decisions to be made which may have one or more alternate routes
(more than one directed edge going out of the box) depending on the the result of the

17
2. Basics

Decision Control to Action to

Node Perform Perform

(a) Decision Node (b) Control Node (c) Action Node

Figure 2.1.: Types of Flowchart Nodes. Control and action nodes are distinguished
by color. Control nodes are automated steps while action nodes are steps
performed as part of the algorithm being depicted.

decision. Decision boxes are usually depicted with a diamond shaped box.

Other boxes represent a process, operation, or action to be performed. Boxes representing

a process are usually rectangles. We will further distinguish two types of processes using
two different colorings: we’ll use green to represent boxes that are steps directly related
to the algorithm being depicted. We’ll use blue for actions that are necessary to the
control flow of the algorithm such as assigning a value to a variable or incrementing a
value as part of a loop. Figure 2.1 depicts the three types of boxes we’ll use. Figure 2.2
depicts a simple ATM (Automated Teller Machine) process as an example.

2.2. Variables

In mathematics, variables are used as placeholders for values that aren’t necessarily
known. For example, in the equation,

x = 3y + 5

the variables x and y represent numbers that can take on a number of different values.

Similarly, in a computer program, we also use variables to store values. A variable is

essentially a memory location in which a value can be stored. Typically, a variable is
referred to by a name or identifier (like x, y, z in mathematics). In mathematics variables
are usually used to hold numerical values. However, in programming, variables can
usually hold different types of values such as numbers, strings (a collection of characters),
Booleans (true or false values), or more complex types such as arrays or objects.

18
2.2. Variables

User Input

amount

yes Get yes

Is PIN Sufficient Dispense
Input PIN amount of
correct? Funds? amount
withdraw

no
no

Eject Card

Figure 2.2.: Example of a flowchart for a simple ATM process

2.2.1. Naming Rules & Conventions

Most programming languages have very specific rules as to what you can use as variable
identifiers (names). For example, most programming languages do not allow you to use
whitespace characters (space, tab, etc.) in a variable’s identifier. Allowing spaces would
make variable names ambiguous: where does the variable’s name end and the rest of the
program continue? How could you tell the difference between “average score” and two
separate variables named “average” and “score”? Many programming languages also
have reserved words–words or terms that are used by the programming language itself
and have special meaning. Variable names cannot be the same as any reserved word as
the language wouldn’t be able to distinguish between them.

For similar reasons, many programming languages do not allow you to start a variable
name with a number as it would make it more difficult for a compiler or interpreter to
parse a program’s source code. Yet other languages require that variables begin with a
specific character (PHP for example requires that all variables begin with a dollar sign,
$ ).

In general, most programming languages allow you to use a combination of uppercase

A-Z and lowercase a-z letters as well as numbers, [0-9] and certain special characters
such as underscores _ or dollar signs, $ . Moreover, most programming languages (like
English) are case sensitive meaning that a variable name using lowercase letters is not
the same variable as one that uses uppercase letters. For example, the variables x and
X are different; the variables average , Average and AVERAGE are all different as well.
A few languages are case-insensitive meaning that they do not recognize differences in
lower and uppercase letters when used in variable identifiers. Even in these languages,

19
2. Basics

however, using a mixture of lowercase and uppercase letters to refer to the same variable
is discouraged: it is difficult to read, inconsistent, and just plain ugly.

Beyond the naming rules that languages may enforce, most languages have established
naming conventions; a set of guidelines and best-practices for choosing identifier names
for variables (as well as functions, methods, and class names). Conventions may be
widely adopted on a per-language basis or may be established within a certain library,
framework or by an organization that may have official style guides. Naming conventions
are intended to give source code consistency which ultimately improves readability and
makes it easier to understand. Following a consistent convention can also greatly reduce
the chance for errors and mistakes. Good naming conventions also has an aesthetic
appeal; code should be beautiful.

There are several general conventions when it comes to variables. An early convention,
but still in common use is underscore casing in which variable names consisting of more
than one word have words separated by underscore characters with all other characters
being lowercase. For example:

average_score , number_of_students , miles_per_hour

A variation on this convention is to use all uppercase letters such as MILES_PER_HOUR .

A more modern convention is to use lower camel casing (or just camel casing) in which
variable names with multiple words are written as one long word with the first letter in
each new word capitalized but with the first word’s first letter lowercase. For example:

averageScore , numberOfStudents , milesPerHour

The convention refers to the capitalized letters resembling the humps of a camel. One
advantage that camel casing has over underscore casing is that you’re not always straining
to type the underscore character. Yet another similar convention is upper camel casing,
also known as PascalCase 1 which is like camel casing, but the first letter in the first word
is also capitalized:

AverageScore , NumberOfStudents , MilesPerHour

Each of these conventions is used in various languages in different contexts which we’ll
explore more fully in subsequent sections (usually underscore lowercasing and camel
casing are used to denote variables and functions, PascalCase is used to denote user
defined types such as classes or structures, and underscore uppercasing is used to denote
static and constant variables). However, for our purposes, we’ll use lower camel casing
for variables in our pseudocode.
1
Rarely, this is referred to as DromedaryCase; a Dromedary is an Arabian camel.

20
2.2. Variables

There are exceptions and special cases to each of these conventions such as when a variable
name involves an acronym or a hyphenated word, etc. In such cases sensible extensions or
compromises are employed. For example, xmlString or priorityXMLParser (involving
the acronym Extensible Markup Language (XML)) may be used which keep all letters in
the acronym consistent (all lowercase or all uppercase).

In addition to these conventions, there are several best-practice principles when deciding
on identifiers.

• Be descriptive, but not verbose – Use variable names that describe what the
variable represents. The examples above, averageScore , numberOfStudents ,
milesPerHour clearly indicate what the variable is intended to represent. Using
good, descriptive names makes it your code self-documenting (a reader can make
sense of it without having to read extensive supplemental documentation).

Avoid meaningless variable names such as value , aVariable , or some cryptic

combination of v10 (its the 10th variable I’ve used!). Ambiguous variables such
as name should also be avoided unless the context makes its clear what you are
referring to (as when used inside of a Person object).

Single character variables are commonly used, but used in a context in which their
meaning is clearly understood. For example, variable names such as x , y are okay
if they are used to refer to points in the Euclidean plane. Single character variables
such as i , j are often used as index variables when iterating over arrays. In this
case, terseness is valued over descriptiveness as the context is very well understood.

As a general rule, the more a variable is used, the shorter it should be. For
example, the variable numStudents may be preferred over the full variable
numberOfStudents .

• Avoid abbreviations (or at least use them sparingly) – You’re not being charged by
the character in your code; you can afford to write out full words. Abbreviations can
help to write shorter variable names, but not all abbreviations are the same. The
word “abbreviation” itself could be abbreviated as “abbr.”, “abbrv.” or “abbrev.”
for example. Abbreviations are not always universally understood by all users,
may be ambiguous and non-standard. Moreover, modern IDEs provide automatic
code completion, relieving you of the need to type longer variable names. If the
abbreviation is well-known or understood from context, then it may make sense to
use it.

• Avoid acronyms (or at least use them sparingly) – Using acronyms in variable
names come with many of the same problems as abbreviations. However, if it makes
sense in the context of your code and has little chance of being misunderstood or
mistaken, then go for it. For example, in the context of a financial application,
APR (Annual Percentage Rate) would be a well-understood acronym in which case

21
2. Basics

the variable apr may be preferred over the longer annualPercentageRate .

• Avoid pluralizations, use singular forms – English is not a very consistent language
when it comes to rules like pluralizations. For most cases you simply add “s”; for
others you add “es” or change the “y” to “i” and add “es”. Some words are the
same form for singular and plural such as “glasses.”2 Other words have completely
different forms (“focus” becomes “foci”). Still yet there are instances in which
multiple words are acceptable: the plural of “person” can be “persons” or “people”.
Avoiding plural forms keeps things simple and consistent: you don’t need to be
a grammarian in order easily read code. One potential exception to this is when
using a collection such as an array to hold more than one element or the variable
represents a quantity that is pluralized (as with numberOfStudents above).

Though the guidelines above provide a good framework from which to write good variable
names, reasonable people can and do disagree on best practice because at some point as
you go from generalities to specifics, conventions become more of a matter of personal
preference and subjective aesthetics. Sometimes an organization may establish its own
coding standards or style guide that must be followed which of course trumps any of the
guidelines above.

In the end, a good balance must be struck between readability and consistency. Rules
and conventions should be followed, until they get in the way of good code that is.

2.2.2. Types

A variable’s type (or data type) is the characterization of the data that it represents. As
mentioned before, a computer only “speaks” in 0s and 1s (binary). A variable is merely
a memory location in which a series of 0s and 1s is stored. That binary string could
represent a number (either an integer or a floating point number), a single alphanumeric
character or series of characters (string), a Boolean type or some other, more complex
user-defined type.

The type of a variable is important because it affects how the raw binary data stored
at a memory location is interpreted. Moreover, some types take a different amount of
memory to store. For example, an integer type could take 32 bits while a floating point
type could take 64 bits. Programming languages may support different types and may
do so in different ways. In the next few sections we’ll describe some common types that
are supported by many languages.

2
These are called plurale tantum (nouns with no singular form) and singular tantum (nouns with no
plural form) for you grammarians. Words like “sheep” are unchanging irregular plurals; words whose
singular and plural forms are the same.

22
2.2. Variables

Numeric Types

At their most basic, computers are number crunching machines. Thus, the most basic
type of variable that can be used in a computer program is a numeric type. There are
several numeric types that are supported by various programming languages. The most
simple is an integer type which can represent whole numbers 0, 1, 2, etc. and their
negations, −1, −2, . . .. Floating point numeric types represent decimal numbers such
as 0.5, 3.14, 4.0, etc. However, neither integer nor floating point numbers can represent
every possible number since they use a finite number of bits to represent the number. We
will examine this in detail below. For now, let’s understand how a computer represents
both integers and floating point numbers in memory.

As humans, we “think” in base-10 (decimal) because we have 10 fingers and 10 toes.

When we write a number with multiple digits in base-10 we do so using “places” (ones
place, tens place, hundreds place, etc.). Mathematically, a number in base-10 can be
broken down into powers of ten; for example:

3, 201 = 3 × 103 + 2 × 102 + 0 × 101 + 1 × 100

In general, any number in base-10 can be written as the summation of powers of 10

multiplied by numbers 0–9,

ck × 10k + ck−1 × 10k−1 + · · · c1 · 101 + c0

In binary, numbers are represented in the same way, but in base-2 in which we only have
0 and 1 as symbols. To illustrate, let’s consider counting from 0: in base-10, we would
count 0, 1, 2, . . . , 9 at which point we “carry-over” a 1 to the tens spot and start over at
0 in the ones spot, giving us 10, 11, 12, . . . , 19 and repeat the carry-over to 20.

With only two symbols, the carry-over occurs much more frequently, we count 0, 1 and
then carry over and have 10. It is important to understand, this is not “ten”: we are
counting in base-2, so 10 is actually equivalent to 2 in base-10. Continuing, we have 11
and again carry over, but we carry it over twice giving us 100 (just like we’d carry over
twice when going from 99 to 100 in base-10). A full count from 0 to 16 in binary can be
found in Table 2.1. In many programming languages, a prefix of 0b is used to denote a
number represented in binary. We use this convention in the table.

As a fuller example, consider again the number 3,201. This can be represented in binary

23
2. Basics

as follows.

0b110010000001 = 1 × 211 + 1 × 210 + 0 × 29 + 0 × 28 +

1 × 27 + 0 × 26 + 0 × 25 + 0 × 24 +
0 × 23 + 0 × 22 + 0 × 21 + 1 × 20
= 211 + 210 + 27 + 20
= 2, 048 + 1, 024 + 128 + 1
= 3, 201

Representing negative numbers is a bit more com-

plicated and is usually done using a scheme called Base-10 Binary
two’s complement. We omit the details, but essen- 0 0b0
tially the first bit in the representation serves as a 1 0b1
sign bit: zero indicates positive, while 1 indicates 2 0b10
negative. Negative values are represented as a com- 3 0b11
plement with respect to 2n (a complement is where 4 0b100
0s and 1s are “flipped” to 1s and 0s). 5 0b101
When represented using two’s complement, binary 6 0b110
numbers with n bits can represent numbers x in the 7 0b111
range 8 0b1000
−2n−1 ≤ x ≤ 2n−1 − 1 9 0b1001
10 0b1010
Note that the upper bound follows from the fact 11 0b1011
that 12 0b1100
n−2
X
0b 11 . . . 011 = 2i = 2n−1 − 1 13 0b1101
| {z } i=0 14 0b1110
n bits 15 0b1111
The −1 captures the idea that we start at zero. The 16 0b10000
exponent in the upper bound is n − 1 since we need
one bit to represent the sign. The lower bound Table 2.1.: Counting in Binary
n−1
represents the idea that we have 2 possible values (n − 1 since we need one bit for
the sign bit) and we don’t need to start at zero, we can start at −1. Table 2.2 contains
ranges for common integer types using various number of bits.

n (number of bits) minimum maximum

8 -128 127
16 -32,768 32,767
32 -2,147,483,648 2,147,483,647
64 -9,223,372,036,854,775,808 9,223,372,036,854,775,807
128 ≈ −3.4028 × 1038 ≈ 3.4028 × 1038

Table 2.2.: Ranges for various signed integer types

24
2.2. Variables

Some programming languages allow you to define variables that are unsigned in which
the sign bit is not used to indicate positive/negative. With the extra bit we can represent
numbers twice as big; using n bits we can represent numbers x in the range

0 ≤ x ≤ 2n − 1

Floating point numbers in binary are represented in a manner similar to scientific notation.
Recall that in scientific notation, a number is normalized by multiplying it by some power
of 10 so that it its most significant digit is between 1 and 9. The resulting normalized
number is called the significand while the power of ten that the number was scaled by is
called the exponent (and since we are base-10, 10 is the base). In general, a number in
scientific notation is represented as:

significand × baseexponent

For example,
exponent
z}|{
14326.123 = 1.4326123 × 10 4
| {z } |{z}
significand base

Sometimes the notations 1.4326123e + 4, 1.4326123e4 or 1.4326123E4 are used. As before,

we can see that a fractional number in base-10 can be seen as a summation of powers of
10:

1.4326123 = 1 × 101 + 4 × 10−1 + 3 × 10−2 + 2 × 10−3 +

6 × 10−4 + 1 × 10−5 + 2 × 10−6 + 3 × 10−7

In binary, floating point numbers are represented in a similar way, but the base is 2,
consequently a fractional number in binary is a summation of powers of 2. For example,

110.011 =1 × 22 + 1 × 21 + 0 × 20 + 0 × 2−1 + 1 × 2−2 + 1 × 2−3

1 1 1
=1 × 4 + 1 × 2 + 0 × 1 + 0 × + 1 × + 1 ×
2 4 8
1 1
=4 + 2 + 0 + 0 + +
4 8
=6.375

In binary, the significand is often referred to as a mantissa. We also normalize a binary

floating point number so that the mantissa is between 12 and 1. This is where the term
floating point comes from: the decimal point (more generally called a radix point) “floats”
left and right to ensure that the number is always normalized. The example above would
normalized to
0.110011 × 23
Here, 0.110011 is the mantissa and 3 is the exponent (which in binary would be 0b11).

25
2. Basics

Name Bits Exponent Bits Mantissa Bits Significant Digits Approximate

of Precision Range
Half 16 5 10 ≈ 3.3 103 ∼ 104.5
Single 32 8 23 ≈ 7.2 10−38 ∼ 1038
Double 64 11 52 ≈ 15.9 10−308 ∼ 10308
Quadruple 128 15 112 ≈ 34.0 10−4931 ∼ 104931

Table 2.3.: Summary of Floating-point Precisions in the IEEE 754 Standard. Half and
quadruple are not widely adopted.

Most modern programming languages implement floating point numbers according to the
Institute of Electrical and Electronics Engineers (IEEE) 754 Standard [20] (also called
the International Electrotechnical Commission (IEC) 60559 [19]). When represented
in binary, a fixed number of bits must be used to represent the sign, mantissa and
exponent. The standard defines several precisions that each use a fixed number of bits
with a resulting number of significant digits (base-10) of precision. Table 2.3 contains a
summary of a few of the most commonly implemented precisions.

Just as with integers, the finite precision of floating point numbers results in several limi-
tations. First, irrational numbers such as π = 3.14159 . . . can only be approximated out
to a certain number of digits. For example, with single precision π ≈ 3.1415927 which is
accurate only to the 6th decimal place and with double precision, π ≈ 3.1415926535897931
approximate to only 15 decimal places.3 In fact, regardless of how many bits we allow in
our representation, an irrational number like π (that never repeats and never terminates)
will only ever be an approximation. Real numbers like π require an infinite precision,
but computers are only finite machines.

Even numbers that have a finite representation (rational numbers) such as 13 = 0.333 are
not represented exactly when using floating point numbers. In double precision binary,
1
= 0b1.0101010101010101010101010101010101010101010101010101 × 2−2
3
which when represented in scientific notation in decimal is

3.3333333333333330 × 10−1

That is, there are only 16 digits of precision, after which the remaining (infinite) sequence
of 3s get cut off.

Programming languages usually only support the common single and double precisions
defined by the IEEE 754 standard as those are commonly supported by hardware.
3
The first 80 digits of π are
3.14159265358979323846264338327950288419716939937510582097494459230781640628620899
though only 39 digits of π are required to accurately calculate the volume of the known universe to
within one atom.

26
2.2. Variables

However, there are languages that support arbitrary precision (also called multiprecision)
numbers and yet other languages that have many libraries to support “big number”
arithmetic. Arbitrary precision is still not infinite: instead, as more digits are needed,
more memory is allocated. If you want to compute 10 more digits of π, you can but at a
cost. To support the additional digits, more memory is allocated. Also, operations are
performed in software using many operations which can be much slower than performing
fixed-precision arithmetic directly in hardware. Still, there are many applications where
such accuracy or large numbers are absolutely essential.

Characters & Strings

Another type of data is textual data which can either be single characters or a sequence
of characters which are called strings. Strings are sometimes used for human readable
data such as messages or output, but may also model general data. For example, DNA
is usually encoded using strings consisting of the characters C, G, A, T (corresponding
to the nucleases cytosine, guanine, adenine, and thymine). Numerical characters and
punctuation can also be used in strings in which case they do not represent numbers,
but instead may represent textual versions of numerical data.

Different programming languages implement characters and strings in different ways (or
may even treat them the same). Some languages implement strings by defining arrays
of characters. Other languages may treat strings as dynamic data types. However, all
languages use some form of character encoding to represent strings. Recall that computers
only speak in binary: 0s and 1s. To represent a character like the capital letter “A”, the
binary sequence 0b1000001 is used. In fact, the most common alphanumeric characters
are encoded according to the American Standard Code for Information Interchange
(ASCII) text standard. The basic ASCII text standard assigns characters to the decimal
values 0–127 using 7 bits to encode each character as a number. Table 2.4 contains a
complete listing of standard ASCII character set.

The ASCII table was designed to enforce a lexicographic ordering: letters are in alphabetic
order, uppercase precede lowercase versions, and numbers precede both. This design
allows for an easy and natural comparison among strings, “alpha” would come before
“beta” because they differ in the first letter. The characters have numerical values 97
and 98 respectively; since 97 < 98, the order follows. Likewise, “Alpha” would come
before “alpha” (since 65 < 97), and “alpha” would come before “alphanumeric”: the
sixth character is empty in the first string (usually treated as the null character with
value 0) while it is “n” in the second (value of 110). This is the ordering that we would
expect in a dictionary.

There are several other nice design features built into the ASCII table. For example, to
convert between uppercase and lowercase versions, you only need to “flip” the second
bit (0 for uppercase, 1 for lowercase). There are also several special characters that

27
2. Basics

Binary Dec Character Binary Dec Character Binary Dec Character

0b000 0000 0 \0 Null character 0b010 1011 43 + 0b101 0110 86 V
0b000 0001 1 Start of Header 0b010 1100 44 , 0b101 0111 87 W
0b000 0010 2 Start of Text 0b010 1101 45 - 0b101 1000 88 X
0b000 0011 3 End of Text 0b010 1110 46 . 0b101 1001 89 Y
0b000 0100 4 End of Transmission 0b010 1111 47 / 0b101 1010 90 Z
0b000 0101 5 Enquiry 0b011 0000 48 0 0b101 1011 91 [
0b000 0110 6 Acknowledgment 0b011 0001 49 1 0b101 1100 92 \
0b000 0111 7 \a Bell 0b011 0010 50 2 0b101 1101 93 ]
0b000 1000 8 \b Backspace 0b011 0011 51 3 0b101 1110 94 ^
0b000 1001 9 \t Horizontal Tab 0b011 0100 52 4 0b101 1111 95
0b000 1010 10 \n Line feed 0b011 0101 53 5 0b110 0000 96 ‘
0b000 1011 11 \v Vertical Tab 0b011 0110 54 6 0b110 0001 97 a
0b000 1100 12 \f Form feed 0b011 0111 55 7 0b110 0010 98 b
0b000 1101 13 \r Carriage return 0b011 1000 56 8 0b110 0011 99 c
0b000 1110 14 Shift Out 0b011 1001 57 9 0b110 0100 100 d
0b000 1111 15 Shift In 0b011 1010 58 : 0b110 0101 101 e
0b001 0000 16 Data Link Escape 0b011 1011 59 ; 0b110 0110 102 f
0b001 0001 17 Device Control 1 0b011 1100 60 < 0b110 0111 103 g
0b001 0010 18 Device Control 2 0b011 1101 61 = 0b110 1000 104 h
0b001 0011 19 Device Control 3 0b011 1110 62 > 0b110 1001 105 i
0b001 0100 20 Device Control 4 0b011 1111 63 ? 0b110 1010 106 j
0b001 0101 21 Negative Ack 0b100 0000 64 @ 0b110 1011 107 k
0b001 0110 22 Synchronous idle 0b100 0001 65 A 0b110 1100 108 l
0b001 0111 23 End of Trans. Block 0b100 0010 66 B 0b110 1101 109 m
0b001 1000 24 Cancel 0b100 0011 67 C 0b110 1110 110 n
0b001 1001 25 End of Medium 0b100 0100 68 D 0b110 1111 111 o
0b001 1010 26 Substitute 0b100 0101 69 E 0b111 0000 112 p
0b001 1011 27 Escape 0b100 0110 70 F 0b111 0001 113 q
0b001 1100 28 File Separator 0b100 0111 71 G 0b111 0010 114 r
0b001 1101 29 Group Separator 0b100 1000 72 H 0b111 0011 115 s
0b001 1110 30 Record Separator 0b100 1001 73 I 0b111 0100 116 t
0b001 1111 31 Unit Separator 0b100 1010 74 J 0b111 0101 117 u
0b010 0000 32 (space) 0b100 1011 75 K 0b111 0110 118 v
0b010 0001 33 ! 0b100 1100 76 L 0b111 0111 119 w
0b010 0010 34 " 0b100 1101 77 M 0b111 1000 120 x
0b010 0011 35 # 0b100 1110 78 N 0b111 1001 121 y
0b010 0100 36 $ 0b100 1111 79 O 0b111 1010 122 z
0b010 0101 37 % 0b101 0000 80 P 0b111 1011 123 {
0b010 0110 38 & 0b101 0001 81 Q 0b111 1100 124 |
0b010 0111 39 ’ 0b101 0010 82 R 0b111 1101 125 }
0b010 1000 40 ( 0b101 0011 83 S 0b111 1110 126 ~
0b010 1001 41 ) 0b101 0100 84 T 0b111 1111 127 Delete
0b010 1010 42 * 0b101 0101 85 U

Table 2.4.: ASCII Character Table. The first and second column indicate the binary and
decimal representation respectively. The third column visualizes the resulting
character when possible. Characters 0–31 and 127 are control characters that
are not printable or print whitespace. The encoding is designed to impose a
lexicographic ordering: A–Z are in order, uppercase letters precede lowercase
letters, numbers precede letters and are also in order.

28
2.2. Variables

need to be escaped to be defined. For example, though your keyboard has a tab and an
enter key, if you wanted to code those characters, you would need to specify them in
some way other than using those keys (since typing those keys will affect what you are
typing rather than specifying a character). The standard way to escape characters is to
use a backslash along with another, single character. The three most common are the
(horizontal) tab, \t, the endline character, \n, and the null terminating character, \0.
The tab and endline character are used to specify their whitespace characters respectively.
The null character is used in some languages to denote the end of a string and is not
printable.

ASCII is quite old, originally developed in the early sixties. President Johnson first
mandated that all computers purchased by the federal government support ASCII in 1968.
However, it is quite limited with only 128 possible characters. Since then, additional
extensions have been developed. The Extended ASCII character set adds support for
128 additional characters (numbered 128 through 255) by adding 1 more bit (8 total).
Included in the extension are support for common international characters with diacritics
such as ü, n
~ and £ (which are characters 129, 164, and 156 respectively).

Even 256 possible characters are not enough to represent the wide array of international
characters when you consider languages like Chinese Japanese Korean (CJK). Unicode
was developed to solve this problem by establishing a standard encoding that supports
1,112,064 possible characters, though only a fraction of these are actually currently
assigned.4 Unicode is backward compatible, so it works with plain ASCII characters. In
fact, the most common encoding for Unicode, UTF-8 uses a variable number of bytes to
encode characters. 1-byte encodings correspond to plain ASCII, there are also 2, 3, and
4-byte encodings.

In most programming languages, strings literals are defined by using either single or
double quotes to indicate where the string begins and ends. For example, one may be
able to define the string "Hello World" . The double quotes are not part of the string,
but instead specify where the string begins and ends. Some languages allow you to use
either single or double quotes. PHP for example would allow you to also define the same
string as 'Hello World' . Yet other languages, such as C distinguish the usage of single
and double quotes: single quotes are for single characters such as 'A' or '\n' while
double quotes are used for full strings such as "Hello World" .

In any case, if you want a single or double quote to appear in your string you need to
escape it similar to how the tab and endline characters are escaped. For example, in C
'\'' would refer to the single quote character and "Dwayne \"The Rock\" Johnson"
would allow you to use double quotes within a string. In our pseudocode we’ll use the
stylized double quotes, “Hello World” in any strings that we define. We will examine
4
As of 2012, 110,182 are assigned to characters, 137,468 are reserved for private use (they are valid
characters, but not defined so that organizations can use them for their own purposes), with 2,048
surrogates and 66 non-character control codes. 864,348 are left unassigned meaning that we are
well-prepared for encoding alien languages when they finally get here.

29
2. Basics

string types more fully in Chapter 8.

Boolean Types

A Boolean is another type of variable that is used to hold a truth value, either true or
false, of a logical statement. Some programming languages explicitly support a built-in
Boolean type while others implicitly support them. For languages that have explicit
Boolean types, typically the keywords true and false are used, but logical expressions
such as x ≤ 10 can also be evaluated and assigned to Boolean variables.

Some languages do not have an explicit Boolean type and instead support Booleans
implicitly, sometimes by using numeric types. For example, in C, false is associated with
zero while any non-zero value is associated with true. In either case, Boolean values are
used to make decisions and control the flow of operations in a program (see Chapter 3).

Object & Reference Types

Not everything is a number or string. Often, we wish to model real-world entities such
as people, locations, accounts, or even interactions such as exchanges or transactions.
Most programming languages allow you to create user-defined types by using objects or
structures. Objects and structures allow you to group multiple pieces of data together
into one logical entity; this is known as encapsulation. For example, a Student object may
consist of a first-name, last-name, GPA, year, major, etc. Grouping these separate pieces
of data together allows us to define a more complex type. We explore these concepts in
more depth in Chapter 10.

In contrast to the built-in numeric, character/string, and Boolean types (also called
primitive data types) user-defined types do not necessarily take a fixed amount of memory
to represent. Since they are user-defined, it is up to the programmer to specify how
they get created and how they are represented in memory. A variable that refers to an
object or structure is usually a reference or pointer: a reference to where the object is
stored in memory on a computer. Many programming languages use the keyword null
(or sometimes NULL or a some variation) to indicate an invalid reference. The null
keyword is often used to refer to uninitialized or “missing” data.

Another common user-defined type is an enumerated type which allows a user to define a
list of keywords associated with integers. For example, the cardinal directions, “north”,
“south”, “east”, and “west” could be associated with the integers 0, 1, 2, 3 respectively.
Defining an enumerated type then allows you to use these keywords in your program
directly without having to rely on mysterious numerical values, making a program more
readable and less prone to error.

30
2.2. Variables

2.2.3. Declaring Variables: Dynamic vs. Static Typing

In some languages, variables must be declared before they can be referred to or used.
When you declare a variable, you not only give it an identifier, but also define its type.
For example, you can declare a variable named numberOf Students and define it to be
an integer. For the life of that variable, it will always be an integer type. You can only
give that variable integer values. Attempts to assign, say, a string type to an integer
variable may either result in a syntax error or a runtime error when the program is
executed or lead to unexpected or undefined behavior. A language that requires you to
declare a variable and its type is a statically typed language.

The declaration of a variable is typically achieved by writing a statement that includes

the variable’s type (using a built-in keyword of the language) along with the variable
name. For example, in C-style languages, a line like

int x;

would create an integer variable associated with the identifier x .

In other languages, typically interpreted languages, you do not have to declare a variable
before using it. Such languages are generally referred to as dynamically typed languages.
Instead of declaring a variable to have a particular type, the type of a variable is
determined by the type of value that is assigned to it. If you assign an integer to a
variable it becomes an integer. If you assign a string to it, it becomes a string type.
Moreover, a variable’s type can change during the execution of a program. If you reassign
a value to a variable, it dynamically changes its type to match the type of the value
assigned.

In PHP for example, a line like

$x = 10;

would create an integer variable associated with the identifier $x . In this example, we
did not declare that $x was an integer. Instead, it was inferred by the value that we
assigned to it (10).

At first glance it may seem that dynamically typed languages are better. Certainly
they are more flexible (and allow you to write less so-called “boilerplate” code), but
that flexibility comes at a cost. Dynamically typed variables are generally less efficient.
Moreover, dynamic typing opens the door to a lot of potential type mismatching errors.
For example, you may have a variable that is assumed to always be an integer. In a
dynamically typed language, no such assumption is valid as a reassignment can change
the variable’s type. It is impossible to enforce this assumption by the language itself and
may require a lot of extra code to check a variable’s type and deal with “type safety”
issues. The advantages and disadvantages of each continue to be debated.

31
2. Basics

1 {
2 int a;
3 {
4 //this is a new code block inside the outer block
5 int b;
6 //at this point in the code, both a and b are in-scope
7 }
8 //at this point, only a is in-scope, b is out-of-scope
9 }

Code Sample 2.1.: Example of variable scoping in C

2.2.4. Scoping

The scope of a variable is the section of code in which a variable is valid or “known.”
In a statically typed language, a variable must be declared before it can be used. The
code block in which the variable is declared is therefore its scope. Outside of this code
block, the variable is invalid. Attempts to reference or use a variable that is out-of-scope
typically result in a syntax error. An example using the C programming language is
depicted in Code Sample 2.1.

Scoping in a dynamically typed language is similar, but since you don’t declare a variable,
the scope is usually defined by the block of code where you first use or reference the
variable. In some languages using a variable may cause that variable to become globally
scoped.

A globally scoped variable is valid throughout the entirety of a program. A global variable
can be accessed and referenced on every line of code. Sometimes this is a good thing:
for example, we could define a variable to represent π and then use it anywhere in our
program. We would then be assured that every computation involving π would be using
the same definition of π (rather than one line of coding using the approximation 3.14
while another uses 3.14159).

On the same token, however, global variables make the state and execution of a program
less predictable: if any piece of code can access a global variable, then potentially any
piece of code could change that variable. Imagine some questionable code changing the
value of our global π variable to 3. For this reason, using global variables is generally
considered bad practice.5 Even if no code performs such an egregious operation, the
fact that anything can change the value means that when testing, you must test for
5
Coders often say “globals are evil” and indeed have often demonstrated that they have low moral
standards. Global variables that is. Coders are always above reproach.

32
2.3. Operators

the potential that anything will change the value, greatly increasing the complexity
of software testing. To capture the advantages of a global variable while avoiding the
disadvantages, it is common to only allow global constants; variables whose values cannot
be changed once set.

Another argument against globally scoped variables is that once the identifier has been
used, it cannot be reused or redefined for other purposes (a floating point variable with
the identifier pi means we cannot use the identifier pi for any other purpose) as
it would lead to conflicts. Defining many globally scoped variables (or functions, or
other elements) starts to pollute the namespace by reserving more and more identifiers.
Problems arise when one attempts to use multiple libraries that have both used the same
identifiers for different variables or functions. Resolving the conflict can be difficult or
impossible if you have no control over the offending libraries.

2.3. Operators

Now that we have variables, we need a way to work with variables. That is, given two
variables we may wish to add them together. Or we may wish to take two strings and
combine them to form a new string. In programming languages this is accomplished
through operators which operate on one or more operands. An operator takes the values
of its operands and combines them in some way to produce a new value. If an operator
is applied to variable(s), then the values used in the operation are the values stored in
the variable at the time that the operator is evaluated.

Many common operators are binary in that they operate on two operands such as common
arithmetic operations like addition and multiplication. Some operators are unary in that
they only operate on one variable. The first operator that we look at is a unary operator
and allows us to assign values to variables.

2.3.1. Assignment Operators

The assignment operator is a unary operator that allows you to take a value and assign
it to a variable. The assignment operator usually takes the following form: the value is
placed on the right-hand-side of the operator while the variable to which we are assigning
the value is placed on the left-hand-side of the operator. For our pseudocode, we’ll use a
generic “left-arrow” notation:
a ← 10
which should be read as “place the value 10 into the variable a.” Many C-style pro-
gramming languages commonly use a single equal sign for the assignment operator. The
example above might be written as

33
2. Basics

a = 10;

It is important to realize that when this notation is used, it is not an algebraic declaration
like a = b which is an algebraic assertion that the variables a and b are equal. An
assignment operator is different: it means place the value on the right-hand-side into the
variable on the left-hand-side. For that reason, writing something like

10 = a;

is invalid syntax. The left-hand-side must be a variable.

The right-hand-side, however, may be a literal, another variable, or even a more complex
expression. In the example before,
a ← 10
the value 10 was acting as a numerical literal: a way of expressing a (human-readable)
value that the computer can then interpret as a binary value. In code, we can conveniently
write numbers in base-10; when compiled or interpreted, the numerical literals are
converted into binary data that the computer understands and placed in a memory
location corresponding to the variable. This entire process is automatic and transparent
to the user. Literals can also be strings or other values. For example:

message ← “hello world”

We can also “copy” values from one variable to another. Assuming that we’ve assigned
the value 10 to the variable a, we can then copy it to another variable b:

b←a

This does not mean that a and b are the same variable. The value that is stored in the
variable a at the time that this statement is executed is copied into the variable b. There
are now two different variables with the same value. If we reassign the value in a, the
value in b is unaffected. This is illustrated in Algorithm 2.1

1 a ← 10
2 b←a
//a and b both store the value 10 at this point
3 a ← 20
//now a has the value 20, but b still has the value 10
4 b ← 25
//a still stores a value of 20, b now has a value of 25

Algorithm 2.1: Assignment Operator Demonstration

The right-hand-side can also be a more complex expression, for example the result of
summing two numbers together.

34
2.3. Operators

2.3.2. Numerical Operators

Numerical operators allow you to create complex expressions involving either numerical
literals and/or numerical variables. For most numerical operators, it doesn’t matter if
the operands are integers or floating point numbers. Integers can be added to floating
point numbers without much additional code for example.

The most basic numerical operator is the unary negation operator. It allows you to
negate a numerical literal or variable. For example,

a ← −10

or
a ← −b
The usage of a negation is so common that it is often not perceived to be an operator
but it is.

Addition & Subtraction

You can also add (sum) two numbers using the + (plus) operator and subtract using the
− (minus) operator in a straightforward way. Note that most languages can distinguish
the minus operator and the negation operator by how you use it just like a mathematical
expression. If applied to one operand, it is interpreted as a negation operator. If applied
to two operands, it represents subtraction. Some examples can be found in Algorithm
2.2.

1 a ← 10
2 b ← 20
3 c←a+b
4 d←a−b
//c has the value 30 while d has the value −10
5 c ← a + 10
6 d ← −d
//c now has the value 20 and d now has the value 10

Algorithm 2.2: Addition and Subtraction Demonstration

Multiplication & Division

You can also multiply and divide literals and variables. In mathematical expressions
multiplication is represented as a × b or a · b or simply just ab and division is represented

35
2. Basics

as a÷b or a/b or ab . In our pseudocode, we’ll generally use a·b and ab , but in programming
languages it is difficult to type these symbols. Usually programming languages use *
for multiplication and / for division. Similar examples are provided in Algorithm 2.3.

1 a ← 10
2 b ← 20
3 c←a·b
4 d ← ab
//c has the value 200 while d has the value 0.5

Algorithm 2.3: Multiplication and Division Demonstration

Careful! Some languages specify that the result of an arithmetic operation on variables
of a certain type must match. That is, an integer plus an integer results in an integer. A
floating point number divided by a floating point number results a floating point number.
When we mix types, say an integer and a floating point number, the result is generally a
floating point number. For the most part this is straightforward. The one tricky case is
when we have an integer divided by another integer, 3/2 for example.

Since both operands are integers, the result must be an integer. Normally, 3/2 = 1.5,
but since the result must be an integer, the fractional part gets truncated (cut-off) and
only the integral part is kept for the final result. This can lead to weird results such as
1/3 = 0 and 99/100 = 0. The result is not rounded down or up; instead the fractional
part is completely thrown out. Care must be taken when dividing integer variables in a
statically typed language. Type casting can be used to force variables to change their
type for the purposes of certain operations so that the full answer is preserved. For
example, in C we can write

1 int a = 10;
2 int b = 20;
3 double c;
4 int d;
5 c = (double) a / (double) b;
6 d = a / b;
7 //the value in c is correctly 0.5 but the value in d is 0

36
2.3. Operators

Integer Division

Recall that in arithmetic, when you divide integers a/b, b might not go into a evenly in
which case you get a remainder. For example, 13/5 = 2 with a remainder r = 3. More
generally we have that
a = qb + r
Where a is the dividend, b is the divisor, q is the quotient (the result) and r is the
remainder. We can also perform integer division in most programming languages. In
particular, the integer division operator is the operator that gives us the remainder of
the integer division operation in a/b. In mathematics this is the modulo operator and is
denoted
a mod b
For example,
13 mod 5 = 3
It is possible that the remainder is zero, for example,

10 mod 5 = 0

Many programming languages support this operation using the percent sign. For example,

c = a % b;

2.3.3. String Concatenation

Strings can also be combined to form new strings. In fact, strings can often be combined
with non-string variables to form new strings. You would typically do this in order to
convert a numerical value to a string representation so that it can be output to the user
or to a file for longterm storage. The operation of combining strings is referred to as
string concatenation. Some languages support this through the same plus operator that
is used with addition. For example,

message ← “hello ” + “world!”

which combines the two strings to form one string containing the characters “hello world!”,
storing the value into the message variable. For our pseudocode we’ll adopt the plus
operator for string concatenation.

The string concatenation operator can also sometimes be combined with non-string
types; numerical types for example. This allows you to easily convert numbers to a
human-readable, base-10 format so that they can be printed to the output. For example
suppose that the variable b contains the value 20, then

message ← “the answer is ” + b

37
2. Basics

might result in the string “the answer is 20” being stored in the variable message.

Other languages use different symbols to distinguish concatenation and addition. Still
yet other languages do not directly support an operator for string concatenation which
must instead be done using a function.

2.3.4. Order of Precedence

In mathematics, when you write an expression such as:

a+b·c

you interpret it as “multiply b and c and then add a.” This is because multiplication has
a higher order of precedence than addition. The order of precedence (sometimes referred
to as order of operations) is a set of rules which define the order in which operations
should be evaluated. In this case, multiplication is performed before addition. If, instead,
we had written
(a + b) · c
we would have a different interpretation: “add a and b and then multiply the result by
c.” That is, the inclusion of parentheses changes the order in which we evaluate the
operations. Adding parentheses can have no effect (if we wrote a + (bc) for example), or
it can cause operations with a lower order of precedence to be evaluated first as in the
example above.

Numerical operators are similar when used in most programming languages. The same
order of precedence is used and parentheses can be used to change the order of evaluation.

2.3.5. Common Numerical Errors

When dealing with numeric types it is important to know and understand their limitations.
In mathematics, the following operations might be considered invalid.

• Division by zero: ab where b = 0. This is an undefined operation in mathematics

and also in programming languages. Depending on the language, any number of
things may happen. It may be a fatal error or exception; the program may continue
executing but give “garbage” results from then on; the result may be a special value
such as null, “NaN” (not-a-number) or “INF” (a special representation of infinity).
It is best to avoid such an operation entirely using conditionals statements and
defensive programming (see Chapter 3).

• Other potentially
√ invalid operations involve common mathematical functions. For
example, −1 would be a complex result, i which some languages do support.

38
2.3. Operators

However, many do not. Similarly, the natural logarithm of zero, ln (0) and negative
values, ln (−1) is undefined. In either case you could expect a result like “NaN” or
“INF.”

• Still other operations seem like they should be valid, but because of how numbers
are represented in binary, the results are invalid. Recall that for a 32-bit signed,
two’s complement number, the maximum representable value is 2,147,483,647.
Suppose this maximum value is stored in a variable, b. Now suppose we attempt
to add one more,
c←b+1
Mathematically we’d expect the result to be 2,147,483,648, but that is more than
the maximum representable integer. What happens is something called arithmetic
overflow. The actual number stored in binary in memory for 2,147,483,647 is
0b0 |11 .{z
. . 11}
31 1s

When we add 1 to this, it is carried over all the way to the 32nd bit, giving the
result
0b1 |00 .{z
. . 00}
31 0s
in binary. However, the 32nd bit is the sign bit, so this is a negative number.
In particular, if this is a two’s complement integer, it has the decimal value
−2, 147, 483, 648 which is obviously wrong. Another example would be if we have a
“large number, say 2 billion and attempt to double it (multiply by 2). We would
expect 4 billion as a result, but again overflow occurs and the result (using 32-bit
signed two’s complement integers) is −294, 967, 296.

• A similar phenomenon can happen with floating point numbers. If an operation

(say multiplying two “small” numbers together) results in a number that is smaller
than the smallest floating point number that can be represented, the result is said
to have resulted in underflow. The result can essentially be zero, or an error can
be raised to indicate that underflow has occurred. The consequences of underflow
can be very complex.

• Floating-point operations can also result in a loss of precision even if no overflow

or underflow occurs. For example, when adding a very large number a and a very
small number b, the result might be no different from the value of a. This is because
(for example) double precision floating point numbers only have about 16 significant
digits of precision with the least significant digits being cutoff in order to preserve
the magnitude.
√
As another example, suppose we compute 2 = 1.41421356 . . .. If we squared the
result, mathematically we would expect to get 2. However, since we only have a
certain number of digits of precision, squaring the result in a computer may result
in a value slightly different from 2 (either 1.9999998 or 2.0000001).

39
2. Basics

2.3.6. Other Operators

Many programming languages support other “convenience” operators that allow you to
perform common operations using less code. These operators are generally syntactic
sugar: the don’t add any functionality. The same operation could be achieved using
other operators. However, they do add simpler or more terse syntax for doing so.

Increment Operators

Adding or subtracting one to a variable is a very common operation. So common, that

most programming languages define increment operators such as i++ and i-- which
add one and subtract one from the variables applied. The same effect could be achieved
by writing
i ← (i + 1) and i ← (i − 1)
but the increment operators provide a shorthand way of expressing the operation.

The operators i++ and i-- are postfix operators: the operator is written after (post)
the operand. Some languages define similar prefix increment operators, ++i and --i .
The effect is similar: each adds or subtracts one from the variable i . However, the
difference is when the operator is used in a larger expression. A postfix operator retains
the original value for the expression, a prefix operator takes on the new, incremented
value in the expression.

To illustrate, suppose the variable i has the value 10. In the following line of code, i is
incremented and used in an expression that adds 5 and stores the result in a variable x :

x = 5 + (i++);

The value of x after this code is 15 while the value of i is now 11. This is because
the postfix operator increments i , but i++ retains the value 10 in the expression. In
contrast, with the line

x = 5 + (++i);

the variable i again now has the value 11, but the value of x is 16 since ++i takes
on the new, incremented value of 11. Appropriately using each can lead to some very
concise code, but it is important to remember the difference.

Compound Assignment Operators

If we want to increment or decrement a variable by an amount other than 1 we can do

so using compound assignment operators that combine an arithmetic operator and an
assignment operator into one. For example, a += 10 would add 10 to the variable a .

40
2.4. Basic Input/Output

1 int a = 10;
2 a += 5; //adds 5 to a
3 a -= 3; //subtracts 3 from a
4 a *= 2; //multiplies a by 2
5 a /= 4; //divides a by 4
6

7 //you can also use compound assignment operators with variables:

8 int b = 5;
9 a += b; //adds the value stored in b to a
10 a -= b; //subtracts the value stored in b from a
11 a *= b; //multiplies a by b
12 a /= b; //divides a by b

Code Sample 2.2.: Compound Assignment Operators in C

The same could be achieved by coding a = a + 10 , but the former is a bit shorter as
we don’t have to repeat the variable.

You can do the same with subtraction, multiplication, and division. More examples
using the C programming language can be found in Code Snippet 2.2. It is important to
note that these operators are not, strictly speaking, equivalent. That is, a += 10 is not
equivalent to a = a + 10 . They have the same effect, but the first involves only one
operator while the second involves two operators.

2.4. Basic Input/Output

Not all variables can be coded using literals. Sometimes a program needs to read in
values as input from a user who can give different values on different runs of a program.
Likewise, a computer program often needs to produce output to the user to be of any
use.

The most basic types of programs are interactive programs that interact with a human
user. Generally, the program may interactive the user to enter some input value(s) or
make some choices. It may then compute some values and respond to the user with some
output. In the following sections we’ll overview the various types of input and output
(I/O for short) that are available.

41
2. Basics

2.4.1. Standard Input & Output

The standard input (stdin for short), standard output (stdout) and standard error (stderr)
are three standard communication streams that are defined by most computer systems.

Though perhaps an over simplification, the keyboard usually serves as a standard input
device while the monitor (or the system console) serves as a standard output device. The
standard error is usually displayed in the same display but may be displayed differently
on some systems (it is typeset in red in some consoles that support color to indicate that
the output is communicating an error).

As a program is executing, it may prompt a user to enter input. A program may wait
(called blocking) until a user has typed whatever input they want to provide. The user
typically hits the enter key to indicate their input is done and the program resumes,
reading the input provided via the standard input. The program may also produce
output which is displayed to the user.

The standard input and output are generally universal: almost any language, and
operating system will support them and they are the most basic types of input/output.
However, the type of input and output is somewhat limited (usually limited to text-based
I/O) and doesn’t provide much in the way of input validation. As an example, suppose
that a program prompts a user to enter a number. Since the input device (keyboard) is
does not really restrict the user, a more obstinate user may enter a non-numeric value,
say “hello”. The program may crash or provide garbage output with such input.

2.4.2. Graphical User Interfaces

A much more user-oriented way of reading input and displaying output is to use a
Graphical User Interface (GUI). GUIs can be implemented as traditional “thick-client”
applications (programs that are installed locally on your machine) or as “thin-client”
applications such as a web application. They typically support general “widgets” such as
input boxes, buttons, sliders, etc. that allow a user to interact with the program in a
more visual way. They also allow the programmer to do better input validation. Widgets
could be design so that only good input is allowed by creating modal restrictions: the
user is only allowed to select one of several “radio” buttons for example. GUIs also
support visual feedback cues to the user: popups, color coding, and other elements can
be used to give feedback on errors and indicate invalid selections.

Graphical user interfaces can also make use of more modern input devices: mice, touch
screens with gestures, even gaming devices such as the Kinect allow users to use a full
body motion as an input mechanism. We discuss GUIs in more detail in Chapter 13. To
begin, we’ll focus more on plain textual input and output.

42
2.4. Basic Input/Output

Language Standard Output String Output

C printf() sprintf()
Java System.out.printf() String.format()
PHP printf() sprintf()

Table 2.5.: printf() -style Methods in Several Languages. Languages support format-
ting directly to the Standard Output as well as to strings that can be further
used or manipulated. Most languages also support printf() -style formatting
to other output mechanisms (streams, files, etc.).

2.4.3. Output Using printf() -style Formatting

Recall that many languages allow you to concatenate a string and a non-string type in
order to produce a string that can then be output to the standard output. However,
concatenation doesn’t provide much in the way of customizability when it comes to
formatting output. We may want to format a floating point number so that it only prints
two decimal places (as with US currency). We may want to align a column of data so
that number places match up. Or we may want to justify text either left or right.

Such data formatting can be achieved through the use of a printf() -style formatting
function. The ideas date back to the mid-60s, but the modern printf() comes from
the C programming language. Numerous programming languages support this style
of formatted output ( printf() stands for print f ormatted). Most support either
printing the resulting formatted output to the standard output as well as to strings and
other output mechanisms (files, streams, etc.). Table 2.5 contains a small sampling of
printf() -style functions supported in several languages. We’ll illustrate this usage
using the C programming language for our examples, but the concepts are generally
universal across most languages.

The function works by providing it a number of arguments. The first argument is always
a string that specifies the formatting of the result using several placeholders (flags that
begin with a percent sign) which will be replaced with values stored in variables but
in a formatted manner. Subsequent arguments to the function are the list of variables
to be printed; each argument is delimited by a comma. Figure 2.3 gives an example
of of a printf() statement with two placeholders. The placeholders are ultimately
replaced with the values stored in the provided variables a, b. If a, b held the values 10
and 2.718281, the code would end up printing

The value of a = 10, the value of b is 2.718281

Though there are dozens of placeholders that are supported, we will focus only on a few:

• %d formats an integer variable or literal

43
2. Basics

Format String

printf("The value of a = %d, the value of b is %f\n", a, b);

Print List
Placeholders

Figure 2.3.: Elements of a printf() statement in C

• %f formats a floating point variable or literal

• %c formats a single character variable or literal

• %s formats a string variable or literal

Misuse of placeholders may result in garbage output. For example, using an integer
placeholder, %d , but providing a string argument; since strings cannot be (directly)
converted to integers, the output will not be correct.

In addition to these placeholders, you can also add modifiers. A number n between
the percent sign and character ( %nd , %nf , %ns )) specifies that the result should be
formatted with a minimum of n columns. If the output takes less than n columns,
printf() will pad out the result with spaces so that there are n columns. If the output
takes n or more columns, then the modifier will have no effect (it specifies a minimum
not a maximum).

Floating-point numbers have a second modifier that allows you to specify the number of
digits of precision to be formatted. In particular, you can use the placeholder %n.mf in
which n has the same meaning, but m specifies the number of decimals to be displayed.
By default, 6 decimals of precision are displayed. If m is greater than the precision of the
number, zeros are usually used for subsequent digits; if m is smaller than the precision of
the number, rounding may occur. Note that the n modifier includes the decimal point
as a column. Both modifiers are optional.

Finally, each of these modifiers can be made negative (example: %-20d ) to left-justify
the result. By default, justification is to the right. Several examples are illustrated in
Code Sample 2.3 with the results in Code Sample 2.4.

2.4.4. Command Line Input

Not all programs are interactive. In fact, the vast majority of software is developed to
interact with other software and does not expect that a user is sitting at the console

44
2.4. Basic Input/Output

1 int a = 4567;
2 double b = 3.14159265359;
3

4 printf("a=%d\n", a);
5 printf("a=%2d\n", a);
6 printf("a=%4d\n", a);
7 printf("a=%8d\n", a);
8

9 //by default, prints 6 decimals of precision

10 printf("b=%f\n", b);
11 //the .m modifier is optional:
12 printf("b=%10f\n", b);
13 //the n modifier is also optional:
14 printf("b=%.2f\n", b);
15 //note that this rounds!
16 printf("b=%10.3f\n", b);
17 //zeros are added so that 15 decimals are displayed
18 printf("b=%20.15f\n", b);

Code Sample 2.3.: printf() examples in C

a=4567
a=4567
a=4567
a= 4567
b=3.141593
b= 3.141593
b=3.14
b= 3.142
b= 3.141592653590000

Code Sample 2.4.: Result of computation in Code Sample 2.3. Spaces are highlighted
with a for clarity.

45
2. Basics

constantly providing it with input. Most languages and operating systems support
non-interactive input from the Command Line Interface (CLI). This is input that is
provided at the command line when the program is executed. Input provided from the
command line are usually referred to as command line arguments. For example, if we
invoke a program named myProgram from the command line prompt using something
like the following:

~>./myProgram a 10 3.14

Then we would have provided 4 command line arguments. The first argument is usually
the program’s name, all subsequent arguments are separated by whitespace. Command
line arguments are provided to the program as strings and it is the program’s responsibility
to convert them if needed and to validate them to ensure that the correct expected
number and type of arguments were provided.

Within a program, command line arguments are usually referred to as an argument vector
(sometimes in a variable named argv ) and argument count (sometimes in a variable
named argc ). We explore how each language supports this in subsequent chapters.

2.5. Debugging

Making mistakes in programming is inevitable. Even the most expert of software

developers make mistakes.6 Errors in computer programs are usually referred to as
bugs. The term was popularized by Grace Hopper in 1947 while working on a Mark
II Computer at a US Navy research lab. Literally, a moth stuck in the computer was
impeding its operation. Removing the moth or “debugging” the computer fixed it. In
this section we will identify general types of errors and outline ways to address them.

2.5.1. Types of Errors

When programming, there are several types of errors that can occur. Some can be easily
detected (or even easily fixed) by compilers and other modern code analysis tools such
as IDEs.

6
A severe security bug in the popular unix bash shell utility went undiscovered for 25 years before it
was finally fixed in September 2014, missed by thousands of experts and some of the best coders in
the world.

46
2.5. Debugging

Syntax Errors

Syntax errors are errors in the usage of a programming language itself. A syntax error
can be a failure to adhere to the rules of the language such as misspelling a keyword or
forgetting proper “punctuation” (such as missing an ending semicolon). When you have a
syntax error, you’re essentially not “speaking the same language.” You wouldn’t be very
comprehensible if you started injecting non-sense words or words from different language
when speaking to someone in English. Similarly, a computer can’t understand what
you’re trying to say (or what directions you’re trying to give it) if you’re not speaking
the same language.

Typically syntax errors prevent you from even compiling a program, though syntax
errors can be a problem at runtime with interpreted languages. When a syntax error is
encountered, a compiler will fail to complete the compilation process and will generally
quit. Ideally, the compiler will give reasons for why it was unable to compile and will
hopefully identify the line number where the syntax error was encountered with a hint on
what was wrong. Unfortunately, many times a compiler’s error message isn’t too helpful
or may indicate a problem on one line where the root cause of the problem is earlier
in the program. One cannot expect too much from a compiler after all. If a compiler
were able to correctly interpret and fix our errors for us, we’d have “natural language”
programming where we could order the computer to execute our commands in plain
English. If we had this science fiction-level of computer interaction we wouldn’t need
programming languages at all.

Fixing syntax errors involves reading and interpreting the compiler error messages,
reexamining the program and fixing any and all issues to conform to the syntax of
the programming language. Fixing one syntax error may enable the compiler to find
additional syntax errors that it had not found before. Only once all syntax errors have
been resolved can a program actually compile. For interpreted languages, the program
may be able to run up to where it encounters a syntax error and then exits with a fatal
error. It may take several test runs to resolve such errors.

Runtime Errors

Once a program is free of syntax errors it can be compiled and be run. However, that
doesn’t mean that the program is completely free of bugs, just that it is free of the types
of bugs (syntax errors) that the compiler is able to detect. A compiler is not able to
predict every action or possible event that could occur when a program is actually run.
A runtime error is an error that occurs while a program is being executed. For example,
a program could attempt to access a file that does not exist, or attempt to connect to a
remote database, but the computer has lost its network connection, or a user could enter
bad data that results in an invalid arithmetic operation, etc.

47
2. Basics

A compiler cannot be expected to detect such errors because, by definition, the conditions
under which runtime errors occur occur at runtime, not at compile time. One run of
a program could execute successfully, while another subsequent run could fail because
the system conditions have changed. That doesn’t mean that we should not attempt to
mitigate the consequences of runtime errors.

As a programmer it is important to think about the potential problems and runtime

errors that could occur and make contingency plans accordingly. We can make reasonable
assumptions that certain kinds of errors may occur in the execution of our program and
add code to handle those errors if they occur. This is known as error handling (which
we discuss in detail in Chapter 6). For example, we could add code that checks if a user
enters bad input and then re-prompt them to enter good input. If a file is missing, we
could add code to create it as needed. By checking for these errors and preventing illegal,
potentially fatal operations, we practice defensive programming.

Logic Errors

Other errors may be a result of bad code or bad design. Computers do exactly as they
are told to do. Logic errors can occur if we tell the computer to do something that we
didn’t intend for them to do. For example, if we tell the computer to execute command
A under condition X, but we meant to have the computer execute command B under
condition Y , we have caused a logical error. The computer will perform the first set of
instructions, not the second as we intended. The program may be free of syntax errors
and may execute without any problems, but we certainly don’t get the results that we
expected.

Logic errors are generally only detected and addressed by rigorous software testing. When
developing software, we can also design a collection of test cases: a set of inputs along
with correct outputs that we would expect the program of code to produce. We can then
test the program with these inputs to see if they produce the same output as in the test
cases. If they don’t, then we’ve uncovered a logical error that needs to be addressed.

Rigorous testing can be just as complex (or even more complex) than writing the program
itself. Testing alone cannot guarantee that a program is free of bugs (in general, the
number of possible inputs is infinite; it is impossible to test all possibilities). However,
the more test cases that we design and pass the higher the confidence we have that the
program is correct.

Testing can also be very tedious. Modern software engineering techniques can help
streamline the process. Many testing frameworks have been developed and built that
attempt to automate the testing process. Test cases can be randomly generated and
test suites can be repeatedly run and verified throughout the development process.
Frameworks can perform regression testing to see if fixing one bug caused or uncovered
another, etc.

48
2.5. Debugging

2.5.2. Strategies

A common beginner’s way of debugging a program is to insert temporary print statements

throughout their program to see what values variables have at certain points in an attempt
to isolate where an error is occurring. This is an okay strategy for extremely simple
programs, but its the “poor man’s” way of debugging. As soon as you start writing
more complex programs you quickly realize that this strategy is slow, inefficient, and
can actually hide the real problems. The standard output is not guaranteed to work as
expected if an error has occurred, so print statements may actually mislead you into
thinking the problem occurs at one point in the program when it actually occurs in a
different part.

Instead, it is much better to use a proper debugging tool in order to isolate the problem.
A debugger is a program, that allows you to “simulate” an execution of your program.
You can set break points in your program on certain lines and the debugger will execute
your program up to those points. It then pauses and allows you to look at the program’s
state: you can examine the contents of memory, look at the values stored in certain
variables, etc. Debuggers will also allow you to resume the execution of your program
to the next break point or allow you to “step” through the execution of your program
line by line. This allows you to examine the execution of a program at human speed in
order to diagnose the exact point in execution where the problem occurs. IDEs allow
you to do this visually with a graphical user interface and easy visualization of variables.
However, there are command line debuggers such as GDB (GNU’s Not Unix! (GNU)
Debugger) that you interact with using text commands.

In general, debugging strategies attempt to isolate a problem to the smallest possible code
segment. Thus, it is best practice to design your code using good procedural abstraction
and place your code into functions and methods (see Chapter 5). It is also good practice
to create test cases and test suites as you develop these small pieces of code.

It can also help to diagnose a problem by looking at the nature of the failure. If some
test cases pass and others fail you can get a hint as to what’s wrong by examining the
key differences between the test cases. If one value passes and another fails, you can
trace that value as it propagates through the execution of your program to see how it
affects other values.

In the end, good debugging skills, just like good coding skills, come from experience. A
seasoned expert may be able to look at an error message and immediately diagnose the
problem. Or, a bug can escape the detection of hundreds of the best developers and
software tools and end up costing millions of dollars and thousands of man-hours.

49
2. Basics

2.6. Examples

Let’s apply these concepts by developing several prompt-and-compute style programs.

That is, the programs will prompt the user for input, perform some calculations, and
then output a result.

To write these programs, we’ll use pseudocode, an informal, abstract description of a

program/algorithm. Pseudocode does not use any language-specific syntax. Instead,
it describes processes at a high-level, making use of plain English and mathematical
notation. This allows us to focus on the actual process/program rather than worrying
about the particular syntax of a specific language. Good pseudocode is easily translated
into any programming language.

2.6.1. Temperature Conversion

Temperature can be measured in several different scales. The most common for everyday
use is Celsius and Fahrenheit. Let’s write a program to convert from Fahrenheit to
Celsius using the following formula:

5
C= · (F − 32)
9

The basic outline of the program will be three simple steps:

1. Read in a Fahrenheit value from the user

2. Compute a Celsius value using the formula above

3. Output the result to the user

This is actually pretty good pseudocode already, but let’s be a little more specific using
some of the operators and notation we’ve established above. The full program can be
found in Algorithm 2.4.

1 prompt the user to enter a temperature in Fahrenheit

2 F ← read input from user
3 C ← 59 · (F − 32)
4 Output C to the user

Algorithm 2.4: Temperature Conversion Program

50
2.7. Exercises

2.6.2. Quadratic Roots

A common math exercise is to find the roots of a quadratic equation with coefficients,
a, b, c,
ax2 + bx + c = 0
using the quadratic formula, √
−b ± b2 − 4ac
x=
2a
Following the same basic outline, we’ll read in the coefficients from the user, compute
each of the roots, and output the results to the user. We need two computations, one for
each of the roots which we label r1 , r2 . The full procedure is presented in Algorithm 2.5.

1 prompt the user to enter a

2 a ← read input from user
3 prompt the user to enter b
4 b ← read input from user
5 prompt the user to enter c
6 c ← read input from user
√
−b+ b2 −4ac
7 r1 ← 2a
√
−b− b2 −4ac
8 r2 ← 2a
9 Output “the roots of ax2 + bx + c are r1 , r2 ”

Algorithm 2.5: Quadratic Roots Program

2.7. Exercises

Exercise 2.1. Write a program that calculates mileage deduction for income tax using
the standard rate of $0.575 per mile. Your program will read in a beginning and ending
odometer reading and calculate the difference and total deduction. Take care that your
output is in whole cents. An example run of the program may look like the following.

INCOME TAX MILEAGE CALCULATOR

Enter beginning odometer reading--> 13505.2
Enter ending odometer reading--> 13810.6
You traveled 305.4 miles. At $.575 per mile,
your reimbursement is $175.61

51
2. Basics

Exercise 2.2. Write a program to compute the total “cost” C of a loan. That is, the
total amount of interest paid over the life of a loan. To compute this value, use the
following formula.
p · i · (1 + i)12n
C= ∗ 12n − p
(1 + i)12n − 1
where

• p is the starting principle amount

r
• i= 12
where r is the APR on the interval [0, 1]

• n is the number of years the loan is to be paid back

Exercise 2.3. Write a program to compute the annualized appreciation of an asset (say
a house). The program should read in a purchase price p, a sale price s and compute
their difference d = s − p (it should support a loss or gain). Then, it should compute an
appreciation rate: r = dp along with an (average) annualized appreciation rate (that is,
what was the appreciation rate in each year that the asset was held that compounded ):
1
(1 + r) y − 1

Where y is the number of years (possibly fractional) the asset was held (and r is on the
scale [0, 1]).
Exercise 2.4. The annual percentage yield (APY) is a much more accurate measure of
the true cost of a loan or savings account that compounds interest on a monthly or daily
basis. For a large enough number of compounding periods, it can be calculated as:

AP Y = ei − 1

where i is the nominal interest rate (6% = 0.06). Write a program that prompts the user
for the nominal interest rate and outputs the APY.
Exercise 2.5. Write a program that calculates the speed of sound (v, feet-per-second)
in the air of a given temperature T (in Fahrenheit). Use the formula,
r
5T + 297
v = 1086
247
Be sure your program does not lose the fractional part of the quotient in the formula
shown and format the output to three decimal places.
Exercise 2.6. Write a program to convert from radians to degrees using the formula
180 · rad
deg =
π
However, radians are on the scale [0, 2π). After reading input from the user be sure to
do some error checking and give an error message if their input is invalid.

52
2.7. Exercises

Exercise 2.7. Write a program to compute the Euclidean Distance between two points,
(x1 , y2 ) and (x2 , y2 ) using the formulate:
p
(x1 − x2 )2 + (y1 − y2 )2

Exercise 2.8. Write a program that will compute the value of sin(x) using the first 4
terms of the Taylor series:

x3 x5 x7
sin(x) ≈ x − + −
3! 5! 7!
In addition, your program will compute the absolute difference between this calculation
and a standard implementation of the sine function supported in your language. Your
program should prompt the user for an input value x and display the appropriate output.
Your output should looks something like the following.

Sine Approximation
===================
Enter x: 1.15
Sine approximation: 0.912754
Sine value: 0.912764
Difference: 0.000010

Exercise 2.9. Write a program to compute the roots of a quadratic equation:

ax2 + bx + c = 0

using the well-known quadratic formula:

√
−b ± b2 − 4ac
2a
Your program will prompt the user for the values, a, b, c and output each real root.
However, for “invalid” input (a = 0 or values that would result in complex roots), the
program will instead output a message that informs the user why that the inputs are
invalid (with a specific reason).

Exercise 2.10. One of Ohm’s laws can be used to calculate the amount of power in
Watts (the rate of energy conversion; 1 joule per second) in terms of Amps (a measure of
current, 1 amp = 6.241 × 1018 electrons per second) and Ohms (a measure of electrical
resistance). Specifically:
W = A2 · O
Develop a simple program to read in two of the terms from the user and output the third.

53
2. Basics

Exercise 2.11. Ohm’s Law models the current through a conductor as follows:
V
I=
R
where V is the voltage (in volts), R is the resistance (in Ohms) and I is the current (in
amps). Write a program that, given two of these values computes the third using Ohm’s
Law.

The program should work as follows: it prompts the user for units of the first value:
the user should be prompted to enter V, R, or I and should then be prompted for the
value. It should then prompt for the second unit (same options) and then the value. The
program will then output the third value depending on the input. An example run of
the program:

Current Calculator
==============
Enter the first unit type (V, R, I): V
Enter the voltage: 25.75
Enter the second unit type (V, R, I): I
Enter the current: 72
The corresponding resistance is 0.358 Ohms

Exercise 2.12. Consider the following linear system of equations in two unknowns:
ax + by = c
dx + ey = f

Write a program that prompts the user for the coefficients in such a system (prompt for
a, b, c, d, e, f ). Then output a solution to the system (the values for x, y). Take care to
handle situations in which the system is inconsistent.
Exercise 2.13. The surface area of a sphere of radius r is

4πr2

and the volume of a sphere with radius r is

4 3
πr
3
Write a program that prompts the user for a radius r and outputs the surface area
and volume of the corresponding sphere. If the radius entered is invalid, print an error
message and exit. Your output should look something like the following.

54
2.7. Exercises

Sphere Statistics
=================
Enter radius r: 2.5
area: 78.539816
volume: 65.449847

Exercise 2.14. Write a program that prompts for the latitude and longitude of two
locations (an origin and a destination) on the globe. These numbers are in the range
[−180, 180] (negative values correspond to the western and southern hemispheres). Your
program should then compute the air distance between the two points using the Spherical
Law of Cosines. In particular, the distance d is

d = arccos (sin(ϕ1 ) sin(ϕ2 ) + cos(ϕ1 ) cos(ϕ2 ) cos(∆)) · R

• ϕ1 is the latitude of location A, ϕ2 is the latitude of location B

• ∆ is the difference between location B’s longitude and location A’s longitude

• R is the (average) radius of the earth, 6,371 kilometers

Note: the formula above assumes that latitude and longitude are measured in radians r,
−π ≤ r ≤ π. See Exercise 2.6 for how to convert between them. Your program output
should look something like the following.

City Distance
========================
Enter latitude of origin: 41.9483
Enter longitude of origin: -87.6556
Enter latitude of destination: 40.8206
Enter longitude of destination: -96.7056
Air distance is 764.990931

Exercise 2.15. Write a program that prompts the user to enter in a number of days.
Your program should then compute the number of years, weeks, and days that number
represents. For this exercise, ignore leap years (thus all years are 365 days). Your output
should look something like the following.

Day Converter
=============

55
2. Basics

Enter number of days: 1000

That is
2 years
38 weeks
4 days

Exercise 2.16. The derivative of a function f (x) can be estimated using the difference
function:
f (x + ∆x) − f (x)
f 0 (x) ≈
∆x
That is, this gives us an estimate of the slope of the tangent line at the point x. Write a
program that prompts the user for an x value and a ∆x value and outputs the value of
the difference function for all three of the following functions:

f (x) = x2
f (x) = sin(x)
f (x) = ln(x)

Your output should looks something like the following.

Derivative Approximation
===================
Enter x: 2
Enter delta-x: 0.1
(x^2)' ~= 4.100000
sin'(x) ~= -0.460881
ln'x(x) ~= 0.487902

In addition, your program should check for invalid inputs: ∆x cannot be zero, and ln(x)
is undefined for x ≤ 0. If given invalid inputs, appropriate error message(s) should be
output instead.

Exercise 2.17. Write a program that prompts the user to enter two points in the plane,
(x1 , y1 ) and (x2 , y2 ) which define a line segment `. Your program should then compute and
output an equation for the perpendicular line intersecting the midpoint of `. You should
take care that invalid inputs (horizontal or vertical lines) are handled appropriately. An
example run of your program would look something like the following.

56
2.7. Exercises

Perpendicular Line
====================
Enter x1: 2.5
Enter y1: 10
Enter x2: 3.5
Enter y2: 11
Original Line:
y = 1.0000 x + 7.5000
Perpendicular Line:
y = -1.0000 x + 13.5000

Exercise 2.18. Write a program that computes the total for a bill. The program should
prompt the user for a sub-total. It should then prompt whether or not the customer is
entitled to an employee discount (of 15%) by having them enter 1 for yes, 2 for no. It
should then compute the new sub-total and apply a 7.35% sales tax, and print the receipt
details along with the grand total. Take care that you properly round each operation.

An example run of the program should look something like the following.

Please enter a sub-total: 100

Apply employee discount (1=yes, 2=no)? 1

Receipt
========================
Sub-Total $ 100.00
Discount $ 15.00
Taxes $ 6.25
Total $ 91.25

Exercise 2.19. The ROI (Return On Investment) is computed by the following formula:

Gain from Investment − Cost of Investment

ROI =
Cost of Investment
Write a program that prompts the user to enter the cost and gain (how much it was sold
for) from an investment and computes and outputs the ROI. For example, if the user
enters $100,000 and $120,000 respectively, the output look similar to the following.

57
2. Basics

Cost of Investment: $100000.00

Gain of Investment: $120000.00
Return on Investment: 20.00%

Exercise 2.20. Write a program to compute the real cost of driving. Gas mileage (in
the US) is usually measured in miles per gallon but the real cost should be measured in
how much it costs to drive a mile, that is, dollars per mile. Write a program to assist a
user in figuring out the real cost of driving. Prompt the user for the following inputs.

• Beginning odometer reading

• Ending odometer reading

• Number of gallons it took to fill the tank

• Cost of gas in dollars per gallon

For example, if the user enters 50,125, 50,430, 10 (gallons), and $3.25 (per gallon), then
your output should be something like the following.

Miles driven: 305

Miles per gallon: 30.50
Cost per mile: $0.11

Exercise 2.21. A bearing can be measured in degrees on the scale of [0, 360) with 0◦
being due north, 90◦ due east, etc. The (initial) directional bearing from location A to
location B can be computed using the following formula.

θ = atan2 sin(∆) · cos(ϕ2 ), cos(ϕ1 ) · sin(ϕ2 ) − sin(ϕ1 ) · cos(ϕ2 ) cos(∆)

Where

• ϕ1 is the latitude of location A

• ϕ2 is the latitude of location B

• ∆ is the difference between location B’s longitude and location A’s longitude

• atan2 is the two-argument arctangent function

Note: the formula above assumes that latitude and longitude are measured in radians r,
−π < r < π. To convert from degrees d (−180 < d < 180) to radians r, you can use the

58
2.7. Exercises

simple formula:
d
r= π
180
Write a program to prompt a user for a latitude/longitude of two locations (an origin and
a destination) and computes the directional bearing (in degrees) from the origin to the
destination. For example, if the user enters: 40.8206, −96.7056 (40.8206◦ N, 96.7056◦ W)
and 41.9483, −87.6556 (41.9483◦ N, 87.6556◦ W), your program should output something
like the following.

From (40.8206, -96.7056) to (41.9483, -87.6556):

bearing 77.594671 degrees

Exercise 2.22. General relativity tells us that time is relative to your velocity. As you
approach the speed of light (c = 299, 792 km/s), time slows down relative to objects
traveling at a slower velocity. This time dilation is quantified by the Lorentz equation
t
t0 = q
v2
1− c2

Where t is the time duration on the traveling space ship and t0 is the time duration on
the (say) Earth.

For example, if we were traveling at 50% the speed of light relative to Earth, one hour in
our space ship (t = 1) would correspond to
1
t0 = p = 1.1547
1 − (.5)2

hours on Earth (about 1 hour, 9.28 minutes).

Write a program that prompts the user for a velocity which represents the percentage
p of the speed of light (that is, p = vc ) and a time duration t in hours and outputs the
relative time duration on Earth.

For example, if the user enters 0.5 and 1 respectively as in our example, it should output
something like the following:

Traveling at 1 hour(s) in your space ship at

50.00% the speed of light, your friends on
Earth would experience:
1 hour(s)
9.28 minute(s)

59
2. Basics

Your output should be able to handle years, weeks, days, hours, and minutes. So if the
user inputs something like 0.9999 and 168, your output should look something like:

Traveling at 168.00 hour(s) in your space ship at

99.99% the speed of light, your friends on
Earth would experience:
1 year(s)
18 week(s)
3 day(s)
17 hour(s)
41.46 minute(s)

Exercise 2.23. Radioactive isotopes decay into other isotopes at a rate that is measured
by a half-life, H. For example, Strontium-90 has a half-life of 28.79 years. If you started
with 10 kilograms of Strontium-90, 28.79 years later you would have only 5 kilograms
(with the remaining 5 kilograms being Yttrium-90 and Zirconium-90, Strontium-90’s
decay products).

Given a mass m of an isotope with half-life H we can determine how much of the isotope
remains after y years using the formula,
(y/H)
1
r =m·
2

For example, if we have m = 10 kilograms of Strontium-90 with H = 28.79, after y = 2

years we would have
(2/28.79)
1
r = 10 · = 9.5298
2
kilograms of Strontium-90 left.

Write a program that prompts the user for an amount m (mass, in kilograms) of an
isotope and its half-life H as well as a number of years y and outputs the amount of
the isotope remaining after y years. For the example above your output should look
something like the following.

Starting with 10.00kg of an isotope with half-life

28.79 years, after 2.00 years you would have
9.5298 kilograms left.

60
2.7. Exercises

Exercise 2.24. In sports, the magic number is a number that indicates the combination
of the number of games that a leader in a division must win and/or the 2nd place team
must lose for the leader to clinch the division. The magic number can be computed using
the following formula:
G + 1 − WA − LB
where G is the number of games played in the season, WA is the number of wins the
leader currently has and LB is the number of losses the 2nd place team currently has.

Write a program that prompts the user to enter:

• G, the total number of games in the season

• WA the number of wins of the leading team

• LA the number of losses of the leading team

• WB the number of wins of the second place team

• LB the number of losses of the second place

Your program will then output the current win percentages of both teams, the magic
number of the leading team as well as the percentage of the remaining games that must
go in team A’s favor to satisfy the magic number (for this, we will assume A and B do
not play each other).

For example, if a user enters the values 162, 96, 58, 93, 62, the output should look
something like the following.

Team Wins Loss Perc Magic Number

A 96 58 62.34% 5
B 93 62 60.00%
With 15 total games remaining, 33.33% must go in Team A's favor

Exercise 2.25. The Doppler Effect is a change in the observed spectral frequency of
light when objects in space are moving toward or away from us. If the spectral frequency
of the object is known, then the relative velocity can be estimated based on either the
blue-shift (for objects traveling toward the observer) or the red-shift (for objects traveling
away from the observer) of the observed frequency.

The blue-shift equation to determine velocity is given by

λ
vb = c −1
λb

61
2. Basics

The red-shift equation to determine velocity is given by

λ
va = c 1 −
λr
where

• c is the speed of light (299,792.458 km/s)

• λ is the actual spectral line of the object (ex: hydrogen is 434nm)

• λr is the observed (red-shifted) spectral line and λb is the observed (blue-shifted)

spectral line

Write a program that prompts the user to enter which spectral shift they want to compute
(1 for blue-shift, 2 for red-shift). The program should then prompt for the spectral line
of an object and the observed (shifted) frequency and output the velocity of the distant
object. For example, if a user enters the values 1 (blue-shifted), 434 (nm) and 487 (nm),
the output should look something like the following.

Spectral Line: 434nm

Observed Line: 487nm
Relative Velocity: 32626.28 km/s

Exercise 2.26. Radiometric dating is a technique by which the age of rocks, minerals,
and other material can be estimated by measuring the proportion of radioactive isotopes
it still has to its decay products. It can be computed with the following formula:

D = D0 + N (eλt − 1)

where

• t is age of the sample,

• D is number of atoms of the daughter isotope in the sample,

• D0 is number of atoms of the daughter isotope in the original composition,

• N is number of atoms of the parent isotope in the sample

• λ is the decay constant of the parent isotope,

ln 2
λ=
t1/2

where t1/2 is the half-life of the parent isotope (in years).

62
2.7. Exercises

Write a program that prompts the user to enter D, D0 , N , and t1/2 and computes the
approximate age of the material, t.

For example, if the user were to enter 150, 50, 300, 28.8 (Strontium-90’s half-life) then
the program should output something like the following.

The sample appears to be 11.953080 years old.

Exercise 2.27. Suppose you have two circles each centered at (0, 0) and (d, 0) with radii
of R, r respectively. These circles may intersect at two points, forming an asymmetric
“lens” as in Figure 2.4.

The area of this lens can be computed using the following formula:
2
d + r 2 − R2
2
d + R2 − r 2

2 −1 2 −1
A = r cos + R cos −
2dr 2dR
1p
(−d + r + R)(d + r − R)(d − r + R)(d + r + R)
2

Write a program that prompts the user for the two radii and the x-offset d and computes
the area of this lens. Your program should handle the special cases where the two circles
do not intersect and when they intersect at a single point (the area is zero).

Figure 2.4.: Intersection of two circles.

63
3. Conditionals

When writing code, its important to be able to distinguish between one or more situations.
Based on some condition being true or false, you may want to perform some action if
its true, while performing another, different action if it is false. Alternatively, you may
simply want to perform one action if and only if the condition is true, and do nothing
(move forward in your program) if it is false.

Normally, the control flow of a program is sequential : each statement is executed top-to-
bottom one after the other. A conditional statement (sometimes called selection control
structures) interrupts this normal control flow and executes statements only if some
specified condition holds. The usual way of achieving this in a programming language is
through the use conditional statements such as the if statement, if-else statement, and
if-else-if statement.

By using conditional statements, we can design more expressive programs whose behavior
depends on their state: if the value of some variable is greater than some threshold, we
can perform action A, otherwise, we can perform action B. You do this on a daily basis
as you make decisions for yourself. At a cafe you may want to purchase the grande coffee
which costs $2. If you have $2 or more, then you’ll buy it. Otherwise, if you have less
than $2, you can settle for the normal coffee which costs $1. Yet still, if you have less
than $1 you’ll not be able to make a purchase. The value of your pocket book determines
your decision and subsequent actions that you take.

Similarly, our programs need to be able to “make decisions” based on various conditions
(they don’t actually make decisions for themselves as computer are not really “intelligent”,
we are simply specifying what should occur based on the conditions). Conditions in a
program are specified by coding logical statements using logical operators.

3.1. Logical Operators

In logic, everything is black and white: a logical statement is an assertion that is either
true or it is false. As previously discussed, some programming languages allow you to
define and use Boolean variables that can be assigned the value true or false. We can
also formulate statements that involve other types of variables whose truth values are
determined by the values of the variables at run time.

65
3. Conditionals

3.1.1. Comparison Operators

Suppose we have a variable age representing the age of an individual. Suppose we wish
to execute some code if the person is an adult, age ≥ 18 and a different piece of code if
they are not an adult, age < 18. To achieve this, we need to be able to make comparisons
between variables, constants, and even more complex expressions. Such logical statements
may not have a fixed truth value. That is, they could be true or false depending on the
value of the variables involved when the program is run.

Such comparisons are common in mathematics and likewise in programming languages.

Comparison operators are usually binary operators in that they are applied to two
operands: a left operand and a right operand. For example, if a, b are variables (or
constants or expressions), then the comparison,

a≤b

is true if the value stored in a is less than or equal to the value stored in b. Otherwise, if
the value stored in b is strictly less than the value stored in a, the expression is false.
Further, a, b are the operands and ≤ is the binary operator.

In general, operators do not commute. That is,

a ≤ b and b ≤ a

are not equivalent, just as they are not in mathematics. However,

a ≤ b and b ≥ a

are equivalent. Thus, the order of operands is important and can change the meaning
and truth value of an expression.

A full listing of binary operators can be found in Table 3.1. In this table, we present both
the mathematical notation used in our pseudocode examples as well as the most common
ways of representing these comparison operators in most programming languages. The
need for alternative representations is because the mathematical symbols are not part of
the ASCII character set common to most keyboards.

When using comparison operators, either operand can be variables, constants, or even
more complex expressions. For example, you can make comparisons between two variables,

a < b, a > b, a ≤ b, a ≥ b, a = b, a 6= b

or they can be between a variable and a constant

a < 10, a > 10, a ≤ 10, a ≥ 10, a = 10, a 6= 10

or
10 < b, 10 > b, 10 ≤ b, 10 ≥ b, 10 = b, 10 6= b

66
3.1. Logical Operators

Psuedocode Code Meaning Type

< < less than relational
> > greater than relational
≤ <= less than or equal to relational
≥ >= greater than or equal to relational
= == equal to equality
6= != not equal to equality

Table 3.1.: Comparison Operators

Comparisons can also be used with more complex expressions such as

√
b2 − 4ac < 0

which could commonly be expressed in code as

sqrt(bb - 4a*c) < 0

Observe that both operands could be constants, such as 5 ≤ 10 but there would be little
point. Since both are constants, the truth value of the expression is already determined
before the program runs. Such an expression could easily be replaced with a simple true
or false variable. These are referred to as tautologies and contradictions respectively.
We’ll examine them in more detail below.

Pitfalls

Sometimes you may want to check that a variable falls within a certain range. For
example, we may want to test that x lies in the interval [0, 10] (between 0 and 10 inclusive
on both ends). Mathematically we could express this as

0 ≤ x ≤ 10

and in code, we may try to do something like

0 <= x <= 10

However, when used in code, the operators <= are binary and must be applied to two
operands. In a language the first inequality, 0 <= x would be evaluated and would
result in either true or false. The result is then used in the second comparison which
results in a question such as true ≤ 10 or false ≤ 10.

Some languages would treat this as a syntax error and not allow such an expression to
be compiled since you cannot compare a Boolean value to a numerical value. However,
other languages may allow this, typically representing true with some nonzero value such
as 1 and false with 0. In either case, the expression would evaluate to true since both

67
3. Conditionals

0 ≤ 10 and 1 ≤ 10. However, this is clearly wrong: if x had a value of 20 for example, the
first expression would evaluate to false, making the entire expression true, but 20 6≤ 10.
The solution is to use logical operators to express the same logic using two comparison
operators (see Section 3.1.3).

Another common pitfall when programming is to mistake the assignment operator

(typically only one equals sign, = ) and the equality operator (typically two equal signs,
== ). As before, some languages will not allow it. The expression a = 10 would not have
a truth value associated with it. Attempts to use the expression in a logical statement
would be a syntax error. Other languages may permit the expression and would give it a
truth value equal to the value of the variable. For example, a = 10 would take on the
value 10 and be treated as true (nonzero value) while a = 0 would take on the value 0
and be treated as false (zero). In either case, we probably do not get the result that we
want. Take care that you use proper equality comparison operators.

Other Considerations

The comparison operators that we’ve examined are generally used for comparing numerical
types. However, sometimes we wish to compare non-numeric types such as single
characters or strings. Some languages allow you to use numeric operators with these
types as well.

Some dynamically typed languages (PHP, JavaScript, etc.) have additional rules when
comparison operators are used with mixed types (that is, we compare a string with a
numeric type). They may even have additional “strict” comparison operators such as
(a === b) and (a !== b) which are true only if the values and types match. So, for
example, (10 == "10") may be true because the values match, but (10 === "10")
would be false since the types do not match (one is an integer, the other a string). We
discuss specifics in subsequent chapters are they pertain to specific languages.

3.1.2. Negation

The negation operator is an operator that “flips” the truth value of the expression that
it is applied to. It is very much like the numerical negation operator which when applied
to positive numbers results in their negation and vice versa. When the logical negation
operator is applied to a variable or statement, it negates its truth value. If the variable
or statement was true, its negation is false and vice versa.

Also like the numerical negation operator, the logical negation operator is a unary
operator as it applies to only one operand. In modern logic, the symbol ¬ is used to

68
3.1. Logical Operators

a ¬a
false true
true false

Table 3.2.: Logical Negation, ¬ Operator

denote the negation operator1 , examples:

¬p, ¬(a > 10), ¬(a ≤ b)

We will adopt this notation in our pseudocode, however most programming languages use
the exclamation mark, ! for the negation operator, similar to its usage in the inequality
comparison operator, != . The negation operator applies to the variable or statement
immediately following it, thus

¬(a ≤ b) and ¬a ≤ b

are not the same thing (indeed, the second expression may not even be valid depending
on the language). Further, when used with comparison operators, it is better to use the
“opposite” comparison. For example,

¬(a ≤ b) and (a > b)

are equivalent, but the second expression is preferred as it is simpler. Likewise,

¬(a = b) and (a 6= b)

are equivalent, but the second expression is preferred.

3.1.3. Logical And

The logical and operator (also called a conjunction) is a binary operator that is true if
and only if both of its operands is true. If one of its operands is false, or if both of them
are false, then the result of the logical and is false.

Many programming languages use two ampersands, a && b to denote the logical And
operator.2 However, for our pseudocode we will adopt the notation And and we will
use expressions such as a And b. Table 3.3 contains a truth table representation of the
logical And operator.
1
This notation was first used by Heyting, 1930 [16]; prior to that the tilde symbol was used (∼p for
example) by Peano [33] and Whitehead & Russell [37]. However, the tilde operator has been adopted
to mean bit-wise negation in programming languages.
2
In logic, the “wedge” symbol, p ∧ q is used to denote the logical And. It was first used again by
Heyting, 1930 [16] but should not be confused for the keyboard caret, ˆ, symbol. Many programming
languages do use the caret as an operator, but it is usually the exclusive-or operator which is true if
and only if exactly one of its operands is true.

69
3. Conditionals

a b a And b
false false false
false true false
true false false
true true true

Table 3.3.: Logical And Operator

The logical And is used to combine logical statements to form more complex logical
statements. Recall that we couldn’t directly use two comparison operators to check that
a variable falls within a range, 0 ≤ x ≤ 10. However, we can now use a logical And to
express this:
(0 ≤ x) And (x ≤ 10)
This expression is true only if both comparisons are true.

Though the And operator is a binary operator, we can write statements that involve
more than one variable or expression by using multiple instances of the operator. For
example,
b2 − 4ac ≥ 0 And a 6= 0 And c > 0
The above statement would be evaluated left-to-right; the first two operands would be
evaluated and the result would be either true or false. Then the result would be used as
the first operand of the second logical And. In this case, if any of the operands evaluated
to false, the entire expression would be false. Only if all three were true would the
statement be true.

3.1.4. Logical Or

The logical or operator is the binary operator that is true if at least one of its operands
is true. If both of its operands are false, then the logical or is false. This is in contrast
to what is usually meant by “or” colloquially. If someone says “you can have cake
or ice-cream,” usually they implicitly also mean, “but not both.” With the logical or
operator, if both operands are true, the result is still true.

Many programming languages use two vertical bars (also referred to as Sheffer strokes),
|| to denote the logical Or operator.3 . However, for our pseudocode we will adopt the
notation Or, thus the logical or can be expressed as a Or b. Table 3.4 contains a truth
table representation of the logical Or operator.

As with the logical And, the logical Or is used to combine logical statements to make
3
In logic, the “vee” symbol, p ∨ q is used to denote the logical Or. It was first used by Russell, 1906
[35].

70
3.1. Logical Operators

a b a Or b
false false false
false true true
true false true
true true true

Table 3.4.: Logical Or Operator

more complex statements. For example,

(age ≥ 18) Or (year = “senior”)
which is true if the individual is aged 18 or older, is a senior, or is both 18 or older and
a senior. If the individual is aged less than 18 and is not a senior, then the statement
would be false.

We can also write statements with multiple Or operators,

a > b Or b > c Or a > c
which will be evaluated left-to-right. If any of the three operands is true, the statement
will be true. The statement is only false when all three of the operands is false.

3.1.5. Compound Statements

The logical And and Or operators can be combined to express even more complex logical
statements. For example, you can express the following statements involving both of the
operators:
a And (b Or c) a Or (b And c)

As an example, consider the problem of deciding whether or not a given year is a leap
year. The Gregorian calendar defines a year as a leap year if it is divisible by 4. However,
every year that is divisible by 100 is not a leap year unless it is also divisible by 400.
Thus, 2012 is a leap year (4 goes into 2012 503 times), however, 1900 was not a leap year:
though it is divisible by 4 (1900/4 = 475 with no remainder), it is also divisible by 100.
The year 2000 was a leap year: it was divisible by 4 and 100 thus it was divisible by 400.

When generalizing these rules into logical statements we can follow a similar process: A
year is a leap year if it is divisible by 400 or it is divisible by 4 and not by 100. This
logic can be modeled with the following expression.
year mod 400 = 0 Or (year mod 4 = 0 And year mod 100 6= 0)

When writing logical statements in programs it is generally best practice to keep things
simple. Logical statements should be written in the most simple and succinct (but
correct) way possible.

71
3. Conditionals

Tautologies and Contradictions

Some logical statements have the same meaning regardless of the variables involved. For
example,
a Or ¬a
is always true regardless of the value of a. To see this, suppose that a is true, then the
statement becomes
a Or ¬a = true Or false
which is true. Now suppose that a is false, then the statement is

a Or ¬a = false Or true

which again is true. A statement that is always true regardless of the truth values of its
variables is a tautology.

Similarly, the statement

a And ¬a
is always false (at least one of the operands will always be false). A statement that is
always false regardless of the truth values of its variables is a contradiction.

In most cases, it is pointless to program a conditional statement with tautologies or

contradictions: if an if-statement is predicated on a tautology it will always be executed.
Likewise, an if-statement involved with a contradiction will never be executed. In either
case, many compilers or code analysis tools may indicate and warn about these situations
and encourage you to modify the code or to remove “dead code.” Some languages may
not even allow you write such statements.

There are always exceptions to the rule. Sometimes you may wish to intentionally write
an infinite loop (see Section 4.5.2) for example in which case a statement similar to the
following may be written.

1 while true do
//some computation
2 end

De Morgan’s Laws

Another tool to simplify your logic is De Morgan’s Laws. When a logical And statement
is negated, it is equivalent to an unnegated logical Or statement and vice versa. That is,

¬(a And b) and ¬a Or ¬b

72
3.1. Logical Operators

Order Operator
1 ¬
2 And
3 Or

Table 3.5.: Logical Operator Order of Precedence

are equivalent to each other;

¬(a Or b) and ¬a And ¬b

are also equivalent to each other. Though equivalent, it is generally preferable to write
the simpler statement. From one of our previous examples, we could write

¬ ((0 ≤ x) And (x ≤ 10))

or we could apply De Morgan’s Law and simplify this to

(0 > x) Or (x > 10)

which is more concise and arguably more readable.

Order of Precedence

Recall that numerical operators have a well defined order of precedence that is taken from
mathematics (multiplication is performed before addition for example, see Section 2.3.4).
When working with logical operators, we also have an order of precedence that somewhat
mirrors those of numerical operators. In particular, negations are always applied first,
followed by And operators, and then lastly Or operators.

For example, the statement

a Or b And c
is somewhat ambiguous. We don’t just evaluate it left-to-right since the And operator
has a higher order of precedence (this is similar to the mathematical expression a + b · c
where the multiplication would be evaluated first). Instead, this statement would be
evaluated by evaluating the And operator first and then the result would be applied to
the Or operator. Equivalently,
a Or (b And c)
If we had meant that the Or operator should be evaluated first, then we should have
explicitly written parentheses around the operator and its operands like

(a Or b) And c

73
3. Conditionals

In fact, its best practice to write parentheses even if it is not necessary. Writing
parentheses is often clearer and easier to read and more importantly communicates intent.
By writing
a Or (b And c)
the intent is clear: we want the And operator to be evaluated first. By not writing the
parentheses we leave our meaning somewhat ambiguous and force whoever is reading the
code to recall the rules for order of precedence. By explicitly writing parentheses, we
reduce the chance for error both in writing and in reading. Besides, its not like we’re
paying by the character.

For similar operators of the same precedence, they are evaluated left-to-right, thus

a Or b Or c is equivalent to ((a Or b) Or c)

and
a And b And c is equivalent to ((a And b) And c)

3.1.6. Short Circuiting

Consider the following statement:

a And b
As we evaluate this statement, suppose that we find that a is false. Do we need to
examine the truth value of b? The answer is no: since a is false, regardless of the truth
value of b, the statement is false because it is a logical And. Both operands must be
true for an And to be true. Since the first is false, the second is irrelevant.

Now imagine evaluating this statement in a computer. If the first operand of an And
statement is false, we don’t need to examine/evaluate the second. This has some potential
for improved efficiency: if the second operand does not need to be evaluated, a program
could ignore it and save a few CPU cycles. In general, the speed up for most operations
would be negligible, but in some cases the second operand could be very “expensive” to
compute (it could be a complex function call, require a database query to determine,
etc.) in which case it could make a substantial difference.

Historically, avoiding even a few operations in old computers meant a difference on

the order of milliseconds or even seconds. Thus, it made sense to avoid unnecessary
operations. This is now known as short circuiting and to this day is still supported in
most programming languages.4 Though the differences are less stark in terms of CPU
resources, most developers and programmers have come to expect this behavior and write
statements under the assumption that short-circuiting will occur.
4
Historically, the short-circuited version of the And operator was known as McCarthy’s sequential
conjunction operation which was formally defined by John McCarthy (1962) as “if p then q, else
false”, eliminating the evaluation of q if p is false [23].

74
3.2. The If Statement

Short circuiting is commonly used to “check” for invalid operations. This is commonly
used to prevent invalid operations. For example, consider the following statement:
(d 6= 0 And 1/d > 1)
The first operand is checking to see if d is not zero and the second checks to see if its
reciprocal is greater than 1. With short-circuiting, if d = 0, then the second operand
will not be evaluated and the division by zero will be prevented. If d 6= 0 then the
first operand is true and so the second operand will be evaluated as normal. Without
short-circuiting, both operands would be evaluated leading to a division by zero error.

There are many other common patterns that rely on short-circuiting to avoid invalid or
undefined operations. For example, short-circuiting is used to check that a variable is
valid (defined or not Null) before using it, or to check that an index variable is within
the range of an array’s size before accessing a value.

Because of short-circuiting, the logical And is effectively not commutative. An operator

is commutative if the order of its operands is irrelevant. For example, addition and
multiplication are both commutative,
x+y =y+x x·y =y·x
but subtraction and division are not,
x − y 6= y − x x/y 6= y/x
In logic, the And and Or operators are commutative, but when used in most programming
languages they are not,
(a And b) 6= (b And a) and (a Or b) 6= (b Or a)
It is important to emphasize that they are still logically equivalent, but they are not
effectively equivalent: because of short-circuiting, each of these statements have a
potentially different effect.

The Or operator is also short-circuited: if the first operand is true, then the truth value
of the expression is already determined to be true and so the second operand will not be
evaluated. In the expression,
a Or b
if a evaluates to true, then b is not evaluated (since if either operand is true, the entire
expression is true).

3.2. The If Statement

Normally, the flow of control (or control flow) in a program is sequential. Each instruction
is executed, one after the other, top-to-bottom and in individual statements left-to-right

75
3. Conditionals

just as one reads in English. Moreover, in most programming languages, each statement
executes completely before the next statement begins. A visualization of this sequential
control flow can be found in the control flow diagram in Figure 3.1(a).

However, it is often necessary for a program to “make decisions.” Some segments of code
may need to be executed only if some condition is satisfied. The if statement is a control
structure that allows us to write a snippet of code predicated on a logical statement.
The code executes if the logical statement is true, and does not execute if the logical
statement is false. This control flow is featured in Figure 3.1(b)

An example using pseudocode can be found in Algorithm 3.1. The use of the keyword
“if” is common to most programming languages. The logical statement associated with
the if-statement immediately follows the “if” keyword and is usually surrounded by
parentheses. The code block immediately following the if-statement is bound to the
if-statement.

1 if (hconditioni) then
2 Code Block
3 end

Algorithm 3.1: An if-statement

As in the flow chart, if the hconditioni evaluates to true, then the code block bound to
the statement executes in its entirety. Otherwise, if the condition evaluates to false, the
code block bound to the statement is skipped in its entirety.

A simple if-statement can be viewed as a “do this if and only if the condition holds.”
Alternatively, “if this condition holds do this, otherwise don’t.” In either case, once the
if-statement finishes execution, the program returns to the normal sequential control
flow.

3.3. The If-Else Statement

An if-statement allows you to specify a code segment that is executed or is not executed.
An if-else statement allows you to specify an alternative. An if-else statement allows you
to define a condition such that if the condition is true, one code block executes and if
the condition is false, an entirely different code block executes.

The control flow of an if-else statement is presented in Figure 3.2. Note that Code
Block A and Code Block B are mutually exclusive. That is, one and only one of them is
executed depending on the truth value of the hconditioni. A presentation of a generic
if-else statement in our pseudocode can be found in Algorithm 3.2

76
3.3. The If-Else Statement

Statement 1
true Code
hconditioni
Block

Statement 2
false

Remaining
Statement 3 Program

(a) Sequential Flow Chart (b) If-Statement Flow Chart

Figure 3.1.: Control flow diagrams for sequential control flow and an if-statement. In
sequential control, statements are executed one after the other as they are
written. In an if-statement, the normal flow of control is interrupted and a
Code Block is only executed if the given condition is true, otherwise it is
not. After the if-statement, normal sequential control flow resumes.

77
3. Conditionals

true Code
hconditioni
Block A

false

Code
Block B

Remaining
Program

Figure 3.2.: An if-else Flow Chart

Just as with an if-statement, the keyword “if” is used. In fact, the if-statement is simply
just an if-else statement with the else block omitted (equivalently, we could have defined
an empty else block, but since it would have no effect, a simple if-statement with no
else block is preferred). It is common to most programming languages to use the “else”
keyword to denote the else block of code. Since there is only one hconditioni to evaluate
and it can only be true or false, it is not necessary to specify the conditions under which
the else block executes. It is assumed that if the hconditioni evaluates to false, the else
block executes.

As with an if-statement, the block of code associated with the if-statement as well as the
block of code associated with the else-statement are executed in their entirety or not at
all. Whichever block of code executes, normal flow of control returns and the remaining
program continues executing sequentially.

3.4. The If-Else-If Statement

An if-statement allows you to define a “do this or do not” and an if-else statement allows
you to define a “do this or do that” statement. Yet another generalization is an if-else-if
statement. Using such a statement you can define any number of mutually exclusive
code blocks.

78
3.4. The If-Else-If Statement

1 if (hconditioni) then
2 Code Block A
3 else
4 Code Block B
5 end

Algorithm 3.2: An if-else Statement

To illustrate, consider the case in which we have exactly three mutually exclusive
possibilities. At a particular university, there are three possible semesters depending
on the month. January through May is the Spring semester, June/July is the Summer
semester, and August through December is the Fall semester. These possibilities are
mutually exclusive because it cannot be both Spring and Summer at the same time
for example. Suppose we have the current month stored in a variable named month.
Algorithm 3.3 expresses the logic for determining which semester it is using an if-else-if
statement.

1 if (month ≥ January) And (month ≤ May) then

2 semester ← “Spring”
3 else if (month > May) And (month ≤ July) then
4 semester ← “Summer”
5 else
6 semester ← “Fall”
7 end

Algorithm 3.3: Example If-Else-If Statement

Let’s understand how this code works. First, the “if” and “else” keywords are used
just as the two previous control structures, but we are now also using the “else if”
keyword combination to specify an additional condition. Each condition, starting with
the condition associated with the if-statement is checked in order. If and when one of the
conditions is satisfied (evaluates to true), the code block associated with that condition
is executed and all other code blocks are ignored.

Each of the code blocks in an if-else-if control structure are mutually exclusive. One
and only one of the code blocks will ever execute. Similar to the sequential control flow,
the first condition that is satisfied is the one that is executed. If none of the conditions
is satisfied, then the code block associated with the else-statement is the one that is
executed.

In our example, we only identified three possibilities. You can generalize an if-else-if
statement to specify as many conditions as you like. This generalization is depicted in

79
3. Conditionals

true
Code
if(hcondition 1i) hcondition 1i
Block A

false

true
Code
else if(hcondition 2i) hcondition 2i
Block B

false

true
Code
else if(hcondition 3i) hcondition 3i
Block C

false

.. ..
. .

false

true
Code
else if(hcondition ni) hcondition ni
Block N

false

Code
else
Block M

Remaining
Program

Figure 3.3.: Control Flow for an If-Else-If Statement. Each condition is evaluated in
sequence. The first condition that evaluates to true results in the corre-
sponding code block being executed. After executing, the program continues.
Thus, each code block is mutually exclusive: at most one of them is executed.
80
3.4. The If-Else-If Statement

Algorithm 3.4 and visualized in Figure 3.3. Similar to the if-statement, the else-statement
and subsequent code block is optional. If omitted, then it may be possible that none of
the code blocks is executed.

1 if (hcondition 1i) then

2 Code Block A
3 else if (hcondition 2i) then
4 Code Block B
5 else if (hcondition 3i) then
6 Code Block C
7 ...
8 else
9 Code Block
10 end

Algorithm 3.4: General If-Else-If Statement

The design of if-else-if statements must be done with care to ensure that your statements
are each mutually exclusive and capture the logic you intend. Since the first condition
that evaluates to true is the one that is executed, the order of the conditions is important.
A poorly designed if-else-if statement can lead to bugs and logical errors.

As an example, consider describing the loudness of a sound by its decibel level in Algorithm
3.5.

1 if decibel ≤ 70 then
2 comf ort ← “intrusive”
3 else if decibel ≤ 50 then
4 comf ort ← “quiet”
5 else if decibel ≤ 90 then
6 comf ort ← “annoying”
7 else
8 comf ort ← “dangerous”
9 end

Algorithm 3.5: If-Else-If Statement With a Bug

Suppose that decibel = 20 which should be described as a “quite” sound. However, in the
algorithm, the first condition, decibel ≤ 70 evaluates to true and the sound is categorized
as “intrusive”. The bug is that the second condition, decibel ≤ 50 should have come first
in order to capture all decibel levels less than or equal to 50.

Alternatively, we could have followed the example in Algorithm 3.3 and completely

81
3. Conditionals

specified both lower bounds and upper bounds in our condition. For example, the
condition for “intrusive” could have been

(decibel > 50) And (decibel ≤ 70)

However, doing this is unnecessary if we order our conditions appropriately and we can
potentially write simpler conditions if we remember the fact that the if-else-if statement
is mutually exclusive.

3.5. Ternary If-Else Operator

Another conditional operator is the ternary if-then-else operator. It is often used to

write an expression that can take on one of two values depending on the truth value of a
logical expression. Most programming languages support this operator which has the
following syntax:

E ? X : Y

Here, E is a Boolean expression. If E evaluates to true, the statement takes on the

value X which does not need to be a Boolean value: it can be anything (an integer,
string, etc.). If E evaluates to false, the statement takes on the value Y .

A simple usage of this expression is to find the minimum of two values:

min = ( (a < b) ? a : b );

If a < b is true, then min will take on the value a. Otherwise it will take on the value b
(in which case a ≥ b and so b is minimal). Most programming languages support this
special syntax as it provides a nice convenience (yet another example of syntactic sugar).

3.6. Examples

3.6.1. Meal Discount

Consider the problem of computing a receipt for a meal. Suppose we have the subtotal
cost of all items in the meal. Further, suppose that we want to compute a discount
(senior citizen discount, student discount, or employee discount, etc.). We can then apply
the discount, compute the sales tax, and sum a total, reporting each detail to the user.

To do this, we first prompt the user to enter a subtotal. We can then ask the user if
there is a discount to be computed. If the user answers yes, then we again prompt them
for an amount (to allow different types of discounts). Otherwise, the discount will be

82
3.6. Examples

zero. We can then proceed to calculate each of the amounts above. To do this we’ll need
an if-statement. We could also use a conditional statement to check to see if the input
makes sense: we wouldn’t want a discount amount that is greater than 100%. The full
algorithm is presented in Algorithm 3.6.

1 Prompt the user for a subtotal

2 subT otal ← read input from user
3 discountP ercent ← 0
4 Ask the user if they want to apply a discount
5 hasDiscount ← get user input if hasDiscount = “yes” then
6 Prompt the user for a discount amount
7 discountP ercent ← read user input
8 end
9 if discountP ercent > 100 then
10 Error! Discount cannot be more than 100%
11 end
12 discount ← subT otal × discountP ercent
13 discountT otal ← subT otal − discount
14 tax ← taxRate × discountT otal
15 grandT otal ← discountT otal + tax
16 output subT otal, discountT otal, tax, grandT otal to user

Algorithm 3.6: A simple receipt program

3.6.2. Look Before You Leap

Recall that dividing by zero is an invalid operation in most programming languages (see
Section 2.3.5). Now that we have a means by which numerical values can be checked, we
can prevent such errors entirely.

Suppose that we were going to compute a quotient of two variables x/y. If y = 0, this
would be an invalid operation and lead to undefined, unexpected or erroneous behavior.
However, if we checked whether or not the denominator is zero before we compute the
quotient then we could prevent such errors. We present this idea in Algorithm 3.7.

1 if y 6= 0 then
2 q ← x/y
3 end

Algorithm 3.7: Preventing Division By Zero Using an If Statement

83
3. Conditionals

This approach to programming is known as defensive programming. We are essentially

checking the conditions for an invalid operation before performing that operation. In
the example above, we simply chose not to perform the operation. Alternatively, we
could use an if-else statement to perform alternate operations or handle the situation
differently. Defensive programming is akin to “looking before leaping”: before taking a
potentially dangerous step, you look to see if you are at the edge of a cliff, and if so you
don’t take that dangerous step.

3.6.3. Comparing Elements

Suppose we have two students, student A and student B and we want to compare them:
we want to determine which one should be placed first in a list and which should be
placed second. For this exercise let’s suppose that we want to order them first by their
last names (so that Anderson comes before Zadora). What if they have the same last
name, like Jane Smith and John Smith? If the last names are equal, then we’ll want to
order them by their first names (Jane before John). If both their first names and last
names are the same, we’ll say either order is okay.

Names will likely be represented using strings, so let’s say that <, = and > apply to
strings, ordering them lexicographically (which is consistent with alphabetic ordering).
We’ll first need to compare their last names. If equal, then we’ll need another conditional
construct. This is achieved by nesting conditional statements as in Algorithm 3.8.

1 if A’s last name < B’s last name then

2 output A comes first
3 else if A’s last name > B’s last name then
4 output B comes first
5 else
//last names are equal, so compare their first names
6 if A’s first name < B’s first name then
7 output A comes first
8 else if A’s first name > B’s first name then
9 output B comes first
10 else
11 Either ordering is fine
12 end
13 end

Algorithm 3.8: Comparing Students by Name

84
3.6. Examples

3.6.4. Life & Taxes

Another example in which there are several cases that have to be considered is computing
an income tax liability using marginal tax brackets. Table 3.6 contains the 2014 US
Federal tax margins and marginal rates for a married couple filing jointly based on the
Adjusted Gross Income (income after deductions).

AGI is over But not over Tax

0 $18,150 10% of the AGI
$18,150 $73,800 $1,815 plus 15% of the AGI in excess of
$18,150
$73,800 $148,850 $10,162.50 plus 25% of the AGI in excess of
$73,800
$148,850 $225,850 $28,925 plus 28% of the AGI in excess of
$148,850
$225,850 $405,100 $50,765 plus 33% of the AGI in excess of
$225,850
$405,100 $457,600 $109,587.50 plus 35% of the AGI in excess
of $405,100
$457,600 — $127,962.50 plus 39.6% of the AGI in excess
of $457,600

Table 3.6.: 2014 Tax Brackets for Married Couples Filing Jointly

In addition, one of the tax credits (which offsets tax liability) tax payers can take is the
child tax credit. The rules are as follows:

• If the AGI is $110,000 or more, they cannot claim a credit (the credit is $0)

• Each child is worth a $1,000 credit, however at most $3,000 can be claimed

• The credit is not refundable: if the credit results in a negative tax liability, the tax
liability is simply $0

As an example: suppose that a couple has $35,000 AGI (placing them in the second tax
bracket) and has two children. Their tax liability is

$1, 815 + 0.15 × ($35, 000 − $18, 150) = $4, 342.50

However, the two children represent a $2,000 refund, so their total tax liability would be
$2,342.50.

Let’s first design some code that computes the tax liability based on the margins and
rates in Table 3.6. We’ll assume that the AGI is stored in a variable named income.
Using a series of if-else-if statements as presented in Algorithm 3.9, the variable tax will

85
3. Conditionals

contain our initial tax liability.

1 if income ≤ 18, 150 then

2 tax ← .10 · income
3 else if income > 18, 150 And income ≤ 73, 800 then
4 tax ← 1, 815 + .15 · (income − 18, 150)
5 else if income > 73, 800 And income ≤ 148, 850 then
6 tax ← 10, 162.50 + .25 · (income − 73, 800)
7 else if income > 148850 And income ≤ 225, 850 then
8 tax ← 28, 925 + .28 · (income − 148, 850)
9 else if income > 225, 850 And income ≤ 405, 100 then
10 tax ← 50, 765 + .33 · (income − 225, 850)
11 else if income > 405, 100 And income ≤ 457, 600 then
12 tax ← 109, 587.50 + .35 · (income − 405, 100)
13 else
14 tax ← 127, 962.50 + .396 · (income − 457, 600)
15 end

Algorithm 3.9: Computing Tax Liability with If-Else-If

We can then compute the amount of a tax credit and adjust the tax accordingly by using
similar if-else-if and if-else statements as in Algorithm 3.10.

1 if income ≥ 110, 000 then

2 credit ← 0
3 else if numberOf Children ≤ 3 then
4 credit ← numberOf Children ∗ 1, 000
5 else
6 credit ← 3000
7 end
//Now adjust the tax, taking care that its a nonrefundable credit
8 if credit > tax then
9 tax ← 0
10 else
11 tax ← (tax − credit)
12 end

Algorithm 3.10: Computing Tax Credit with If-Else-If

86
3.7. Exercises

3.7. Exercises

Exercise 3.1. Write a program that prompts the user for an x and a y coordinate in
the Cartesian plane and prints out a message indicating if the point (x, y) lies on an axis
(x or y axis, or both) or what quadrant it lies in (see Figure 3.4).

Quadrant II Quadrant I

Quadrant III Quadrant IV

Figure 3.4.: Quadrants of the Cartesian Plane

Exercise 3.2. A BOGO (Buy-One, Get-One) sale is a promotion in which a person

buys two items and receives a 50% discount on the less expensive one. Write a program
that prompts the user for the cost of two items, computes a 50% discount on the less
expensive one, and then computes a grand total.

Exercise 3.3. Price Per Mile. Write a program to determine which type of gas is the
better deal with respect to price-per-mile driven. For example, suppose unleaded costs
$2.50 per gallon and your vehicle is able to get an average of 30 miles per gallon. The
true cost of unleaded is thus 8.33 cents per mile. Now suppose that the ethanol fuel costs
only $2.25 per gallon but only yields 25 miles per gallon, thus 9 cents per mile, a worse
deal.

Write a program that prompts the user to enter the price (per gallon) of one type of gas
as well as the miles-per-gallon. Then prompt for the same two values for a second type
of gas and compute the true cost of each type. Print a message indicating which type of
gas is the better deal. For example, if the user enters the values above the output would
look something like:

87
3. Conditionals

Gas A: $0.0833 per mile

Gas B: $0.0900 per mile
Gas A is the better deal.

Exercise 3.4. Various substances have different boiling points. A selection of substances
and their boiling points can be found in Table 3.7. Write a program that prompts the
user for the observed boiling point of a substance in degrees Celsius and identifies the
substance if the observed boiling point is within 5% of the expected boiling point. If the
data input is more than 5% higher or lower than any of the boiling points in the table, it
should output Unknown substance.

Substance Boiling Point (C)

Methane -161.7
Butane -0.5
Water 100
Nonane 150.8
Mercury 357
Copper 1187
Silver 2193
Gold 2660

Table 3.7.: Expected Boiling Points

Exercise 3.5. Electrical resistance in various metals can be measured using nano-ohm
metres (nΩ · m). Table 3.8 gives the resistivity of several metals.

Material Resistivity (nΩ · m)

Copper 16.78
Aluminum 26.50
Beryllium 35.6
Potassium 72.0
Iron 96.10

Table 3.8.: Resistivity of several metals

Write a program that prompts the user for an observed resistivity of an unknown material
(as nano-ohm metres) and identifies the substance if the observed resistivity is within
±3% of the known resistivity of any of the materials in Table 3.8. If the input value lies
outside the ±3% range, output Unknown substance.

Exercise 3.6. The visible light spectrum is measured in nanometer (nm) frequencies.
Ranges roughly correspond to visible colors as depicted in Table 3.9.

88
3.7. Exercises

Color Wave length range (nm)

Violet 380 – 450
Blue 450 – 475
Indigo 476 – 495
Green 495 – 570
Yellow 570 – 590
Orange 590 – 620
Red 620 - 750

Table 3.9.: Visible Light Spectrum Ranges

Write a program that takes an integer corresponding to a wavelength and outputs the
corresponding color. If the value lies outside the ranges it should output Not a visible
wavelength. If a value lies within multiple color ranges it should print all that apply
(for example, a wavelength of 495 is “Indigo-green”).
Exercise 3.7. A certain production of steel is graded according to the following condi-
tions:

(i) Hardness must be greater than 50

(ii) Carbon content must be less than 0.7

(iii) Tensile strength must be greater than 5600

A grade of 5 thru 10 is is assigned to the steel according to the conditions in Table 3.10.
Write a program that will read in the hardness, carbon content, and tensile strength as
Grade Conditions
10 All three conditions are met
9 Conditions (i) and (ii) are met
8 Conditions (ii) and (iii) are met
7 Conditions (i) and (iii) are met
6 If only 1 of the three conditions is met
5 If none of the conditions are met

Table 3.10.: Grades of Steel

inputs and output the corresponding grade of the steel.

Exercise 3.8. A triangle can be characterized in terms of the length of its three sides.
In particular, an equilateral triangle is a triangle with all three sides being equal. A
triangle such that two sides have the same length is isosceles and a triangle with all three
sides having a different length is scalene. Examples of each can be found in Figure 3.5.

In addition, the three sides of a triangle are valid only if the sum of any two sides is
strictly greater than the third length.

89
3. Conditionals

Write a program to read in three numbers as the three sides of a triangle. If the three
sides do not form a valid triangle, you should indicate so. Otherwise, if valid, your
program should output whether or not the triangle is equilateral, isosceles or scalene.

(a) Equilateral Triangle (b) Isosceles Triangle (c) Scalene Triangle

Figure 3.5.: Three types of triangles

Exercise 3.9. Body Mass Index (BMI) is a healthy statistic based on a person’s mass
and height. For a healthy adult male BMI is calculated as
m
BMI = · 703.069579
h2
where m is the person’s mass (in lbs) and h is the person’s height (in whole inches).
Write a program that reads in a person’s mass and height as input and outputs a
characterization of the person’s health with respect to the categories in Table 3.11.

Range Category
BMI < 15 Very severely underweight
15 ≤ BMI < 16 Severely underweight
16 ≤ BMI < 18.5 Underweight
18.5 ≤ BMI < 25 Normal
25 ≤ BMI < 30 Overweight
30 ≤ BMI < 35 Obese Class I
35 ≤ BMI < 40 Obese Class II
BMI ≥ 40 Obese Class III

Table 3.11.: BMI Categories

Exercise 3.10. Let R1 and R2 be rectangles in the plane defined as follows. Let (x1 , y1 )
be point corresponding to the lower-left corner of R1 and let (x2 , y2 ) be the point of its
upper-right corner. Let (x3 , y3 ) be point corresponding to the lower-left corner of R2 and
let (x4 , y4 ) be the point of its upper-right corner.

Write a program to determine the intersection of these two rectangles. In general, the
intersection of two rectangles is another rectangle. However, if the two rectangles abut
each other, the intersection could be a horizontal or vertical line segment (or even a
point). It is also possible that the intersection is empty. Your program will need to
distinguish between these cases.

90
3.7. Exercises

(8.5, 8.25)
(6, 7.5)

(4, 5.5)

(2, 1)
x

Figure 3.6.: Intersection of Two Rectangles

If the intersection of R1 , R2 is a rectangle, R3 , your program should output two points

(the lower-left and upper-right corners of R3 ) as well as the area of R3 . If the intersection
is a line segment, your program should output the two end-points and whether it is a
vertical or horizontal line segment. Finally, if the intersection is empty your program
should output “empty intersection”. Your program should also be robust enough to
check that the input is valid (it should not accept empty or “reversed” rectangles).

Your program should read in x1 , y1 , x2 , y2 , x3 , y3 , x4 , y4 from the user and perform the
computation above. As an example, the values 2, 1, 6, 7.5, 4, 5.5, 8.5, 8.25 would correspond
to the two rectangles in Figure 3.6.

The output for this instance should look something like the following.

Intersecting rectangle: (4, 5.5), (6, 7.5)

Area: 4.00

Exercise 3.11. Write an app to help people track their cell phone usage. Cell phone
plans for this particular company give you a certain number of minutes every 30 days
which must be used or they are lost (no rollover). We want to track the average number
of minutes used per day and inform the user if they are using too many minutes or can
afford to use more.

Write a program that prompts the user to enter the following pieces of data:

91
3. Conditionals

• Number of minutes in the plan per 30 day period, m

• The current day in the 30 day period, d

• The total number of minutes used so far u

The program should then compute whether the user is over, under, or right on the average
daily usage under the plan. It should also inform them of how many minutes are left
and how many, on average, they can use per day for the rest of the month. Of course, if
they’ve run out of minutes, it should inform them of that too.

For example, if the user enters m = 250, d = 10, and u = 150, your program should print
out something similar to the following.

10 days used, 20 days remaining

Average daily use: 15 min/day

You are EXCEEDING your average daily use (8.33 min/day),

continuing this high usage, you'll exceed your minute plan by
200 minutes.

To stay below your minute plan, use no more than 5 min/day.

Of course, if the user is under their average daily use, a different message should be
presented. You are allowed/encouraged to compute any other stats for the user that you
feel would be useful.

Exercise 3.12. Write a program to help a floor tile company determine how many tiles
they need to send to a work site to tile a floor in a room. For simplicity, assume that all
rooms are perfectly rectangular with no obstructions; we will also omit any additional
measurements related to grouting.

Further, we will assume that all tile is laid in a grid pattern centered at the center of the
room. That is, four tiles will meet at their corners at the center of the room with tiles
laid out to the edge of the room. Thus, it may be the case that the final row and/or
column at the edge may need to be cut. Also note that if the cut is short enough, the
remaining tile can be used on the other end of the room (same goes for the corners).

The program will take the following input:

• w - the width of the room

• l - the length of the room

92
3.7. Exercises

Center of the room Center of the room

10.0 10.0

0.9 0.9 0.4 0.4

9.8 8.8
(a) Example 1 (b) Example 2

Figure 3.7.: Examples of Floor Tiling

• t - width/length of the tile (all tiles are perfectly square)

If we can use whole tiles to perfectly fit the room, then we do so. For example, on
the input (10, 10, 1), we could perfectly tile a 10 × 10 room with 100 1 × 1 tiles. If the
tiles don’t perfectly fit, then we have to consider the possibility of waste and/or reuse.
Consider the examples in Figure 3.7.

The first example is from the input (9.8, 100, 1). In this case, we lay the tiles from the
center of the room (8 full tile lengths) but are left with 0.9 on either side. If we cut a
tile to fit the left side, we are left with only .1 tile which is too short for the right side.
Therefore, we are forced to waste the 0.1 length and cut a full tile for the right side. In
all, 100 tiles are required.

The second example is from the input (8.8, 100, 1). In this case, we again lay tiles from
the center of the room (8 full tile lengths) and are left with 0.4 lengths on either side.
Here, we can reuse the cut tile: cut a tile on one side 0.4 with 0.6 remaining, and cut 0.4
on the other side of the tile (with the center 0.2 length of the tile being waste). Thus,
both sides can be tiled with a single tile, meaning only 90 full tiles are needed to tile this
room.

You may further assume that tiles used on the length-side end of the room cannot be
used to tile the width-side of the room (and vice versa). Your program will compute and
output the number of tiles required.

93
4. Loops
Computers are really good at automation. A key aspect of automation is the ability to
repeat a process over and over on different pieces of data until some condition is met.
For example, if we have a collection of numbers and we want to find their sum we would
iterate over each number, adding it to a total, until we have examined every number.
Another example may include sending an email message to each student in a course. To
automate the process, we could iterate over each student record and for each student we
would generate and send the email.

Automated repetition is where loops come in handy. Computers are perfectly suited for
performing such repetitive tasks. We can write a single block of code that performs some
action or processes a single piece of data, then we can write a loop around that block of
code to execute it a number of times.

Loops provide a much better alternative than repeating (cut-paste-cut-paste) the same
code over and over with different variables. Indeed, we wouldn’t even do this in real
life. Suppose that you took a 100 mile trip. How would you describe it? Likely, you
wouldn’t say, “I drove a mile, then I drove a mile, then I drove a mile, . . .” repeated 100
times. Instead, you would simply state “I drove 100 miles” or maybe even, “I drove until
I reached my destination.”

Loops allow us to write concise, repeatable code that can be applied to each element in
a collection or perform a task over and over again until some condition is met. When
writing a loop, there are three essential components:

• An initialization statement that specifies how the loop begins

• A continuation (or termination) condition that specifies whether the loop should
continue to execute or terminate

• An iteration statement that makes progress toward the termination condition

The initialization statement is executed before the loop begins and serves as a way to set
the loop up. Typically, the initialization statement involves setting the initial value of
some variable.

The continuation statement is a logical statement (that evaluates to true or false) that
specifies if the loop should continue (if the value is true) or should terminate (if the value
is false). Upon termination, code returns to a sequential control flow and the program

95
4. Loops

Initialization:
i ← 1

Continuation: false
i ≤ 10?

true

loop body

Iteration:
repeat i ← (i + 1)

remaining
program

Figure 4.1.: A Typical Loop Flow Chart

96
4.1. While Loops

continues.

The iteration statement is intended to update the state of a program to make progress
toward the termination condition. If we didn’t make such progress, the loop would
continue on forever as the termination condition would never be satisfied. This is known
as an infinite loop , and results in a program that never terminates.

As a simple example, consider the following outline.

• Initialize the value of a variable i to 1

• While the value of i is less than or equal to 10 . . . (continuation condition)

• Perform some action (this is sometimes referred to as the loop body

• Iterate the variable i by adding one to its value

The code outline above specifies that some action is to be performed once for each value:
i = 1, i = 2, . . . , i = 10, after which the loop terminates. Overall, the loop executes a
total of 10 times. Prior to each of the 10 executions, the value of i is checked; as it is less
than or equal to 10, the action is performed. At the end of each of the 10 iterations, the
variable i is incremented by 1 and the termination condition is checked again, repeating
the process. There are several different types of loops that vary in syntax and style but
they all have the same three basic components.

4.1. While Loops

A while loop is a type of loop that places the three components in their logical order. The
initialization statement is written before the loop code. Typically the keyword while is
used to specify the continuation/termination condition. Finally, the iteration statement
is usually performed at the end of the loop inside the code block associated with the loop.
A small, counter-controlled while loop is presented in Algorithm 4.1 which illustrates the
previous example of iterating a variable i from 1 to 10.

1 i ← 1 //Initialization statement
2 while (i ≤ 10) do
3 Perform some action
4 i ← (i + 1) //Iteration statement
5 end

Algorithm 4.1: Counter-Controlled While Loop

Prior to the while statement, the variable i is initialized to 1. This action is only

97
4. Loops

performed once and it is done so before the loop code. Then, before the loop code is
executed, the continuation condition is checked. Since i = 1 ≤ 10, the condition evaluates
to true and the loop code block is executed. The last line of the code block is the iteration
statement, where i is incremented by 1 and now has a value of 2. The code returns to
the top of the loop and again evaluates the continuation condition (which is still true as
i = 2 ≤ 10).

On the 10th iteration of the loop when i = 10, the loop will execute for the last time. At
the end of the loop, i is incremented to 11. The loop still returns to the top and the
continuation condition is still checked one last time. However, since i = 11 6≤ 10, the
condition is now false and the loop terminates. Regular sequential control flow returns
and the program continues executing whatever code is specified after the loop.

4.1.1. Example

In the previous example we knew that we wanted the loop to execute ten times. Though
you can use a while loop in counter-controlled situations, while loops are typically used
in scenarios when you may not know how many iterations you want the loop to execute
for. Instead of a straightforward iteration, the loop itself may update a variable in a
less-than-predictable manner.

As an example, consider the problem of normalizing a number as is typically done in

scientific notation. Given a number x (for simplicity, we’ll consider x ≥ 1), we divide
it by 10 until its value is in the interval [1, 10), keeping track of how many times we’ve
divided by 10. For example, if we have the number x = 32, 145.234, we would divide by
10 four times, resulting in 3.2145234 so that we could express it as

3.2145234 × 104

A simple realization of this process is presented in Algorithm 4.2. Rather than some
fixed number of iterations, the number of times the loop executes depends on how large
x is. For the example mentioned, it executes 4 times; for an input of x = 10, 000, 000 it
would execute 7 times. A while loop allows us to specify the repetition process without

98
4.2. For Loops

having to know up front how many times it will execute.

Input : A number x, x ≥ 0
Output : x normalized, k its exponent
1 k←0
2 while x > 10 do
3 x ← (x/10)
4 k ← (k + 1)
5 end
6 output x, k

Algorithm 4.2: Normalizing a Number With a While Loop

4.2. For Loops

A for loop is similar to a while loop but allows you to specify the three components on
the same line. In many cases, this results in a loop that is more readable; if the code
block in a while loop is long it may be difficult to see the initialization, continuation,
and iteration statements clearly. For loops are typically used to iterate over elements
stored in a collection such as an array (see Chapter 7). Usually the keyword for is used
to identify all three components. A general example is given in Algorithm 4.3.

1 for ( hinitializationi; hcontinuationi; hiterationi ) do

2 Perform some action
3 end

Algorithm 4.3: A General For Loop

Note the additional syntax: in many programming languages, semicolons are used at
the end of executable statements. Semicolons are also used to delimit each of the three
loop components in a for-loop (otherwise there may be some ambiguity as to where
each of the components begins and ends). However, the semicolons are typically only
placed after the initialization statement and continuation condition and are omitted after
the iteration statement. A more concrete example is given in Algorithm 4.4 which is
equivalent to the counter-controlled while loop we examined earlier.

Though all three components are written on the same line, the initialization statement
is only ever executed once; at the beginning of the loop. The continuation condition is
checked prior to each and every execution of the loop. Only if it evaluates to true does
the loop body execute. The iteration condition is performed at the end of each loop
iteration.

99
4. Loops

1 for ( i ← 1; i ≤ 10; i ← (i + 1) ) do
2 Perform some action
3 end

Algorithm 4.4: Counter-Controlled For Loop

4.2.1. Example

As a more concrete example, consider Algorithm 4.5 in which we do the same iteration
(i will take on the values 1, 2, 3, . . . , 10), but in each iteration we add the value of i for
that iteration to a running total, sum.

1 sum ← 0
2 for ( i ← 1; i ≤ 10; i ← (i + 1) ) do
3 sum ← (sum + i)
4 end

Algorithm 4.5: Summation of Numbers in a For Loop

Again, the initialization of i = 1 is only performed once. On the first iteration of the loop,
i = 1 and so sum will be given the value sum + i = 0 + 1 = 1 At the end of the loop, i will
be incremented and will have a value of 2. The continuation condition is still satisfied, so
once again the loop body executes and sum will be given the value sum + i = 1 + 2 = 3.
On the 10th (last) iteration, sum will have a value 1 + 2 + 3 + · · · + 9 = 45 and i = 10.
Thus sum + i = 45 + 10 = 55 after which i will be incremented to 11. The continuation
condition is still checked, but since 11 6≤ 10, the loop body will not be executed and the
loop will terminate, resulting in a final sum value of 55.

4.3. Do-While Loops

Yet another type of loop is the do-while loop. One major difference between this type
of loop and the others is that it is always executed at least once. The way that this is
achieved is that the continuation condition is checked at the end of the loop rather than
prior to is execution. The same counter-controlled example can be found in Algorithm
4.6.

In contrast to the previous examples, the loop body is executed on the first iteration
without checking the continuation condition. Only after the loop body, including the
incrementing of the iteration variable i is the continuation condition checked. If true, the
loop repeats at the beginning of the loop body.

100
4.3. Do-While Loops

1 i←1
2 do
3 Perform some action
4 i ← (i + 1)
5 while i ≤ 10

Algorithm 4.6: Counter-Controlled Do-While Loop

Initialization:
i ← 1

loop body

Iteration:
i ← (i + 1)

Continuation:
i ≤ 10? true

false

remaining
program

Figure 4.2.: A Do-While Loop Flow Chart. The continuation condition is checked after
the loop body.

Do-while loops are typically used in scenarios in which the first iteration is used to “setup”
the continuation condition (thus, it needs to be executed at least once). A common
example is if the loop body performs an operation that may result in an error code (or
flag) that is either true (an error occurred) or false (no error occurred).

From this perspective, a do-while loop can also be seen as a do-until loop: perform a
task until some condition is no longer satisfied. The subtle wording difference implies

101
4. Loops

1 do
2 Read some data
3 isError ← result of reading
4 while isError

Algorithm 4.7: Flag-Controlled Do-While Loop

that we’ll perform the action before checking to see if it should be performed again.

4.4. Foreach Loops

Many languages support a special type of loop for iterating over individual elements
in a collection (such as a set, list, or an array). In general, such loops are referred to
as foreach loops. These types of loops are essentially syntactic sugar: iterating over a
collection could be achieved with a for loop or a while loop, but foreach loops provide a
more convenient way to iterate over a collections. We will revisit these loops when we
examine arrays in Chapter 7. For now, we look at a simple example in Algorithm 4.8.

1 foreach element a in the collection A do

2 process the element a
3 end

Algorithm 4.8: Example Foreach Loop

How the elements are stored in the collection and how they are iterated over is not our
(primary) concern. We simply want to apply the same block of code to each element,
the foreach loop handles the details on how each element is iterated over. The syntax
also provides a way to refer to each element (the a variable in the algorithm). On each
iteration of the loop, the foreach loop updates the reference a to the next element in
the array. The loop terminates after it has iterated through each and every one of the
elements. In this way, a foreach loop simplifies the syntax: we don’t have to specify any
of the three components ourselves. As a more concrete example, consider iterating over
each student in a course roster. For each student, we wish to compute their grade and
then email them the results. The foreach loop allows us to do this without worrying
about the iteration details (see Algorithm 4.9).

102
4.5. Other Issues

1 foreach (student s in the class C) do

2 g ← compute a’s grade
3 send a an email informing them of their grade g
4 end

Algorithm 4.9: Foreach Loop Computing Grades

4.5. Other Issues

4.5.1. Nested Loops

Just as with conditional statements, we can nest loops within loops to perform more
complex processes. Though you can do this with any type of loop, we present a simple
example using for loops in Algorithm 4.10.

1 n ← 10
2 m ← 20
3 for (i ← 1; i ≤ m; i ← (i + 1)) do
4 for (j ← 1; j ≤ n; j ← (j + 1)) do
5 output (i, j)
6 end
7 end

Algorithm 4.10: Nested For Loops

The outer for loop executes a total of 20 times while the inner for loop executes 10
times. Since the inner for loop is nested inside the outer loop, the entire inner loop
executes all 10 iterations for each of the 20 iterations of the outer loop. Thus, in
total the inner most output operation executes 10 × 20 = 200 times. Specifically, it
outputs (1, 1), (1, 2), . . . , (1, 10), (2, 1), (2, 2), . . . , (2, 10), (3, 1), . . . , (20, 10). Nested loops
are commonly used when iterating over elements in two-dimensional arrays such as
tabular data or matrices. Nested loops can also be used to process all pairs in a collection
of elements.

4.5.2. Infinite Loops

Sometimes a simple mistake in the design of a loop can make it execute forever. For
example, if we accidentally iterate a variable in the wrong direction or write the opposite

103
4. Loops

termination/continuation condition. Such a loop is referred to as an infinite loop. As an

example, suppose we forgot the increment operation from a previous example.

1 sum ← 0
2 i←1
3 while i ≤ 10 do
4 sum ← (sum + i)
5 end

Algorithm 4.11: Infinite Loop

In Algorithm 4.11 we never make progress toward the terminating condition! Thus, the
loop will execute forever, i will continue to have the value 0 and since 0 ≤ 10, the loop
body will continue to execute. Care is needed in the design of your loops to ensure that
they make progress toward the termination condition.

Most of the time an infinite loop is not something you want and usually you must terminate
your buggy program externally (sometimes referred to as “killing” it). However, infinite
loops do have their uses. A poll loop is a loop that is intended to not terminate. At a
system level, for example, a computer may poll devices (such as input/output devices)
one-by-one to see if there is any active input/output request. Instead of terminating,
the poll loop simply repeats itself, returning back to the first device. As long as the
computer is in operation, we don’t want this process to stop. This can be viewed as an
infinite loop as it doesn’t have any termination condition.

The Zune Bug

Though proper testing and debugging should reduce the likelihood of such bugs, there
are several notable instances in which an infinite loop impacted real software. One
such instance was the Microsoft Zune bug. The Zune was a portable music player, a
competitor to the iPod. At about midnight on the night of December 31st, 2008, Zunes
everywhere failed to startup properly. A firmware clock driver designed by a 3rd party
company contained the following code.

2008 was a leap year, so the check on line 2 evaluated to true. However, though December
31st, 2008 was the 366th day of the year ( days = 366 ) the third line evaluated to false
and the loop was repeated without any of the program state being updated. The problem
was “fixed” 24 hours later when it was the 367th day and line 3 worked. The problem
was that line 3 should have been days >= 366) .

The failure was that this code was never tested on the “corner cases” that it was designed
for. No one thought to test the driver to see if it worked on the last day of a leap year.

104
4.5. Other Issues

1 while(days > 365) {

2 if(IsLeapYear(year)) {
3 if(days > 366) {
4 days -= 366;
5 year += 1;
6 }
7 } else {
8 days -= 365;
9 year += 1;
10 }
11 }

Code Sample 4.1.: Zune Bug

The code worked the vast majority of the time, but this illustrates the need for rigorous
testing.

4.5.3. Common Errors

When writing loops its important to use the proper syntax in the language in which you
are coding. Many languages use semicolons to terminate executable statements. However,
the while statements are not executable: they are part of the control structure of the
language and do not have semicolons at the end. A misplaced semicolon could be a
syntax error, or it could be syntactically correct but lead to incorrect results. A common
error is to place a semicolon at the end of a while statement as in

while(count <= 10); //WRONG!!!

In this example, the while loop binds to an empty executable statement and results in
an infinite loop!

Other common errors are the result of misidentifying either the initialization statement
or the continuation condition. Starting a counter at 1 instead of zero, or using a ≤
comparison instead of a < , etc. These can lead to a loop being off-by-one resulting in a
logic error.

Other errors are the result of using improper variable types. Recall that operations
involving floating-point numbers can have round off and precision errors, 13 + 13 + 13 may
not be equal to one for example. It is best to avoid using floating-point numbers or
comparisons in the control of your loops. Boolean and integer types are much less error

105
4. Loops

prone.

Finally, you must always ensure that your loops are making progress toward the termina-
tion condition. A failure to properly increment a counter can lead to incorrect results or
even an infinite loop.

4.5.4. Equivalency of Loops

It might not seem obvious at first, but in fact, any type of loop can be re-written as
another type of loop and perform equivalent operations. That is, any while loop can be
rewritten as an equivalent for loop. Any do-while loop can be rewritten as an equivalent
while loop!

So why do we have different types of loops? The short answer is that we want our
programming languages to be flexible. We could design a language in which every loop
had to be a while loop for example, but there are some situations in which it would be
more “natural” to write code with a for loop. By providing several options, programmers
have the choice of which type of loop to write.

In general, there are no “rules” as to which loop to apply to which situation. There
are general trends, best practices, and situations where it is more common to use one
loop rather than another, but in the end it does come down to personal choice and style.
Some software projects or organizations may have established guidelines or style guide
that establishes such guidelines in the interest of consistency and uniformity.

4.6. Problem Solving With Loops

Loops can be applied to any problem that requires repetition of some sort or to simplify
repeated code. When designing loops, it is important to identify the three components
by asking the questions:

• Where does the loop start? What variables or other state may need to be initialized
or setup prior to the beginning of the loop?

• What code needs to be repeated? How can it be generalized to depend on loop

control variables? This helps you to identify and write the loop body.

• When should the loop end? How many times do we want it to execute? This helps
you to identify the continuation and/or termination condition.

• How do we make progress toward the termination condition? What variable(s)

need to be incremented and how?

106
4.7. Examples

4.7. Examples

4.7.1. For vs While Loop

Let’s consider how to write a loop to compute the classic geometric series,
∞
1 X
= xk = 1 + x + x2 + x3 + · · ·
1 − x k=0

Obviously a computer cannot compute an infinite series as it is required to terminate in

a finite number of steps. Thus, we can approach this problem in a number of different
ways.

One way we could approximate the series is to compute it out to a fixed number of terms.
To do so, we could initialize a sum variable to zero, then iteratively compute and add
terms to the sum until we have computed n terms. To keep track of the terms, we can
define a counter variable, k as in the summation.

Following our strategy, we can identify the initialization: k should start at 0. The iteration
is also easy: k should be incremented by 1 each time. The continuation condition should
continue the loop until we have computed n terms. However, since k starts at 0, we
would want to continue while k < n. We would not want to continue the iteration when
k = n as that would make n + 1 iterations (again since k starts at 0). Further, since
we know the number of iterations we want to execute, a for loop is arguably the most
appropriate loop for this problem. Our solution is presented in Algorithm 4.12.

Input : x, n ≥ 0
1
Output : An approximated value of 1−x
using a geometric series
1 sum ← 0
2 for (k = 0; k < n; k ← (k + 1)) do
3 sum ← (sum + xk )
4 end
5 output sum

Algorithm 4.12: Computing the Geometric Series Using a For Loop

As an alternative, consider the following approach: instead of computing a predefined

number of terms, what if we computed terms until the difference between the value in the
previous iteration and the value in the current iteration is negligible, say less than some
small amount. We could stop our computation because any further iterations would
only affect the summation less and less. That is, the current value represents a “good
enough” approximation. That way, if someone wanted an even better approximation,
they could specify a smaller .

107
4. Loops

This approach will be more straightforward with a while loop since the continuation
condition will be more along the lines of “while the estimation is not yet good enough,
continue the summation.” This approach will also be easier if we keep track of both
a current and a previous value of the summation, then computing and checking the
difference will be easier.

Input : x, > 0
1 sumprev ← 0
2 sumcurr ← 1
3 k←1
4 while |sumprev − sumcurr | ≥ do
5 sumprev ← sumcurr
6 sumcurr ← (sumcurr + xk )
7 k ← (k + 1)
8 end
9 output sum

Algorithm 4.13: Computing the Geometric Series Using a While Loop

On lines 1–2 we initialize our values to ensure that the while loop will execute at least
once. In the continuation condition, we use the absolute value of the difference as the
series can oscillate between negative and positive values.

4.7.2. Primality Testing

An integer n > 1 is called prime if the only integers that divide it evently are 1 and itself.
Otherwise it is called composite. For example, 30 is composite as it is divisible by 2, 3,
and 5 among others. However, 31 is prime as it is only divisible by 1 and 31.

Consider the problem of determining whether or not a given integer n is prime or

composite, referred to as primality testing. A straightforward
√ way of determining this is
to simply try dividing by every integer 2 up to n: if any of these integers divides n,
then n is composite.
√ Otherwise, if none of them do, n is prime. Observe that we only
need to go up to n since√ any prime divisor greater than that will correspond to some
prime divisor less than n.

A simple for loop can be constructed to capture this idea. Our

√ initialization clearly starts
at i = 2, incrementing by 1 each time until i has exceeded n. This solution is presented
in Algorithm 4.14. Of course this is certainly not the most efficient way to solve this
problem, but we will not go into more advanced algorithms here.

Now consider this more general problem: given an integer m > 1, determine how many

108
4.7. Examples

Input : n > 1
√
1 for (i ← 2; i ≤ n; i ← (i + 1)) do
2 if i divides n then
3 output composite
4 end
5 end
6 output prime

Algorithm 4.14: Determining if a Number is Prime or Composite

prime numbers ≤ m there are. A key observation is that we’ve already solved part of
the problem: determining if a given number is prime in the previous exercise. To solve
this more general problem, we could reuse or adapt our previous solution. In particular,
we could surround the previous solution in an outer loop and iterate over integers from
2 up to m. The inner loop would then determine if the integer is prime and instead of
outputting a result, could increment a counter of the number of primes it has found so
far. This solution is presented in Algorithm 4.15.

Input : m > 1
1 numberOf P rimes ← 0
2 for (j = 2; j ≤ m; j ← (j + 1)) do
3 isP rime ← true
√
4 for (i ← 2; i ≤ j; i ← (i + 1)) do
5 if (i divides j) then
6 isP rime ← f alse
7 end
8 end
9 if (isP rime) then
10 numberOf P rimes ← (numberOf P rimes + 1)
11 end
12 end
13 output numberOf P rimes

Algorithm 4.15: Counting the number of primes.

4.7.3. Paying the Piper

Banks issue loans to customers as one lump sum called a principle P that the borrower
must pay back over a number of terms. Usually payments are made on a monthly basis.

109
4. Loops

Further, banks charge an amount of interest on a loan measured as an Annual Percentage

Rate (APR). Given these conditions, the borrower makes monthly payments determined
by the following formula.
iP
monthlyP ayment =
1 − (1 + i)−n
apr
Where i = 12
is the monthly interest rate, and n is the number of terms (in months).

For simplicity, suppose we borrow P = $1, 000 at 5% interest (apr = 0.05) to be paid
back over a term of 2 years (n = 24). Our monthly payment would (rounded) be
.05
12
· 1000
monthlyP ayment = = $43.87
1 − (1 + .05
12
)−24

When the borrower makes the first month’s payment, some of it goes to interest, some of
it goes to paying down the balance. Specifically, one month’s interest on $1,000 is
0.05
$1, 000 · = $4.17
12
and so $43.87 − $4.17 = $39.70 goes to the balance, making the new balance $960.30.
The next month, this new balance is used to compute the new interest payment,
0.05
$960.30 · = $4.00
12
And so on until the balance is fully paid. This process is known as loan amortization.

Let’s write a program that will calculate a loan amortization schedule given the inputs as
described above. To start, we’ll need to compute the monthly payment using the formula
above and for that we’ll need a monthly interest rate. The balance will be updated
month-to-month, so we’ll use another variable to represent the balance. Finally, we’ll
want to track the current month in the loan schedule process.

Once we have these variables setup, we can start a loop that will repeat once for each
month in the loan schedule. We could do this using either type of loop, but for this
exercise, let’s use a while loop. Using our month variable, we’ll start by initializing it to
1 and run the loop through the last month, n.

On each iteration we compute that month’s interest and principle payments as above,
update the balance, and also be sure to update our month counter variable to ensure
we’re making progress toward the termination condition. On each iteration we’ll also
output each of these variables to the user. The full program can be found in Algorithm
4.16.

If we were to actually implement this we’d need to be more careful. This outlines the basic
process, but keep in mind that US dollars are only accurate to cents. A monthly payment

110
4.8. Exercises

can’t be $43.871 cents. We’ll need to take care to round properly. This introduces another
issue: by rounding the final month’s payment may not match the expected monthly
payment (we may over or under pay in the final month). An actual implementation may
need to handle the final month’s payment separately with different logic and operations
than are outside the loop.

Input : A principle, P , a number of terms, n, an APR, apr

Output : A loan amortization schedule
1 balance ← P //The initial balance is the principle
2 i ← apr
12
//monthly interest rate
iP
3 monthlyP ayment ← 1−(1+i)−n
4 month ← 1 //A month counter
5 while (month ≤ n) do
6 monthInterest ← i · balance
7 monthP rinciple ← monthlyP ayment − monthInterest
8 balance ← balance − monthP rinciple
9 month = (month + 1)
10 output month, monthInterest, monthP rinciple, balance
11 end

Algorithm 4.16: Computing a loan amortization schedule

4.8. Exercises

Exercise 4.1. Write a for-loop and a while-loop that accomplishes each of the following.

(a) Prints all integers 1 thru 100 on the same line delimited by a single space

(b) Prints all even integers 0 up to n in reverse order

(d) Prints all positive powers of two up to 230 : 1, 2, 4, . . . , 1073741824 one value per
line (try computing up to 231 and 232 and discern reasons for why it may fail)

(e) Prints all even integers 2 thru 200 on 10 different lines (10 numbers on each line)
in reverse order

(f) Prints the following pattern of numbers (hint: use two nested loops; the result can
be computed using some value of i + 10j)

111
4. Loops

11 21 31 41 51 61 71 81 91 101
12 22 32 42 52 62 72 82 92 102
13 23 33 43 53 63 73 83 93 103
14 24 34 44 54 64 74 84 94 104
15 25 35 45 55 65 75 85 95 105
16 26 36 46 56 66 76 86 96 106
17 27 37 47 57 67 77 87 97 107
18 28 38 48 58 68 78 88 98 108
19 29 39 49 59 69 79 89 99 109
20 30 40 50 60 70 80 90 100 110

Exercise 4.2. Civil engineers have come up with two different models on how a city’s
population will grow over the next several years. The first projection assumes a 10%
annual growth rate while the second projection assumes a linear growth rate of 50,000
additional citizens per year. Write a program to project the population growth under
both models. Take, as input, the initial population of the city along with a number of
years to project the population.

In addition, compute how many years it would take to double the population under each
model.

Exercise 4.3. Write a loan program similar to the amortization schedule program we
developed in Section 4.7.3. However, give the user an option to specify an extra monthly
payment amount in order to pay off the loan early. Calculate how much quicker the loan
gets paid off and how much they save in interest.

Exercise 4.4. The rate of decay of a radioactive isotope is given in terms of its half-life
H, the time lapse required for the isotope to decay to one-half of its original mass.
For example, the isotope Strontium-90 (90 Sr) has a half-life of 28.9 years. If we start
with 10kg of Strontium-90 then 28.9 years later you would expect to have only 5kg of
Strontium-90 (and 5kg of Yttrium-90 and Zirconium-90, isotopes which Strontium-90
decays into).

Write a program that takes the following input:

• Atomic Number (integer)

• Element Name

• Element Symbol

• H (half-life of the element)

• m, an initial mass in grams

112
4.8. Exercises

Your program will then produce a table detailing the amount of the element that remains
after each year until less than 50% of the original amount remains. This amount can be
computed using the following formula:
(y/H)
1
r =m×
2
y is the number of years elapsed, and H is the half-life of the isotope in years.

For example, using your program on Strontium-90 (symbol: Sr, atomic number: 38)
with a half-life of 28.9 years and an initial amount of 10 grams would produce a table
something like the following.

Strontium-90 (38-Sr)
Elapsed Years Amount
-------------------------
- 10g
1 9.76g
2 9.53g
3 9.30g
...
28 5.11g
29 4.99g

Exercise 4.5. Write a program that computes various statistics on a collection of

numbers that can be read in from the command line, as command line arguments, or via
other means. In particular, given n numbers,

x1 , x2 , . . . , xn

your program should compute the following statistics:

• The minimum number

• The maximum number

• The mean,
n
1X
µ= xi
n i=1

• The variance,
n
2 1X
σ = (xi − µ)2
n i=1

113
4. Loops

• And the standard deviation,

v
u n
u1 X
σ=t (xi − µ)2
n i=1

where n is the number of numbers that was provided. For example, with the numbers,

3.14, 2.71, 42, 3, 13

your output should look something like:

Minimum: 2.71
Maximum: 42.00
Mean: 12.77
Variance: 228.77
Standard
Deviation: 15.13

Exercise 4.6. The ancient Greek mathematician Euclid developed a method for finding
the greatest common divisor of two positive integers, a and b. His method is as follows:

1. If the remainder of a/b is 0 then b is the greatest common divisor.

2. If it is not 0, then find the remainder r of a/b and assign b to a and the remainder
r to b.

3. Return to step (1) and repeat the process.

Write a program that uses a function to perform this procedure. Display the two integers
and the greatest common divisor.

Exercise 4.7. Write a program to estimate the value of e ≈ 2.718281 . . . using the series:
∞
X 1
e=
k=0
k!

Obviously, you will need to restrict the summation to a finite number of n terms.

Exercise 4.8. The value of π can be expressed by the following infinite series:

1 1 1 1 1 1
π =4· 1− + − + − + − ···
3 5 7 9 11 13

114
4.8. Exercises

An approximation can be made by taking the first n terms of the series. For n = 4, the
approximation is
1 1 1
π ≈4· 1− + − = 2.8952
3 5 7
Write a program that takes n as input and outputs an approximation of π according to
the series above.
Exercise 4.9. The sine function can be approximated using the following Taylor series.
∞
X (−1)i 2i+1 x3 x5
sin (x) = x =x− + − ···
i=0
(2i + 1)! 3! 5!

Write a function that takes x and n as inputs and approximates sin x by computing the
first n terms in the series above.
Exercise 4.10. One way to compute π is to use Machin’s formula:
π 1 1
= 4 arctan − arctan
4 5 239
To compute the arctan function, you could use the following series:
∞
x x 3 x5 x7 X (−1)i
arctan x = − + − + ··· = x2i+1
1 3 5 7 i=0
2k + 1

Write a program to estimate π using these formulas but allowing the user to specify how
many terms to use in the series to compute it. Compare the estimate with the built-in
definition of π in your language.
Exercise 4.11. The arithmetic-geometric mean of two numbers x, y, denoted M (x, y) (or
√
agm(x, y)) can be computed iteratively as follows. Initially, a1 = 12 (x + y) and g1 = xy
(i.e. the normal arithmetic and geometric means). Then, compute

an+1 = 21 (an + gn )
√
gn+1 = an g n

The two sequences will converge to the same number which is the arithmetic-geometric
mean of x, y. Obviously we cannot compute an infinite sequence, so we compute until
|an − gn | < for some small number .
Exercise 4.12. The integral of a function is a measure of the area under its curve. One
numerical method for computing the integral of a function f (x) on an interval [a, b] is
the rectangle rule. Specifically, an interval [a, b] is split up into n equal subintervals of
size h = b−a
n
. Then the integral is approximated by computing:
Z b n−1
X
f (x)dx ≈ f (a + ih) · h
a i=0

115
4. Loops

Write a program to approximate an integral using the rectangle method. For this
particular exercise you will integrate the function
sin x
f (x) =
x
For reference, the function is depicted in Figure 4.3. Write a program that will read the
end points a, b and the number of subintervals n and computes the integral of f using
the rectangle method. It should then output the approximation.

sin x
Figure 4.3.: Plot of f (x) = x

Exercise 4.13. Another way to compute an integral is to a technique called Monte

Carlo Integration, a randomized numerical integration method.

Given the interval [a, b], we enclose the function in a region of interest with a rectangle
of a known area Ar . We then randomly select n points within the rectangle and count
the number of random points that are within the function’s curve. If m of the n points
are within the curve, we can estimate the integral to be
Z b
m
f (x) dx ≈ Ar
a n

Consider again the function f (x) = sin(x)

x
. Note that the global maximum and minimum
of this function are 1 and ≈ −0.2172 respectively. Therefore, we can also restrict the
rectangle along the y-axis from −.25 to 1. That is, the lower left of the rectangle will be
(a, −.25) and the upper right will be (b, 1) for a known area of

Ar = |a − b| × 1.25

Figure 4.4 illustrates the rectangle for the interval [−5, 5].

Write a program that will takes as input interval values a, b, and an integer n and perform
a Monte Carlo estimate of the integral of the function above. Realize that this is just an
approximation and it is randomized so your answers may not match exactly and may be
different on various executions of your program. Take care that you handle points within
the curve but under the x-axis correctly.

116
4.8. Exercises

Figure 4.4.: A rectangle for the interval [−5, 5].

Exercise 4.14. Consider a ball trapped in a 2-D box. Suppose that it has an initial
position (x, y) within the box (the box’s dimensions are specified by its lower left (x` , y` )
and an upper right (xr , yr ) points) along with an initial angle of travel θ in the range
[0, 2π). As the ball travels in this direction it will eventually collide with one of the sides
of the box and bounce off. For this model, we will assume no loss of velocity (it keeps
going) and its angle of reflection is perfect.

Write a program that takes as input, x, y, θ, x` , y` , xr , yr , and an integer n and computes

the first n − 1 Euclidean points on the box’s perimeter that the ball bounces off of in its
travel (include the initial point in your printout for a total of n points). You may assume
that the input will always be “good” (the ball will always begin somewhere inside the
box and the lower left and upper right points will not be reversed).

As an example, consider the inputs:

x = 1, y = 1, θ = .392699, x` = 0, y` = 0, xr = 4, yr = 3, n = 20

Starting at (1, 1), the ball travels up and to the right bouncing off the right wall. Figure
4.5 illustrates this and the subsequent bounces back and forth.

Your output should simply be the points and should look something like the following.

(1.000000, 1.000000)
(4.000000, 2.242640)
(2.171572, 3.000000)
(0.000000, 2.100506)
(4.000000, 0.443652)
(2.928929, 0.000000)
(0.000000, 1.213202)
(4.000000, 2.870056)
(3.686287, 3.000000)
(0.000000, 1.473090)

117
4. Loops

y
(1.284, 3)

(1, 1)

Figure 4.5.: Follow the bouncing ball

(3.556355, 0.000000)
(4.000000, 0.183764)
(0.000000, 1.840617)
(2.798998, 3.000000)
(4.000000, 2.502529)
(0.000000, 0.845675)
(2.041640, 0.000000)
(4.000000, 0.811179)
(0.000000, 2.468033)
(1.284282, 3.000000)

Exercise 4.15. An integer n ≥ 2 is prime if its only divisors are 1 and itself, n. For
example, 2, 3, 5, 7, 11, . . . are primes. Write a program that outputs all prime numbers 2
up to m where m is read as input.

Exercise 4.16. An integer n ≥ 2 is prime if the only integers that evenly divide it are 1
and n itself, otherwise it is composite. The prime factorization of an integer is a list of
its prime divisors along with their multiplicities. For example, the prime decomposition
of 188, 760 is:
188, 760 = 2 · 2 · 2 · 3 · 5 · 11 · 11 · 13
Write a program that takes an integer n as input and outputs the prime factorization of
n. If n is invalid, an appropriate error message should be displayed instead. Your output
should look something like the following.

118
4.8. Exercises

1001 = 7 * 11 * 13

Exercise 4.17. One way of estimating π is to randomly sample points within a 2 × 2

square centered at the origin. If the distance between the randomly chosen point (x, y)
and the origin is less than or equal to 1, then the point lies inside the unit circle centered
at the origin and we count it. If the point lies outside the circle then we can ignore it. If

Figure 4.6.: Sampling points in a circle

we sample n points and m of them lie within the circle, then π can be estimated as

4m
π≈
n

Given a point (x, y), its distance from the origin is simply
p
x2 + y 2

This idea is illustrated in Figure 4.6. Example code is given to randomly generate
numbers within a bound. Write a program that takes an integer n as input and randomly
samples n points within the 2 × 2 square and outputs an approximation of π.

Of course, you’ll need a way to generate random numbers within the range [−1, 1]. Since
you are using some randomization, the result is just an approximation and may not
match exactly or even be the same between two different runs of your program.

119
4. Loops

Figure 4.7.: Regular polygons

Exercise 4.18. A regular polygon is a polygon that is equiangular. That is, it has n
sides and n points whose angle from the center are all equal in measure. Examples for
n = 3 through n = 8 can be found in Figure 4.7.

Write a program that takes n and a radius r as inputs and computes the points of a
regular n-sided polygon centered at the origin (0, 0). Each point should be distance r
from the origin and the first point should lie on the positive x-axis. Each subsequent
point should be at an angle θ equal to 2π
n
from the previous point. Recall that given the
polar coordinates θ, r we can convert to cartesian coordinates (x, y) using the following.
x = r · cos θ
y = r · sin θ
Your program should be robust enough to check for invalid inputs. If invalid, an error
message should be printed and the program should exit.

For example, running your program with n = 5, r = 6 should produce the points of a
pentagon with “radius” 6. The output should look something like:

Regular 5-sided polygon with radius 6.0:

(6.0000, 0.0000)
(1.8541, 5.7063)
(-4.8541, 3.5267)
(-4.8541, -3.5267)
(1.8541, -5.7063)

Exercise 4.19. Let p1 = (x1 , y1 ) and p2 = (x2 , y2 ) be two points in the cartesian plane
which define a line segment. Suppose we travel along this line starting at p1 taking n
steps that are an equal distance apart until we reach p2 . We wish to know which points
correspond to each of these steps and which step along this path is closest to anther
point p3 = (x3 , y3 ). Recall that the distance between two points can be computed using
the Euclidean distance formula:
p
δ = (x1 − x2 )2 + (y1 − y2 )2
Write a program that takes three points and an integer n as inputs and outputs a sequence
of points along the line defined by p1 , p2 that are distance nδ apart from each other. It

120
4.8. Exercises

should also indicate which of these computed points is the closest to the third point.
For example, the execution of your program with inputs 0, 2, −5.5, 7.75, −2, 3, 10 should
produce output that looks something like:

(0.00, 2.00) to (-5.50, 7.75) distance: 7.9569

(0.00, 2.00)
(-0.55, 2.58)
(-1.10, 3.15)
(-1.65, 3.72) <-- Closest point to (-2, 3)
(-2.20, 4.30)
(-2.75, 4.88)
(-3.30, 5.45)
(-3.85, 6.02)
(-4.40, 6.60)
(-4.95, 7.17)
(-5.50, 7.75)

Exercise 4.20. The natural log of a number x is usually computed using some numerical
approximation method. One such method is to use the following Taylor series.
(x − 1)2 (x − 1)3 (x − 1)4
ln x = (x − 1) − + − + ···
2 3 4
However, this only works for |x − 1| ≤ 1 (except for x = 0) and diverges otherwise. For
x such that |x| > 1, we can use the series
y 1 1 1
ln = + 2 + 3 + ···
y−1 y 2y 3y
x
where y = x−1 . Of course such an infinite computation cannot be performed by a
computer. Instead, we approximate ln x by computing the series out to a finite number
of terms, n. Your program should print an error message and exit for x ≤ 0; otherwise it
should use the first series for 0 < x ≤ 1 and the second for x > 1.

Another series that has better convergence properties and works for any range of x is as
follows
1+y 1 2 1 4 1 6
ln x = ln = 2y 1 + y + y + y · · ·
1−y 3 5 7
(x−1)
where y = (x+1)
.

You will write a program that approximates ln x using these two methods computed to n
terms. You will also compute the error of each method by comparing the approximated
value to the standard math library’s log function.

121
4. Loops

Your program should accept x and n as inputs. It should be robust enough to reject
any invalid inputs (ln x is not defined for x = 0 you may also print an error for any
negative value; n must be at least one). It will then compute an approximation using
both methods and print the relative error of each method.

For example, the execution of your program with inputs 3.1415, 6 should produce output
that looks something like:

Taylor Series: ln(3.1415) ~= 1.11976

Error: 0.02494
Other Series: ln(3.1415) ~= 1.14466
Error: 0.00004

Exercise 4.21. There are many different numerical methods to compute the square root
of a number. In this exercise, you will implement several of these methods.

(a) The Exponential Identity Method involves the following identity:

√ 1
x = e 2 ln (x)

Which assumes the use of built-in (or math-library) functions for e and the natural
log, ln.

(b) The Babylonian Method involves iteratively computing the following recurrence:

1 x
ai = ai−1 +
2 ai−1

where a1 = 1.0. Computation is repeated until |ai − ai−1 | ≤ δ where δ is some

small constant value.

(c) A method developed for one of the first electronic computers (EDSAC [29]) involves
the following iteration. Let a0 = x, c0 = x − 1. Then compute

ai+1 = ai − ai2ci
c2 (c −3)
ci+1 = i 4i

The iteration is performed for as many iterations as specified (n), or until the
change
√ in a is negligible. The resulting value for a is used as an approximation for
x ≈ a. However, this method only works for values of x such that 0 < x < 3. We
can easily overcome this by scaling x by some power of 4 so that the scaled value
of x satisfies 12 ≤ x < 2. After applying the method we can then scale back up by
√
the appropriate value of 2 (since 4 = 2). Algorithm 4.17 describes how to scale x.

122
4.8. Exercises

Write a program to compute the square root of an input number using these methods
and compare your results.

1 power ← 0
2 while x < 12 do
//Scale up
3 x ← (x · 4)
4 power ← (power − 1)
5 end
6 while x ≥ 2 do
//Scale down
7 x ← x4
8 power ← (power + 1)
9 end

Algorithm 4.17: Scaling a value x so that it satisfies 12 ≤ x < 2. After execution,

power indicates what power of 2 the value x was scaled by.

Exercise 4.22. There are many different numerical methods to compute the natural
logarithm of a number, ln x. In this exercise, you will implement several of these methods.

(a) A formula version approximates the natural logarithm as:

π
ln(x) ≈ − m ln(2)
2M (1, 4/s)

Where M (a, b) is the arithmetic-geometric mean and s = x2m . In this formula, m

is a parameter (a larger m provides more precision).

(b) The standard Taylor Series for the natural logarithm is:
∞
X (−1)n+1
ln(x) = (x − 1)n
n=1
n

As we cannot compute an infinite series, we will simply compute the series to the
first m terms. Also note that this series is not convergent for values x > 1

(c) Borchardt’s algorithm is an iterative method that works as follows. Let

1+x √
a0 = b0 = x
2
Then repeat:
ak +bk
ak+1 = p 2
bk+1 = ak+1 bk

123
4. Loops

until the absolute difference between ak , bk is small; that is |ak − bk | < . Then the
logarithm is approximated as
x−1
ln(x) ≈ 2
ak + b k

(d) Newton’s method works if x is sufficiently close to 1. It works by setting y0 = 1

and then computing
x − e yn
yn+1 = yn + 2
x + e yn
The iteration is performed m times.

To ensure that some of the methods above work, you may need to scale the number x to
be as close as possible to 1. One way to do this is to divide or multiply by e until x is
close to 1. Suppose we divided by e k times; that is x = z · ek where z is close to 1. Then

ln(x) = ln(z · ek ) = ln(z) + ln(ek ) = ln(z) + k

Thus, we can apply the methods above to Newton’s method to z and add k to the result
to get ln(x). A similar trick can be used to ensure that the Taylor Series method is
convergent.

Exercise 4.23. Write a program that takes a positive integer n as a command line
argument and outputs a list of all pairs of integers x, y such that 1 ≤ x, y ≤ n whose
sum, x + y is prime. For example, if n = 5, the output should look something like the
following.

1 + 2 = 3 is prime
1 + 4 = 5 is prime
2 + 3 = 5 is prime
2 + 5 = 7 is prime
3 + 4 = 7 is prime

Exercise 4.24. Consider the following variation on the classical “FizzBuzz” challenge.
Write a program that will print out numbers 1 through n where n is provided as a
command line argument. However, if the number is a perfect square (that is, the square
of some integer; for example 1 = 12 , 4 = 22 , 9 = 32 , etc.) print “Go Huskers” instead. If
the number is a prime (2, 3, 5, 7, etc.) print “Go Cubs” instead.

Exercise 4.25. Implement a program to solve the classic “Rainfall Problem” (which
has been used in CS education studies to assess programming knowledge and skills).
Write an interactive program that repeatedly prompts the user to enter an integer (which

124
4.8. Exercises

represents the amount of rainfall on a particular day) until it reads the integer 99999.
After 99999 is entered, it should print out the correct average. That is, it should not count
the final 99999. Negative values should also be ignored. For example, if the user entered
the sequence 4 0 -1 10 99999 the output should look something like the following.

Average rainfall: 4.66

Exercise 4.26. Write a program that takes an integer n and a subsequent list of integers
as command line arguments and determines which number(s) between 1 and n are
missing from the list. For example, if the following numbers are given to the program:
10 5 2 3 9 2 8 8 your output should look something like:

Missing numbers 1 thru 10:

1, 4, 6, 7, 10

Exercise 4.27. Write a program that takes a list of pairs of numbers representing
latitudes/longitudes (on the scale [−180, 180] (negative values correspond to the southern
and western hemispheres). Then, starting with the first pair, calculate the intermediate
air distances between each location as well as a final total distance.

To compute air distance from location A to a location B, use the Spherical Law of
Cosines:
d = arccos (sin(ϕ1 ) sin(ϕ2 ) + cos(ϕ1 ) cos(ϕ2 ) cos(∆)) · R
where

• ϕ1 is the latitude of location A, ϕ2 is the latitude of location B

• ∆ is the difference between location B’s longitude and location A’s longitude

• R is the (average) radius of the earth, 6,371 kilometers

Note: the formula above assumes that latitude and longitude are measured in radians r,
−π ≤ r ≤ π. To convert from degrees deg (−180 ≤ deg ≤ 180) to radians r, you can use
the simple formula:
deg
r= π
180

For example, if the command line arguments were

40.8206 -96.756 41.8806 -87.6742 41.9483 -87.6556 28.0222 -81.7329

125
4. Loops

your output should look something like:

(40.8206, -96.7560) to (41.8806, -87.6742): 766.8053km

(41.8806, -87.6742) to (41.9483, -87.6556): 7.6836km
(41.9483, -87.6556) to (28.0222, -81.7329): 1638.7151km
Total Distance: 2413.2040

Exercise 4.28. A DNA sequence is made up of a sequence of four nucleotide bases, A,

C, G, T (adenine, cytosine, guanine, thymine). One particularly interesting statistic of a
DNA sequence is finding a CG island : a subsequence that contains the highest frequency
of guanine and cytosine.

For simplicity, we will be interested in subsequences of a particular length, n that will be

provided as part of the input.

Write a program that takes, as command line arguments, an integer n and a DNA
sequence. The program should then find all subsequences of the given DNA string of
length n with the maximal frequency of C and G in it. For example, if the DNA sequence
is

ACAAGATGCCATTGTCCCCCGGCCTCCTGCTGCTGCTGCTCTCCGGGGCCACGGC

and the “window” size that we’re interested in is n = 5 then you would scan the sequence
and find every subsequence with the maximum number of C or G bases. Your output
should include all CG Islands (by indices) in the sequence similar to the following.

n = 5
highest frequency: 5 / 5 = 100.00%
CG Islands:
15 thru 20: CCCCC
16 thru 21: CCCCG
17 thru 22: CCCGG
18 thru 23: CCGGC
19 thru 24: CGGCC
42 thru 47: CCGGG
43 thru 48: CGGGG
44 thru 49: GGGGC
45 thru 50: GGGCC

126
4.8. Exercises

Exercise 4.29. Write a program that will assist people in saving for retirement using a
tax-deferred 401k program.

Your program will read the following inputs as command line arguments.

• An initial starting balance

• A monthly contribution amount (we’ll assume its the same over the life of the
savings plan)

• An (average) annual rate of return (on the scale [0, 1])

• An (average) annual rate of inflation (on the scale [0, 1])

• A number of years until retirement

Your program will then compute a monthly savings table detailing the (inflation-adjusted)
interest earned each month, contribution, and new balance. The inflation-adjusted rate
of return can be computed with the following formula.
1 + rate of return
−1
1 + inflation rate
To get the monthly rate, simply divide by 12. Each month, interest is applied to the
balance at this rate (prior to the monthly deposit) and the monthly contribution is added.
Thus, the earnings compound month to month.

Be sure that your program handles bad inputs as well as it can. It should also round to
the nearest cent for every figure. Finally, as of 2014, annual 401k contributions cannot
exceed $17,500. If the user’s proposed savings schedule violates this limit, display an
error message instead of the savings table.

For inputs 10000 500 0.09 0.012 10 your output should look something like the
following:

Month Interest Balance

1 $ 64.23 $ 10564.23
2 $ 67.85 $ 11132.08
3 $ 71.50 $ 11703.58
4 $ 75.17 $ 12278.75
5 $ 78.87 $ 12857.62
6 $ 82.58 $ 13440.20
7 $ 86.33 $ 14026.53
8 $ 90.09 $ 14616.62
9 $ 93.88 $ 15210.50

127
4. Loops

...
116 $ 678.19 $ 106767.24
117 $ 685.76 $ 107953.00
118 $ 693.37 $ 109146.37
119 $ 701.04 $ 110347.41
120 $ 708.75 $ 111556.16
Total Interest Earned: $ 41556.16
Total Nest Egg: $ 111556.16

Exercise 4.30. An affine cipher is an encryption scheme that encrypts messages using
the following function:
ek (x) = (ax + b) mod n
Where n is some integer and 0 ≤ a, b, x ≤ n − 1. That is, we fix n, which will be used to
encode an alphabet as in Table 4.1.

x character
0 (space)
1 A
2 B
3 C
.. ..
. .
25 Y
26 Z
27 .
28 !

Table 4.1.: Character Mapping for n = 29

Then we choose integers a, b to define the encryption function. Suppose a = 10, b = 13,
then
ek (x) = (10x + 13) mod 29
So to encrypt “HELLO!” we would encode it as 8, 5, 12, 12, 15, 27, then encrypt them,

ek (8) = (10 · 8 + 13) mod 29 = 6

ek (5) = (10 · 5 + 13) mod 29 = 5

ek (12) = (10 · 12 + 13) mod 29 = 17
ek (12) = (10 · 12 + 13) mod 29 = 17
ek (15) = (10 · 15 + 13) mod 29 = 18

128
4.8. Exercises

ek (28) = (10 · 28 + 13) mod 29 = 3

Which, when mapped back to characters using our encoding is “FEQQRC.”

To decrypt a message we need to invert the decryption function, that is,

dk (y) = a−1 · (y − b) mod n

where a−1 is the inverse of a modulo n. The inverse of an integer a is the value such that

(a · a−1 ) mod n = 1

so for a = 10, n = 29, the inverse, 10−1 mod 29 = 3 since 3 · 10 mod 29 = 1. Given a and
n, how can we find an inverse, a−1 ? Obviously it cannot be zero, nor can it be 1 (1 is its
own inverse). There is a simple algorithm (the Extended Euclidean Algorithm) that can
solve this problem, but n = 29 is small enough that a brute-force strategy of testing all
possibilities will suffice.

Write a program that takes a, b and an encrypted message as command line arguments
and decrypts the message. Your program should print the decrypted message and other
cipher information to the standard output. For example:

a = 10
b = 13
a^-1 = 3
Encrypted Message: FEQQRC
Decrypted Message: HELLO!

Exercise 4.31. A centroid (or barycenter) of a plane figure is the arithmetic mean of all
the points in the shape. The centroid of a non-self-intersecting closed polygon defined by
n vertices (x0 , y0 ), (x1 , y1 ), . . . , (xn1 , yn−1 ) is the point (Cx , Cy ) which can be computed
using:

n−1
1 X
Cx = (xi + xi+1 )(xi yi+1 − xi+1 yi )
6A i=0

n−1
1 X
Cy = (yi + yi+1 )(xi yi+1 − xi+1 yi )
6A i=0

with A being the polygon’s signed area:

129
4. Loops

(2, 3.6)

(2.468, 1.902)
(4.25, 2.1)
(1, 1.5)
(3.7, 1.5)

(2.25, 0.25)

Figure 4.8.: A polygon and its centroid. Whoo!

n−1
1X
A= (xi yi+1 − xi+1 yi )
2 i=0

In these formulas, the vertices are assumed to be numbered in order of their occurrence
along the polygon’s perimeter. Furthermore, the vertex (xn , yn ) = (x0 , y0 ).

Write a program that takes n pairs of x-y coordinates as command line arguments that
represent a polygon and computes its centroid. For example, the an input such as

1 1.5 2 3.6 4.25 2.1 3.7 1.5 2.25 0.25

would correspond to the polygon in Figure 4.8 with a centroid of (2.46, 1.90). The output
should look something like the following.

Centroid: (2.468405, 1.902874)

Exercise 4.32. Suppose we are given a set of n points (x0 , y0 ), (x1 , y1 ), . . . , (xn1 , yn−1 )
in the plane. Write a program that outputs instructions for someone to “walk” to each
point in the order that they are given starting at the origin by only going in the cardinal
directions (not “as the bird flies”). For example, if the following input is given,

3 5 2 7 9 5

Then the output should look like the following.

Go north 5.00 units

Go east 3.00 units
Go north 2.00 units

130
4.8. Exercises

Go west 1.00 units

Go south 2.00 units
Go east 7.00 units

Exercise 4.33. A histogram is a graphical representation of the distribution of numerical

data. Typically, a histogram is represented as a (vertical) bar graph. However, we’ll limit
our attention to graphing a horizontal ASCII histogram.

In particular, write a program that takes a sequence of numerical values representing

grades (on a 0–100 scale; reject any invalid values). The program will then display
a horizontal graph of the distribution of these values over 6 fixed ranges that looks
something like the following.

0 - 49 (2, 11.11%) ******

50 - 59 (1, 5.56%) ***
60 - 69 (1, 5.56%) ***
70 - 79 (5, 27.78%) **************
80 - 89 (4, 22.22%) ************
90 - 100 (5, 27.78%) **************

In the diagram, each star, * represents (at least) 2%

Exercise 4.34. Shannon entropy is a measure of information in a piece of data; specifi-
cally it is the expected value (average) of the information the data contains. A higher
entropy value corresponds to more random data while 0 indicates highly structured data.
Entropy is calculated using the following formula.

n
X
H(X) = − p(xi ) · log2 p(xi )
i=1

where x1 , x2 , . . . , xn are the distinct symbols in the data sequence X and p(xi ) is the
probability of the xi -th symbol. If the length of X is n, then this is simply
c(xi )
p(xi ) =
n
where c(xi ) is a count of the number of times xi appears in X. For example, the string
“Hello World” has 8 distinct symbols with the probability of “l” being .2727, “o” being
.1818 and the remaining being .0909. Thus the entropy would be
H(’Hello World’) = 2.845351

131
4. Loops

Write a program that takes a string as command line input (which may contain spaces)
and computes its entropy.

132
5. Functions
In mathematics, a function is a mapping from a set of inputs to a set of outputs such
that each input is mapped to exactly one output. For example, the function

f (x) = x2

maps numeric values to their squares. The input is a variable x. When we assign an
actual value to x and evaluate the function, then the function has a value, its output. For
example, setting x = 2 as input, the output would be 22 = 4. Mathematical functions
can have multiple inputs,

f (x, y) = x2 + y 2 f (x, y, z) = 2x + 3y − 4z

However, a function will only ever have a single output value.

In programming languages, a function (sometimes called subroutine or procedure) can

take multiple inputs and produce one output value. We’ve already seen some examples
of these functions. For example, most languages provide a math library that you can
use to evaluate the square root or sine of a value x. We’ve also seen some functions
with multiple input values such as the “power” function that allows you to compute
f (x, y) = xy . Finally, the main entry point to many programs is defined by a main
function.

More formally, a function is a sequence of instructions (code) that is packaged into a

unit that can be reused. A function performs a specific task: given a number of inputs,
it executes some sequence of operations (executes some code) and “returns” (outputs) a
result. The output can be captured into a variable or used in an expression by whatever
code invoked or “called” the function.

Defining and using functions in programming has numerous advantages. The most
obvious advantage is that it allows you a way to organize code. By separating a program
it into distinct units of code it is more organized and it is clearer what each piece or
segment of code does. This also facilitates top-down design: one way to approach a
problem is to split it up into a series of subproblems until each subproblem is either
trivial to deal with, or an existing, “off-the-shelf” solution already exists for it. Functions
may serve as the logical unit for each subproblem.

Another advantage is that by organizing code into functions, those functions can be reused
in multiple places either in your program/project or even in other programs/projects.

133
5. Functions

A prime example of this are the standard libraries available in most programming
languages that provide functions to perform standard input/output or mathematical
functions. These standard libraries provide functions that are used by thousands of
different programs across multiple different platforms.

Functions also form an isolated unit of code. This allows for better and easier testing.
By isolating pieces of code, we can rigorously test those pieces of code by themselves
without worrying about the larger program or contexts.

Functions facilitates procedural abstraction. Placing code into functions allows you to
abstract the details of how the function computes its answer. As an example: consider
a standard math library’s square root function: it may use some interpolation method,
a Taylor series, or some other method entirely to compute the square root of a given
number. However, by putting this functionality into a function, we, as programmers, do
not need to concern ourselves about these details. Instead, we simply use the function,
allowing us to focus on the larger issues at hand in our program.

5.1. Defining & Using Functions

Like variables, many programming languages may require that you declare a function
before you can use it. A function declaration may simply include a description of the
function’s input/output and name. A function declaration may require you to define the
function’s body at the same time or separately. Functions can also have scope: some
areas of the code may be able to “see” the function or know about it and be able to
invoke the function while other areas of the code may not be able to see the function
and therefore may not be able to invoke it.

Some interpreted programming languages use function hoisting which allows you to
use/invoke functions before you declare them. This works because the interpreter does
an initial scan of the code and identifies all function declarations. Only after it has
“hoisted” all functions into scope does it start to execute the program. Thus, a function
declaration can appear after it has been used and it will still work.

5.1.1. Function Signatures

A function is usually defined by its signature: every function can be identified by its
name (also called an identifier ), its list of input parameters, and its output. A function
signature allows the programming language to uniquely identify each function so that

134
5.1. Defining & Using Functions

when you invoke a function there is no ambiguity in which function you are calling.

1 Function sum(a, b)
2 x←a+b
3 return x
4 end

Algorithm 5.1: A function in pseudocode. In this case, the name (identifier) of

the function is sum and it has two parameters, a and b. Its body is contained in
lines 2–3. Its return value is indicated by the return statement on line 3.

A function declaration in pseudocode is presented in Algorithm 5.1. In the pseudocode,

explicit variable types are omitted, and thus the return type is inferred from the return
statement. In Figure 5.1 we have provided an example of a function declaration in the C
programming language with each element labeled.

Parameters

double getDistance(double x1, double y1, double x2, double y2);

Return Identifier
Type (name)

Figure 5.1.: A function declaration (prototype) in the C programming language with the
return type, identifier, and parameter list labeled.

Some languages only allow you to use one identifier for one function (like variables) while
other languages allow you to define multiple functions with the same identifier as long
as the parameter list is different (see Section 5.3.2 below). In general, like variables,
function names are case sensitive. Also similar to variables, modern lower camel casing
is used with function names.

When defining the parameters to a function (its input), you usually provide a comma
delimited list of variable names. In the case of statically typed languages, the types of
the variable parameters are also specified. The order is important as when you invoke
the function, the number of inputs must match the number of parameters in the function
declaration. The variable types may also need to match. In some dynamically typed
languages, you may be able to call functions with different types or you may be able to
omit some of the parameters (see Section 5.3.4 below).

Similarly, the return type of the function may need to be specified in statically typed
languages while with dynamic languages, functions may conditionally return different
types. We generally refer to the “return value” or “return type” because when a function

135
5. Functions

is done executing, it “returns” the control flow back to the line of code that invoked it,
returning its computed value.

You can also define functions that may not have any inputs or may not have any output.
Some languages use the keyword void to indicate no return value and such functions are
known as “void functions.” When a function doesn’t have any input values, its parameter
list is usually empty.

The function signature may be accompanied by the function body which contains the actual
code that specifies what the function does. Typically the function body is demarcated
with a code block using opening and closing curly brackets, { ... } . Within the
function you can generally write any valid code including declaring variables. When you
declare a variable inside a function, however, it is local to that function. That is, the
variable’s scope is only defined within the function. A local variable cannot be accessed
outside the function, indeed the local variable does not usually survive when the function
ends its execution and returns control back to line of code that called it. Function
parameters are essentially locally scoped variables as well and can usually be treated as
such.

5.1.2. Calling Functions

When a function has been defined and is in scope, you can invoke or “call” the function
by coding the function name and providing the input parameters which can be either
variables or literals. When provided as inputs, parameters are referred to as arguments
to the function. The arguments are typically provided as a comma delimited list and
placed inside parentheses.

Invoking a function changes the usual flow of control. When invoked, control flow is
transferred over to the function. When the function finishes executing the code in its
body, control flow returns to the point in the code that invoked it. It is common for a
program to be written so that a function calls another function and that function calls
another. This can form a deep chain of function calls in which the flow of control is
transferred multiple times. Upon the completion of each function, control is returned
back to the function that called it, referred to as the calling function.

If a function returns a value it can either be captured in a variable using an assignment

136
5.2. How Functions Work

operator or by using it in an expression.

1 a ← 10
2 b ← 20
3 c ← sum(a, b)

Algorithm 5.2: Using a function. We invoke a function by indicating its name

(identifier) and passing it arguments.

5.1.3. Organizing

Functions provide code organization, but functions themselves should also be organized.
We’ve seen this with standard libraries. Functions that provide basic input/output are
all grouped together into one library. Functions that involve math functions are grouped
together into a separate math library.

Some languages allow you to define and “import” individual libraries which organize
similar functions together. Some languages do this by collecting functions into “utility”
classes or modules. Only when you import these modules do the functions come into
scope and can be used in your code. If you do not import these modules, then the
functions are out of scope and cannot be used.

In some languages once a function is imported it is part of the global scope and can be
“seen” by any part of the code. This can cause conflicts: if you import modules from two
different libraries each with different functions that have the same name or signature,
then the two function definitions may be in conflict or it may make your code ambiguous
as to which function you intend to invoke. This is sometimes referred to as “polluting the
namespace.” There are several techniques that can avoid this situation. Some languages
allow you to place functions into a namespace to keep functions with the same name in
different “spaces.” Other languages allow you to place functions into different classes and
then invoke them by explicitly specifying which class’s function you want to call. Yet
other languages don’t have great support for function organization and it is the library
designer’s responsibility to avoid naming conflicts, typically by adding a library-specific
prefix to every function.

5.2. How Functions Work

To understand how functions work in practice, it is necessary to understand how a

program operates at a lower level. In particular, each program has a program stack (also
called a call stack). A stack is a data structure that holds elements in a Last-In First-Out

137
5. Functions

(LIFO) manner. Elements are added to the “top” of the stack in an operation called
push and elements can be removed from the top of the stack in an operation called pop.
In general, elements cannot be inserted or removed from the middle or “bottom” of the
stack.

In the context of a program, a call stack is used to keep track of the flow of control.
Depending on the operating system, compiler and architecture, the details of how
elements are stored in the program stack may vary. In general when a program begins,
the operating system loads it into memory at the bottom of the call stack. Global
variables (static data) are stored on top of the main program. Each time a function is
called, a new stack frame is created and pushed on top of the stack. This frame contains
enough space to hold values for the arguments passed to the function, local variables
declared and used by the function, as well as a space for a return value and a return
address. The return address is a memory location that the program should return to after
the execution of the function. That way, when the function finishes its execution, the
stack frame can be removed (popped) and the lower stack frame of the calling function
is preserved. This is a very efficient way to keep track of the flow of control in a program.
As each function calls another function, each stack frame is preserved by pushing a new
one on top of the program stack. Each time a function terminates execution and returns,
the removal of the stack frame means that all local variables go out of scope. Thus,
variables that are local to a function are not accessible outside the function.

To illustrate, consider the following snippet of C code. The main() function invokes
the average() function which in turn invokes the sum() function. Each invocation
creates a new stack frame on top of the last in the program stack which is depicted in
Figure 5.2.

1 double sum(double a, double b) {

2 double x = a + b;
3 return x;
4 }
5

6 double average(double a, double b) {

7 double y = sum(a, b) / 2.0;
8 return y;
9 }
10

11 int main(int argc, char **argv) {

12 double n = 10.0;
13 double m = 16.0;
14 double ave = average(n, m);
15 printf("average = %f\n", ave);
16 return 0;
17 }

138
5.2. How Functions Work

Unused stack space

Top of the stack

(low memory)
local variables x
sum() stack frame
return address, value ( 26.0 )

arguments a, b

local variables y

return address, value ( 13.0 ) average() stack frame

arguments a, b

local variables n, m, ave

return address, value ( 0 ) main() stack frame

arguments argc, argv

Global Variables
Static Content

Bottom of the stack

(high memory) Program Code

Figure 5.2.: Program Stack. At the bottom we have the program’s code, followed by
static content such as global variables. Each function call has its own stack
frame along with its own arguments and local variables. In particular, the
variable arguments a and b in two different stack frames are completely
different variables. Upon returning from the sum() function call, the top-
most stack frame would be popped and removed, returning to the code
for the average() function via the return address. The stack is depicted
bottom-up with high memory at the bottom and low memory at the top,
but this may differ depending on the architecture.

139
5. Functions

5.2.1. Call By Value

When a function is invoked, arguments are passed to it. When you invoke a function you
can pass it variables as arguments. However, the variables themselves are not passed to
the function, but instead the values stored in the variables at the time that you call the
function are passed to the function. This mechanism is known as call by value and the
variable values are passed by value to the function.

Recall that the arguments passed to a function are placed in a new stack frame for that
function. Thus, in reality copies of the values of the variables are passed to the function.
Any changes to the parameters inside the function have no effect on the original variables
that were “passed” to the function when it was invoked.

To illustrate, consider the following C code. We have a function sum that takes two
integer parameters a and b which are passed by value. Inside sum , we create another
variable x which is the sum of the two passed variables. We then change the value
of the first variable, a to 10 . Elsewhere in the code we call sum on two variables,
n , m with values 5 and 15 respectively. The invocation of the function sum means
that the two values, 5 and 15, stored in the variables are copied into a new stack frame.
Thus, changing the value to the first parameter changes the copy and has no effect on
the variable n . At the end of this code snippet n retains its original value of 5. The
program stack frames are depicted in Figure 5.3.

1 int sum(int a, int b) {

2 int x = a + b;
3 a = 10;
4 return x;
5 }
6

7 ...
8

9 int n = 5;
10 int m = 15;
11 int k = sum(n, m);

5.2.2. Call By Reference

Some languages allow you to pass a parameter to a function by providing its memory
address instead of the value stored in it. Since the memory address is being provided
to the function, the function is able to access the original variable and manipulate the

140
5.2. How Functions Work

0x0088 x = 20 0x0088 x = 20
sum() 0x0084 b = 15 sum() 0x0084 b = 15
stack frame 0x0080 a = 5 stack frame 0x0080 a = 10

.. ..
. .

0x0018 k 0x0018 k
calling function calling function
0x0014 m = 10 0x0014 m = 10
stack frame 0x0010 n = 5 stack frame 0x0010 n = 5
(a) Upon invocation of the sum() function, a new (b) The change to the variable a in the sum()
stack frame is created which holds the parameters function changes the parameter variable, but the
and local variable. The parameter variables a original variable n is unaffected.
and b are distinct from the original argument
variables n and m .

0x0088 x = 20 0x0088 x = 20
sum() 0x0084 b = 15 sum() 0x0084 b = 15
stack frame 0x0080 a = 10 stack frame 0x0080 a = 10

.. ..
. .

0x0018 k 0x0018 k = 20
calling function calling function
0x0014 m = 10 0x0014 m = 10
stack frame 0x0010 n = 5 stack frame 0x0010 n = 5
(c) When the sum() function finishes execution, (d) The returned value is stored in the variable k
its stack frame is removed and the variables a , and the variable n retains its original value.
b , and x are no longer valid. The return value
20 is stored in another return value location.

Figure 5.3.: Demonstration of Pass By Value. Passing variables by value means that
copies of the values stored in the variables are provided to the function.
Changes to parameter variables do not affect the original variables.

141
5. Functions

contents stored at that memory address. In particular, the function is now able to make
changes to the original variable. This mechanism is known as call by reference and the
variables are passed by reference.

To illustrate, consider the following C code. Here, the variable a is passed by reference
( b is still passed by value, the *a and &n in the following code are dereferencing and
referencing operators respectively. For details, see Section 18.2). Below when we invoke
the sum() function, we pass not the value stored in n , but the memory address of the
variable n . Thus, when we change the value of the variable a in the function, we are
actually changing the value of n (since we have access to its memory location). At the
conclusion of this snippet of code, the value stored in n has been changed to 10. The
program stack frames are depicted in Figure 5.4.

1 int sum(int *a, int b) {

2 int x = *a + b;
3 *a = 10;
4 return x;
5 }
6

7 ...
8

9 int n = 5;
10 int m = 15;
11 int k = sum(&n, m);

Whether or not a variable is passed by value or by reference depends on the language,

type of variable, and the syntax used.

5.3. Other Issues

5.3.1. Functions as Entities

In programming languages, any entity that can be stored in a variable or passed as

an argument to a function or returned as a value from a function is referred to as a
“first-class citizen.” Numerical values for example are usually first-class citizens as they
can be stored in variables and passed to functions.

Functional Programming is a programming language paradigm in which functions them-

selves are first-class citizens. That is, functions can be assigned to variables, functions

142
5.3. Other Issues

0x0088 x = 20 0x0088 x = 20
sum() 0x0084 b = 15 sum() 0x0084 b = 15
stack frame 0x0080 a = 0x0010 stack frame 0x0080 a = 0x0010

.. ..
. .

0x0018 k 0x0018 k
calling function calling function
0x0014 m = 10 0x0014 m = 10
stack frame 0x0010 n = 5 stack frame 0x0010 n = 10
(a) Upon invocation of the sum() function, a new (b) The change to the variable a in the sum()
stack frame is created which holds the parameters function actually changes what the variable a
and local variable. The parameter variable a references. That is, the original variable n .
holds the memory location of the original variable
n.

0x0088 x = 20 0x0088 x = 20
sum() 0x0084 b = 15 sum() 0x0084 b = 15
stack frame 0x0080 a = 0x0010 stack frame 0x0080 a = 0x0010

.. ..
. .

0x0018 k 0x0018 k = 20
calling function calling function
0x0014 m = 10 0x0014 m = 10
stack frame 0x0010 n = 10 stack frame 0x0010 n = 10
(c) When the sum() function finishes execution, (d) The returned value is stored in the variable k
its stack frame is removed and the variables a , and the value in the variable n has now changed.
b , and x are no longer valid. The return value
20 is stored in another return value location.

Figure 5.4.: Demonstration of Pass By Reference. Passing variables by reference means

that the memory address of the variables are provided to the function. The
function is able to make changes to the original variable because it knows
where it is stored.

143
5. Functions

can be passed to other functions as arguments, and functions can even return functions
as a result. This is done as a matter of course in functional programming languages
such as Haskell and Clojure, but many programming languages contain some functional
aspects.

For example, some languages support the same concept by using function pointers which
are essentially references to where the function is stored in memory. As a memory
location is essentially a number, it can be passed around in functions and be stored in a
variable. Purists would argue that this is not sufficient to call a function a “first-class
citizen” in such a language. They may argue that a language must be able to create
new functions at runtime for it to be considered a language in which functions are “true”
first-class citizens.

In any case, there are several advantages to being able to pass functions around as
arguments or store them in variables. Passing a function to another function as an
argument gives you the ability to provide a callback. A callback is simply a function
that gets passed to another function as an argument. The idea is that the function that
receives the callback will execute or “call back” the passed function at some point.

Using callbacks enables us to program a “generic” function that provides some generalized
functionality. Then more specific behavior can be be implemented in the callback function.
For example, we could create a generic sort function that sorts elements in a collection.
We could make the sort function generic so that it could sort any type of data: numbers,
strings, objects, etc. A callback would provide more specific behavior on how to order
individual elements in the sorted array.

As another example, consider GUI Programming in which we want to design a user

interface. In particular, we may be able to create a button element in our interface. We
need to be able to specify what happens when the user clicks the button. This could be
achieved by passing in a function as a callback to “register” it with the click “event.”

A related issue is anonymous functions (also known as lambda expressions). Typically, we

simply want to create a function so that we can pass it as a callback to another function.
We may have no intention of actually calling this function directly as it may not be of
much use other than passing it as a callback. Some languages allow you to define a
function “inline” without an identifier so that it can be passed to another function. Since
the function has no name and cannot be invoked by other sections of the code (other
than the function we passed it to), it is known as an anonymous function.

5.3.2. Function Overloading

Some languages do not allow you to define more than one function with the same name
in the same scope. This is to prevent ambiguity in the code. When you write code to
invoke a function and there are several functions with that name, which one are you

144
5.3. Other Issues

actually calling?

Some languages do allow you to define multiple functions with the same name as long as
they differ in either the number (also called arity) or type of parameters. For example,
you could define two absolute value, |x| functions with the same name, but one of them
takes a floating point number while the other takes an integer as its parameter. This
is known as function overloading because you are “overloading” the code by defining
multiple functions with the same name.

The ambiguity problem is solved by requiring that each function with the same name
differs in their parameters. If you invoke the absolute value function and pass it a floating
point number, clearly you meant to call the first version. If you passed it an integer, it is
clear that you intended to invoke the second version. Depending on the type and number
of arguments you pass to a function, the compiler or interpreter is able to determine
which version you intend to call and is able to make the right function call. This process
is known as static dispatch.

In a language without function overloading, we would be forced to use different names

for functions that perform the same operation but on different types.

5.3.3. Variable Argument Functions

Many languages allow you to define special functions that take a variable number of
parameters. Often they are referred to as “vararg” (short for variable argument) functions.
The syntax for doing so varies as does how you can write a function to operate on a
variable number of arguments (usually through some array or collection data structure).

The standard printf (print formatted) function found in many languages is a good
example of a vararg function. The printf function allows you to use one function to
print any number of variables or values. Without a vararg function, you would have to
implement a printf version for no arguments, 1 argument, 2 arguments, etc. Even
then, you would only be able to support up to n arguments for as many functions as
you defined. By using a vararg function, we can write a single function that operates
on all of these possibilities. It is import to note, a vararg function is not an example
of function overloading. There is still only one function defined, it just takes a variable
number of arguments.

5.3.4. Optional Parameters & Default Values

Suppose that you define a function which has, say, three parameters. Now suppose you
invoke the function but only provide it 2 of the 3 arguments that it expects. Some
languages would not allow this and it would be considered a syntax or runtime error. Yet

145
5. Functions

other languages may have very complex rules about what happens when an argument is
omitted. Some languages allow you to omit some arguments when calling functions as a
feature of the language. That is, the parameters to a function are optional.

When a language allows parameters to be optional, it usually also allows you to define
default values to the parameters if the calling function does not provide them. If a
user calls the function without specifying a parameter, it takes on the default value.
Alternatively, the default could be a non-value like “null” or “undefined.” Inside the
function you could implement logic that determined whether or not a parameter was
passed to the function and alter the behavior of the function accordingly.

5.4. Exercises

Exercise 5.1. Recall that the greatest common divisor (gcd) of two positive integers, a
and b is the largest positive integer that divides both a and b. Adapt the solution from
Exercise 4.6 into a function. If the language you use supports it, return the gcd via a
pass by reference variable.
Exercise 5.2. Write a function that scales an input x to to its scientific notation scale
so that 1 ≤ x < 10. If you language supports pass by reference, the amount that x is
shifted should be stored in a pass-by-reference parameter. For example, a call to this
function with x = 314.15 should return 3.1415 and the amount it is scaled by is n = −2.
Exercise 5.3. Write a function that returns the most significant digit of a floating point
number. The function should only return an integer in the range 1–9 (it should return
zero only if x = 0).
Exercise 5.4. Write a function that, given an integer x, sums the values of its digits.
That is, for x = 29423 the sum 2 + 9 + 4 + 2 + 3 = 20.
Exercise 5.5. Write a function to convert radians to degrees using the formula,
180 · rad
deg =
π
Write another function to covert degrees to radians.
Exercise 5.6. Write functions to compute the diameter, circumference and area of a
circle given its radius. If your language supports pass by reference, compute all three of
these with one function.
Exercise 5.7. The arithmetic-geometric mean of two numbers x, y, denoted M (x, y) (or
√
agm(x, y)) can be computed iteratively as follows. Initially, a1 = 12 (x + y) and g1 = xy
(i.e. the normal arithmetic and geometric means). Then, compute
an+1 = 12 (an + gn )
√
gn+1 = an g n

146
5.4. Exercises

The two sequences will converge to the same number which is the arithmetic-geometric
mean of x, y. Obviously we cannot compute an infinite sequence, so we compute until
|an − gn | < for some small number .
Exercise 5.8. Write a function to compute the annual percentage yield (APY) given an
annual percentage rate (APR) using the formula
AP Y = eAP R − 1
Exercise 5.9. Write a function that will compute the air distance between two locations
given their latitudes and longitudes. Use the formula as in Exercise 2.14.
Exercise 5.10. Write a function to convert a color represented in the RGB (red-green-
blue) color model (used in digital monitors) to a CMYK (cyan-magenta-yellow-key) used
in printing. RGB values are integers in the range [0, 255] while CMYK are fractional
numbers in the range [0, 1]. To convert to CMYK, you first need to scale each integer
value to the range [0, 1] by simply computing
r g b
r0 = , g0 = , b0 =
255 255 255
and then using the following formulas:
K = 1 − max{r0 , g 0 , b0 }
C = (1 − r0 − k)/(1 − k)
M = (1 − g 0 − k)/(1 − k)
Y = (1 − b0 − k)/(1 − k)

Exercise 5.11. Write a function to convert from CMYK to RGB using the following
formulas.
r = 255 · (1 − C) · (1 − K)
g = 255 · (1 − M ) · (1 − K)
b = 255 · (1 − Y ) · (1 − K)

Exercise 5.12. Write some functions to convert an RGB color to a gray scale, “removing”
the color values. An RGB color value is grayscale if all three components have the same
value. To transform a color value to grayscale, there there are several possible techniques.
The average method simply sets all three values to the average:
r+g+b
3
The lightness method averages the most prominent and least prominent colors:
max{r, g, b} + min{r, g, b}
2

147
5. Functions

The luminosity technique uses a weighted average to account for a human perceptual
preference toward green:
0.21r + 0.72g + 0.07b

Exercise 5.13. Adapt the methods to compute a square root in Exercise 4.21 into
functions.

Exercise 5.14. Adapt the methods to compute the natural logarithm in Exercise 4.22
into functions.

Exercise 5.15. Weight (mass in the presence of gravity) can be measured in several
scales: kilogram force (kgf), pounds (lbs), ounces (oz), or Newtons (N). To convert
between these scales, you can use the following facts:

• 1 kgf is equal to 2.20462 pounds

• There are 16 ounces in a pound

• 1 kgf is equal to 9.80665 Newtons

Write a collection of functions to convert between these scales.

Exercise 5.16. Length can be measured by several different units. We will concern
ourselves with the following scales: kilometer, mile, nautical mile, and furlong. A measure
in each one of these scales can be converted to another using the following facts.

• One mile is equivalent to 1.609347219 kilometers

• One nautical mile is equivalent to 1.15078 miles

• A furlong is 81 -th of a mile

Write a collection of functions to convert between these scales.

Exercise 5.17. Temperature can be measured in several scales: Celsius, Kelvin, Fahren-
heit, and Newton. To convert between these scales, you can use the following conversion
table.

From/To Celsius Kelvin Fahrenheit Newton

9 33
Celsius – c + 273.15 c 5 + 32 c 100
9
Kelvin k − 273.15 – 5
k − 459.67 .33k − 90.1395
Fahrenheit (f − 32) 59 5
9
f + 255.372 – 11
60
f − 8815
Newton n 100
33
100
33
n + 273.15 60
11
n + 32 –

Table 5.1.: Conversion Chart

Write a collection of functions to convert between these scales.

148
5.4. Exercises

Exercise 5.18. Energy can be measured in several different scales: calories (c), joules
(J), ergs (erg) and foot-pound force (ft-lbf) among others. To convert between these
scales, you can use the following facts:

• 1 erg equals 1.0 × 10−7 J

• 1 ft-lbs equals 1.3558 joules

• 1 calorie is equal to 4.184 joules

Write a collection of functions to convert between these scales.

Exercise 5.19. Pressure is a measure of force applied to the surface of an object per
unit area. There are several units that can be used to measure pressure:

• Pascal (Pa) which is one Newton per square meter

• Pound-force Per Square Inch (psi)

• Atmosphere (atm) or standard atmospheric pressure

• The torr, an absolute scale for pressure

To convert between these units, you can use the following formulas.

• 1 psi is equal to 6,894.75729 Pascals, 1 psi is equal to 0.06804596 atmospheres

• 1 atmosphere is equal to 101,325 Pascals

1 101,325
• 1 torr is equal to 760
atmosphere and 760
Pascals

Write a collection of functions to convert between these scales.

149
6. Error Handling

Writing perfect code is difficult. The more complex a system or code base, the more
likely it is to have bugs. That is, flaws or mistakes in a program that result in incorrect
behavior or unintended consequences. The term “bug” has been used in engineering
for quite a while. The term was popularized in the context of computer systems by
Grace Hopper who, when working on the Naval Mark II computer in 1946, tracked a
malfunction to a literal bug, a moth, trapped in a relay [2].

Some of the biggest modern engineering failures can be tracked to simple software bugs.
For example, on September 26th, 1983 a newly installed Soviet early warning system
gave indication that nuclear missiles had been launched on the Soviet Union by the
United States. Stanislav Petrov, a lieutenant colonel in the Soviet Air Defense Forces and
duty officer at the time, did not trust the new system and did not report the incident to
superiors who may have ordered a counter strike. Petrov was correct as the false alarm
was caused by sunlight reflections off of high altitude clouds as well as other bugs in the
newly deployed system [28].

In September 1999 the Mars Climate Orbiter, a project intended to study the Martian
climate and atmosphere was lost after it entered into the upper atmosphere of Mars and
disintegrated. The error was due to a subsystem that measured the craft’s momentum
in a non-standard pound force per second unit when all other systems expected the
standard newton second unit [1]. The loss of the project was calculated at over $125
million.

There are numerous other examples, some that have caused inconvenience to users (such
as the Zune bug mentioned in Section 4.5.2) to bugs in medical devices that have cost
dozens of lives to those resulting in the loss of millions of dollars [6].

In some sense, Software Engineering and programming is unique. If you build a bridge
and forget one bolt its likely not going to cause the bridge to collapse. If you draw
up plans for a development and the land survey is a few inches off overall, its not a
catastrophic problem. However, if you forget one character or are off by one number in a
program, it can cause a complete system failure.

There are a variety of reasons for why bugs make it into systems. Bugs could be the result
of a fundamental misunderstanding of the problem or requirements. Poor management
and the pressure of time constraints to deliver a project may make developers more
careless. A lack of proper testing may mean many more bugs survive the development

151
6. Error Handling

process than otherwise should have. Even expert programmers can overlook a simple
mistake when writing thousands of lines of code.

Given the potential for error, it is important to have good software development method-
ologies that emphasize testing a system at all levels. Working in teams where regular
code reviews are held so that colleagues can examine, critique, and catch potential bugs
are essential for writing robust code.

Modern coding tools and techniques can also help to improve the robustness of code. For
example, debuggers are tools that help a developer debug (that is, find and fix the cause
of an error) a program. Debuggers allow you to simulate the execution of a program
statement-by-statement and view the current state of the program such as variable values.
You can “step through” the execution line by line to find where an error occurs in order
to localize and identify a bug.

Other tools allow you to perform static analysis on source code to search for potential
problems. That is, problems that are not syntax errors and are not necessarily bugs
that are causing problems, but instead are anti-patterns or code smells. Anti-patterns
are essentially common bad-habits that can be found in code. They are an attempted
solution to a commonly encountered problem but which don’t actually solve the problem
or introduces new problems. Code smells are “symptoms” in a source code that indicate
a possible deeper design or implementation flaw. Failure to adhere to good programming
principles such as properly initializing variables or failure to check for null values are
examples of smells. Static analysis tools automatically examine the code base for potential
issues like these. For example, a lint (or linter) is a tool that can examine source code
suspicious or non-portable code or code that does not comply with generally accepted
standards or ways of doing things.

Even if code contains no bugs, it is still susceptible to errors. For example, a program
could connect to a remote database to query and process data. However, if the network
connection is temporarily unavailable, the program will not be able to execute properly.
Because of the potential of such errors, it is important to write code that is not only bug
free but is also robust and resilient. We must anticipate possible error conditions and
write code to detect, prevent, or recover from such errors. Generally, this is referred to
as error handling.

Much of what we now consider Software Engineering was pioneered by people like
Margaret Hamilton who was the lead Apollo flight software designer at NASA. During
the Apollo 11 Moon landing (1969), an error in one system caused the lander’s computer
to become overworked with data. However, because the system was designed with a
robust architecture, it could detect and handle such situations by prioritizing more
important tasks (those related to landing) over lower priority tasks. The resilience that
was built into the system is credited with its success [11].

152
6.1. Error Handling

6.1. Error Handling

In general, errors are potential conditions or situations that can reasonably be anticipated
by a developer. For example, if we write code to open and process a file, there are several
things that could go wrong. The file may not exist, or we may not have permission on
the system to read it, or the formatting in the file may be corrupted or not as expected.
Still yet, everything could be fine with the file, but it may contain erroneous or invalid
values.

If an error can be anticipated, we may be able to write code that detects the particular
error condition and handles it by executing code that may be able to recover from the
error condition. In the case of a missing file for example, we may be able to prompt the
user for an alternate file.

We may be able to detect but not necessarily recover from certain errors. For example, if
the file has been corrupted in example above, there may not be a way to properly “fix”
it. If it contains invalid data, we may not even want the program to fix it as it may
indicate a bug or other issue that needs to be addressed. Still yet, there may be some
error conditions that we cannot recover from at all as they are completely unexpected.
In such instances, we may want the error to result in the termination of the program in
which case the error is considered fatal.

6.2. Error Handling Strategies

There are several general strategies for performing error handling. We’ll look at two
general methods here: defensive programming and exceptions.

6.2.1. Defensive Programming

Defensive programming is a “look before you leap” strategy. Suppose we have a potentially
“dangerous” section of code; that is, a line or block of code whose execution could encounter
or result in an error condition. Before we execute the code, we perform a check to see
if the error condition is present (usually using a conditional statement). If the error
condition does not hold, then we proceed with the code as normal. However, if the error
condition does hold, instead of executing the code, we execute alternative code that
handles the error.

Suppose we are about to divide by a number. To prevent a division by zero error, we

can check if our denominator is zero or not. If it is, then we raise or handle the error
instead of performing the division. What should be done in such a case? We could, as an
alternative, use a predefined value as a result instead. Or we could notify the user and

153
6. Error Handling

ask for an alternative. Or we could log the error and proceed as normal. Or we could
decide that the error is so egregious that it should be fatal and terminate the execution
of the program.

Which is the right way to handle this error? It really depends on your design requirements
really. This raises the question, though: “who” is responsible for making these decisions?
Suppose we’re designing a function for a library that is not just for our project but others
as well (as is the case with the standard library functions). Further, the function we’re
designing could have multiple different error conditions that it checks for. In this scenario
there are two entities that could handle the errors: the function itself and the code that
invokes the function.

Suppose that we decide to handle the errors inside the function. As designers of the
function, we’ve made the decision to handle the errors for the user (the code that invokes
our function). Regardless of how we decide to handle the errors, this design decision has
essentially taken any decision making ability away from users. This is not very flexible
for someone using our code. If they have different design considerations or requirements,
they may need or want to handle the errors in a different way than we did.

Now suppose that we decide not to handle the errors inside our function. Defensive
programming may still be used to prevent the execution of code that results in an error.
However, we now need a way to communicate the error condition to the calling function
so that it can know what type of error happened and handle it appropriately.

Error Codes

One common pattern to communicate errors to a calling function is to use the return
type as an error code. Usually this is an integer type. By convention 0 is used to indicate
“no error” and various other non-zero values are used to indicate various types of errors.
Depending on the system and standard used, error codes may have a predefined value or
may be specific to an application or library.

One problem with using the return type to indicate errors is that functions are no longer
able to use the return type to return an actual computed value. If a language supports
pass by reference, then this is not generally a problem. However, even with such languages
there are situations where the return type must be used to return a value. In such cases,
the function can still communicate a general error message by returning some flag value
such as null.

Alternatively, a language may support error codes by using a shared global variable that
can be set by a function to indicate an error. The calling function can then examine the
variable to see if an error occurred during the invocation of the function.

154
6.2. Error Handling Strategies

Limitations

Defensive programming has its limitations. Let’s return to the example of processing a
file. To check for all four of the error conditions we identified, we would need a series of
checks similar to the following.

1 if file does not exists then

2 return an error code
3 end
4 if we do not have permissions then
5 return an error code
6 end
7 if the file is corrupted then
8 return an error code
9 end
10 if the file contains invalid values then
11 return an error code
12 end
13 process file data

A problem arises when an error condition is checked and does not hold. Then, later in
the execution, circumstances change and the error condition does hold. However, since
it was already checked for, the program remains under the assumption that the error
condition does not hold. For example, suppose that another process or program deletes
the file that we wish to process after its existence has been checked but before we start
processing it. Because of the sequential nature of our program, this type of error checking
is susceptible to these issues.

6.2.2. Exceptions

An exception is an event or occurrence of an anomalous, erroneous or “exceptional”

condition that requires special handling. Exceptions interrupt the normal flow of control
in a program by handing the flow of control over to exception handlers.

Languages usually support exception handling using a try-catch control structure such
as the following.

try {
//potentially dangerous code here

155
6. Error Handling

} catch(Exception e) {
//exception handling code here
}

The try is used to encapsulate potentially dangerous code, or simply code that would
fail if an error condition occurs. If an error occurs at some point within the try block,
control flow is immediately transferred to the catch block. The catch block is where
you specify how to handle the exception. If the code in the try block does not result in
an exception, then control flow will skip over the catch statement and resume normally
after.

It is important to understand how exceptions interrupt the normal control flow. For
example, consider the following pseudocode.

try {
statement1;
statement2;
statement3;
} catch(Exception e) {
//exception handling code here
}

Suppose statement1 executes with no error but that when statement2 executes, it
results an exception. Control flow is then transferred to the catch block, skipping
statement3 entirely. In general, there may not be a mechanism for your catch block
to recover and execute statement3 . Therefore, it may be necessary to make your
try-catch blocks fine-grained, perhaps having only a single statement within the try
statement.

Some languages only support a generic Exception and the type of error may need to be
communicated through other means such as a string error message. Still other languages
may support many different types of exceptions and you may be able to provide multiple
catch statements to handle each one differently. In such languages, the order in which
you place your catch statements may be important. Similar to an if-else-if statement,
the first one that matches the exception will be the one that executes. Thus, it is best
practice to order your catch blocks from the most specific to the most general.

Some languages also support a third finally control statement as in the following
example.

156
6.3. Exercises

try {
//potentially dangerous code here
} catch(Exception e) {
//exception handling code here
} finally {
//unconditionally executed code here
}

The try-catch block operates as previously described. However, the finally block
will execute regardless of whether or not an exception was raised. If no exception
was raised, then the try block will fully execute and the finally block will execute
immediately after. If an exception was raised, control flow will be transferred to the
catch block. After the catch block has executed, the finally block will execute.

finally blocks are generally used to handle resources that need to be “cleaned up”
whether or not an exception occurs. For example, opening a connection to a database
to retrieve and process data. Whether or not an exception occurs during this process
the connection will need to be properly closed as it represents a substantial amount of
resources (a network connection, memory and processing time on both the server and
client machines, etc.). Failure to properly close the connection may result in wasted
resources. By placing the clean up code inside a finally statement, we can be assured
that it will execute regardless of an error or exception.

In addition to handling exceptions, a language may allow you to “throw” your own
exceptions by using the keyword throw . In this way you can also practice defensive
programming. You could write a conditional statement to check for an error condition
and then throw an appropriate exception.

6.3. Exercises

Exercise 6.1. Rewrite the function to compute the GCD in Exercise 5.1 to handle
invalid inputs.
Exercise 6.2. Rewrite the function to compute statistics of a circle in Exercise 5.6 to
handle invalid input (negative radius).
Exercise 6.3. Rewrite the function to compute the annual percentage yield in Exercise
5.8 to handle invalid input.
Exercise 6.4. Rewrite the function to compute air distance in Exercise 5.9 to handle
invalid input (latitude/longitude values outside the range [−180, 180]).

157
6. Error Handling

Exercise 6.5. Rewrite the function to convert from RGB to CMYK in Exercise 5.10 to
handle invalid inputs (values outside the range [0, 255]).

Exercise 6.6. Rewrite the function to convert from CMYK to RGB in Exercise 5.11 to
handle invalid inputs.

Exercise 6.7. Rewrite the square root functions from Exercise 5.13 to handle invalid
inputs.

Exercise 6.8. Rewrite the natural logarithm functions from Exercise 5.14 to handle
invalid inputs.

Exercise 6.9. Rewrite the weight conversion functions from Exercise 5.15 to handle
invalid inputs.

Exercise 6.10. Rewrite the length conversion functions from Exercise 5.16 to handle
invalid inputs.

Exercise 6.11. Rewrite the temperature conversion functions from Exercise 5.17 to
handle invalid inputs.

Exercise 6.12. Rewrite the energy conversion functions from Exercise 5.18 to handle
invalid inputs.

Exercise 6.13. Rewrite the pressure conversion functions from Exercise 5.19 to handle
invalid inputs.

158
7. Arrays, Collections & Dynamic
Memory

Rarely do we ever deal with a single piece of data in a program. Instead, most data is
made up of a collection of similar elements. A program to compute grades would be
designed to operate on an entire roster of students. Scientific data represents a collection
of many different samples or measurements. A bank maintains and processes many
different accounts, etc.

Arrays are a way to collect similar pieces of data together in an ordered collection. The
pieces of data stored in an array are referred to as elements and are stored in an ordered
manner. That is, there is a “first” element, “second” element, etc. and a “last” element.
Though the elements are ordered, they are not necessarily sorted in any particular manner.
Instead, the order is usually determined by the order in which you place elements into
the array.

Some languages only allow you to store the same types of elements in an array. For
example, an integer array would only be able to store integers, an array of floating-point
numbers would only be able to store floating-point numbers. Other languages allow you
to create arrays that can hold mixed elements. A mixed array would be able to hold
elements of any type, so it could hold integers, floating-point numbers, strings, objects,
or even other arrays.

Some languages treat arrays as full-fledged object types that not only hold elements, but
have methods that can be called to manipulate or transform the contents of the array.
Other languages treat arrays as a primitive type, simply using arrays as storage data
structures.

Languages can vary greatly in how they implement and represent arrays of elements, but
many have the same basic usage patterns allowing you to create arrays and manipulate
their contents.

159
7. Arrays, Collections & Dynamic Memory

index 0 1 2 3 4 5 6 7 8 9

contents 2 3 5 7 11 13 17 19 23 29

Figure 7.1.: An integer array of size 10. Using zero-indexing, the first element is at index
0, the last at index 9.

7.1. Basic Usage

Creating Arrays

Though there can be great variation in how a language uses arrays, there are some
commonalities. Languages may allow you to create static arrays or dynamically allocated
arrays (see Section 7.2 below for a detailed discussion). Static arrays are generally created
using the program stack space while dynamically allocated arrays are stored in the heap.
In either case you generally declare an array by specifying its size. In statically typed
languages, you must also declare the array’s type (integer, floating-point, etc.).

Indexing Arrays

Once an array has been created you can use it by assigning values to it or by retrieving
values from it. Because there is more than one element, you must specify which element
you are assigning or retrieving. This is generally done through indexing. An index is
an integer that specifies an element in the array. The index is used in conjunction with
(usually) square brackets and the array’s identifier. For example, if the array’s identifier
is arr and the index is an integer value stored in the variable i , we would refer to the
i-th element using the syntax arr[i] . An example is presented in Figure 7.1.

For most programming languages, indices start at zero. This is known as zero-indexing.1
Thus, the first element is at arr[0] , the second at arr[1] etc. When an array is
stored in memory, each element is usually stored one after the other in one contiguous
space in memory. Further, each element is of a specific type which is represented using a
fixed number of bytes in memory. Thus the index actually acts as an offset in memory
from the beginning of the array. For example, if we have an array of integers which each
take 4 bytes each, then the 5th element would be stored at index 4, which is an an offset
equal to 4 × 4 = 16 bytes away from the beginning of the array. The first element, being
at index 0 is 4 × 0 = 0 bytes from the beginning of the array (that is, the first element is
at the beginning of the array).

Once an element has been indexed, it can essentially be treated as a regular variable,
1
Some languages do use 1-indexing but there are very strong arguments in favor of zero-indexing [13].

160
7.1. Basic Usage

assigning and retrieving values as you would regular variables. Care must be taken so
that you do not make a reference to an element that does not exist. For example, using a
negative index or an index i ≥ n in an array of n elements. Depending on the language,
indexing an array element that is out-of-bounds may result in undefined behavior, an
exception being thrown, or a corruption of memory.

Iteration

Since we can simply index elements in an array, it is natural to iterate over elements
in an array using a for loop. We can create a for loop using an index variable i which
starts at 0, and increments by one on each iteration, accessing the i-th element using the
syntax described above, arr[i] . One issue is that such a loop would need to know how
large the array is in order to define a terminating condition.

1 n ← size of array arr

2 for (i ← 0; i < (n − 1); i ← (i + 1)) do
3 process element arr[i]
4 end

Some languages build the size of the array into a property that can be accessed. Java,
for example, has a arr.length property. Other languages provide functions that you
can use to determine their size. Still other languages (such as C), place the burden of
“bookkeeping” the size of an array on you, the programmer. Whenever you pass an array
to a function you need to also pass a size parameter that informs the function of how
many elements are in the array. Yet other functions may also require you to pass the
size of each element in the array.

Some languages also support a basic foreach loop (cf. Section 4.4). A foreach loop is
syntactic sugar that allows you to iterate over the elements in an array (usually in order)
without the need for boilerplate code that creates and increments an index variable.

1 foreach element a in arr do

2 process element a
3 end

The actual syntax may vary if a language supports such a loop.

161
7. Arrays, Collections & Dynamic Memory

Using Arrays in Functions

Most programming languages allow you to use arrays as both function parameters and as
return types. You can pass arrays to functions and functions can be defined that return
arrays. Typically, when arrays are passed to functions, they are passed by reference so
as to avoid making an entirely new copy of the array. As a consequence, if the function
makes changes to the array elements, those changes may be realized in the calling function.
Sometimes you may want this behavior. However, sometimes you may not want the
function to make changes to the passed array. Some languages allow you to use various
keywords to prevent any changes to the passed array.

If a function is designed to return an array, care must be taken to ensure that the correct
type of array is returned. Recall that static arrays are allocated on the call stack. It
would be inappropriate to return static arrays from a function as the array is part of
the stack frame that is destroyed when the function returns control back to the calling
function (we discuss this in detail below). Instead, if a function returns an array, it
should be an array that has been dynamically allocated (on the heap).

7.2. Static & Dynamic Memory

Recall that arrays can be declared as static arrays, meaning that when you declare them,
they are allocated and stored on the program’s call stack within the stack frame in which
they are declared. For example, if a function foo() creates a static array of 5 integers,
each requiring 4 bytes, then 20 contiguous bytes are allocated on the stack frame for
foo() to store the static array.

This can cause several potential problems. First, the typical stack space allocated for a
program is relatively small. It can be as large as a couple of Megabytes (MBs) or on
some older systems or those with limited resources, even on the order of a few hundred
Kilobytes (KBs). When dealing with data of any substantial size, you could quickly run
out of stack space, resulting in a stack overflow.

Another problem arises if we want to return a static array from a function. Recall that
when a function returns control to the calling function, its stack frame is popped off the
top and goes out-of-scope (see Section 5.2). Since the array is part of the stack frame of
the function, it too goes out of scope. Depending on how the system works, the contents
of the frame may be completely changed in the “bookkeeping” process of returning from
the function. Even if the contents remain unchanged when the function returns, they
will almost certainly be overwritten when another function call is made and a new stack
frame overwrites the old one. For these reasons, static arrays are of very limited use.
They must be kept small and cannot be returned from a function.

162
7.2. Static & Dynamic Memory

Demonstration in C

To make this concept a bit more clear, we’ll use a concrete example in the C programming
language. Consider the program code in Figure 7.2. Here, we have a function foo()
that creates a static integer array of size 5, int b[5]; . This memory is allocated on
the stack frame and then assigned values. Printing them in the function will give the
expected results,

b[0] = 5
b[1] = 10
b[2] = 15
b[3] = 20
b[4] = 25

1 #include <stdlib.h>
2 #include <stdio.h>
3

4 int * foo(int n) {
5 int i;
6 int b[5];
7 for(i=0; i<5; i++) {
8 b[i] = n*i;
9 printf("b[%d] = %d\n", i, b[i]);
10 }
11 return b;
12 }
13

14 int main(int argc, char **argv) {

15 int i, m = 5;
16 int *a = foo(m);
17 for(i=0; i<5; i++) {
18 printf("a[%d] = %d\n", i, a[i]);
19 }
20 return 0;
21 }

Figure 7.2.: Example returning a static array

163
7. Arrays, Collections & Dynamic Memory

However, when the function foo() ends execution and returns control back to the
main() function, (sometimes called unwinding), the contents of foo() ’s stack frame
are altered as part of the process. Some of the contents are the same, but elements have
been completely altered. Printing the “returned” contents of the array gives us garbage:

a[0] = 1564158624
a[1] = 32767
a[2] = 15
a[3] = 20
a[4] = -626679356

This is not an issue when returning primitive types as the return value is placed in a
special memory location available to the calling function. Even in our example, the
return value is properly communicated to the calling function: its just that the returned
value is a pointer to the array’s location (which happens to be a memory address in the
“stale” stack frame). The stack frames are depicted in Figure 7.3.

7.2.1. Dynamic Memory

The solution to the problems presented by static arrays is to use dynamically allocated
arrays. Such arrays are not allocated and stored in the program call stack, but instead
are stored in another area called the heap. In fact, because of the shortcomings of static
arrays, some languages only allow you to use dynamically allocated arrays even if it is
done so implicitly.

Recall that when a program is loaded into memory, its code is placed on the program call
stack in memory. On top of that is any static data (global variables or data for example).
Subsequent function calls place new stack frames on top of that. In addition, at the
“other end” of the program’s memory space, the heap is allocated. The heap is a “pile” of
memory like the stack is, but it is less structured. The stack is one contiguous piece of
memory, but the heap may have “holes” in it as various chunks of it are allocated and
deallocated for the program. In general, the heap is larger but allocation and deallocation
is a more involved and more costly process while the stack is smaller and faster to
allocate/deallocate stack frames from.

Initially, a program is allocated a certain amount of memory for its heap. When a
program attempts to dynamically allocate memory, say for a new array of integers, it
makes a request to the operating system for a certain amount of memory (a certain
number of bytes). The system responds by allocating a chunk of the heap memory which
the program can then use to store elements in the array. Usually, the memory space is

164
7.2. Static & Dynamic Memory

stored as a pointer or reference. The reference is stored in a variable in a stack frame,

but the actual contents of the array are stored in the heap space.

Depending on the language and system, if a program uses all of its heap space and runs
out, the operating system may terminate the program or it may attempt to reserve even
more memory for the program.

Memory Management

If a program no longer needs a dynamically allocated memory space, it should “clean up”
after itself by deallocating or “freeing” the memory, releasing it back to the heap space
so that it can be reused either by the program or some other program or process on the
system. The process of allocating and deallocating memory is generally referred to as
memory management. If a program does not free up memory, it may eventually run out
and be forced to terminate. Even if it does not necessarily run out of available memory,
its performance may degrade or it may impact system performance.

If a program has poor memory management and fails to deallocate memory when it is
no longer needed, the memory “leaks”: the available memory gradually runs out because
it is not released back to the heap for reallocation. Programs which such poor memory
management are said to have a memory leak. Sometimes this is a consequence of a
dangling pointer: when a program dynamically allocates a chunk of memory but then
due to carelessness, loses the reference to the memory chunk, making it impossible to
free up.

Some languages have automatic garbage collection that handle memory management for
us. The language itself is able to monitor the dynamically allocated pieces of memory
and determine if any variable in the program still references it. If the memory is no
longer referenced, it is “garbage” and becomes eligible to be “collected.” The system
itself then frees the memory and makes it available to the program or operating system.
In such “memory managed” languages, we are responsible for allocating memory, but are
not (necessarily) responsible for deallocating it.

Even if a language offers automated memory management, it is still possible to have

memory leaks and other memory allocation issues. Automated memory management
does not solve all of our memory management problems. Moreover, it comes at a cost.
The additional resources and overhead required to monitor memory can have a significant
performance cost. However, with modern garbage collection systems and algorithms,
the performance gap between garbage collected languages and user-managed memory
languages has been shrinking. In any case, all program memory is reclaimed by the
operating system when the program terminates.

165
7. Arrays, Collections & Dynamic Memory

7.2.2. Shallow vs. Deep Copies

In most languages, an array variable is actually a reference to the array in memory.

We could create an array referred to by a variable A and then create another reference
variable B and set it “equal” to A. However, this is simply a shallow copy. Both the
reference variables refer to the same data in memory. Consequently, if we change the
value of an element in one, the change is realized in both.

Often, we want a completely different copy, referred to as a deep copy. With a deep copy,
A and B would refer to different memory blocks. Changes to one would not affect the
other. This distinction is illustrated in Figure 7.5.

7.3. Multidimensional Arrays

A normal array is usually one dimensional. One can think an array as a single “row” in a
table that contains a certain number of entries. Most programming languages allow you
to define multidimensional arrays. For example, two dimensional arrays would model
having multiple rows in a full table. You can also view two dimensional arrays as matrices
in mathematics. A matrix is a rectangular array of numbers that have a certain number
of rows and a certain number of columns.

As an example, consider the following 2 × 3 matrix (it has 2 rows and 3 columns):

1 9 −8
2.5 3 5

In mathematics, entries in a matrix are indexed via their row and column. For example,
ai,j would refer to the element in the i-th row and j-th column. Referring to the row
first and column second is referred to as row major ordering. If the number of rows and
the number of columns are the same, the matrix is referred to as a square matrix. For
example, the following is a square, 10 × 10 matrix.
 
2 68 9 44 80 79 77 59 27 2

 3 86 22 42 58 24 45 39 7 47 


 5 7 17 12 29 56 68 14 65 3 


 7 35 64 69 79 56 52 77 82 85 


 11 55 36 5 25 6 22 25 72 37 


 13 20 37 74 3 53 87 70 3 78 


 17 72 68 26 11 6 63 70 29 16 


 19 59 6 26 87 18 82 27 75 19 

 23 73 30 80 51 14 34 67 59 58 
29 48 2 39 18 21 33 28 40 34

166

Sanet - ST Concepts and Techniques of Programming in C
No ratings yet
Sanet - ST Concepts and Techniques of Programming in C
419 pages
C Data Structure Practice
100% (1)
C Data Structure Practice
507 pages
Theory of Computation
100% (2)
Theory of Computation
429 pages
2017 Book FoundationsOfProgrammingLangua PDF
100% (7)
2017 Book FoundationsOfProgrammingLangua PDF
382 pages
Std11 CompSci EM
100% (1)
Std11 CompSci EM
392 pages
Computer Programming in C For Beginners
100% (1)
Computer Programming in C For Beginners
199 pages
C Data Structure Practice
No ratings yet
C Data Structure Practice
507 pages
CP Text Book
No ratings yet
CP Text Book
546 pages
Dowek Principles of Programming Languages c2009
100% (2)
Dowek Principles of Programming Languages c2009
171 pages
What Is Computer Science PDF
No ratings yet
What Is Computer Science PDF
244 pages
What Is Computer Science
100% (2)
What Is Computer Science
244 pages
Ecology PDF
No ratings yet
Ecology PDF
3 pages
Campus Preparation
No ratings yet
Campus Preparation
456 pages
Elephant Lifting Catalog v48
100% (1)
Elephant Lifting Catalog v48
80 pages
Campus Booklet
No ratings yet
Campus Booklet
456 pages
20 Fall Computer Science Concepts
No ratings yet
20 Fall Computer Science Concepts
28 pages
Oral Habits and Its Relationship To Malocclusion A Review.20141212083000
No ratings yet
Oral Habits and Its Relationship To Malocclusion A Review.20141212083000
4 pages
Computer Science Is More Important Than Calculus: The Challenge of Living Up To Our Potential
No ratings yet
Computer Science Is More Important Than Calculus: The Challenge of Living Up To Our Potential
4 pages
CSE Guide
No ratings yet
CSE Guide
345 pages
Computer Science One
No ratings yet
Computer Science One
625 pages
MSC Circ 0913
No ratings yet
MSC Circ 0913
11 pages
Computer Science
No ratings yet
Computer Science
130 pages
2001 Nieuwaal
No ratings yet
2001 Nieuwaal
89 pages
An Introduction To Computer Science and Problem Solving: What Is in This Chapter ?
No ratings yet
An Introduction To Computer Science and Problem Solving: What Is in This Chapter ?
38 pages
ACPH Formula
No ratings yet
ACPH Formula
4 pages
Leadership Across Cultures
No ratings yet
Leadership Across Cultures
36 pages
Fundamentals of Aerodynamits: MC Graw Hill
No ratings yet
Fundamentals of Aerodynamits: MC Graw Hill
9 pages
Computer Science One
No ratings yet
Computer Science One
615 pages
Nawoduzopuwifunit
100% (1)
Nawoduzopuwifunit
2 pages
Computer Science Principles With Java
No ratings yet
Computer Science Principles With Java
261 pages
C语言的科学与艺术 (英文版)
No ratings yet
C语言的科学与艺术 (英文版)
596 pages
PHD Thesis GauthamRam Cover Final
No ratings yet
PHD Thesis GauthamRam Cover Final
251 pages
Introduction
No ratings yet
Introduction
138 pages
Computer Network - CS610 Power Point Slides Lecture 12
No ratings yet
Computer Network - CS610 Power Point Slides Lecture 12
20 pages
Glade Tutorial
No ratings yet
Glade Tutorial
5 pages
Introduction
No ratings yet
Introduction
138 pages
Introduction To Computer Science: Ryan Stansifer
No ratings yet
Introduction To Computer Science: Ryan Stansifer
143 pages
IntroCS Book
No ratings yet
IntroCS Book
104 pages
Introduction
No ratings yet
Introduction
134 pages
Alistair Edwards - Get Set For Computer Science (Get Set For University)
No ratings yet
Alistair Edwards - Get Set For Computer Science (Get Set For University)
209 pages
Lecture0 720p en
No ratings yet
Lecture0 720p en
51 pages
Computer Science in A Flash The Absolute Essentials Principles of Programming, Coding, and Computing (Julian Nash)
No ratings yet
Computer Science in A Flash The Absolute Essentials Principles of Programming, Coding, and Computing (Julian Nash)
157 pages
Introduction
No ratings yet
Introduction
138 pages
2023 Fall Lectures 0 Lang en Lecture0
No ratings yet
2023 Fall Lectures 0 Lang en Lecture0
35 pages
2022 Lecture0 720p Sdr-En
No ratings yet
2022 Lecture0 720p Sdr-En
51 pages
2 5348041922055258424
No ratings yet
2 5348041922055258424
26 pages
OS in Education
No ratings yet
OS in Education
26 pages
Introducing Computer Science Fundamentals Before Programming
No ratings yet
Introducing Computer Science Fundamentals Before Programming
5 pages
Computer Science For The Masses: Robert Sedgewick Princeton University
No ratings yet
Computer Science For The Masses: Robert Sedgewick Princeton University
50 pages
Computer Programming: "The Beginning of A New Era"
No ratings yet
Computer Programming: "The Beginning of A New Era"
8 pages
Algebra and More For Analytics
No ratings yet
Algebra and More For Analytics
29 pages
Topic 2 Linear Programming
No ratings yet
Topic 2 Linear Programming
64 pages
ICS Chapter 1
No ratings yet
ICS Chapter 1
19 pages
Performance and Durability Comparison: Dell Latitude 14 5000 Series vs. HP EliteBook 840 G1
No ratings yet
Performance and Durability Comparison: Dell Latitude 14 5000 Series vs. HP EliteBook 840 G1
20 pages
CSE115 Course Notes: Fall 2010 Revision
No ratings yet
CSE115 Course Notes: Fall 2010 Revision
3 pages
Guiding Principle:: Title: Training Guide For Dcws On Self Help Assessment
No ratings yet
Guiding Principle:: Title: Training Guide For Dcws On Self Help Assessment
33 pages
Accelerator 960-1 052018
No ratings yet
Accelerator 960-1 052018
4 pages
Current Affairs - Compendium - DMS - IIT - Delhi
No ratings yet
Current Affairs - Compendium - DMS - IIT - Delhi
28 pages
1.0 Introduction - Problem - Solving - IPOS - Algorithms
No ratings yet
1.0 Introduction - Problem - Solving - IPOS - Algorithms
14 pages
DxDiag Requisitos
No ratings yet
DxDiag Requisitos
30 pages
Ayitenew Determinantsof Internal Audit Effectiveness Evidencefrom Gurage Zone
No ratings yet
Ayitenew Determinantsof Internal Audit Effectiveness Evidencefrom Gurage Zone
12 pages
Lecture 1
No ratings yet
Lecture 1
20 pages
Javascript Programming: Introduction To
No ratings yet
Javascript Programming: Introduction To
14 pages
1980 Bookmatter FundamentalsOfComputerScience
No ratings yet
1980 Bookmatter FundamentalsOfComputerScience
10 pages
An Introduction Computer Science and History
No ratings yet
An Introduction Computer Science and History
9 pages
5630 Cree
No ratings yet
5630 Cree
32 pages
Lesson Plans Feb. 2019
No ratings yet
Lesson Plans Feb. 2019
13 pages
Introduction To Computer Science - Ryan Stansifer
No ratings yet
Introduction To Computer Science - Ryan Stansifer
10 pages
Process Design and Reengineering
No ratings yet
Process Design and Reengineering
23 pages
Unit 8 Year 6 (w21)
No ratings yet
Unit 8 Year 6 (w21)
23 pages
Introduction To Computer Science
No ratings yet
Introduction To Computer Science
5 pages
DCP Exam Datesheet
No ratings yet
DCP Exam Datesheet
15 pages
LP Eng8
No ratings yet
LP Eng8
6 pages
Computer Science Essay
No ratings yet
Computer Science Essay
1 page
15 Remanufact
No ratings yet
15 Remanufact
6 pages
Stanford GSB Ee Sample Schedule MRR
No ratings yet
Stanford GSB Ee Sample Schedule MRR
1 page
Homework 2 DSP
No ratings yet
Homework 2 DSP
2 pages
Programming Problems: Advanced Algorithms
From Everand
Programming Problems: Advanced Algorithms
Bradley Green
3.5/5 (7)
Group Cognition
From Everand
Group Cognition
Gerry Stahl
No ratings yet
Programming Problems: A Primer for The Technical Interview
From Everand
Programming Problems: A Primer for The Technical Interview
Bradley Green
4.5/5 (3)
The Impact of Artificial Intelligence Technology on Public School Curriculums of Mathematics-Sciences
From Everand
The Impact of Artificial Intelligence Technology on Public School Curriculums of Mathematics-Sciences
Noel Smythe
No ratings yet
Linux Programming Tools Unveiled
From Everand
Linux Programming Tools Unveiled
N. B. Venkateswarlu
No ratings yet
Algorithm Challenges: The Dojo Collection
From Everand
Algorithm Challenges: The Dojo Collection
Martin Puryear
No ratings yet
Learn Python in One Hour: Programming by Example
From Everand
Learn Python in One Hour: Programming by Example
Victor R. Volkman
3/5 (2)
The Infinite Bit: An Inside Story of Digital Technology
From Everand
The Infinite Bit: An Inside Story of Digital Technology
Arvind Padmanabhan
No ratings yet
The Open Schoolhouse: Building a Technology Program to Transform Learning and Empower Students
From Everand
The Open Schoolhouse: Building a Technology Program to Transform Learning and Empower Students
Charlie Reisinger
No ratings yet
Deep Learning for Beginners: A Comprehensive Introduction of Deep Learning Fundamentals for Beginners to Understanding Frameworks, Neural Networks, Large Datasets, and Creative Applications with Ease
From Everand
Deep Learning for Beginners: A Comprehensive Introduction of Deep Learning Fundamentals for Beginners to Understanding Frameworks, Neural Networks, Large Datasets, and Creative Applications with Ease
Steven Cooper
2.5/5 (2)
An Introduction to Functional Programming Through Lambda Calculus
From Everand
An Introduction to Functional Programming Through Lambda Calculus
Greg Michaelson
No ratings yet
Natural Language Understanding: Fundamentals and Applications
From Everand
Natural Language Understanding: Fundamentals and Applications
Fouad Sabry
No ratings yet
Neural Networks: A Practical Guide for Understanding and Programming Neural Networks and Useful Insights for Inspiring Reinvention
From Everand
Neural Networks: A Practical Guide for Understanding and Programming Neural Networks and Useful Insights for Inspiring Reinvention
Steven Cooper
No ratings yet