Introduction to R
Engr. (Dr.) Onoriode Avbenake
19th September, 2025
1. General Introduction
Contents
2. Creating Vectors and Matrices
3. Data Frames
GENERAL INTRODUCTION
Theory
R is a scripting language for statistical data manipulation and analysis developed
by AT&T
It is free and open source (both inexpensive and beautiful)
It is a public-domain implementation of the widely regarded S statistical
language, and it is a standard among professional statisticians.
It is available for Windows, Mac, and Linux operating systems.
In addition to providing statistical operations, R is a general-purpose
programming language, so it can be used to automate analyses and create new
functions 3
GENERAL INTRODUCTION
Theory
It incorporates features found in object-oriented and functional programming
languages.
The system saves data sets between sessions, so one do not need to reload
them each time. It saves one’s command history too.
Because R is open source software, it is easy to get help from the user
community. Also, a lot of new functions are contributed by users, many of whom
are prominent statisticians.
4
GENERAL INTRODUCTION
Theory
The functional programming nature of the R language offers many advantages:
• Clearer, more compact code
• Potentially much faster execution speed
• Less debugging, because the code is simpler
• Easier transition to parallel programming (such as C and Python)
R operates in two modes: interactive and batch. In interactive mode, users key in
commands, R displays results, they key in more commands, and so on.
On the other hand, batch mode does not require interaction with the user. It is
useful for production jobs, such as when a program must be run periodically, say
once per day, because one can automate the process.
5
CREATING VECTORS AND MATRICES
Vectors
Let us make a simple data set consisting of the numbers 1, 2, 3, 4, 5, 6, 7, 8, 9, 10
and name it w and play around with it
6
CREATING VECTORS AND MATRICES
Vectors
We can do more with these vectors
7
CREATING VECTORS AND MATRICES
Vectors
Including histogram
8
CREATING VECTORS AND MATRICES
Matrices
An R matrix corresponds to the mathematical concept of the same name: a
rectangular array of numbers. Technically, a matrix is a vector, but with two
additional attributes:
the number of rows, and
the number of columns.
Here is some sample matrix code:
rbind()
cbind()
9
CREATING VECTORS AND MATRICES
Matrices
And these matrices can be manipulated thus;
10
CREATING VECTORS AND MATRICES
Matrices
Elements, rows and columns can be extracted from the original matrix. And even
multiplied
11
DATA FRAMES
Data Frames
A data frame is like a matrix, with a two-dimensional rows-and-columns structure
Data frames are the heterogeneous analogs of matrices for two-dimensional data
For instance, one column may consist of numbers, and another column might
have character strings.
Data frames can be created thus;
12
DATA FRAMES
Data Frames
Data frames can be accessed/called in one of three (3) ways
The idea of a data frame is to encapsulate multiple data, along with variable
names, into one object. The contents can equally be viewed thus;
13
DATA FRAMES
Data Frames
More than two data can be created with frames
14
DATA FRAMES
Data Frames
Extracting sub-data from frames
We can filter in Data Frames in two (2) ways thus;
15
DATA FRAMES
Data Frames
Also, we can filter more than one properties
rbind() and cbind() can be used to add data to already existing Data Frames
16
DATA FRAMES
Data Frames
Also, we can filter more than one properties
rbind() and cbind() can be used to add data to already existing Data Frames
17
DATA FRAMES
Data Frames
Since we can add a new component to an already existing list/Data Frame at any
time
18
DATA FRAMES
Data Frames
Two or more Data Frames can be combined/merged/joined based on a common
vector in each.
N.B.: If a data of the common vector is absent in one data frame, that data would not be merged
19
DATA FRAMES
Data Frames
There were errors in the two commands. Debugging should be easily interpreted
and corrected
20
DATA FRAMES
Data Frames
The merge() function has named arguments by.x and by.y, which handle cases in
which variables/vectors have similar information but different names in the two
data frames
21
Frequency
Distributions
and Graphs
Next Topic
22
Thank you
For more information, please contact:
Insert name
Insert role
Insert email
Insert phone contact
Nigerian University of Technology and Management
6, Freetown Road, Apapa, Lagos
Email: info@[Link]
23