0% found this document useful (0 votes)

38 views9 pages

Base R Functions and Data Structures

The document provides an overview of basic R functions and data structures, including vectors, matrices, data frames, and lists, along with their creation and manipulation methods. It covers essential functions for data type coercion, string manipulation, and applying functions across data structures. Additionally, it introduces various functions such as apply(), sapply(), and lapply() for efficient data processing.

Uploaded by

Alyssa

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

38 views9 pages

Base R Functions and Data Structures

Uploaded by

Alyssa

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Base R Functions

2024-10-01

Basic Functions & Data Structures

• Type of Classes: numeric, character, logical, factor

#Assignment Operator
a <- 2.7
b <- "hello"
c <- TRUE
a

## [1] 2.7

## [1] "hello"

## [1] TRUE

#class() Function
class(a)

## [1] "numeric"

class(b)

## [1] "character"

is.logical(c)

## [1] TRUE

• Coercion Functions: to convert data types

as.numeric()
as.logical()
as.character()
as.factor()

1
Data Structure Features & Usage
Vector combine function: c( )
selecting elements: vector[index OR index range]
replacing elements with a new value: vector[index] <- new_value
Matrix matrix(values, nrow = , byrow = TRUE) FALSE to fill by
*elements only take in 1 column 1st instead
datatype select elements: matrix[row_no, col_no]
Dataframe columns are named & can be of different class types
List retrieving names of objects in the list: names(list)
selecting components in a list: list[[index]]

#Creating a vector of length 3

d <- c(1,2,3) #c() is a combine function that joins comma-separated data types into a vector
d

## [1] 1 2 3

#Creating a sequence of integers

x <- -2:2
x

## [1] -2 -1 0 1 2

y<- 2ˆx
y

## [1] 0.25 0.50 1.00 2.00 4.00

#Select elements within a vector *indexing in R is 1-based

#--> same as python formatting
#1st element:
y[1]

## [1] 0.25

#Range:
y[2:4]

## [1] 0.5 1.0 2.0

#Replacing elements in a vector with a new value

y[1] <- -3
y

## [1] -3.0 0.5 1.0 2.0 4.0

#Class of Vector --> dependent on the property/nature of its elements

firstname <- c("adam", "brian", "cathy")# character, length = 3
avg <- c(1.2) # numeric, length = 1
pass <- c(TRUE, FALSE, TRUE) # logical, length = 3
class(pass)

2
## [1] "logical"

#--------------
#2) MATRIX
#--------------
#elements only take in one datatype
mat <- matrix(1:12, nrow = 4, byrow = TRUE) #byrow FALSE fills by column first instead, reverses dimensi
mat

## [,1] [,2] [,3]

## [1,] 1 2 3
## [2,] 4 5 6
## [3,] 7 8 9
## [4,] 10 11 12

dim(mat)

## [1] 4 3

#Select elements within a matrix

#entry at 2nd row & 3rd column
mat[2, 3]

## [1] 6

#select elements in 1st & 2nd rows

mat[c(1,2), ]

## [,1] [,2] [,3]

## [1,] 1 2 3
## [2,] 4 5 6

#Different from matrix in a sense that columns are named & can be of different class types

#Creating Data Frames: data.frame() function

budget_cat <- c("Manpower", "Asset", "Other")
amount <- c(519.4, 38.0, 141.4)
op_budget <- data.frame(budget_cat, amount)
op_budget

## budget_cat amount
## 1 Manpower 519.4
## 2 Asset 38.0
## 3 Other 141.4

#Select elements in a dataframe

#Option 1: Conventional Method
op_budget[, "budget_cat"] #select the budget category

## [1] "Manpower" "Asset" "Other"

3
#Option 2: ACCESSOR OPERATOR $
op_budget$budget_cat

## [1] "Manpower" "Asset" "Other"

#[ EXAMPLE: CARS DATA ]

data(cars) #load data
str(cars) #structure of data frame

## ’data.frame’: 50 obs. of 2 variables:

## $ speed: num 4 4 7 7 8 9 10 10 10 11 ...
## $ dist : num 2 10 4 22 16 10 18 26 34 17 ...

dim(cars) #number of rows & columns

## [1] 50 2

class(cars) # checking types of object

## [1] "data.frame"

names(cars) # viewing column names

## [1] "speed" "dist"

head(cars) # default is 6, to specify indicate n = 10

## speed dist
## 1 4 2
## 2 4 10
## 3 7 4
## 4 7 22
## 5 8 16
## 6 9 10

#SPECIFY NUMBER OF ROWS to be examines using

head(cars, n = 10)

## speed dist
## 1 4 2
## 2 4 10
## 3 7 4
## 4 7 22
## 5 8 16
## 6 9 10
## 7 10 18
## 8 10 26
## 9 10 34
## 10 11 17

4
#A collection of objects which can be of different classes & lengths
mylist <- list(A = 1,
B = c(1, 2),
C = c(TRUE, FALSE, TRUE),
D = matrix(1:6, nrow = 3)) # a list of 4 objects
class(mylist)

## [1] "list"

names(mylist)

## [1] "A" "B" "C" "D"

#Select components of a list:

#Option 1: using DOUBLE SQUARE BRACKETS [[]] & take in an index value
mylist[[3]] #3rd component in list

## [1] TRUE FALSE TRUE

#Option 2: using ACCESSOR OPERATOR $ & specify componenet of list

mylist$C

## [1] TRUE FALSE TRUE

table() function: contingency table of counts for a particular variable

unique() function: lists down a dataframe of unique values

General string r

*1-based indexing

string r function how it works

str_length() Find the number of characters in a string
count() Count the number of occurrences of a pattern in a
string
str_c(vector_of_strings, Concatenate strings (similar to paste() but more
vector_of_string_to_concat, sep = “-”) efficient)
*paste()
str_sub(string/vector_of_strings, start = Extract a substring by specifying start and end
index, end = index) positions
str_extract(string/vector_of_strings, Extract the first occurrence of a pattern in a string
pattern)
str_match(string, regular_expression) Extract matched groups from a regular expression
str_match_all() Extract all matched groups from a string
str_split(col_name, pattern = “common Split a string into multiple parts based on a pattern
character”, simplify = TRUE)
str_detect(string/vector_of_strings, Detect the presence of a pattern in a string.
pattern)

5
string r function how it works
str_end(string, character) Detects for strings ending with the character
str_replace(string, pattern, Replace the first occurrence of a pattern in a string
replacement_string)
str_replace_all(string, pattern, Replace all occurrences of a pattern in a string
replacement_string)
str_which(string, pattern) Find the index of strings that match a pattern
str_remove() Remove the first occurrence of a pattern in a string.
str_trim() Remove leading and trailing whitespace from a string
str_squish() Remove excess whitespace from a string (reduces
multiple spaces to one)
str_pad() Pad a string to a specific width.
str_to_upper() Change a string to upper &
str_to_lower() lower case
str_to_title() Each letter of each word is capitalized

library(tidyverse)
#str_extract()
df <- tibble(sentence = c("The price is $100", "It costs $200"))

# Extract the first number after "$"

df %>% mutate(price = str_extract(sentence, "\\$\\d+"))

## # A tibble: 2 x 2
## sentence price
## <chr> <chr>
## 1 The price is $100 $100
## 2 It costs $200 $200

#str_replace()
df <- tibble(sentence = c("I have 2 apples", "You have 3 bananas"))

# Replace the first number with "many"

df %>% mutate(sentence_replaced = str_replace(sentence, "\\d+", "many"))

## # A tibble: 2 x 2
## sentence sentence_replaced
## <chr> <chr>
## 1 I have 2 apples I have many apples
## 2 You have 3 bananas You have many bananas

#str_match()
df <- tibble(sentence = c("I have 2 apples", "You have 3 bananas"))

# Extract the number and the word after it

df %>% mutate(matches = str_match(sentence, "(\\d+) (\\w+)"))

## # A tibble: 2 x 2
## sentence matches[,1] [,2] [,3]
## <chr> <chr> <chr> <chr>
## 1 I have 2 apples 2 apples 2 apples
## 2 You have 3 bananas 3 bananas 3 bananas

6
#str_match_all()
df <- tibble(sentence = c("I have 2 apples and 3 bananas", "You have 4 oranges and 2 pears"))

# Extract all numbers

df %>% mutate(matches = str_match_all(sentence, "\\d+"))

## # A tibble: 2 x 2
## sentence matches
## <chr> <list>
## 1 I have 2 apples and 3 bananas <chr [2 x 1]>
## 2 You have 4 oranges and 2 pears <chr [2 x 1]>

#str_pad()
df <- tibble(name = c("Joe", "Sam"))

# Pad names to 10 characters with dots

df %>% mutate(padded_name = str_pad(name, width = 10, side = "right", pad = "."))

## # A tibble: 2 x 2
## name padded_name
## <chr> <chr>
## 1 Joe Joe.......
## 2 Sam Sam.......

apply functions

function How it works

apply(X, margin , function) X: matrix/array, margin: 1 - rows, 2 - columns, function: function
using anonymous functions: to apply
apply(X, 1, function(x) Apply a function to rows or columns of a matrix or an array
any(is.na(x))
lappply(X, function) Apply a function to each element of a list or vector, returning a list
sapply(X, function) Simplify lapply(), returns a vector or matrix instead of a list
tapply(V index, function) V: vector to be split, index: factors to be split by
Apply a function over subsets of a vector

#---------------------------------------------------
# apply(): across rows/columns of a matrix/dataframe
#---------------------------------------------------
my_mat <- matrix(1:30, nrow = 10, byrow = FALSE) #generate a 10x3 matrix with values 1:30
my_mat

## [,1] [,2] [,3]

## [1,] 1 11 21
## [2,] 2 12 22
## [3,] 3 13 23
## [4,] 4 14 24
## [5,] 5 15 25
## [6,] 6 16 26

7
## [7,] 7 17 27
## [8,] 8 18 28
## [9,] 9 19 29
## [10,] 10 20 30

#COLUMN means
apply(my_mat, MARGIN = 2, mean)

## [1] 5.5 15.5 25.5

#ROW means
apply(my_mat, MARGIN = 1, mean)

## [1] 11 12 13 14 15 16 17 18 19 20

#using ANONYMOUS FUNC

#COLUMN SUM INCREMENTED BY 3
apply(my_mat, MARGIN = 2, function(x) sum(x) + 3)

## [1] 58 158 258

#x is a single column of the matrix my_mat

#anonymous function takes in 1 argument

Example: US Personal Expenditure data

data(USArrests)
head(USArrests, 3)

## Murder Assault UrbanPop Rape

## Alabama 13.2 236 58 21.2
## Alaska 10.0 263 48 44.5
## Arizona 8.1 294 80 31.0

#-----------------------------------------------------------------
# tapply(): om a subset of data frame broken down by factor levels
#-----------------------------------------------------------------
# Create synthetic data
set.seed(27)
df<-data.frame(price = rnorm(100, sd = 5, mean = 20),
city =sample(paste0("C", 1:4),size = 100,replace = T),
region =sample(paste0("R", 1:4),size = 100,replace = T))
head(df)

## price city region

## 1 29.53581 C2 R2
## 2 25.72438 C4 R2
## 3 16.17735 C1 R4
## 4 12.71284 C2 R2
## 5 14.53266 C4 R3
## 6 21.47621 C4 R2

8
#--------------------------------------
#LIST BEING USED for sapply() & laaply()
y <- list(A = 1:5,
B = seq(0, 10, length = 5),
C = c(TRUE, TRUE, FALSE))
y

## $A
## [1] 1 2 3 4 5
##
## $B
## [1] 0.0 2.5 5.0 7.5 10.0
##
## $C
## [1] TRUE TRUE FALSE

#---------------------------------------

#--------------------------------------------------------------
# sapply(): across elements of a list & return a VECTOR/MATRIX
#--------------------------------------------------------------
sapply(y, mean)

## A B C
## 3.0000000 5.0000000 0.6666667

#-----------------------------------------------------
# lapply(): across elements of a list & return a LIST
#-----------------------------------------------------
lapply(y, mean)

## $A
## [1] 3
##
## $B
## [1] 5
##
## $C
## [1] 0.6666667

R Programming Materials
No ratings yet
R Programming Materials
51 pages
Data Structures in R: Matrices, Lists, Arrays
No ratings yet
Data Structures in R: Matrices, Lists, Arrays
46 pages
R Programming Variables and Operators Guide
No ratings yet
R Programming Variables and Operators Guide
8 pages
Apply Functions With Purrr::: Cheat Sheet
No ratings yet
Apply Functions With Purrr::: Cheat Sheet
2 pages
Rtips: Essential R Programming Tips
No ratings yet
Rtips: Essential R Programming Tips
72 pages
R Programming Basics: Vectors, Matrices, Dataframes
No ratings yet
R Programming Basics: Vectors, Matrices, Dataframes
13 pages
CH 3
No ratings yet
CH 3
33 pages
R Programming Cheat Sheet: Data Structures
No ratings yet
R Programming Cheat Sheet: Data Structures
2 pages
R Programming Cheatsheet
100% (2)
R Programming Cheatsheet
6 pages
R Studio
No ratings yet
R Studio
8 pages
R Machine Learning Lab Guide
0% (1)
R Machine Learning Lab Guide
9 pages
Lab 02 - Compound Data Structures
No ratings yet
Lab 02 - Compound Data Structures
12 pages
R Cheatsheet Base R
No ratings yet
R Cheatsheet Base R
2 pages
R Programming Basics and Git Integration
No ratings yet
R Programming Basics and Git Integration
83 pages
R Programming-Chapiter 4
No ratings yet
R Programming-Chapiter 4
16 pages
R Data Types and Structures Overview
No ratings yet
R Data Types and Structures Overview
16 pages
R Programing
No ratings yet
R Programing
32 pages
Introduction to R Programming Basics
No ratings yet
Introduction to R Programming Basics
33 pages
Base R
No ratings yet
Base R
2 pages
R Pres
No ratings yet
R Pres
53 pages
R
No ratings yet
R
15 pages
Week3 2020
No ratings yet
Week3 2020
20 pages
Introduction To R
No ratings yet
Introduction To R
21 pages
Lec 4 Basics of R
No ratings yet
Lec 4 Basics of R
22 pages
R-Programming Record - Odd Sem 21-22
No ratings yet
R-Programming Record - Odd Sem 21-22
35 pages
Tutorial 1
No ratings yet
Tutorial 1
29 pages
R Network Analysis with igraph Guide
No ratings yet
R Network Analysis with igraph Guide
62 pages
Regular Expressions and Text Cleaning Guide
No ratings yet
Regular Expressions and Text Cleaning Guide
2 pages
Da Session 4
No ratings yet
Da Session 4
75 pages
Essential R Data Structures Explained
No ratings yet
Essential R Data Structures Explained
18 pages
Purrr Functions Cheatsheet
No ratings yet
Purrr Functions Cheatsheet
2 pages
R Data Structures and Analysis Basics
No ratings yet
R Data Structures and Analysis Basics
7 pages
R Functions & Operators Guide
No ratings yet
R Functions & Operators Guide
22 pages
Data Analysis Using R - 3
No ratings yet
Data Analysis Using R - 3
32 pages
RStudio
No ratings yet
RStudio
31 pages
R Programming Basics and Functions
No ratings yet
R Programming Basics and Functions
13 pages
Introduction to Data Analysis in R
No ratings yet
Introduction to Data Analysis in R
5 pages
Purrr
No ratings yet
Purrr
2 pages
R Dataframe and Vector Operations Guide
No ratings yet
R Dataframe and Vector Operations Guide
13 pages
MLlab 5 TH
No ratings yet
MLlab 5 TH
17 pages
R Matrix and Vector Operations Guide
No ratings yet
R Matrix and Vector Operations Guide
22 pages
Matrix, Dataframes, List
No ratings yet
Matrix, Dataframes, List
8 pages
Introduction To R
No ratings yet
Introduction To R
74 pages
R File Code
No ratings yet
R File Code
16 pages
R WorkSamples
No ratings yet
R WorkSamples
44 pages
Lecture 1
No ratings yet
Lecture 1
42 pages
BIO259 Note
No ratings yet
BIO259 Note
55 pages
R Programming Language: History
No ratings yet
R Programming Language: History
20 pages
Business Analytics with R: Course Intro
No ratings yet
Business Analytics with R: Course Intro
35 pages
Creating and Manipulating Objects
No ratings yet
Creating and Manipulating Objects
12 pages
R Basics: Data Types and Structures
No ratings yet
R Basics: Data Types and Structures
9 pages
BDS306C - Imp Questions & Answers - Module 2-2
No ratings yet
BDS306C - Imp Questions & Answers - Module 2-2
14 pages
Unit 4
No ratings yet
Unit 4
27 pages
R Programming: © 2016 SMART Training Resources Pvt. LTD
No ratings yet
R Programming: © 2016 SMART Training Resources Pvt. LTD
28 pages
R Programming Basics and Data Handling
No ratings yet
R Programming Basics and Data Handling
22 pages
R Programming Basics & Data Structures
No ratings yet
R Programming Basics & Data Structures
30 pages
Chapter 3 Coordinate Transformations
No ratings yet
Chapter 3 Coordinate Transformations
14 pages
Engineering Analysis I Course Overview
No ratings yet
Engineering Analysis I Course Overview
201 pages
Model For Topological Phononics and Phonon Diode
No ratings yet
Model For Topological Phononics and Phonon Diode
7 pages
MATH1180 Sample Questions
No ratings yet
MATH1180 Sample Questions
6 pages
Phased Array Antenna Calibration Methods
No ratings yet
Phased Array Antenna Calibration Methods
4 pages
Understanding Linear Equations Systems
No ratings yet
Understanding Linear Equations Systems
6 pages
3D Pipeline Tutorial
No ratings yet
3D Pipeline Tutorial
41 pages
Linear Systems: Solutions & Methods
100% (1)
Linear Systems: Solutions & Methods
34 pages
TCS Important Programs
No ratings yet
TCS Important Programs
4 pages
Class XII Pre-Board Exam Syllabus 2024-25
No ratings yet
Class XII Pre-Board Exam Syllabus 2024-25
6 pages
Academic Handbook FoE UG (2023)
No ratings yet
Academic Handbook FoE UG (2023)
36 pages
JEE 2025 Top 100 Maths
No ratings yet
JEE 2025 Top 100 Maths
28 pages
Refractive Index & Pressure Systems
No ratings yet
Refractive Index & Pressure Systems
9 pages
Operations Research & Game Theory in War
100% (1)
Operations Research & Game Theory in War
26 pages
Physics Vector Analysis Guide
100% (7)
Physics Vector Analysis Guide
33 pages
Cryptography and Integer Operations
No ratings yet
Cryptography and Integer Operations
88 pages
Block Sparse Matrix Multiplication with Tensor Cores
No ratings yet
Block Sparse Matrix Multiplication with Tensor Cores
7 pages
30 Days Maths SMS - Ac
No ratings yet
30 Days Maths SMS - Ac
23 pages
Cegep Linear Algebra Problems
No ratings yet
Cegep Linear Algebra Problems
92 pages
Matrices and Linear Algebra
No ratings yet
Matrices and Linear Algebra
13 pages
Change of Basis in Linear Algebra
No ratings yet
Change of Basis in Linear Algebra
6 pages
Oops Lab Manual
No ratings yet
Oops Lab Manual
41 pages
Chapter 7 Division Facts and Strategies
No ratings yet
Chapter 7 Division Facts and Strategies
24 pages
A Coupling Method For Identifying Arc Faults Based On Short-Observation-Window SVDR
No ratings yet
A Coupling Method For Identifying Arc Faults Based On Short-Observation-Window SVDR
10 pages
MATLAB Basics for Engineering Students
No ratings yet
MATLAB Basics for Engineering Students
11 pages
O Level Additional Maths 4037 Tips
0% (1)
O Level Additional Maths 4037 Tips
6 pages
Finite Element Method
No ratings yet
Finite Element Method
176 pages
1449MSc in Data Science Final
No ratings yet
1449MSc in Data Science Final
23 pages
Robotic Arm Control Systems
No ratings yet
Robotic Arm Control Systems
7 pages
Rajesh Sir - Maths First Full Portion Question Paper-1
No ratings yet
Rajesh Sir - Maths First Full Portion Question Paper-1
6 pages

Base R Functions and Data Structures

Uploaded by

Base R Functions and Data Structures

Uploaded by

Base R Functions

Basic Functions & Data Structures

• Type of Classes: numeric, character, logical, factor

• Coercion Functions: to convert data types

#Creating a vector of length 3

#Creating a sequence of integers

## [1] 0.25 0.50 1.00 2.00 4.00

#Select elements within a vector *indexing in R is 1-based

## [1] 0.5 1.0 2.0

#Replacing elements in a vector with a new value

## [1] -3.0 0.5 1.0 2.0 4.0

#Class of Vector --> dependent on the property/nature of its elements

## [,1] [,2] [,3]

#Select elements within a matrix

#select elements in 1st & 2nd rows

## [,1] [,2] [,3]

#Creating Data Frames: data.frame() function

#Select elements in a dataframe

## [1] "Manpower" "Asset" "Other"

## [1] "Manpower" "Asset" "Other"

#[ EXAMPLE: CARS DATA ]

## ’data.frame’: 50 obs. of 2 variables:

dim(cars) #number of rows & columns

class(cars) # checking types of object

names(cars) # viewing column names

## [1] "speed" "dist"

head(cars) # default is 6, to specify indicate n = 10

#SPECIFY NUMBER OF ROWS to be examines using

## [1] "A" "B" "C" "D"

#Select components of a list:

## [1] TRUE FALSE TRUE

#Option 2: using ACCESSOR OPERATOR $ & specify componenet of list

## [1] TRUE FALSE TRUE

table() function: contingency table of counts for a particular variable

unique() function: lists down a dataframe of unique values

string r function how it works

# Extract the first number after "$"

# Replace the first number with "many"

# Extract the number and the word after it

# Extract all numbers

# Pad names to 10 characters with dots

function How it works

## [,1] [,2] [,3]

## [1] 5.5 15.5 25.5

#using ANONYMOUS FUNC

## [1] 58 158 258

#x is a single column of the matrix my_mat

Example: US Personal Expenditure data

## Murder Assault UrbanPop Rape

## price city region

You might also like