0% found this document useful (0 votes)
75 views

Introduction To Data Science With R Programming

This document provides an introduction to programming in R. It begins with an overview of R as a programming language and its history. Some key points covered include R's basics and features, how it compares to other languages like Python and Java, and its main IDE RStudio. The document then discusses variables, operators, and basic data types in R like vectors, matrices, arrays, lists, and data frames. It provides examples of how to create and manipulate objects of each data type.

Uploaded by

Vimal Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
75 views

Introduction To Data Science With R Programming

This document provides an introduction to programming in R. It begins with an overview of R as a programming language and its history. Some key points covered include R's basics and features, how it compares to other languages like Python and Java, and its main IDE RStudio. The document then discusses variables, operators, and basic data types in R like vectors, matrices, arrays, lists, and data frames. It provides examples of how to create and manipulate objects of each data type.

Uploaded by

Vimal Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 40

Introduction to Data Science with R

Programming

Dr. D. Vimal Kumar


Associate Professor
Department of Computer Science
Nehru Arts and Science College
Coimbatore
TABLE OF CONTENTS
Programming Language
History
R Basics & Features
Comparison other programming languages
RStudio
Merits & Demerits
Variables
Operators
Data Types
Programming Language
•What is a programming language?
A programming language is a set of rules that provides a way of telling a
computer what operations to perform.
• What determines a good programming languages?
 Run-time performance
 Ease of designing,
 Coding
 Debugging
 Maintenance
 Reusability
Comparison with other Languages
R Programming Python Java
•It was stably released •It was stably released •It was stably released
in 2014. in 1996. in 1995.
•It has more functions •It has less functions •It has large number of
and packages. and packages. inbuilt functions and
packages.
•It is an interpreter •It is an interpreter •It is interpreter and
base language base language compiled based
language.
•It is statistical design •It is general purpose •It is general purpose
and graphics language. programming
programming language designed for
language. web applications .
•It is difficult to learn •It is easy to •It is easy to learn and
and understand. understand. understand.
•R is mostly use for •Generic programming •Java is mostly used in
data analysis. tasks such as design design of windows
of software's or applications and web
History of R
• A Programming Language
• Graphics Representation
• A Statistical Package
• An Interpreter computer language
• Open Source
• Object Oriented Language
• Reporting
• Used by statisticians and data miners for data
analysis
Cont…..
• R was created by Ross Ihaka and Robert Gentleman
• Well developed , simple and effective Programming language
• Effective data handling and storage facility
• Collection of operators for calculation on array, list, vector
and matrices
• Provides large coherent and integrated collection of tools for
data analysis
• Provides Graphical facilities for data analysis
• R can be interfaced with languages like python , C, C++,
Matlab and Hadoop
RStudio
• RStudio is designed to make it easy to write scripts.
• RStudio makes it convenient to view and interact
with the objects stored in your environment. ...
• RStudio makes it easy to set your working directory
and access files on your computer. ...
• RStudio makes graphics much more accessible for a
casual user
Rstudio IDE
Merits of R
• Open Source. R is an open-source programming
language. ...
• Exemplary Support for Data Wrangling. 
• The Array of Packages.
• Quality Plotting and Graphing. ...
• Highly Compatible. ...
• Platform Independent. ...
• Eye-Catching Reports. ...
• Machine Learning Operations
Disadvantages of R Programming
• Weak Origin. R shares its origin with a much
older programming language “S”.
• Data Handling In R, the physical memory stores the
objects. ...
• Basic Security. R lacks basic security. ...
• Complicated Language. R is not an easy language to
learn. ...
• Lesser Speed. ...
• Spread Across various Packages
So why learn R??
Variable
Operators
Cont....
• Arithmetic Operators: These operators help us perform
the basic arithmetic operations like addition, subtraction,
multiplication, etc.
• Relational Operators: These operators help us perform
the relational operations like checking if a variable is
greater than, lesser than or equal to another variable. The
output of a relational operation is always a logical value.
• Logical Operators: These operators compare the two
entities and are typically used with boolean (logical)
values such as ‘and’, ‘or’ and ‘not’. 
Arithmetic Operators
Relational Operat0rs
Logical Operators
Assignment Operator
Assignment Operators: These operators are used to
assign values to variables in R. The assignment can be
performed by using either the assignment operator (<-)
or equals operator (=). The value of the variable can be
assigned in two ways, left assignment and right
assignment.
Cont....
Sample Program
My.name <- readline(prompt <-"Enter name:")
My.age <- readline(prompt <- "Enter age:")
# Convert character to integer
My.age <- as.integer(My.age)
print(paste("Hi,", My.name, "next year you will be",
My.age+1, "years old."))
Sample Program – Data Visualisation
hist(mtcars$mpg)
hist(mtcars$mpg, breaks=3, col="red")
Data Types
Data Types
Vectors
• Vectors are the most basic R data objects. It contains element of the
same type. The data types can be logical, integer, double, character,
complex or raw. A vector's type can be checked with the typeof()
function. Another important property of a vector is its length.
• remove and rm can be used to remove objects.
• Positive Index – The values inside the brackets are assigned with
Index. Positive Index used to retrieve the members inside the vector
• Negative Index – Used to remove the member from the vector.
• Range Index : Produce vector slice between two indexes by using
colon Operator
• Named Vector – Vector members can be assigned names and
retrieved using names. Names can also be reversed in string vectors
Cont....
# Create a Vector .
a <- c(3,4,5,6,8 )
print(a)
print(length(a))
print (max(a))
print (min(a))
print (head(a,2))
print (tail(a , 3))
#Naming the vector
v<-c(1,2,3)
names(v) = c("First", "Second","Third")
v["First"]
print(v["First"])
Types of Vectors
Matrix
• Matrix is the R object in which the elements are arranged in a two-
dimensional rectangular layout.
The basic syntax for creating a matrix in R is −
matrix(data, nrow, ncol, byrow, dimnames)
Where:
• data is the input vector which becomes the data elements of the
matrix.
• nrow is the number of rows to be created.
• ncol is the number of columns to be created.
• byrow is a logical clue. If TRUE, then the input vector elements are
arranged by row.
• dimname is the names assigned to the rows and columns.
Matrix – Example
Mymatrix <- matrix(c(1:25), nrow = 5, ncol = 5, byrow =
TRUE)
print(Mymatrix)
• Output:
       [,1]  [,2]  [,3]  [,4] [,5]
[1,]    1       2      3     4 5
[2,]    6     7      8   9 10
[3,]    11     12     13   14 15
[4,]    16     17     18   19 20
[5,] 21 22 23 24 25
Example – Matrix Operation
M1 <- matrix (c(2,4,5,6,7,8,7,1), nrow=2, byrow=TRUE)
M2 <- matrix (c(9,8,7,6,5,4,3,2), nrow=2, byrow=FALSE)
# Addition of two Matrix
addmatrix <- M1+M2
print(addmatrix)
# Subtraction of two Matrix
submatrix <- M1-M2
print(submatrix)
#Multiplication of two Matrix
multiplymatrix <- M1*M2
print(multiplymatrix)
# Transpose of Matrix
M2 <- matrix (c(9,8,7,6,5,4,3,2), nrow=2, byrow=FALSE)
tranmatrix <- t(M2)
Print(tranmatrix)
Arrays
While matrices are confined to two dimensions, arrays
Arrays in R are data objects which can be used to store data in more
than two dimensions. It takes vectors as input and uses the values in
the dim parameter to create an array.

The basic syntax for creating an array in R is −


array(data, dim, dimnames)
Where:
• data is the input vector which becomes the data elements of the
array.
• dim is the dimension of the array, where you pass the number of
rows, column and the number of matrices to be created by mentioned
dimensions.
• dimname is the names assigned to the rows and columns.
Example - Array
# Create an array.
a <- array(c('green','yellow'),dim=c(3,3,2))
print(a)
When we execute the above code, it produces the following result:
• ,,1
• [,1] [,2] [,3]
• [1,] "green" "yellow" "green"
• [2,] "yellow" "green" "yellow"
• [3,] "green" "yellow" "green"
• ,,2
• [,1] [,2] [,3]
• [1,] "yellow" "green" "yellow"
• [2,] "green" "yellow" "green"
• [3,] "yellow" "green" "yellow"
Difference

S.No Vectors List

1 Vector stores elements of the A list holds different data


same type or converts such as Numeric, Character,
implicitly. logical, etc

2  vector is not recursive Lists are recursive

3 The vector is one- list is a multidimensional


dimensional object
List
Lists are the R objects which contain elements of different
types like − numbers, strings, vectors and another list inside it.
•Listcancontainelementsofdifferentdatatypes
•Itcancontainnumbers,vectorsorlistinsideitself
•Itiscreatedusinglist()function
Syntax
Listname <- list(values)

ConvertList toVector - unlist()


List - Example
listdata <-list("Green","Red", c( 21, 32, 11),TRUE, 24.5, 11)
print (listdata)
# To See the value stored
print (listdata [1])
print (listdata [3])
# Assign names to the list
names(listdata) <- c("lst quarter","A_matrix","A Innerlist")
# Remove the fourth element from list
listdata[4] <- NULL
# Print 4th Element
print(listdata[4])
# Update 3rd element
listdata[2] <- "Updated element"
print(listdata[2])
listdata[2] <- 45.5
print(listdata[2])
Data Frames
• Data Frame
• A Data Frame is a table or a two-dimensional array-like
structure in which each column contains values of one
variable and each row contains one set of values for each
column. Below are some of the characteristics of a Data Frame
that needs to be considered every time we work with them:
• The column names should be non-empty.
• Each column should contain the same amount of data items.
• The data stored in a data frame can be of numeric, factor or
character type.
• The row names should be unique.
Create Dataframe
emp_id = c(100:104)
emp_name = c("John","Henry","Adam","Ron","Gary")
dept = c("Sales","Finance","Marketing","HR","R & D")
emp.data <- data.frame(emp_id, emp_name, dept)
print(emp.data)
FACTORS
• Factors are data objects that help to categorise the
data and store it as levels
• Factor variables are used for statistical Modeling
• It can store both string and Integer datatype
Factor- Example
# Create the vectors for data frame.
height <- c(132,151,162,139,166,147,122)
weight <- c(48,49,66,53,67,52,40)
gender <-
c("male","male","female","female","male","female","male")
input_data <- data.frame(height,weight,gender)
print(input_data)
# Test if the gender column is a factor.
print(as.factor(input_data$gender))
# Print the gender column so see the levels.
print(input_data$gender)
Thank You
Any Queries

You might also like