0% found this document useful (0 votes)
45 views16 pages

Unit1 R

The document provides an overview of R programming, including its introduction, features, and various data structures such as vectors, lists, matrices, arrays, factors, and data frames. It explains how to create and manipulate these data types, along with the concept of classes and objects in R. Additionally, it covers special values and attributes associated with R objects.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
45 views16 pages

Unit1 R

The document provides an overview of R programming, including its introduction, features, and various data structures such as vectors, lists, matrices, arrays, factors, and data frames. It explains how to create and manipulate these data types, along with the concept of classes and objects in R. Additionally, it covers special values and attributes associated with R objects.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd

BCA V SEM R PROGRAMMING

Statistical Computing and R Programming


BCA V SEM R PROGRAMMING

UNIT I

Introduction to the Language


R is a system for statistical analyses and graphics created by Ross Ihaka and
Robert Gentleman at the University of Auckland, New Zealand. The core of R is
an interpreted computer language which allows branching and looping as well as
modular programming using functions. R allows integration with the procedures
written in the C, C++, .Net, Python or FORTRAN languages for efficiency.
R is freely available under the GNU General Public License, and pre-compiled
binary versions are provided for various operating systems like Linux, Windows
and Mac.
R is free software distributed under a GNU-style copy left, and an official part of
the GNU project called GNU S.

Features of R
➢ R is a well-developed, simple and effective programming
language which includes conditionals, loops, user defined
recursive functions and input and output facilities.
➢ R has an effective data handling and storage facility,
➢ R provides a suite of operators for calculations on arrays, lists,
vectors and matrices.
➢ R provides a large, coherent and integrated collection of tools
for data analysis.
➢ R provides graphical facilities for data analysis and display
either directly at the computer or printing at the papers.
R Command Prompt
Once you have R environment setup and R interpreter is launched.
you will get a prompt > where you can start typing your program as follows:

An object can be created with the “assign" operator which is written as an


arrow with a minus sign and a bracket; this symbol can be oriented left-to-
right or the reverse
> myString <- "Hello, World!"
> print ( myString)
[1] "Hello, World!"
BCA V SEM R PROGRAMMING

> n <- 15
>n
[1] 15
> 5 -> n
>n
[1] 5
R Script File
We can write the programs in script files and then execute those scripts at
command prompt with the help of R interpreter called Rscript.
# My first program in R Programming
myString <- "Hello, World!"
print ( myString)
Save the above code in a file test.R and execute it at Linux command prompt as
given below. Even if you are using Windows or other system, syntax will remain
same.
$ Rscript test.R O/P:

[1] "Hello, World!"

Comments
Comments are statements ignored by R interpreter. Single comment is written
using # in the beginning of the statement as follows:

# My first program in R Programming


Variables,Datatypes in R : Everything in R is an object. R has 5 atomic
vector ypes. By atomic,we mean the vector only holds data of a single type.
❖ Character:"a",“hello"

❖ Numeric(realordecimal):10,25.5

❖ Integer:2L(theLtellsRtostorethisasaninteger)

❖ Logical:TRUE,FALSE

❖ Complex:1+4i(complexnumberswithrealandimaginaryparts

R DataObjects
BCA V SEM R PROGRAMMING

In contrast to other programming languages like C and java in R, the variables


are not declared as some data type. The variables are assigned with R-Objects
and the data type of the R-object becomes the data type of the variable. There
are many types of R-objects. The frequently used ones are:
➢ Vectors
➢ Lists
➢ Matrices
➢ Arrays
➢ Factors
➢ Data Frames

OBJECTS MODES/CLASS SEVERAL MODES


POSSIBLE IN THE
SAME OBJECT?

Vector numeric, character, No


complex or logical
Factor numeric or character No
array numeric, character, No
complex or logical
matrix numeric, character, No
complex or logical
Data Frame numeric, character, Yes
complex or logical
List numeric, character, yes
complex or logical,
function, expression

Vectors
In R programming, the very basic data types are the R-objects called vectors.
The other R-Objects are built upon the atomic vectors.
When you want to create vector with more than one element, you should use c()
function which means to combine the elements into a vector.
# Create a vector.
apple <- c('red','green',"yellow")
print(apple)
# Get the class of the vector.
print(class(apple))
When we execute the above code, it produces the following result:
[1] "red" "green" "yellow"
[1] "character"
BCA V SEM R PROGRAMMING

Lists
A list is an R-object which can contain many different types of elements inside
it like vectors, functions and even another list inside it.
# Create a list.
list1 <- list(c(2,5,3),21.3,sin) #
Print the list.
print(list1)
When we execute the above code, it produces the following result:
[[1]]
[1] 2 5 3
[[2]]
[1] 21.3
[[3]]
function (x) .Primitive("sin")

Matrices
A matrix is a two-dimensional rectangular data set. It can be created using a
vector input to the matrix function.
# Create a matrix.
M = matrix( c('a','a','b','c','b','a'), nrow=2,ncol=3,byrow = TRUE) print(M)
When we execute the above code, it produces the following result:
[,1] [,2] [,3]
[1,] "a" "a" "b"
[2,] "c" "b" "a"

Arrays
While matrices are confined to two dimensions, arrays can be of any number of
dimensions. The array function takes a dim attribute which creates the required
number of dimension. In the below example we create an array with two
elements which are 3x3 matrices each.
# Create an array.
a <- array(c('green','yellow'),dim=c(3,3,2)) print(a)
When we execute the above code, it produces the following result:
,,1
[,1] [,2] [,3]
[1,] "green" "yellow" "green"
[2,] "yellow" "green" "yellow"
[3,] "green" "yellow" "green"
,,2
[,1] [,2] [,3]
[1,] "yellow" "green" "yellow"
[2,] "green" "yellow" "green"
[3,] "yellow" "green" "yellow"
BCA V SEM R PROGRAMMING

Factor.
A factor includes not only the values of the corresponding categorical variable,
but also the different possible levels of that variable (even if they are not present
in the data). The function factor creates a factor with the following options:
factor(x, levels = sort(unique(x), [Link] = TRUE),labels = levels, exclude = NA,
ordered = [Link](x))levels specifies the possible levels of the factor (by
default the unique values of the vector x), labels defines the names of the levels,
exclude the values of x to exclude from the levels, and ordered is a logical
argument specifying whether the levels of the factor are ordered. Recall that x is
of mode numeric or character.
# Create a vector.
apple_colors <-
c('green','green','yellow','red','red','red','green') #
Create a factor object.
factor_apple <- factor(apple_colors) #
Print the factor. print(factor_apple)
print(nlevels(factor_apple))
When we execute the above code, it produces the following result:
[1] green green yellow red red red yellow green
Levels: green red yellow
# applying the nlevels function we can know the number of distinct
values
[1] 3

> factor(1:5, exclude=4)


[1] 1 2 3 NA 5
Levels: 1 2 3 5

Data Frames
Data frames are tabular data objects. Unlike a matrix in data frame each column
can contain different modes of data. The first column can be numeric while the
second column can be character and third column can be logical. It is a list of
vectors of equal length.
Data Frames are created using the [Link]() function or is created implicitly
by the function [Link];
# Create the data frame.
BMI <- [Link]( gender = c("Male", "Male","Female"),height =
c(152, 171.5,165), weight = c(81,93, 78), Age =c(42,38,26) )
print(BMI)
When we execute the above code, it produces the following result:
gender height weight Age
1 Male 152.0 81 42
2 Male 171.5 93 38
3 Female 165.0 78 26
BCA V SEM R PROGRAMMING

Special Values
The special values can be used to mark abnormal or missing values in vectors,
arrays, or other data structures.
Value Description Example
Infinity When a number is R> foo <- Inf
too large for R to R> foo
represent, the value [1] Inf
is deemed to be R> bar <- c(3401,Inf,3.1,-555,Inf,43)
infinite. R> bar
[1] 3401.0 Inf 3.1 -555.0 Inf 43.0
R> baz <- 90000^100
R> baz
[1] Inf
NaN These difficult-to- R> foo
quantify special [1] NaN
values are labeled R> bar <- c(NaN,54.3,-2,NaN,90094.123,-
NaN in R, which Inf,55)
stands for Not a R> bar
Number. [1] NaN 54.30 -2.00 NaN 90094.12 -Inf 55.00
R> -Inf+Inf
[1] NaN
R> Inf/Inf
[1] NaN
NA In statistical R> foo <-
analyses, datasets c("character","a",NA,"with","string",NA)
often contain missing R> foo
values. R provides a [1] "character" "a" NA "with" "string" NA
standard special term R> baz <-
to represent missing matrix(c(1:3,NA,5,6,NA,8,NA),nrow=3,ncol=3)
values, NA, which R> baz
reads as Not [,1] [,2] [,3]
Available [1,] 1 NA NA
[2,] 2 5 8
[3,] 3 6 NA
NULL This value is often R> foo <- NULL
used to explicitly R> foo
define an “empty” NULL
entity, which is quite R> c(2,4,NULL,8)
different from a [1] 2 4 8
“missing” entity
specified with NA.
An instance of NA
clearly denotes an
existing position that
can be accessed
and/or overwritten if
necessary—not so
for NULL.
BCA V SEM R PROGRAMMING

Types, Classes,Coercion
Classes in R Programming:
Classes and Objects are basic concepts of Object-Oriented Programming that
revolve around the real-life entities. Everything in R is an object.
 An object is simply a datastructure that has some methods and attributes.
A class is just a blueprint or a sketch of these objects.
 It represents the set of properties or methods that are common to all
objects of one type.
 Unlike most othe rprogramming languages,R has a three-classs system.
❖ S3 Classes

❖ S4 Classes

❖ Reference Classes
S3Class:S3is the simplest yet the most popular OOP system and it lacks formal
definition and structure..
 An object of this type can be created by just adding an attribute to it.
 In S3 systems, methods don’t belong to the class. They belong to generic
functions.
Example:
#createa list with required components
Course<-list(name=“meena",Dept=“Computers")
#giveanametoyourclass
class(BCA)<-Course
print(BCA)
Output:
$name
[1] “meena"
$Dept
[1] “Computers
S4 Class : Programmers of other languages like C++ , Java might find S3 to be
very much different than their normal idea of classes as it lacks the structure
that classes are supposed to provide.
 S4 is a slight improvement over S3 as its objects have a proper definition
and it gives a proper structure to its objects.
BCA V SEM R PROGRAMMING

 setClass() is used to define a class and new() is used to create the objects
Example:
library(methods)#definitionofS4class
setClass(“Course",slots=list(name="character",Subject="character"))
#creating an object using new() by passing class name and slotvalues
Course<-new(“Dept",name=“Meena",Subject=“RProgramming")
Output:
An object of class “Course"
Slot "name":
[1] “Meena"
Slot “Subject":
[1] “R Programming
Reference Class: Reference Class is an improvement over S4Class. Here the
methods belong to the classes. These are much similar to object-oriented classes
of other languages.
Defining a Reference class is similar to defining S4 classes. We use
setRefClass() instead of setClass() and “fields” instead of“slots”.
Example:library(methods)
#setRefClass returns a generator
Course<-setRefClass(“Course",fields=list(name="character",
Subject="character",))
#now we can use the generator to create objects
Dept<-Course(name=“Meena",Subject=”R”)
Dept
Output:
Reference class object of class “Course"
Field "name":
[1] “Meena"
Field “Subject":
[1] “R Prog
Attributes
Each R object you create has additional information about the nature of the
object itself. This additional information is referred to as the object’s attributes.
In general, you can think of attributes as either explicit or implicit. Explicit
attributes are immediately visible to the user, while R determines implicit
attributes internally. You can print explicit attributes for a given object with the
attributes function, which takes any object and returns a named list.
R> foo <- matrix(data=1:9,nrow=3,ncol=3)
R> foo
[,1] [,2] [,3]
BCA V SEM R PROGRAMMING

[1,] 1 4 7
[2,] 2 5 8
[3,] 3 6 9
R> attributes(foo)
$dim
[1] 3 3
R> attr(x=foo,which="dim")
[1] 3 3

OBJECT CLASS
An object’s class is one of the most useful attributes for describing an entity in
R. Every object you create is identified, either implicitly or explicitly, with at
least one class. Elementary R objects such as vectors, matrices, and arrays, on
the other hand, are implicitly classed, which means the class is not identified
with the attributes function. For user defined datastructures like factor and data
frame class is explicit and here attributes play a important role in accessing the
object.
Stand-Alone Vectors
Let’s create some simple vectors to use as examples
R> num.vec1 <- 1:4
R> num.vec1
[1] 1 2 3 4
R> num.vec2 <- seq(from=1,to=4,length=6)
R> num.vec2
[1] 1.0 1.6 2.2 2.8 3.4 4.0
R> [Link] <- c("a","few","strings","here")
R> [Link]
[1] "a" "few" "strings" "here"
R> [Link] <- factor(c("Blue","Blue","Green","Red","Green","Yellow"))
R> [Link]
[1] Blue Blue Green Red Green Yellow
Levels: Blue Green Red Yellow
R> class(num.vec1)
[1] "integer"
R> class(num.vec2)
[1] "numeric"
R> class([Link])
[1] "character"
R> class([Link])
[1] "factor”
BCA V SEM R PROGRAMMING

Other Data Structures


R> num.mat1 <- matrix(data=num.vec1,nrow=2,ncol=2)
R> num.mat1
[,1] [,2]
[1,] 1 3
[2,] 2 4
R> class(num.mat1)
[1] "matrix"
Multiple Classes
Certain objects will have multiple classes
R> [Link] <- factor(x=c("Small","Large","Large","Regular","Small"),
levels=c("Small","Regular","Large"),
ordered=TRUE)
R> [Link]
[1] Small Large Large Regular Small
Levels: Small < Regular < Large
R> class([Link])
[1] "ordered" "factor"
Is-Dot Object-Checking Functions
To check whether the object is a specific class or datatype, you can use the is-
dot functions on the object and it will return a TRUE or FALSE logical value
Eg:
R> num.vec1 <- 1:4
R> num.vec1
[1] 1 2 3 4
R> [Link](num.vec1)
[1] TRUE
R> [Link](num.vec1)
[1] TRUE
R> [Link](num.vec1)
[1] FALSE
R> [Link](num.vec1)
[1] FALSE
R> [Link](num.vec1)
[1] TRUE
R> [Link](num.vec1)
[1] FALSE
As-Dot Coercion Functions
BCA V SEM R PROGRAMMING

Converting from one object or data type to another is referred to as coercion.


Like other features of R you’ve met so far, coercion is performed either
implicitly or explicitly. Implicit coercion occurs automatically when elements
need to be converted to another type in order for an operation to complete
Implicit coercion of logical values to their numeric counterparts occurs in lines
of code like this:
R> 1:4+c(T,F,F,T)
[1] 2 2 3 5
R> foo <- 34
R> bar <- T
R> paste("Definitely foo: ",foo,"; definitely bar: ",bar,".",sep="")
[1] "Definitely foo: 34; definitely bar: TRUE."
In other situations, coercion won’t happen automatically and must be carried out
by the user. This explicit coercion can be achieved with the as-dot functions.
R> [Link](c(T,F,F,T))
[1] 1 0 0 1
R> 1:4+[Link](c(T,F,F,T))
[1] 2 2 3 5
R> foo <- 34
R> [Link] <- [Link](foo)
R> [Link]
[1] "34"
R> bar <- T
R> [Link] <- [Link](bar)
R> [Link]
[1] "TRUE"
R> paste("Definitely foo: ",[Link],"; definitely bar: ",[Link],".",sep="")
[1] "Definitely foo: 34; definitely bar: TRUE."
R> baz <- factor(x=c("male","male","female","male"))
R> baz
[1] male male female male
Levels: female male
R> [Link](baz)
[1] 2 2 1 2
Here, you see that R has assigned the numeric representation of the factor in the
stored order of the factor labels (alphabetic by default). Level 1 refers to female,
and level 2 refers to male.
R> foo <- matrix(data=1:4,nrow=2,ncol=2)
R> foo
[,1] [,2]
[1,] 1 3
BCA V SEM R PROGRAMMING

[2,] 2 4
R> [Link](foo)
[1] 1 2 3 4

Basic Plotting
The R function plot, on the other hand, takes in two vectors—one vector of x
locations and one vector of y locations—and opens a graphics device where it
displays the result.

R> foo <- c(1.1,2,3.5,3.9,4.2)


R> bar <- c(2,2.2,-1.3,0,0.2)
R> plot(foo,bar)

There are a wide range of graphical parameters that can be supplied as


arguments to the plot function. Some of the most commonly used graphical
parameters are listed here;

type Tells R how to plot the supplied coordinates (for example, as


stand-alone points(p) or joined by lines(l) or both dots and lines(b), "o" for
overplotting the points with lines (this eliminates the gaps between points and
lines visible for type="b"),type="n" results in no points or lines plotted,
creating an empty plot
main, xlab, ylab Options to include plot title, the horizontal axis label,
and the vertical axis label, respectively.
col Color (or colors) to use for plotting points and lines.
pch Stands for point character. This selects which character to use for
plotting individual points.
BCA V SEM R PROGRAMMING

cex Stands for character expansion. This controls the size of plotted point
characters.
lty Stands for line type. This specifies the type of line to use to connect
the points (for example, solid, dotted, or dashed).
lwd Stands for line width. This controls the thickness of plotted lines.
xlim, ylim This provides limits for the horizontal range and vertical
range (respectively) of the plotting region.
Eg.,
R> plot(foo,bar,type="b",main="My lovely plot",xlab="",ylab="",
col=4,pch=8,lty=2,cex=2.3,lwd=3.3)

Adding Points, Lines, and Text to an Existing Plot


 Abline: The abline function is a simple way to add straight lines spanning
anplot. The line (or lines) can be specified with slope and intercept values
You can also simply add horizontal or vertical lines. This line of code
adds two separate horizontal lines, one at y = 5 and the other at y = 5,
using h=c(-5,5).

R> abline(h=c(-5,5),col="red",lty=2,lwd=2)
BCA V SEM R PROGRAMMING

 Segment:The segments command takes a “from” coordinate (given as x0


and y0) and a “to” coordinate (as x1 and y1) and draws the corresponding
line.
R> segments(x0=c(5,15),y0=c(-5,5),x1=c(5,15),y1=c(5,5),col="red",lty=3,lwd=2)

 Points:you use points to begin adding specific coordinates from x and y


to the plot. Just like plot, points takes two vectors of equal lengths with x
and y values.
R> points(x[y>=5],y[y>=5],pch=4,col="darkmagenta",cex=2)

 Lines: To draw lines connecting the coordinates in x and y, you use lines
R> lines(x,y,lty=4)
 Arrows: function arrows is used just like segments, where you provide a
“from”coordinate (x0, y0) and a “to” coordinate (x1, y1). By default, the
head of the arrow is located at the “to” coordinate, though this can be
altered.
R> arrows(x0=8,y0=14,x1=11,y1=2.5)

 Text: As per the default behavior of text, the string supplied as labels is
centered on the coordinates provided with the arguments x and y.
R> text(x=8,y=15,labels="sweet spot")
BCA V SEM R PROGRAMMING

 Legend: you can add legend using legend function.


R>legend("bottomleft",
legend=c("overall process","sweet","standard",
"too big","too small","sweet y range","sweet x range"),
pch=c(NA,19,1,4,3,NA,NA),lty=c(4,NA,NA,NA,NA,2,3),
col=c("black","blue","black","darkmagenta","darkgreen","red","
red"),
lwd=c(1,NA,NA,NA,NA,2,2),[Link]=c(NA,1,1,2,2,NA,NA))

➢ The first argument sets where the legend should be


placed("topleft", "topright", "bottomleft", or "bottomright")
➢ Next you supply the labels as a vector of character strings to the
legend argument
➢ Then you need to supply the remaining argument values in vectors
of the same length so that the right elements match up with each
label.

You might also like