0% found this document useful (0 votes)
9 views30 pages

R Lecture 1

The document is an introduction to R, covering essential concepts such as data types, statistical measures (mean, median, quartiles), and data structures (vectors, arrays, matrices, data frames). It also discusses generic functions, graphical user interfaces, and data import/export methods. Key functions and operations in R are highlighted to facilitate data analysis and visualization.

Uploaded by

jeetxh12
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views30 pages

R Lecture 1

The document is an introduction to R, covering essential concepts such as data types, statistical measures (mean, median, quartiles), and data structures (vectors, arrays, matrices, data frames). It also discusses generic functions, graphical user interfaces, and data import/export methods. Key functions and operations in R are highlighted to facilitate data analysis and visualization.

Uploaded by

jeetxh12
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Introduction to R

Dr. Manisha Verma


Resources
Data Science and Big Data Analytics: Discovering, Analyzing,
Visualizing and Presenting Data, EMC Education Services, John
Willey & Sons, 2015
Chapter 3
• Data
• The annual sales in U.S. dollars for 10,000 retail customers in a CSV file

displays the
first six
records of
sales
n
• Mean (algebraic measure) (sample vs. 1
population):
x 
n

i 1
xi

Note: n is sample size and N is


  x
population size. N
• Median:
• Middle value if odd number of values,
or average of the middle two values
otherwise

5
• Quartiles, outliers and boxplots
• Quartiles: Q1 (25th percentile), Q3 (75th percentile)
• Inter-quartile range: IQR = Q3 – Q1
• Five number summary: min, Q1, median, Q3, max
• Outlier: usually, a value higher/lower than 1.5 x IQR

6
• Generic Functions in R
• A group of functions sharing the same name but behaving differently
depending on the number and the type of arguments they receive.
print(5)
print("Hello")
• Both use print() but behave differently.
• R automatically calls [Link]() for numbers and [Link]() for
strings
• plot() is determined by the passed variables
• summary()
• Help in R
R Graphical User Interfaces
• R software uses a command-line interface (CLI)
• Popular GUIs e.g. R commander, Rattle, RStudio
R Graphical User Interfaces
• RStudio
• Scripts
• Workspace
• Plots
• Console
Data Import and Export
• Read from CSV
• Set path

• Read from other files such as TXT


• Import function default values
• Writing file
Attribute and Data Types
• NOIR
• Numeric, Character, and Logical Data Types

• Functions to examine characteristics of variable


• class(): What kind of object it is in R (its abstract type or how R will treat
it).
• typeof(): How the object is stored in memory (its internal storage type).
• Test variables and coerce

• Length: find the number of elements in a vector, list, or other R


object.
• Vectors
Vectors are a basic building block for data in R. Simple R variables are
actually vectors. A vector can only consist of values in the same class.
• Create vectors
using combine function c(), using colon operator :
• Initialize a vector of a length

• Vector has no dimension


• Arrays
• array(): Creates or tests for arrays.
• Matrix: matrix() creates a matrix from the given set of values.

• Matrix operations +, -, * (elementwise multiplication), %*% (matrix


multiplication), t() transpose, solve() inverse, sum() sum of all elements
• Data Frames: provide a structure for storing and accessing several
variables of possibly different data types.
• [Link]() creates data frames
• $ : access the variables in data frame
• Structure of data frame
• Subsetting operator to extract part of data frame

You might also like