01 IntroSlides
01 IntroSlides
Tyson S. Barrett
Summer 2017
Utah State University
1
Introduction
Objects
Importing Data
Saving Data
Conclusions
2
Introduction
3
and
This class will use R and RStudio to show how R can make several
aspects of your research simpler, more likely to be reproducible, and
more replicable.
4
RStudio
5
Data Types, Objects and More
6
Data Types, Objects and More
6
Data Types, Objects and More
6
Objects
7
Physical Objects
For Example:
Likewise, objects in R are useful for some things and not for others.
Objects are how we interact with the data, analyze it, and output it.
We will discuss the most important objects for working with data:
• Vectors
9
Virtual Objects
Likewise, objects in R are useful for some things and not for others.
Objects are how we interact with the data, analyze it, and output it.
We will discuss the most important objects for working with data:
• Vectors
• Data Frames
9
Virtual Objects
Likewise, objects in R are useful for some things and not for others.
Objects are how we interact with the data, analyze it, and output it.
We will discuss the most important objects for working with data:
• Vectors
• Data Frames
• Lists
9
Data Types: Vectors
10
numeric
[1] "I think this is great." "I would suggest you lea
[3] "You seem quite smart."
12
factor
14
Data Frames
A B C
1 1 1.4 0
2 2 2.1 0
3 1 4.6 1
4 4 2.0 1
5 3 8.2 1
3
We can do quite a bit with the data.frame that we called df. (Once again,
15
we could have called it anything, although I recommend short names.)
Data Frames
16
Data Frames
4
There are actually very small differences but its really not important here.
17
Data Frames
df[1:3, "A"]
df[1:3, 1]
18
Data Frames
df[1:3, "A"]
df[1:3, 1]
18
Data Frames
df[1:3, "A"]
df[1:3, 1]
18
Data Frames
Finally, we can combine the c() function to grab different rows and
columns. To grab rows 1 and 5 and columns “B” and “C” you can
do the following:
19
Some Functions for Data Frames
names(df)
class(df$A)
[1] "factor"
20
Some Functions for Data Frames
summary(df)
A B C
level1:2 Min. :1.40 Male :2
level2:1 1st Qu.:2.00 Female:3
level3:1 Median :2.10
level4:1 Mean :3.66
3rd Qu.:4.60
Max. :8.20
21
Some Functions for Data Frames
head(df, n=10)
A B C
1 level1 1.4 Male
2 level2 2.1 Male
3 level1 4.6 Female
4 level4 2.0 Female
5 level3 8.2 Female
22
Importing Data
23
Import, Don’t Input
Most of the time you’ll want to import data into R rather than
manually entering it line by line, variable by variable.
There are some built in ways to import many delimited5 data types
(e.g. comma delimited–also called a CSV, tab delimited, space
delimited). Other packages6 have been developed to help with this
as well.
5
The delimiter is what separates the pieces of data.
6
A package is an extension to R that gives you more functions–abilities–to work
with data. Anyone can write a package, although to get it on the Comprehensive
R Archive Network (CRAN) it needs to be vetted to a large degree. In fact, after
some practice, you could write a package to help you more easily do your work.
24
Important Note about Importing
When you import data into R, it does not do anything to the data
file (unless you ask it to). So, you can play around with it in R,
change its shape, subset it, and whatever else you’d like without
destroying or even modifying the original data.
Note that the slides that discuss saving data show you how you
can override (not recommended) or save additional data files.
25
Importing Data
load("file.rda")
Note that you don’t assign this to a name such as df. Instead, it
loads whatever R objects were saved to it.
26
Delimited Files
## for csv
df <- read.table("file.csv", sep = ",", header=TRUE)
## for tab delimited
df <- read.table("file.txt", sep = "\t", header=TRUE)
## for space delimited
df <- read.table("file.txt", sep = " ", header=TRUE)
The argument sep tells the function what kind of delimiter the data
has and header tells R if the first row contains the variable names.
Note that at the end of the lines you see that I left a comment using #.
Anything after a # is not read by the computer; it’s just for us humans.
27
Other Data Formats
1. haven
2. foreign
install.packages("packagename")
library(packagename)
28
Other Data Formats
Using these packages, I will show you simple ways to bring your
data in from other formats.
library(haven)
## for Stata data
df <- read_dta("file.dta")
## for SPSS data
df <- read_spss("file.sav")
## for this type of SAS file
df <- read_sas("file.sas7bdat")
library(foreign)
## for export SAS files
df <- read.xport("file.xpt")
29
Data and Questions
If you have another type of data file to import, online helps found
on sites like www.stackoverflow.com and www.r-bloggers.com often
have the solution.
30
Saving Data
31
Saving Data
Finally, there are many ways to save data. Most of the read...
functions have a corresponding write... function.
32
Saving Data
33
Help Menu in R
?functionname
34
Conclusions
35
R is built for you
36
37