0% found this document useful (0 votes)
11 views

Lecture02 Slides

Uploaded by

hinching612186
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views

Lecture02 Slides

Uploaded by

hinching612186
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 47

Data Basics Variable Types Data Types Functions Inspecting Data Conclusion

Data, Observations, and Variables

Dr. ZHOU Titi


Division of Social Science
HKUST

SOSC 1100: Quantitative Data Analysis for Social Research I

. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . 0/46
Data Basics Variable Types Data Types Functions Inspecting Data Conclusion

Today’s Question

If X = {89, 74, 90}, then X is:

A. a character variable
B. a binary variable
C. a continuous variable
D. an observation

. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . 1/46
Data Basics Variable Types Data Types Functions Inspecting Data Conclusion

Plan for Today

! What are data and dataframe?


! To create a dataframe: data.frame()
! What is an observation?
! What is a variable?
! Types of variables
! Character vs. Numeric
! Continuous vs. Discrete
! Types of data:
! Conventional
! Unconventional
! Use functions in R: (), sqrt(), round(), #
! How to load and make sense of data in R?
! New functions and operators: setwd(), read.csv(), View(), head(), dim()

. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . 2/46
Data Basics Variable Types Data Types Functions Inspecting Data Conclusion

Observations, Data, Dataset, and Dataframe

! Information gathering is at the heart of all sciences, providing the


observations used in data analyses.
! The observations gathered on the characteristics of interest are collectively
called data.

! A dataset is a structured
collection of data.
! Datasets are typically organized variables
as dataframes, where rows are 1 2 ...
observations and columns are ↓ ↓
variables.
1 →
! Dataframe vs. matrix: observations 2 →
The latter is restricted to
...
containing data all of the same
type* (i.e. numeric, integers,
logical and character).
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . 3/46
Data Basics Variable Types Data Types Functions Inspecting Data Conclusion

Example of a Dataframe

! To construct a dataframe: data.frame()

. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . 4/46
Data Basics Variable Types Data Types Functions Inspecting Data Conclusion

What is an observation?

! It is the information collected from a particular entity or individual in the


study.
! The unit of observation of the dataset defines the individuals or the
entities that each observation in the dataframe represents.
! If the unit of observation is students, each row in the dataframe represents a
different student.

! We usually refer to an observation by the row number in the dataframe,


which we denote as i:
! What is the first observation (i = 1) in the dataframe above?
~

承載嘅以⾥野

. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . 5/46
Data Basics Variable Types Data Types Functions Inspecting Data Conclusion

數紫
What is a variable?

! A variable contains the values of a changing characteristic for the various


individuals or entities in the study.
! Every column of data in a dataframe is a variable
! If the unit of observation is students, each variable captures a specific
characteristic of the students, for all the students in the study.

! We usually refer to a variable by its name.


! first_name, test_scores

. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . 6/46
Data Basics Variable Types Data Types Functions Inspecting Data Conclusion

Notation
When defining a new variable, we represent the variable and its contents in the
following format:
X = {10, 5, 8}

! On the left-hand side of the equal sign, we identify the variable name :
! What is the name of the variable here?

! On the right-hand side of the equal sign and inside curly brackets, we have
the content of the variables: multiple observations, separated by commas.

! What are the observations in X ?


! To represent each individual observation, we use Xi :
! where i stands for the observation number;
! the subscript i means that we have a different value of X for each value of i.
! What is X3 ? - 7
! The total number of observations is denoted as n:
! What does n equal to here? -

73
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . 7/46
Data Basics Variable Types Data Types Functions Inspecting Data Conclusion

Types of Variables Based on Content

. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . 8/46
Data Basics Variable Types Data Types Functions Inspecting Data Conclusion

Character vs. Numeric

! Character variable contains text.


! first_names =
{ana, elena, maria, ...}

! Numeric variable contains numbers.


! text_score = {80, 75, 90, ...}

. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . 9/46
Data Basics Variable Types Data Types Functions Inspecting Data Conclusion

Numeric Variable: Continuous vs. Discrete

! Continuous: If a variable can take an infinite continuum of possible real


number values.
! There is always another possible value between any two possible values.
! Example: income, height, weight, the amount of time it takes to read a book
! Age is continuous in the sense that an individual does not age in discrete
jumps.
絕衬
! Discrete: If a variable’s possible values form a set of separate numbers.
! Discrete variables have a basic unit of measurement that cannot be
subdivided.
! Example: the number of students in a class, the number of siblings, the
number of times a person has been married
! The continuous-discrete distinction can be blurry.
! It depends on how variables are measured.
! "Is gender a continuous variable?"
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . 10/46
Data Basics Variable Types Data Types Functions Inspecting Data Conclusion

Discrete Variable: Binary vs. Non-binary

Binary variable:
! It can take only two values: 1s and 0s
! They represent the presence/absence of a trait:
! 1 if individual i has the trait Yes ho q
! 0 if individual i does NOT have the trait

! Example: voted = {1, 0, 0, 1, 1, 1, 0}, where


!
1 if individual i voted
Votedi =
0 if individual i didn’t vote

! Can you think of another example?

. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . 11/46
Data Basics Variable Types Data Types Functions Inspecting Data Conclusion

Non-binary variable:
! Categorical variable: if a variable has a finite set of categories.
! Nominal (categorical): variable with unordered categories.
! Example: race, hair color, types of food, method of travel to work
已非名
level
! Ordinal (ranked): variable with ordered categories.
! Example: educational level, the level of satisfaction, steak doneness
可數的
! Count variable: values are a form of counts (0, 1, 2, 3, and so on).
! Example: the number of students enrolled in schools, the number of sunny
days per year, the number of cigarettes one person can smoke per day.

. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . 12/46
Data Basics Variable Types Data Types Functions Inspecting Data Conclusion

Data Types

asewide

! Conventional
1 Cross-sectional data
2 Time series data
3 Pooled cross sections data
4 Panel (or longitudinal) data
uge receut yms,
! Unconventional
1 Textual data
2 Network Data
3 Spatial Data

. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . 13/46
Data Basics Variable Types Data Types Functions Inspecting Data Conclusion

Conventional Data

隨機抽樣
Convntional !: Cross-sectional Data
! It consists of a sample of individuals, households, firms, cities, states,
countries, or a variety of other units, taken at a given point in time.
平均值
! Minor timing difference within a year in collecting the data would be ignored.
抽樣
! It can be obtained by random sampling from the underlying population.
! Random sampling: a method to randomly select a sample of observations
from the target population.
not H after to the data

. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . 14/46
Data Basics Variable Types Data Types Functions Inspecting Data Conclusion

Conventional Data

by time
Conventional ": Time series Data
! It consists of observations on a variable or several variables over time.
! Example: Stock prices, money supply, consumer price index, GDP etc.
! Time is an important dimension:
! Past events can influence future events.
! Lags in behavior are prevalent in the social sciences.

! The chronological ordering of observations conveys important information.

. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . 15/46
Data Basics Variable Types Data Types Functions Inspecting Data Conclusion

Conventional Data

Conventional #: Pooled cross sections Data


! It has both cross-sectional and time series features.
! Example: suppose that two cross-sectional household surveys are taken in
1985 and in 1990 respectively.
! In 1985, a random sample of households is surveyed for a set of variables; in
1990, a new random sample of households is taken for the same variables.
! To increase sample size, we combine the two to form a pooled cross section.
! Useful to see how a key relationship has changed over time.

. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . 16/46
Data Basics Variable Types Data Types Functions Inspecting Data Conclusion

Conventional Data

Conventional $: Panel (Longitudinal) Data


比事同⼀样野相時間的分別
! It consists of a time series for each cross-sectional member in the dataset.
! Example: suppose we have wage, education, and employment history for a
set of individuals followed over a 10-year period.
! It is the same cross-sectional units are followed over a given time period.

. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . 17/46
Data Basics Variable Types Data Types Functions Inspecting Data Conclusion

Unconventional Data

Unconventional: ! Textual data

! Digitized textual data through email, websites, social media messages, news
report, government documents, health records, digitized published articles
and books etc.
! Example: The disputed authorship of The Federalist Papers
! The Federalist consists of 85 essays attributed to Alexander Hamilton, John
Jay, and James Madison from 1787 to 1788.
! Because both Hamilton and Madison helped draft the Constitution, scholars
regard The Federalist as a primary document reflecting the intentions of the
authors of the Constitution.
! Among all the essays, 73 of them are uncontested; for 12 essays, the
authorship is under debate.

. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . 18/46
Data Basics Variable Types Data Types Functions Inspecting Data Conclusion

Unconventional Data

Example: The disputed authorship of The Federalist Papers

! The text of the 85 essays is scraped from the Library of Congress website
and stored as fpXX.txt , where XX represents the essay number ranging
from 01 to 85.
! Scraping: an automated method of data collection from websites using a
computer program.
The Federalist Papers data

! Methods: to distinguish the authors on the the basis of their writing style:
! Filler words: upon, by, and to at different rates.
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . 19/46
Data Basics Variable Types Data Types Functions Inspecting Data Conclusion

Unconventional Data

Unconventional: " Network Data

! It describes relationships among units rather than units in isolation.


! Example: friendship networks among people, citation networks among
academic articles, and trade networks among countries.
! The unit of analysis is relationship.
! Some key concepts:
! An adjacency matrix: as one way to represent network data, the entries
indicate the existence or absence of a relationship between two units.
! Directed network contains directionality, with senders and receivers, whereas
an undirected network does not.
! An undirected network yields a symmetric adjacency matrix, whereas a
directed network does not.

. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . 20/46
Data Basics Variable Types Data Types Functions Inspecting Data Conclusion

Unconventional Data

Example: the marriage network in Renaissance Florence

Florence Marriage Network Data

! An adjacency matrix:
! The entries represent the existence of relationships between two units (one
presented by the row and the other represented by the column).
! 1 indicates the existence; 0 indicates no relationship.
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . 21/46
Data Basics Variable Types Data Types Functions Inspecting Data Conclusion

Unconventional Data

Unconventional: # Spatial Data

! Spatial data contain information about patterns over space and can be
visualized through maps.
! Main types of spatial data:
! Spatial point data represent the locations of events as points on a map.
! Spatial polygon data represent geographical areas by connecting points on a
map.
! Spatial-temporal data: a set of spatial point or polygon data recorded over
time, revealing changes in spatial patterns over time.

! Example: the geographical distribution of power elites in China


! Data: all the Central Committee (CC) members in China during 1945-2012.

. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . 22/46
Data Basics Variable Types Data Types Functions Inspecting Data Conclusion

Unconventional Data

Example: the geographical distribution of power elites in China

! Uneven geographical distribution but the pattern has changed over time.
! A decreasing effect of revolutionary base and an increasing effect of
education

8th CC in 1956 18th CC in 2012

. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . 23/46
Data Basics Variable Types Data Types Functions Inspecting Data Conclusion

Functions

Use functions in R

! Think of a function as an action that you request R to perform


on a particular object or piece of data.
! Example: Use functions to do more advanced calculations.
! To take the square root of 25:
## Use the power operator ˆ
25 ˆ 0.5
## [1] 5
## Use the square root function, sqrt()
sqrt( 25 )
## [1] 5

. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . 24/46
Data Basics Variable Types Data Types Functions Inspecting Data Conclusion

Functions

A function

! takes input(s): takes the number 25



! performs an action with the input(s): computes the square root 25
! When we use a function to do something, we generally refer to this
as calling the function

! produces an output: produces the number 5

. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . 25/46
Data Basics Variable Types Data Types Functions Inspecting Data Conclusion

Functions

! We will learn how to use these functions:


! sqrt(), setwd(), read.csv(), View(), head(), dim(), mean(), ifelse(), table(),
prop.table(), hist(), median(), sd(), var(), plot(), abline(), cor(), lm(), c(),
ssample(), rnorm(), pnorm(), print(), abs(), and summary(), etc.

! In time, we will learn:


! their names
! the actions they perform
! the inputs they require
! the outputs they produce

. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . 26/46
Data Basics Variable Types Data Types Functions Inspecting Data Conclusion

Function Formats

! The name of a function is always followed by parentheses ():


function_name().
! Inside the parentheses, we specify the inputs, which we refer to as
arguments: function_name(arguments).
! Most functions require that we specify at least one argument but can take
many optional arguments.
! Some arguments are required, others are optional.

! When multiple arguments are specified inside the parentheses, they are
separated by commas , : function_name(argument1, argument2)

. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . 27/46
Data Basics Variable Types Data Types Functions Inspecting Data Conclusion

Function Formats

! To specify the arguments, we enter them in a particular order or include the


name of the argument in our specification:
! function_name(argument1, argument2) or
! function_name(argument1_name = argument1,
argument2_name = argument2)

! We always specify required arguments first.


! If there is more than one required argument, we enter them in the order
expected by R.

! We specify any optional arguments we want next and include their names:
! function_name(required_argument,
optional_argument_name = optional_argument)

. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . 28/46
Data Basics Variable Types Data Types Functions Inspecting Data Conclusion

Function Formats

USING R FUNCTIONS:

We typically write code in one of these two formats:

function_name(required_argument)
or
function_name(required_argument,
optional_argument_name = optional_argument)

. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . 29/46
Data Basics Variable Types Data Types Functions Inspecting Data Conclusion

Examples

Fictitious example:
! Suppose R were capable of baking and that it had a function named bake()
that, by default,bakes the specified ingredient for 60 minutes at 400◦ F.
! Required argument: the ingredient
! Example: cake_mix
! Optional arguments: named degrees and minutes to change the default
temperature and duration of the bake, respectively.
! degrees = 350 changes temperature to 350◦ F
! minutes =30 changes duration of bake to 30 minutes
! The following code would ask R to bake a cake mix for 30 minutes at
350◦ F, so that we can have cake as the output:

. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . 30/46
Data Basics Variable Types Data Types Functions Inspecting Data Conclusion

Examples

Example: sqrt() computes the square root of the argument.



! To compute 25, run:
sqrt(25)
## [1] 5

! sqrt is the name of the function, which, as all function names, is followed by
parentheses ().
! 25 is the required argument.
! 5 is the output.

! Alternatively, we create an object that contains the number 25 and then


run the function:
twentyfive <- 25
sqrt(twentyfive)
## [1] 5

. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . 31/46
Data Basics Variable Types Data Types Functions Inspecting Data Conclusion

Examples

Example: round() can be used to round some value to the nearest whole number.

! To round the number 3.14165, run:


round( 3.14165 )
## [1] 3
round( 3.14165, 2 )
## [1] 3.14
! In the second function call, the 2nd argument is the number of decimal
places that it should be rounded to (i.e., 2).

! When calling a complicated function, it’s not easy to remember which one
argument comes first. ⇒ Make use of argument names.
round( x = 3.14165, digits = 2 )
## [1] 3.14

. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . 32/46
Data Basics Variable Types Data Types Functions Inspecting Data Conclusion

Some Notes

Some notes

! Code is sequential! One must run code in order.


! Whenever returning to work on an R script, run all the code from the
beginning.

! It is good practice to comment code.


! To include short notes to yourself or your collaborators explaining what the
code does.
! Use # to comment code.
! R ignores everything that follows # until the end of the line.

! Examples:

sqrt(25) # calculates square root of 25


## [1] 5

# sqrt(25) calculates square root of 25

. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . 33/46
Data Basics Variable Types Data Types Functions Inspecting Data Conclusion

How to load and make sense of data in R?

! Do social pressure affect turnout?


! Alan S. Gerber, Donald P. Green, and Christopher W. Larimer. 2008. "Social
Pressure and Voter Turnout: Evidence from a Large-Scale Field Experiment."
American Political Science Review, 102(1), 33-48.
! A study of social pressure within neighborhoods and voter turnout
! Data collected through a large-scale field experiment
! "voting.csv"

. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . 34/46
Data Basics Variable Types Data Types Functions Inspecting Data Conclusion

Some Notes
! Folder and files:
! Create a folder title SOSC1100 on your Desktop.
! Go to download voting.csv and HKBarometerL2.csv from Canvas and save
them in the folder SOSC1100.
! To follow the R demonstration in this lecture, you can choose:
! to create a new R script in RStudio; or,
! to download the Lecture02_Exercise.R from Canvas; save it in

SOSC1100 folder; and open it in RStudio.


! Remember: to execute code, highlight it and either
1 manually hit "run" ( )
2 use the shortcut command+enter in Mac or ctr+enter in Windows
! Remember:
! R ignores anything that follows the symbol #.
! This is what we use to comment our code.

! R code is sequential (it has an order!).


! This means that you need to run lines 1 through 3 before running line 4.

! R code is case sensitive. . . . . . . . . . . . . . . . . . . . .


. . . . . . . . . . . . . . . . . . . . 35/46
Data Basics Variable Types Data Types Functions Inspecting Data Conclusion

Do social pressure affect turnout?

Do social pressure affect turnout?

! Let’s answer this question by analyzing the voting dataset.

. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . 36/46
Data Basics Variable Types Data Types Functions Inspecting Data Conclusion

Do social pressure affect turnout?

The voting Dataset

! Unit of observation: registered voters


! Description of variables:

Variable Description
birth Year of birth of registered voter
Whether registered voter received message:
message
"yes", "no"
Whether registered voter voted:
voted
1=voted, 0=didn’t vote

. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . 37/46
Data Basics Variable Types Data Types Functions Inspecting Data Conclusion

Do social pressure affect turnout?

Overview

## STEP 1. Set working directory to SOSC1100 folder using setwd()


## STEP 2. Load the dataset using read.csv()
## STEP 3. Understand the data using View() or head()
## STEP 4. Identify the types of variables included
## STEP 5. Identify the number of observations using dim()

. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . 38/46
Data Basics Variable Types Data Types Functions Inspecting Data Conclusion

Do social pressure affect turnout?

STEP 1. Set working directory to SOSC1100 folder

! The setwd() function:


! Sets the working directory, that is, directs R to the folder on your computer
where the dataset is saved.
! The only required argument is the path to the folder in quotes.
! Example: setwd("∼/Desktop/folder ")

setwd("∼/Desktop/SOSC1100") # if Mac
setwd("C:/user/Desktop/SOSC1100") # if Windows
! Note: In Windows code, user is your own username.

! When in doubt, you can always set it manually:


Session » Set Working Directory » To Source File Location

. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . 39/46
Data Basics Variable Types Data Types Functions Inspecting Data Conclusion

Do social pressure affect turnout?

STEP 2: Load the dataset

! The read.csv() function


! Read comma-separated values (CSV) files
! The only required argument is the name of the CSV file in quotes.
! Example: read.csv("file.csv ")

voting <- read.csv("voting.csv") # reads and stores data

! Remember: To store the dataset as an object, we use the assignment


operator <-.
! Could we have named the object something other than voting?

. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . 40/46
Data Basics Variable Types Data Types Functions Inspecting Data Conclusion

Do social pressure affect turnout?

STEP 3: Understand the data


! The View() function
! Open a new tab in the source pane of RStudio with the contents of a dataset.
! The only required argument is the name of the object, where the dataset is
stored (without quotes).
! Example: View(data)
View(voting) # opens new tab with entire dataset

! The head() function


! Shows the first six rows or observations in a dataset.
! The only required argument is the name of the object, where the dataset is
stored (without quotes).
! In the output, the first column identifies the position of the observations, and
the first row identifies the names of the variables.
! Example: head(data)
head(voting) # shows first observations of dataset
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . 41/46
Data Basics Variable Types Data Types Functions Inspecting Data Conclusion

Do social pressure affect turnout?

head(voting) # shows first several observations of


View(voting) # opens new
dataset
tab with entire dataset
## birth message voted
## 1 1981 no 0
## 2 1959 no 1
## 3 1956 no 1
## 4 1939 yes 1
## 5 1968 no 0
## 6 1967 no 0
## (Read about description of variables and unit
of observation)
## unit of observation? what does each obser-
vation represent?
## substantively interpret the first observation.

. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . 42/46
Data Basics Variable Types Data Types Functions Inspecting Data Conclusion

Do social pressure affect turnout?

STEP 4: Identify the types of variables included

head(voting) # shows first observations of dataset


## birth message voted
## 1 1981 no 0
## 2 1959 no 1
## 3 1956 no 1
## 4 1939 yes 1
## 5 1968 no 0
## 6 1967 no 0
## (character vs. numeric; continuous vs. discrete; binary, categorical
(nominal vs. ordinal), or count
## What type of variable is birth?
## What type of variable is message?
## What type of variable is voted?
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . 43/46
Data Basics Variable Types Data Types Functions Inspecting Data Conclusion

Do social pressure affect turnout?

STEP 5: Identify the number of observations

! The dim() function


! Provide the dimensions of a dataframe.
! The only required argument is the name of the object, where the dataframe
is stored (without quotes).
! The output is two values:
! The first indicates the number of observations in the dataframe;
! The second indicates the number of variables.

! Example: dim(data)

dim(voting) # provides dimensions: rows, columns


## [1] 229444 3
## How many observations are in the dataset?

. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . 44/46
Data Basics Variable Types Data Types Functions Inspecting Data Conclusion

Do social pressure affect turnout?

Quit R

! Before closing your computer, remember to save the R script, otherwise


you risk losing unsaved changes.
! either use shortcuts (command+S or ctrl+S) or
! click on File > Save

! If you quit RStudio, R will ask whether you want to save the workspace
image, which contains all the objects you have created during the R
session.
! I recommend that you do NOT save it.
! You can always re-create the objects by re-running the code in your R script.

. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . 45/46
Data Basics Variable Types Data Types Functions Inspecting Data Conclusion

Today’s Lecture
! Data/dataset/dataframe
! Observations and variables
! Variable types: character vs. numeric; continuous vs. discrete
! Data types: conventional and unconventional
! Using functions: (), sqrt(), round(), #
! Loading and viewing data: setwd(), read.csv(), View(), head(), dim()

Class Meeting Next Tuesday


! Review what we’ve learned from this lecture
! Discuss your questions and comments
! Load and make sense of some new real data
! Bring your laptops!

. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . 46/46

You might also like