0% found this document useful (0 votes)

14 views7 pages

STAT 359 R Commands Study Guide

STAT359StudyGuide

Uploaded by

nilsdmikkelsen

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views7 pages

STAT 359 R Commands Study Guide

STAT359StudyGuide

Uploaded by

nilsdmikkelsen

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

STAT 359 Study Guide

Nils Dosaj Mikkelsen

December 16, 2020

1 Slide 1 - Review
1. Here’s a list of some basic R commands:

(a) [Link]: reads tabular data into R (txt,pdf,prn etc.)

(b) names(dataframe): lists all column names for data frame
(c) attach(dataframe): allows data frame columns to be called di-
rectly
(d) detach(dataframe): detaches attached data frame
(e) summary(dataframe): provide summary statistics of the data frame
(quantiles,mean, attribute frequency etc.)
(f) mean(data): computes the mean value
(g) sum(data): returns the sum of all values in data
(h) median(data): returns middle value of the data
(i) var(data): returns the variance of the data
(j) sd(data): returns the standard deviation of the data
(k) data[1]: returns 1st element of data (R is not indexed )
(l) data[data < 5]: returns all entries that meet this condition
(m) length(data): returns number of entries in data
(n) data[,3]: returns 3rd column (all rows)
(o) data[5,]: returns 5th row (all columns)
(p) data[2:4,5] returns rows 2-4 (inclusive) from 5th column
(q) data[data > 4 & data < 8]: logical connectors (and)
(r) data[data > 3 | data < 6]: logical connectors (or)

1
(s) order(data): sorts data in ascending order
(t) rev(order(data)): sorts data in descending order
(u) seq(s,e,step): create an inclusive sequential vector of values from
s − e incremented by step
(v) rep(val,num): create a vector containing val, num times.

2. Distribution commands:

(a) rnorm(n): generates n random normally distributed values

(b) pnorm(0) = 0.5: enter number of standard deviations, returns
percentage (Z-scores)
(c) qnorm(0.5) = 0: enter percentage, returns standard deviations
from mean (Confidence Intervals)

3. Remove missing values from data vector with:

(a) assign column to variable name: var = df$column

(b) remove mising values: var = var[![Link](var)]

4. We can obtain confidence interval for some 100(1 − α) as:

(a) set alpha e.g. 0.05 = 95% CI

(b) Calculate upper and lower bounds:

x1 −x2 ±qnorm(1−α/2)∗sqrt(var(X1 /length(X1 )+(var(X2 )/length(X2 ))

5. We can also determine if the means are different with the t-test

6. T obs is the observed value of the test statistic.

7. The p-value is the probability, under the assumption that H0 is true,

that the test statistic is at least as extreme as that observed.

8. To test H0 : µ1 = µ2 as: 2 ∗ (1 − pnorm([Link]))

2
2 Slide 2 - Two Sample T-Test and Boot-
strapping
1. Bootstrapping: A resampling method that uses random sampling
with replacement. Bootstrapping is handy with small sample sizes.
left with less

2. Monte Carlo Method: Relying on repeated random sampling to

obtain numerical results.

3 Slide 3 - Review of Probability Distribu-

tions and Q-Q Plots
1. Chi-square Distribution: If Z ∼ N (0, 1) we say that the random
variable defined by X = Z 2 is χ2(1)
k
Zi2
P
More generally if Z1 , . . . , Zk are independent N (0, 1) then X =
i=1
is said to be χ2(k)
Ex.
To compute P (X ≥ 4) if X ∼ χ23 : 1 − pchisq(q = 4, df = 3)
To compute the quantile p(X ≤ q0.7 ) = 0.7: qchisq(p = 0.7, df = 4)

2. T-Distribution: If Z ∼ N (0, 1) and W ∼ χ2n and Z and W are

independent then: X = √ZW to find a tn distribution.
n

The t-distribution is shaped like the normal distribution but it has

heavier tails. (meaning they don’t bottom out as fast, which seems
counter-intuitive).

3. A bow shape in a Q-Q plot indicates a chi-square distribution.

4 Slide 4 - Skewness and Kurtosis

1. Skewness: A measure of asymmetry within the distribution of the
data. Skewness greater than zero indicates that the right tail is longer
than the left tail (meaning at the left tail comes down earlier). Skewness

3
equal to zero corresponds to a symmetric distribution.

skew = f unction(x){
m3 = sum((x − mean(x))3 )/length(x)
s3 = sqrt(var(x))3
m3/s3}

2. Kurtosis: The fourth moment of a distribution. Kurtosis measures

the peakedness along with the heaviness of the tails of a distribution.
Heavy tails mean that there is a larger probability of getting very large
values.
The normal distribution is taken as a reference and therefore the kur-
tosis of a normal distribution is 0.
Distributions with positive kurtosis have heavier than normal tails. An
example would be the t-distribution.
Distributions with negative kurtosis have a flatter shape in the middle.
An example would be a uniform distribution.

kurtosis = f unction(x){
m4 = sum((x − mean(x))4 )/length(x)
s4 = var(x)2
m4/s4 − 3}

5 Slide 5 - More on Two Sample Testing: t-

tests
1. two sample t-test: An alternative approach for testing the equality of
population means in the small sample setting. t-tests assume that the
data arises from a normal distribution. Other assumptions are made
depending on the type of t-test being used:

(a) pooled t-test: assumption of equal variance

(b) Welch t-test: no assumption of equal variance
(c) paired t-test: two samples that are dependent (pairs of data)

4
6 Slide 6 - Paired Data: Parametric and Non-
parametric Methods
1. Paired t-test: Used for testing the difference in means between pair
to data (such as before and after measurements). The paired t-test
assumes normality.

2. The Signed Rank Test: An alternative to the paired t-test which

does not assume normality. Test statistics are derived from the ranks
of the data values. The signed rank test is robust outliers.
Ranks are generated by computing the difference between the data
pairs. The smallest difference is ranked 1. ties are split (e.g. a tie for
3 results in a rank of 3.5 for both pairs).
To compute the test statistic compute: (Before - After) for every data
pair and sum all positive values (i.e. where After is less than Before).
To use: [Link](paired = T rue)

7 Slide 7 - The Mann-Whitney Test: A Non-

parametric Sample Procedure
1. The Mann-Whitney Test: Similar to The signed rank test in that
it does not assume normality. The Mann-Whitney test assumes that
the data is independent and not paired. The Mann-Whitney test is
based on ranks, where we group m + n observations and rank them
from 1, . . . , m + n. The test statistic is then computed as the sum of
the ranks of the first sample.
To use: [Link](paired = F alse)

8 Slide 8 - Analysis of Variance

1. ANOVA generalizes the t-test for J ≥ 2 populations assuming a sample
size of n is drawn from each.
Very often the J populations will correspond to the J different levels
of an experimental factor that is manipulated in an experiment with n
observations taken at each of its J levels.

5
First, we want to detect if the means are all equal. In the case where
the means are not all equal, a secondary objective is to investigate how
the means differ from each other.

2. When the error bars are based on the standard error of the mean,
overlap ensures us that there is insufficient evidence of a population
difference. In this same case, non-overlap does not necessarily indicate
sufficient evidence of a difference in population means.

3. When the error bars are based on confidence intervals, overlapping

confidence intervals do not imply that there is insufficient evidence of a
population difference, as in our example. When the confidence intervals
are non-overlapping, this does indicate evidence that the population
means are different.

9 Slide 9 - ANOVA for Factorial Experiments

10 Slide 10 -
1.

11 Slide 11 -
1.

12 Slide 12 -
1.

13 Slide 13 -
1.

14 Slide 14 -
1.

6
15 Slide 15 -
1.

16 Slide 16 -
1.

17 Slide 17 -
1.

18 Slide 18 -
1.

19 Fundamentals
1.

20 Dataframes
1.

21 Central Tendency
1.

22 Variance
1.

Statistical Concepts and Tests Explained
No ratings yet
Statistical Concepts and Tests Explained
7 pages
Statistical Testing and Modelling in R
No ratings yet
Statistical Testing and Modelling in R
21 pages
Essential R Commands for Statistics
No ratings yet
Essential R Commands for Statistics
5 pages
AP Statistics Exam Study Guide
No ratings yet
AP Statistics Exam Study Guide
20 pages
Tenko Raykov, George A. Marcoulides-Basic Statistics - An Introduction With R-Rowman & Littlefield Publishers (2012) PDF
No ratings yet
Tenko Raykov, George A. Marcoulides-Basic Statistics - An Introduction With R-Rowman & Littlefield Publishers (2012) PDF
345 pages
Resampling Methods for Data Analysis
100% (22)
Resampling Methods for Data Analysis
16 pages
Introduction to Statistics at DTU
No ratings yet
Introduction to Statistics at DTU
263 pages
Exploratory Data Analysis Overview
No ratings yet
Exploratory Data Analysis Overview
55 pages
Medical Statistics with R Overview
No ratings yet
Medical Statistics with R Overview
73 pages
Empirical Research Methods Course Outline
No ratings yet
Empirical Research Methods Course Outline
51 pages
Lesson 2. Simple Comparative Experiments
No ratings yet
Lesson 2. Simple Comparative Experiments
8 pages
Simple Comparative Experiments Overview
No ratings yet
Simple Comparative Experiments Overview
8 pages
Statistical Notation and Formulas Guide
No ratings yet
Statistical Notation and Formulas Guide
16 pages
Basic Statistical Analyses Overview
No ratings yet
Basic Statistical Analyses Overview
44 pages
Medical Statistics Using R
No ratings yet
Medical Statistics Using R
85 pages
Real Statistics Using Excel - Examples Workbook Charles Zaiontz, 9 April 2015
No ratings yet
Real Statistics Using Excel - Examples Workbook Charles Zaiontz, 9 April 2015
1,595 pages
Introduction to Hypothesis Testing in R
No ratings yet
Introduction to Hypothesis Testing in R
47 pages
Selecting the Right Statistical Test
No ratings yet
Selecting the Right Statistical Test
18 pages
T-Test and ANOVA Procedures Guide
No ratings yet
T-Test and ANOVA Procedures Guide
77 pages
Quantitative Data Analysis Guide
No ratings yet
Quantitative Data Analysis Guide
37 pages
Business Analytics: Regression Insights
No ratings yet
Business Analytics: Regression Insights
58 pages
Statistics Overview and Key Concepts
No ratings yet
Statistics Overview and Key Concepts
20 pages
Z-Test Implementation in R Programming
No ratings yet
Z-Test Implementation in R Programming
21 pages
Statistical Analysis Techniques in Excel
100% (2)
Statistical Analysis Techniques in Excel
315 pages
Introduction to R for Data Analytics
No ratings yet
Introduction to R for Data Analytics
44 pages
Understanding Descriptive Statistics
No ratings yet
Understanding Descriptive Statistics
54 pages
Statistical Inference Techniques Guide
No ratings yet
Statistical Inference Techniques Guide
66 pages
MBA 604: Probability and Statistics Notes
100% (2)
MBA 604: Probability and Statistics Notes
117 pages
Basic Statistics Formula Overview
No ratings yet
Basic Statistics Formula Overview
5 pages
Business Statistics Lecture Notes
No ratings yet
Business Statistics Lecture Notes
69 pages
Descriptive Statistics in Engineering
No ratings yet
Descriptive Statistics in Engineering
20 pages
MATLAB Paired T-Test Guide
No ratings yet
MATLAB Paired T-Test Guide
53 pages
Data Analysis Techniques Overview
No ratings yet
Data Analysis Techniques Overview
14 pages
Understanding Statistics Basics
No ratings yet
Understanding Statistics Basics
4 pages
Statistical Testing and Modeling in R
No ratings yet
Statistical Testing and Modeling in R
13 pages
Introduction to R Programming Basics
No ratings yet
Introduction to R Programming Basics
115 pages
Statistical Analysis Basics and Applications
No ratings yet
Statistical Analysis Basics and Applications
52 pages
R Inbuilt Functions and Statistical Tests
No ratings yet
R Inbuilt Functions and Statistical Tests
6 pages
Data Transformation and Analysis in R
No ratings yet
Data Transformation and Analysis in R
36 pages
Statistical Analysis: Data Types & Methods
No ratings yet
Statistical Analysis: Data Types & Methods
20 pages
Statistical Methods and Hypothesis Testing
No ratings yet
Statistical Methods and Hypothesis Testing
6 pages
Descriptive Statistics and R Analysis
No ratings yet
Descriptive Statistics and R Analysis
56 pages
Critical Z Values for One-Tailed Test
No ratings yet
Critical Z Values for One-Tailed Test
38 pages
Statistical Procedures Overview
No ratings yet
Statistical Procedures Overview
20 pages
Inferential Statistics Course Notes
No ratings yet
Inferential Statistics Course Notes
141 pages
Analisis Produksi Padi dan Luas Panen
No ratings yet
Analisis Produksi Padi dan Luas Panen
7 pages
Frequency Distributions and Ogives Analysis
No ratings yet
Frequency Distributions and Ogives Analysis
78 pages
Variance and Standard Deviation Worksheet
No ratings yet
Variance and Standard Deviation Worksheet
7 pages
Probability and Statistics Exam Questions
No ratings yet
Probability and Statistics Exam Questions
6 pages
Machine Learning in Real Estate Valuation
No ratings yet
Machine Learning in Real Estate Valuation
5 pages
GDP Variation Explained by Population and Biocapacity
No ratings yet
GDP Variation Explained by Population and Biocapacity
4 pages
Factors Influencing Student Performance
No ratings yet
Factors Influencing Student Performance
4 pages
Visual Inspection for Regression Fit
100% (1)
Visual Inspection for Regression Fit
111 pages
Test Bank for Research Methods 5th Ed.
No ratings yet
Test Bank for Research Methods 5th Ed.
15 pages
Understanding Multiple Linear Regression
No ratings yet
Understanding Multiple Linear Regression
56 pages
Understanding Simultaneous Equations
No ratings yet
Understanding Simultaneous Equations
15 pages
Factor Analysis Report - October 2023
No ratings yet
Factor Analysis Report - October 2023
12 pages
Sampling Distribution & Central Limit Theorem
No ratings yet
Sampling Distribution & Central Limit Theorem
38 pages
Rainfall Estimation in Marathwada, Maharashtra
No ratings yet
Rainfall Estimation in Marathwada, Maharashtra
5 pages
Package Pricing Model for Mission Hospital
No ratings yet
Package Pricing Model for Mission Hospital
1 page
Econometrics Final Exam Overview
100% (2)
Econometrics Final Exam Overview
2 pages
Zillow Home Price Analysis and T-Test
No ratings yet
Zillow Home Price Analysis and T-Test
19 pages
Systematic Sampling Techniques Explained
No ratings yet
Systematic Sampling Techniques Explained
11 pages
Student Resources: Statistical Techniques in Business and Economics, 17e
No ratings yet
Student Resources: Statistical Techniques in Business and Economics, 17e
11 pages
Discriminant Analysis Overview
100% (1)
Discriminant Analysis Overview
20 pages
Elements of Statistics Exam Guide
No ratings yet
Elements of Statistics Exam Guide
8 pages
CRD Factorial Experiments Explained
No ratings yet
CRD Factorial Experiments Explained
8 pages
A Level Maths: Hypothesis Testing Guide
No ratings yet
A Level Maths: Hypothesis Testing Guide
2 pages
Correlation Analysis of Science Grades
No ratings yet
Correlation Analysis of Science Grades
2 pages
Understanding Sampling Methods in Research
No ratings yet
Understanding Sampling Methods in Research
43 pages
OMCL Network Quality Management Guide
No ratings yet
OMCL Network Quality Management Guide
9 pages
Applied Econometrics Syllabus 2021-22
No ratings yet
Applied Econometrics Syllabus 2021-22
7 pages
Black Soldier Fly Frass as Fertilizer for Pechay
No ratings yet
Black Soldier Fly Frass as Fertilizer for Pechay
12 pages
Variable Cross-Section Dependence Tests in Panel Data
No ratings yet
Variable Cross-Section Dependence Tests in Panel Data
8 pages
Grading and Measurement Errors in Surveying
100% (1)
Grading and Measurement Errors in Surveying
23 pages