0% found this document useful (0 votes)

2 views

Advanced Quantitative Methods

The document outlines advanced quantitative methods, focusing on descriptive statistics, hypothesis testing, and regression analysis. It discusses various statistical concepts such as confidence intervals, types of data, and the importance of sample size, as well as methods like ANOVA and Structural Equation Modeling (SEM). Key points include the significance of p-values, the Central Limit Theorem, and the relationship between variables in regression models.

Uploaded by

saina

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views

Advanced Quantitative Methods

Uploaded by

saina

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 12

pAdvanced Quantitative Methods

Descriptive statistics
 Understand the theory first from the literature that you are reading
 Price and quantity are inversely related – law of demand
 One man’s idea then becomes a statement and the further it
spreads – it becomes a conjucture
 Concepts – Abstract stage. Factors/Constructs/Latent Variable
 Ability – construct, IQ – item/ measured variable
 99% interval – while sampling a. population, you are not able to
sample all 1000 max 500
 Construct – something that you don’t know how to measure or can
be measured in multiple ways
 Moderating variable can have different effect on different groups or
categories.

Categorical:
Nominal- the label changed into a number.
Ordinal – It’s in an order but you can’t do any mathematical
operations on ordinal or nominal.
They are discrete

Scale:
Interval – They do not have true zero and doesn’t have a ratio. 20
degrees doesn’t mean its twice as hot as 10 degrees.
Ratio – They have a degree to it; all mathematical operations can be
done. Here 10% is half of 20%.

YOU CAN’T TRIM MORE THAN 20% OF DATA EVEN IF IT’S A VERY BIG
DATA SET.
Percentile – gives you the location

Standard variation – average deviation/dispersion from the mean

The mean absolute deviation (MAD) is a measure of variability
that indicates the average distance between each observation and
the mean.
Knowing whether your data is normal or not helps with inference –
confidence interval and hypotheses
Skewness is a measure of the asymmetry of a distribution.
Kurtosis – volatility of data
Data must be mesokurtic, low- platykurtic, high- leptokurtic
Standard error mean – dispersion between the samples. I would
want this to be 0 or as low as possible.

Very rare when weight, frequency or by is used in JMP.

Bar graph – qualitative data, histogram – ratio data

After running JMP, first we quote the data.

If the standard deviation is too close to the range, then it’s too high.

Upper and lower 95% mean – exactly 95% of the population’s mean
will most likely lie between these two values
As my confidence interval value increases, my prediction accuracy
decreases.
95% CI means if I had taken 100 samples, then there is a high
chance that average mean lies in the range between upper and
lower 95%.
Confidence interval and precision are inversely related. CI can be
subjective.

Central Limit theorem – If your sample size is large, n >30, this

means your sampling distribution becomes more and more normally
distributed.

HYPOTHESIS
If equal to sign is used it’s a two tailed test, otherwise one-tailed
test.
Univariant analysis – just one characteristic.
Parametric test – assuming that population is normal – use z or t
test.
Non-parametric test – you click on Wilcoxon Signed rank box on JMP

P value tells you the actual level of significance, if p value is less

than 5%(default) then whatever the null was considered is rejected.

When we do a two tailed test, 2p is compared with alpha.

Don’t write that null is accepted, always write rejected or not
rejected because while there is a high chance our calculation is
correct, our confidence interval is never 100%.
If null is rejected, then its colored on JMP.
Only if the sample mean is greater than your hypothesized mean
then it is a right tail rejection (null – age < 50) otherwise it’s a left
tail rejection (null – age>50).

In social science fields we want more inclusivity so 99% confidence

but in the medical field we need more precision so we can use 95%
or 90% confidence.

If we want null to be accepted, then we will choose a lower alpha

and if you want the null to be rejected then we will choose a higher
alpha.

Chi square test – nominal data, positively skewed.

1. T- test: 2 categories
2. One-way anova: more than 2
3. Two-way anova? – more than 2 categories

Fit y by x is used when I have a single dependent variable (BMI) y and I

have a single independent variable (gender).

If p value is greater than 0.05 then null is accepted.

For two-way Anova – add cross in JMP because you want, he interaction
between two variables.

Effects test, gender and drug separate are main effects, and the last one
is called as interaction effect.

When my correlation is 0, this indicates there is no linear relationship.

Linear regression - Least square method tries to minimize the sum of the
squared errors.

Correlation: Analyze – Multivariate

If its low, then there is a high chance it could be non-linear.

Steep fit lines show stronger positive or negative relationships, flatline

means it’s a weak relationship.

Fit model

You use cross in fit model if one of your x variables is categorical.

Used for linear regression

R square – how fit the model is; how much of the variation in BP is
explained by cholesterol and age. Here, age and cholesterol only explain
13.8% OF BP THEREFORE MY RGERESSION WASN’T VERY SUCCESFUL, ITS
NOT A VERY FITTING MODEL.

RMSE – an important measure to see how fit the model is; to check the
accuracy of the model. As close to 0 as possible

If age and cholesterol are equal to 0 then we can expect the BP to be

67.10
If my age changes by 1unit, on an average I can expect my BP to go up by
0.308 keeping other factors constant.

If the total cholesterol changes by 1 unit, then the change in BP will go up

by 0.066 keeping all other factors constant.

Hypothesis here: if age=0 or not, same for the rest is what is done in
probability section. Here, alternate is true since it’s all below 0.05.

Here, intercept is significant, there is no theoretical justification. So, in my

next model I will remove the intercept and run the model.

1. Another solution here is to increase your sample size but it’s not a
plausible solution.

2. Another solution would be to increase my independent variables.

3. Another would be to include a categorical variable.

Adjusted R square tells you if you should be adding your explanatory

variable to a model – your adjusted R square should go up from the
previous one.

Adj R square is very rigid hence even a 0.1 variation is good. It tells us if
we have underfit or overfit a model.

R square tells the degree of freedom; the more you identify from a data
the more it loses its freedom.

3 of my data from 442 has become rigid (n - 1 = 2).

If you don’t include intercept, then you can compute r square or adj r
square.
Since my RMSE has gone up from my previous model, this is not a better
model.

Typically, when you add a variable, it could make another variable

insignificant and multi-collinearity could happen.

Colored prob means null is not accepted, hence its significant.

Categorical variable – dummy variable

Given that age, cholesterol are 0, for gender=0 , BP = 49.928

Given that age, cholesterol are 0, for gender= 1, BP = 49.928-2.371 =

47.557

Structural Equation Modelling

Latent variable – hidden or cannot be measured but can be observed
using other variables (decision making, financial literacy)

Difference from linear regression – do not have latent variables

SEM resolves multicollinearity

Measured variables called items are also in it.

Latent variables are either reflective (JMP) or formative model.

Reflective model – Basically the ways how it is reflected using other

variables, these variables will have high correlation.

Formative model – formed by. They need not be correlated.

You can have multiple independent and dependent variables. Latent
variables can be measured by latent variables.

Linear regression – 1 dependent variable

Structural model – relationship between two latent variables.

Confirmatory Factor Analysis – CFA will confirm whether all latent variables
are properly constructed. Analyze – multivariate – SEM

First you have to form the diagram. Then run it

1. Model comparison:

 Unrestricted model – apparently the best fitting model in the entire

data. All variables can interact with each other.
 Independence model – Worst model
 Model 1(our model: we name it) – lies in between the best and worst
fitting model

-2 log likelihood tells you how worst fitting the model. Higher the
number the worse it is. This gives you an implication of how good your
model based on its closeness to unrestricted model.

AICc and BIC tells you good fitting your model. Lower value implies
better fitting model.

Chi Square in the model tells you if it’s a good fitting model. You can’t
tell anything form the value, so you have to check if null is accepted or
not. Here it if it’s a good fitting model. 0.7624 means null is accepted
hence it’s a good fitting model.
Comparative Fit Index (CFI) – closer to 1 (0.9-1 preferably)

Root Mean Square Error Approximate (RMSEA) – should be less than

0.1. Here, its 0 hence it’s a good fitting model.

Heat map – if there’s too many dark red or blue boxes then its heated.
We want a cooled map.

Assess measurement model – only visible for CFA. It tells you about the
reliability and validity of the model. Croneback alpha can also be used
for this.

If all the items are surpassing the threshold level, then all the items are
individually reliable. We will drop the question if it’s too far from the
data. Reliability indicates consistency.

If the latent variable is crossing the threshold line. Here we will accept
as it’s almost meeting the red line.
If my values above the diagonal is greater than on the diagonal, it
indicates high validity. There will be nothing below the diagonal.

Bad model:

- change your latent variable: Here we could drop one of the latent
variables like leadership. We will look at correlation to judge. For
example, the variables that come under leadership, one might be
negative, or the correlation might be weak.

2. Parameter estimates

1 on the first factor by default - Factor loading – my factor is

represented or loaded against goal, work and interact.
You can’t interpret factor loading so you change it to correlation by
going to show – show estimates – make it standardized (its
unstandardized by default)

Positive and high correlations. If I am seeing a very high correlation

between latent variables then it might be a problem later. It might
indicate multicollinearity.
SEM

For SEM you must interconnect your independent variables with one
another. If there many dependent variables you must interconnect them
also.

Chi square – null is accepted hence it’s a good model. All other criterions
are useless. is useless. RMSEA almost equal to 0 then good model.
1 unit in leadership will lead to a 0.634 unit avg increase in satisfaction,
keeping conflict constant.

1 unit change in conflict will lead to 0.357 fall in satisfaction, keeping

leadership constant

Solid line – significant, dashed line – insignificant.

Here null is if beta is equal to 0 or not.

Mid-term:

Comparing mean – use Anova

t-test: pair wise comparison

two-way Anova – single independent variable and 2 x factors

- Here, null is mean across all six categories are equal to another

PROCESS Vs SEM
No ratings yet
PROCESS Vs SEM
6 pages
Supply Chain Planning Peer Reviewed Answers
100% (1)
Supply Chain Planning Peer Reviewed Answers
2 pages
CF Week 9 Assignment Template
No ratings yet
CF Week 9 Assignment Template
6 pages
Introduction To GraphPad Prism
No ratings yet
Introduction To GraphPad Prism
33 pages
Multivariate Analysis – The Simplest Guide in the Universe: Bite-Size Stats, #6
From Everand
Multivariate Analysis – The Simplest Guide in the Universe: Bite-Size Stats, #6
Lee Baker
No ratings yet
BLP Assignment 2014
No ratings yet
BLP Assignment 2014
3 pages
Multivariate Data Analysis Using SPSS
100% (2)
Multivariate Data Analysis Using SPSS
124 pages
Hays 5th Edition Errors
No ratings yet
Hays 5th Edition Errors
3 pages
Notes On SPSS
No ratings yet
Notes On SPSS
19 pages
1ststeps in Hyphothesis Testing
No ratings yet
1ststeps in Hyphothesis Testing
17 pages
SMMD
No ratings yet
SMMD
10 pages
QM SM FA Notes
No ratings yet
QM SM FA Notes
41 pages
Quantitative Methods Vocabulary
No ratings yet
Quantitative Methods Vocabulary
5 pages
Hypothesis Testing in Machine Learning Using Python - by Yogesh Agrawal - 151413
No ratings yet
Hypothesis Testing in Machine Learning Using Python - by Yogesh Agrawal - 151413
15 pages
Notes - Machine Learning
No ratings yet
Notes - Machine Learning
9 pages
AP Psych Prep 2 Part II More Methods Statistics
No ratings yet
AP Psych Prep 2 Part II More Methods Statistics
37 pages
Statistics Interview Questions & Answers For Data Scientists
No ratings yet
Statistics Interview Questions & Answers For Data Scientists
43 pages
Predective Analytics or Inferential Statistics
No ratings yet
Predective Analytics or Inferential Statistics
27 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
20 pages
Lec 34
No ratings yet
Lec 34
15 pages
Basic Statistics: Basic Statistical Interview Question
No ratings yet
Basic Statistics: Basic Statistical Interview Question
5 pages
049 Stat 326 Regression Final Paper
No ratings yet
049 Stat 326 Regression Final Paper
17 pages
Psychological Statistics
No ratings yet
Psychological Statistics
15 pages
Paired T Tests - Practical
No ratings yet
Paired T Tests - Practical
3 pages
Is Important Because:: The Normal Distribution
No ratings yet
Is Important Because:: The Normal Distribution
19 pages
Statistical Analysis For Ib Biology
No ratings yet
Statistical Analysis For Ib Biology
26 pages
AP Stats Study Guide 1 1 1
No ratings yet
AP Stats Study Guide 1 1 1
21 pages
SPSS Notes
No ratings yet
SPSS Notes
8 pages
DS_UNIT_4
No ratings yet
DS_UNIT_4
13 pages
Presentation 7 Chap6
No ratings yet
Presentation 7 Chap6
17 pages
IV_AI-DS_AD3491_FDSA_Unit5
No ratings yet
IV_AI-DS_AD3491_FDSA_Unit5
39 pages
Q1. What Is Data Science? List The Differences Between Supervised and Unsupervised Learning
100% (1)
Q1. What Is Data Science? List The Differences Between Supervised and Unsupervised Learning
41 pages
Data Science Interview Questions - Statistics: Mohit Kumar Dec 12, 2018 11 Min Read
100% (1)
Data Science Interview Questions - Statistics: Mohit Kumar Dec 12, 2018 11 Min Read
14 pages
Data Science Interview Preparation (30 Days of Interview Preparation)
No ratings yet
Data Science Interview Preparation (30 Days of Interview Preparation)
27 pages
Untitled Document
No ratings yet
Untitled Document
9 pages
Stats Notes
No ratings yet
Stats Notes
10 pages
Regression and Introduction To Bayesian Network
No ratings yet
Regression and Introduction To Bayesian Network
12 pages
ML Model Paper 2 Solution
No ratings yet
ML Model Paper 2 Solution
15 pages
Measure of Central Tendency Dispersion A
No ratings yet
Measure of Central Tendency Dispersion A
8 pages
Engineering Statistics: Measures of Central Tendency
No ratings yet
Engineering Statistics: Measures of Central Tendency
10 pages
Data Analysis
100% (1)
Data Analysis
34 pages
Chapter 2 - : Measures of Variability
No ratings yet
Chapter 2 - : Measures of Variability
6 pages
BRM Unit 4 Extra
No ratings yet
BRM Unit 4 Extra
10 pages
HypothesisTesting_TRANSCRIPT
No ratings yet
HypothesisTesting_TRANSCRIPT
38 pages
1.1 - Statistical Analysis PDF
No ratings yet
1.1 - Statistical Analysis PDF
10 pages
A Comprehensive Guide To Data Exploration
100% (1)
A Comprehensive Guide To Data Exploration
18 pages
Standard Deviation and Its Applications
100% (1)
Standard Deviation and Its Applications
8 pages
ML Model Paper 2 Solution
No ratings yet
ML Model Paper 2 Solution
15 pages
Gea1000 Cheatsheet Finals
No ratings yet
Gea1000 Cheatsheet Finals
3 pages
Econometrics
No ratings yet
Econometrics
13 pages
Confidence Limits in Statistics
No ratings yet
Confidence Limits in Statistics
30 pages
Statistics Interview Questions
No ratings yet
Statistics Interview Questions
20 pages
Unit 0 - Statistics Unit Notes Dictated (CLOSED)
No ratings yet
Unit 0 - Statistics Unit Notes Dictated (CLOSED)
10 pages
Analysis Interpretation and Use of Test Data
No ratings yet
Analysis Interpretation and Use of Test Data
50 pages
Chap5 Chris Brooks
No ratings yet
Chap5 Chris Brooks
8 pages
Descriptive Stat
No ratings yet
Descriptive Stat
13 pages
Statistics: An Introduction and Overview
No ratings yet
Statistics: An Introduction and Overview
51 pages
20151113141143introduction To Statistics-7
No ratings yet
20151113141143introduction To Statistics-7
29 pages
Advanced Statistics and Probability
No ratings yet
Advanced Statistics and Probability
37 pages
Alternative Vs Null Hypothesis
No ratings yet
Alternative Vs Null Hypothesis
1 page
Assignment Answers
No ratings yet
Assignment Answers
5 pages
Introduction To Quantitative Techniques in Business
No ratings yet
Introduction To Quantitative Techniques in Business
45 pages
Evaluation Metrics:: Confusion Matrix
No ratings yet
Evaluation Metrics:: Confusion Matrix
7 pages
Biostatistics Notes
No ratings yet
Biostatistics Notes
10 pages
Missing Value Treatment
No ratings yet
Missing Value Treatment
22 pages
Data Science Interview Preparation
100% (1)
Data Science Interview Preparation
113 pages
Chapter 3
No ratings yet
Chapter 3
68 pages
3352 8855 1 SM
No ratings yet
3352 8855 1 SM
9 pages
Lampiran SPSS
No ratings yet
Lampiran SPSS
18 pages
Simple Regression Analysis
No ratings yet
Simple Regression Analysis
4 pages
STAT659: Chapter 6
No ratings yet
STAT659: Chapter 6
30 pages
Forecasting US Population Totals With The Box-Jenkins Approach
No ratings yet
Forecasting US Population Totals With The Box-Jenkins Approach
10 pages
STA1000F 2024 Assignment Week 10
No ratings yet
STA1000F 2024 Assignment Week 10
3 pages
Nordpred: Fit Power5 and Poisson Age-Period-Cohort Models For Prediction of Cancer Incidence
No ratings yet
Nordpred: Fit Power5 and Poisson Age-Period-Cohort Models For Prediction of Cancer Incidence
21 pages
De La Salle University Gokongwei College of Engineering Department of Industrial Engineering Fndstat
No ratings yet
De La Salle University Gokongwei College of Engineering Department of Industrial Engineering Fndstat
5 pages
Chapter 5 - 2010
No ratings yet
Chapter 5 - 2010
8 pages
BRMS - DR - Dhanashree Havale MCQ
No ratings yet
BRMS - DR - Dhanashree Havale MCQ
17 pages
A Multivariate Weibull Distribution
No ratings yet
A Multivariate Weibull Distribution
12 pages
9709 s20 QP 31-Solved (Handwritten)
No ratings yet
9709 s20 QP 31-Solved (Handwritten)
12 pages
Miller - Haden - 2013 - GLM Statistical Analysis PDF
No ratings yet
Miller - Haden - 2013 - GLM Statistical Analysis PDF
274 pages
P and S Gtu Pyq Past 3 Years
No ratings yet
P and S Gtu Pyq Past 3 Years
19 pages
Introduction BS Final
No ratings yet
Introduction BS Final
54 pages
Ch10 Two Sample Tests
No ratings yet
Ch10 Two Sample Tests
17 pages
Sta230 20100329163207
100% (1)
Sta230 20100329163207
62 pages
Rhea Single Versus Multiple Sets For Strength A Meta Analysis To Address The Controversy
No ratings yet
Rhea Single Versus Multiple Sets For Strength A Meta Analysis To Address The Controversy
5 pages
An Illustrated Guide To The Poisson Regression Model - by Sachin Date - Towards Data Science
No ratings yet
An Illustrated Guide To The Poisson Regression Model - by Sachin Date - Towards Data Science
25 pages
Logistic Regression
No ratings yet
Logistic Regression
11 pages
Presentation1 (1275)
No ratings yet
Presentation1 (1275)
31 pages
Course Description PDF
No ratings yet
Course Description PDF
5 pages
Case Writing
No ratings yet
Case Writing
19 pages