Stat Lecture 1
Stat Lecture 1
ANALYSIS WITH
SOFTWARE
APPLICATION
Course Description
■ This course provides a fundamental understanding of
the concepts of statistical inference necessary to
effectively employ statistical methods in
contemporary business situations. It is designed to
use the appropriate statistical techniques and any
available software application that will facilitate a
data-driven decision-making process in the field of
accounting and other related areas. In addition to the
more complex software for data analysis, it required
that the students to be highly proficient with the use
of MS excel/ Jamovi/ SPSS for Statistical Analysis.
Course Outline
1.Basic statistical concepts and methods of statistical
inference.
2.Data collection, organization, and presentation of data,
3.Measures of central tendency and variation,
elementary probability,
4.Sampling and sampling distributions
5.Confidence intervals,
6.Hypothesis testing
7.Correlation and regression analysis.
8.Statistical Analysis using software
■Statistics refers to the
science dealing w/ the
collection, organization,
analysis and interpretation
of numerical data.
• In plural sense “Statistics” –
set of data or a mass of
observations, like public health.
1. Direct or Interview
method
2. Indirect or Questionnaire
3. Registration
4. Experimental
1. Direct or Interview method
-oral type of w/ a face to face contact bet. the
researcher and the respondents.
• Types of Interviews:
o Formal – requires an appointment w/ the respondent
o Informal – by chance interview
o Clinical – involves a patient & his health provider
o In-depth – wider & deeper coverage as in investigative or
detective cases.
o Focus – solicits views and opinions from a group of people
o Non-Directed – interviewed person has given the task of
providing pieces of advice.
▪ i.e., counselling given by guidance counsellor
2. Indirect or Questionnaire
-set of written & planned questions related
to a particular topic intended to answer the
problem of the study.
• Types of Questionnaires:
o Close ended – answerable through options or
choices.
o Open ended – questions that require further
explanation in phrases or paragraphs.
▪ i.e., narrative responses
3. Registration – data obtained
through births, deaths, marriages,
licenses and census.
4. Experimental – used by
scientific researches.
What is a “Variable”?
o Math – A value that may change within the
scope of a problem or situation (vs. a “constant”)
▪x
o Research – A logical set of attributes (gender,
age, etc.)
▪ Age
o Computers – A symbolic name given to an
unknown quantity.
▪ A$
Relationships Between
Variables
▪ In Math, the relationship between 2
❖ 𝐹(𝑥) = 210 − 𝑥
variables is written as a “function.”
1. Descriptive Statistics
2. Inferential Statistics
1. Descriptive Statistics
• Uses different methods of
statistics to summarize and present
data in narrative form.
• Methods of tabulation
o Graphical presentation
o Computation of averages
o Measures of variability
2. Inferential Statistics
• Uses generalizations & conclusions
about a target population w/c is based
on results from a sample.
o Experimental method
• Phenomena of Variation: tendency
of measurable characteristics to change
from one individual or setting w/in the
same individual or setting.
o Person’s blood pressure
2 Types of Variables:
1. Constant
▪ values remain the same from time to time.
❖ minutes in an hour
❖ number of days a week
2. Independent/Dependent Variable
▪ measured according to quantity or values and are
expressed numerically.
❖ Birthweight
❖ hospital bed capacity
❖ arm circumference
Types Of Independent/Dependent Variable:
1. Discrete Variable
➢Variable which can assume only integral
values or whole number.
✓ books
2. Continuous Variables
➢Variable which can attain values in terms of
fraction or decimals.
✓ Birthweight
✓ Arm circumference
Levels of Measurement
1. Nominal
• Numbers or symbols used to classify an object, person or
characteristics into categories.
o Collection of yes, no, undecided responses to a
medical survey question.
2. Ordinal
• Data are arranged in some order but differences
between data values cannot be determined.
o Size of t-shirt
o Socio economic status
o In 10 urine samples 6 were rated normal, 4
pathological
Levels of Measurement
3. Interval
• Characterized by a common and unit and
measurement.
• Distances between any two on the scale are known
sizes.
o Temp. Reading of 15ºc and 35ºc
4. Ratio
• Has a true zero point wherein the number zero
indicates the absence of the characteristics under
considerations.
o Height in meters
o Weight in kilograms
There are two broad flavours of
variables (Types of Variables ;
Independent/Dependent Variable)
Dichotomous Variables
■The most common type of categorical
variable is “dichotomous”, meaning that it
has two levels or two possible values.
■“Dichotomize” means to convert a non-
dichotomous variable to a dichotomous
one.
■Categorizing Continuous Variables
Categorical variables with more levels
can also be created.
Sampling
▪ Descriptive Statistics
❖ Just describing the people in front.
▪ Inferential Statistics
❖ Using the information to learn
something about larger population at
hand
■ The sample comes from the larger
population, also called “reference
population.”
■The sample is extracted from the larger
population, then manipulated to learn
something about the larger population.
■Statistics are performed on this
representative sample in order to infer
properties about the reference population.
■It’s important that sample be
representative.
The Null Hypothesis
o What is it?
▪ It is a statement that there is no
relationship between the variables we are
testing.
o Why do we care?
▪ Statistical tests allow us to either “reject” or
“fail to reject” the null hypothesis.
o Ho: the average number of subjects getting
better in the test group is no different from the
average number of subjects in the placebo group.
The p-Value
o A “p-value” is
computed from a
statistical test. It
tells us whether
we should reject
the null
hypothesis.
The p-Value
• Whether or not we
reject the null is
determined by
whether the p-value
is below a certain
cut-off, which we
call the alpha
value. Traditionally,
we tend to set 0.05
or 0.01
■A convenient, though inaccurate
interpretation is that the p-value is the
probability that your result was due to
chance → More accurately, the p-value
is the probability of your test
incorrectly rejecting the null, when
indeed the null hypothesis is true.
■A useful memory aid: “If the p is
low, the null (hypothesis) must
go.”
Confidence Intervals
■ A confidence interval is another way to express
a statistical result along with its significance
level, without having to use a p-value.
o Gives us a range value where the answer
probably sits.
Example:
– The mean age of university students is 21
years old (18, 21.5).
– The “confidence interval” of the parameter
estimate.
Commonly Use Statistical
Tests
Descriptive vs. Inferential Statistical Methods