EDA - Midterms - Reviewer
EDA - Midterms - Reviewer
Numerical Variable
Evangelista, MSCE
➢ Take on values with equal units such as
INTRODUCTION TO DATA weight in pounds and time in hours
Is a collection of persons, things, or objects ➢ Place the person or thing into a category
under study
Examples of Variable and Data
Commonly referred to as the symbol N Variable
Sample ➢ Name, age, course, height
Is a portion taken from the population Data
Commonly referred to as the symbol n ➢ Marielle, 19 years old, civil engineering,
Sampling 4’11
The actual values of the variable Uses probability to determine how confident
we can be that our conclusions are correct.
They may be numbers, or they may be words
Probability
Datum is a single value
Mathematical tool used to study randomness
Variable
Deals with the chance (the likelihood) of an
Is a characteristic of interest for each person event to happen
or thing in a population. Represented by capital
letters such as X and Y Quantitative description of chances
associated with various outcomes.
It is a characteristic that changes or varies
over time
There are two types:
Probability Calculations 2.2. Bar Graph
Used in statistics to analyze and interpret data ❖ The length of the bar for each category
is proportional to the number or
It proves a bridge between descriptive and percent of individuals in each category.
inferential statistics
❖ Bars may be vertical or Horizontal.
Types of Data
1. Qualitative (Categorical)
➢ Results of categorizing or describing POPULATION VS SAMPLE
attributes of a population. Population Sample
➢ It measures the quality or characteristics µ = mean ̄ x = mean
of each experimental unit. σ = standard s = standard
deviation deviation
➢ They are generally described by words or
letters.
Census
2. Quantitative (Numerical)
Information/data gathered from every member
➢ Are always numbers of the population
➢ They are the result of counting or Data
measuring attributes of a population.
Information from the sample of the population
2.1. Quantitative Discrete
❖ Are the results of counting Sampling plan/method
❖ Ex. Number of laborers, Number of Selecting the group where the researchers will
cars collect data from.
2.2. Quantitative Continuous Sampling Methods
❖ Are the results of measuring 1. Non-Probability
❖ Ex. Time, Amount of Gas ➢ Individuals of the population are not given
an equal opportunity to become a part of
Organizing and Displaying Data
the sample.
1. Statistical Table
1.1. Convenience Sampling
➢ Use a data distribution to describe:
❖ Choosing samples based on easy or
❖ What Values have been measured convenient access
❖ Choosing members of the sample 4. Decide what questions you will ask
when there are clearly defined 5. Conduct the interview
subgroups in the population
# 𝑜𝑓 𝑚𝑒𝑚𝑏𝑒𝑟𝑠 𝑖𝑛 𝑒𝑎𝑐ℎ 𝑠𝑡𝑟𝑎𝑡𝑎
# 𝑜𝑓 𝑚𝑒𝑚𝑏𝑒𝑟𝑠 𝑖𝑛 𝑡ℎ𝑒 𝑠𝑡𝑟𝑎𝑡𝑢𝑚 LEVELS OF MEASUREMENT
= ∗ (𝑛)
𝑡𝑜𝑡𝑎𝑙 # 𝑜𝑓𝑡ℎ𝑒 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 1. Nominal Scale Level
➢ Qualitative/ Categorical
3 Features to keep in MIND while constructing a ➢ Names, colors, LABELS
sample
➢ Order does not matter
1. Consistency
➢ Cannot perform mathematical
2. Diversity operations, frequencies and proportions
can be applied
3. Transparency
➢ Can be represented through a bar or pie
Conducting a Survey
chart
1. Face-To-Face interviews
2. Ordinal Scale Level
Advantages
➢ Ranking
❖ Fewer misunderstood questions
➢ The Order Matters
❖ High response rates
➢ Differences cannot be measured (not
❖ Additional information can be equal intervals)
collected from the respondents
➢ Can perform mathematical operations
3. Interval Scale Level GROUPED DATA AND UNGROUPED DATA
➢ The order matters Collection of Data
➢ Differences can be measured First step in the field of research
➢ No true “zero” or starting point Presentation of Data
➢ Can perform math operations: mean, To look for ways to condense and arrange the
median, and SD. data and to study their characteristics.
➢ Line graph, bar chart, and histogram Ungrouped Data
4. Ratio Scale Level Data in its original form
➢ Order matters Raw data
➢ Differences are measurable (including List of numbers that do not convey anything
ratios)
No summarization or aggregation
➢ Contains a “0” starting point
Grouped Data
➢ Can perform math operations: mean,
median, and SD. Data that is bundled together in different
classes or categories
➢ Line graphs, box plots, bar charts, and
histogram
In ungrouped data
Data Nominal Ordinal Interval Ratio
Given the data:
3, 2, 4, 5, 6, 8, 2, 5, 8, 7, 9, 8, 8, 8, 11, 10, 12, 11, 9
Labeled Yes Yes Yes Yes
Determine the following:
Ordered No Yes Yes Yes a. Frequency table
Σ(𝑋𝑖 − x̄ )2
𝑆= √
𝑛
̄ ) (X -
X (X-X |X - ̄X| ̄ )2 √(X − X̄)2
X
27 -3 3 9 3
29 -1 1 1 1
30 0 0 0 0
31 1 1 1 1
33 3 3 9 3