0% found this document useful (0 votes)
22 views18 pages

Computation Variation and Quartile

Uploaded by

Leavic Maghanoy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views18 pages

Computation Variation and Quartile

Uploaded by

Leavic Maghanoy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 18

VARIATION

• In survey data analysis, variation refers to how spread out the responses are for a particular
question. It essentially tells you how much the individual answers deviate from the average
(mean) response. There are two main ways to measure variation:

1. Variance. This is a statistical measure that represents the average squared difference of
each response from the mean. A higher variance indicates that the data points are further
away from the mean on average, signifying a wider spread of responses.

2. Standard Deviation. This is the square root of the variance and is expressed in the same
units as the original data. It's generally considered easier to interpret than variance because
it's in the same scale as the data itself. A high standard deviation indicates a larger spread of
responses around the mean.
• Here are some additional points to consider about variation in
survey data:
 Variation is most useful for analyzing questions with
numerical responses (e.g., rating scales, income levels).
 For categorical data (e.g., yes/no, multiple choice), other
measures like frequency tables or chi-square tests might be
more appropriate to understand response distribution.
Participant Rating
1 4
2 3
3 5
4 2
5 4

Step 2: Find the Deviation from the Mean for Each


Step 1: Find the Mean (Average Participant
Rating) For each participant, subtract the mean from their rating.
Add up all the ratings and divide by  Participant 1: Deviation = 4 - 3.6 = 0.4
the number of participants (n = 5).  Participant 2: Deviation = 3 - 3.6 = -0.6

Mean = (4 + 3 + 5 + 2 + 4) / 5 = 3.6  Participant 3: Deviation = 5 - 3.6 = 1.4


 Participant 4: Deviation = 2 - 3.6 = -1.6
 Participant 5: Deviation = 4 - 3.6 = 0.4
• Step 3: Square Each Deviation

• Square each of the deviation values you calculated in step 2.

 Participant 1: 0.4 ^ 2 = 0.16

 Participant 2: -0.6 ^ 2 = 0.36

 Participant 3: 1.4 ^ 2 = 1.96

 Participant 4: -1.6 ^ 2 = 2.56

 Participant 5: 0.4 ^ 2 = 0.16


• Step 4: Find the Sum of Squares
• Add up all the squared deviations you calculated in step 3.
• Sum of Squares = 0.16 + 0.36 + 1.96 + 2.56 + 0.16 = 5.2
• Step 5: Calculate the Variance
• There's a slight difference in the calculation depending on whether
you're analyzing a sample or the entire population. Here, we'll assume
we're analyzing a sample (n = 5).
• Variance = Sum of Squares / (n - 1)
• = 5.2 / (5 - 1)
• = 5.2 / 4 = 1.3
Quartile and Percentile
Quartile and Percentile
• Quartiles and percentiles themselves are not exactly measures of central tendency, but they do play a vital role in
understanding where the "center" of your data lies.

• Here's how they fit in:

 Central tendency: This refers to the middle or typical value within a data set. Common measures of central
tendency include mean, median, and mode.

 Percentiles: These divide your ordered data into 100 equal parts. The nth percentile indicates that n% of your data
falls below that value. For instance, the 50th percentile is the median, where 50% of the data lies on either side.

 Quartiles: A specific type of percentile, quartiles divide your data into fourths. The first quartile (Q1) is the 25th
percentile, the second quartile (Q2) is the median (50th percentile), and the third quartile (Q3) is the 75th
percentile.
• While percentiles and quartiles don't represent the exact center, they reveal how your data is spread
around a central point. By looking at these values, you can understand:

 Skewness: If the median is closer to Q1 (left-skewed) or Q3 (right-skewed), it indicates the data


leans in that direction.

 Spread: The distance between Q1 and Q3 (interquartile range or IQR) represents the middle 50%
of your data. A large IQR suggests a wider spread, while a small IQR indicates the data is more
clustered around the center.
• So, while quartiles and percentiles provide valuable insights into the center of your data's
distribution, they work alongside measures like mean or median to give a more complete picture.
 Central tendency: This refers to the middle or typical value within a data set. Common
measures of central tendency include mean, median, and mode.

 Percentiles: These divide your ordered data into 100 equal parts. The nth percentile
indicates that n% of your data falls below that value. For instance, the 50th percentile is
the median, where 50% of the data lies on either side.

 Quartiles: A specific type of percentile, quartiles divide your data into fourths. The first
quartile (Q1) is the 25th percentile, the second quartile (Q2) is the median (50th
percentile), and the third quartile (Q3) is the 75th percentile.
• While percentiles and quartiles don't represent the exact center, they reveal
how your data is spread around a central point. By looking at these values,
you can understand:

 Skewness: If the median is closer to Q1 (left-skewed) or Q3 (right-skewed),


it indicates the data leans in that direction.

 Spread: The distance between Q1 and Q3 (interquartile range or IQR)


represents the middle 50% of your data. A large IQR suggests a wider spread,
while a small IQR indicates the data is more clustered around the center.
25 Percentile
th

• Example:

• Imagine you have a dataset containing salaries (in thousands of dollars) for 20 employees:

• Data: {18, 22, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36,
38, 40, 42, 45, 50}

• 1. Order the data:

• First, arrange the salaries in ascending order:

• Data: {18, 22, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36,
38, 40, 42, 45, 50}
• 2. Calculate Median (50th percentile):

• Since we have 20 data points, the median will be the average of the 10th and 11th values:
• Median = (29 + 30) / 2 = $29.5 thousand

• 3. Calculate Quartiles (25th percentile):

 Lower Quartile (Q1 or 25th percentile):


o Find the position for Q1: (Number of data points * 0.25) + 0.5 = (20 * 0.25) + 0.5 = 5.5

o Since 5.5 isn't a whole number, we take the average of the 5th and 6th values:

• Q1 = (26 + 27) / 2 = $26.5 thousand


75th Percentile

• Example:

• Imagine you have a dataset containing salaries (in thousands of dollars) for 20 employees:

• Data: {18, 22, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34,
35, 36, 38, 40, 42, 45, 50}
• 1. Order the data:

• First, arrange the salaries in ascending order:

• Data: {18, 22, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34,
35, 36, 38, 40, 42, 45, 50}
• 2. Calculate Median (50th percentile):

• Since we have 20 data points, the median will be the average of the 10th and 11th values:
• Median = (29 + 30) / 2 = $29.5 thousand

• 3. Calculate Quartiles (75th percentile):

 • Upper Quartile (Q3 or 75th percentile):

 Similar to Q1, calculate the position:

(Number of data points * 0.75) + 0.5 =

(20 * 0.75) + 0.5 = 15.5

 Again, average the 15th and 16th values:

 Q3 = (34 + 35) / 2 = $34.5 thousand


• 4. Analyze the results:
 Central tendency: The median ($29.5 thousand) indicates the
typical salary in this dataset.
 Spread: The interquartile range (IQR) is
 Q3 - Q1 = $34.5 thousand - $26.5 thousand = $8 thousand.
o This means the middle 50% of the salaries fall within this
range ($26.5 thousand to $34.5 thousand).

You might also like