0% found this document useful (0 votes)
14 views

Assignment 3 FBA

This report analyzes data collected from a survey of 250 college students in Da Nang about their exercise habits and health. The analysis includes: 1) Females accounted for 65% of respondents, showing greater interest in fitness than males. 2) Most respondents were aged 21-22, indicating higher exercise habits among this age group. 3) Height and weight were found to have a weak positive correlation, with most students in healthy ranges. 4) Interval estimates for height and weight found average heights of 163.9-165.6 cm and weights of 55.3-57.6 kg with 90% confidence. Hypothesis testing rejected the proposed mean height of 155 cm.

Uploaded by

Min Min
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views

Assignment 3 FBA

This report analyzes data collected from a survey of 250 college students in Da Nang about their exercise habits and health. The analysis includes: 1) Females accounted for 65% of respondents, showing greater interest in fitness than males. 2) Most respondents were aged 21-22, indicating higher exercise habits among this age group. 3) Height and weight were found to have a weak positive correlation, with most students in healthy ranges. 4) Interval estimates for height and weight found average heights of 163.9-165.6 cm and weights of 55.3-57.6 kg with 90% confidence. Hypothesis testing rejected the proposed mean height of 155 cm.

Uploaded by

Min Min
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

THE UNIVERSITY OF DANANG

VN-UK INSTITUTE FOR RESEARCH AND EXECUTIVE EDUCATION

Data analysis report

VNUK Institute for Research and Executive Education

FBA 3nd Group Assignment

Lecturer: Cuong Chi Nguyen, MSc. & MBA

Tran Bao Minh - 20010015


Nguyen Thi Nhu Quynh - 20010121
Nguyen Thi Bich Tram - 20050037
Nguyen Hoang Nha Uyen - 21010035
Nguyen Thanh Huyen - 21010034

1
Table of content

I. Abstract: 3

II.Introduction: 3

1. Research objectives
2. Objects and scope research
● Purposes
● Respondents
● Geographic cope

III. Data analysis: 4

IV. Storytelling: 14

V. Peer evaluation: 13

2
I. Abstract:
This report serves as a record of the information collected and the data obtained. Data sets
and calculations on the Microsoft Excel platform are used to analyze the data and decide to
interpret the data based on the needs of Gym customers for Gym registration.

II. Introduction:
1. Research objectives:
This survey tries to identify subjective and objective issues related to consumer behavior,
especially the need to care about fitness and health of current Danang college students, using
the results of student reviews and comments on the Internet.

2. Object and scope of research


➢ The purpose of this survey is to understand the need to care about fitness health and the
impact of influencers on their decision to go to the gym. The questions are designed in the
survey to give reasons affecting health and the desire to go to a gym to improve the health of
Da Nang students. Whether it is certification real or creatively adapted content.

➢ Audience: College Students in Da Nang.

➢ Geographical range: Da Nang

III. Data analysis


1) Analyze at least 1 categorical (qualitative) variable from the data (frequency, relative
frequency table, charts), and point out some discussion about the analysis results.

Figure1: Relative frequency of gender

3
According to the above statistics, through the screening of 250 valuable samples for this
research topic, with 163 people completing the survey, it showed that their need to care about
their bodybuilding and health is important to the female gender - accounting for the majority
of respondents - 65%. Meanwhile, men account for only 35% - the disparity is too obvious.

2) Analyze at least 1 quantitative variable from the data (frequency, relative frequency
table, charts- histogram), and point out some discussion about the analysis results.

Figure 1. Histogram of the number of Age

4
According to the chart above, the age group between 21 - 22 to have the highest exercise
habits, with a relative frequency of 55%, compared to the other age groups. Additionally, the
age range of 19 - 20 has the second-highest number, with a relative frequency of 36%,
according to the data gathered. The remaining age categories, on the other hand, only account
for 9% of the pie chart, which is an astonishingly high proportion. Thus, it can be shown that
the 138 young people who took part in the study and were between the ages of 21 - 22 have
greater requirements for and habits related to physical activity and sports than other age
groups.

3) Generate a Scatter plot and measure the covariance, correlation coefficients to


address the relationship between 2 quantitative variables, and point out some discussion
about the analysis results.

The figure below illustrates the relationship between Age and Weight of the survivors:

5
Figure 1. Correlation between Age and Weight

The covariance of the survey data provided is 0.4667 and the correlation coefficient is
0.0401. As a result, we can infer that the two provided variables have a strong positive
association. At the same time, it can be clearly seen through the analysis that the majority of
people aged 19-21 have a concentrated weight of 50-65 kg while those who weigh over 80kg
are quite small (only 2 people). This finding shows that the majority of students in Danang
today have a fairly suitable weight (neither too thin nor too overweight) both male and
female.

6
The figure below illustrates the relationship between Age and Height of the survivors:

Figure 2. Correlation between Age and Height

The covariance of the survey data provided is -0.1168 and the correlation coefficient is
-0.0126. As a result, we can deduce that the two provided variables have a negative
connection indicating that both variables move in opposite directions. At the same time, the
analysis shows that the majority of students are concentrated in age from 19-21 with height
ranging from 150-180cm (over 150cm). This finding shows that the majority of students in
Danang today are of average-to-good height, both male and female.

4) Analyze the mean, median, quartiles, range, and standard deviation of at least 1
quantitative variable from the data, and point out some discussion about the analysis
results

7
After the analysis, we can draw some conclusions as follows:
- The median: is 500.000 VND. It shows the average amount that survey participants can pay,
50% of survey participants pay over 500.000 VND for exercise, 50% of survey participants
spend pay less than 500.000 for exercise
- Standard deviation : The data set has a standard deviation of 823197.9156VND
- The difference between the highest cost and the lowest cost is 5.000.000 VND
- Quartiles:
● Q1-25%: 25% of survey respondents do not pay for exercise and 75% of survey
respondents pay for exercise
● Q2-50%: it shows that 50% of survey participants pay less than 500.000 VND for
exercise and the remaining 50% of survey participants pay more than 500.000 VND
for exercise
● Q3-75%: it shows that 75% of survey participants pay less than 772.500 VND for
exercise and 25% of survey participants pay more than 772.500 VND for exercise.

5) Interval Estimation:

1. Height
The figures below illustrate the interval estimation of the customer’s height with the
confidence level respectively 90%, 95% and 99%:

8
Confidence level of 90%
Based on the analysis, we have a confidence level of 90% that the interval estimate of The
Height variables of all students in Da Nang is from 163.933cm - 165.621cm
● Median : On average, the height of the survey participants is 165cm
● Range: The difference between the tallest and the shortest is 50cm.
● Standard Deviation: The data set has a standard deviation of 8.1cm.

Confidence level of 95%


Based on the analysis, we have a confidence level of 95% that the interval estimate of the
average Height of all students in Da Nang is from 163.770cm - 165.784cm
● Medium: On average, the height of the survey participants is 165cm.
● Range: The difference between the tallest and the shortest is 50cm.
● Standard Deviation: The data set has a standard deviation of 8.1cm.

Confidence level of 99%

9
Based on the analysis, we have a confidence level of 99% that the interval estimate of the
Height variables of all students in Da Nang is from 163.450cm - 166.104cm
● Median: On average, the height of the survey participants is 165cm.
● Range: The difference between the tallest and the shortest is 50cm.
● Standard Deviation: The data set has a standard deviation of 8.1cm

2. Weight:
The figures below illustrate the interval estimation of the customer’s weight with the
confidence level respectively 90%, 95% and 99%:

Confidence level of 90%


Confidence Interval Estimate of Average Weight with the confidence level of 90% is
(55.33kg - 57.64kg)
● Median: On average, the average weight of survey participants is 55 kg.
● Range: The difference between the heaviest and the lightest is 57kg
● Standard Deviation: The data set has a standard deviation of 11.07kg.

Confidence level of 95%


Confidence Interval Estimate of Average Weight with the confidence level of 95% is
(55.11kg - 57.86kg).
● Median: On average, the average height of survey participants is 55kg
● Range: The difference between the heaviest and the lightest is 57kg.

10
● Standard Deviation: The data set has a standard deviation of 11.07kg

Confidence level of 99%


Confidence Interval Estimate of Average Weight with the confidence level of 99% is
(54.67kg- 58.30kg).
● Median: On average, the average height of survey participants is 55kg
● Range: The difference between the heaviest and the lightest is 57kg.
● Standard Deviation: The data set has a standard deviation of 11.07kg.

6) The height hypothesis.

Step 1: H0: mean of height = 155cm


H1: mean of height is different to 155cm
Step 2: The level of significance: alpha = 0.05
Step 3: z-statistics value = 50.80315
Step 4: z = 50.80315 => p-value = 0.00000 < alpha = 0.05
Step 5: Because p-value = 0.00 < alpha = 0.05
=> reject H0, accept H1: the mean of height is different to 155 cm
=> The mean of height is different to 155 cm
First, we define two hypotheses: the null hypothesis, which asserts that the mean height
equals 155 cm, and the alternative hypothesis, which argues that the mean height does not
equal 155 cm. As a result, the threshold of significance will be set by default to 0.05. Because
the population's standard deviation is 7.790, we opted to utilize the Z-test, and the z statistic
value was calculated as 50.80315, resulting in a p-value of nearly 0%. The p-value clearly
shows that it is below the level of significance (0.05). As a consequence, we may infer that

11
We can reject the null hypothesis and accept the alternative hypothesis, meaning that the
mean height varies from 155 cm.

7) The weight hypothesis.

Step 1: H0: mean of weight = 61


H1: mean of weight is different to 61
Step 2: The level of significance: alpha = 0.05
Step 3: Test statistics value: -9.470
Step 4: T = -9.470 => p-value = 0.00 < alpha = 0.05
Step 5: Because p-value = 0.00 < alpha = 0.05
-> reject H0, accept H1: mean of weight is different to 61
=> The mean of weight of the sample is different to 61 kg

In theory, we have five standard stages for implementing hypothesis testing. To begin, we
establish two hypotheses: the null hypothesis (if the mean of weight equals 61) and the
alternative hypothesis (whether the mean of weight differs from 61). The level of significance
will subsequently be chosen by default: 0.05%. Because the population's standard deviation
was unknown, we had to use the T-test, and the test statistic value was computed as -9.470,
resulting in a p-value of about 0%. We can see from the p-value that it is clearly less than the
criterion of significance (0.05). As a result, we may conclude that we can reject the null
hypothesis and accept the alternative hypothesis, implying that the mean weight of the sample
is different to 61 kg.

8) Extra credit.
Create a Null vs. Alternative Hypothesis:

12
After implementing hypothesis testing, we collect some figures that show if the population
mean height is different in gender. There are 88 men and 163 women, with the mean height
for the male group being about 172 centimeters and approximately 161 centimeters for the
female one. Theoretically, we have two important steps for the Independent Sample Test.
First of all, with Levene’s Test for Equality of Variances, we obviously have a significant
level larger than 0.05. That means the variance for the two samples is equal. In the second
stage, the T-Test for Equality of Means, we have a significant level of 2-tailed in two ways is
0.000, smaller than 5%. It shows that the mean of the male differs from the mean of the
female. Moreover, we can see that the mean difference is about 10.07 centimeters, which
means the male is higher than the female. Therefore, we can have the conclusion that male
and female are different in the mean height in the population on average.

IV. Storytelling:
- This exercise needs us to do 3 main tasks: report, excel, and spss. Data provided from lesson
2 to do. In fact, to download spss is really difficult because to use the main license of spss has
to spend a lot of money, but the crack version is really not functional enough for us to analyze
the data. But we finally managed to spss for this part of the exercise.
- As for excel, the data taken from the survey google form results of exercise 2, analyzing the
inner chart, column chart, covariance and Correlation... was done quite quickly and smoothly
because we were trained. quite a lot from previous posts.
- Finally, report analysis, our team has 5 people, each of whom has a fairly appropriate data
analysis task. Practically speaking, we only had 10 days to complete all of them, so it was
quite urgent because the homework for all subjects was quite a lot. But we managed to
complete the assignment properly and submit it on time.

13
V. Peer evaluation:

Team member What did he/she do? How many % of the contribution

Tran Bao Minh - Edit excel 20%


- Edit report
- Do the covariance, correlation
- Do the storytelling

Nguyen Thi Nhu - Do the descriptive statistics 20%


Quynh - Do the Interval Estimation

Nguyen Thi Bich - Do the pie chart 20%


Tram - Do the histogram

Nguyen Hoang Nha - Do the abstract 20%


Uyen - Do the introduction

Nguyen Thanh - Do the The height hypothesis 20%


Huyen - Do the The weight hypothesis
- Do the Extra credit

14

You might also like