0% found this document useful (0 votes)
3 views

Assignment

The document outlines a data management and analysis assignment conducted by Madhu Ritu Bhadra Medha. It includes statistical tests using STATA to analyze systolic blood pressure, BMI by sex, and height by race, with results indicating no significant differences in BMI between genders and a significant difference in height across races. The document provides detailed commands and calculations for t-tests and ANOVA, confirming the statistical findings.

Uploaded by

madhuritu.medha
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

Assignment

The document outlines a data management and analysis assignment conducted by Madhu Ritu Bhadra Medha. It includes statistical tests using STATA to analyze systolic blood pressure, BMI by sex, and height by race, with results indicating no significant differences in BMI between genders and a significant difference in height across races. The document provides detailed commands and calculations for t-tests and ANOVA, confirming the statistical findings.

Uploaded by

madhuritu.medha
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 5

Assignment 1

Data management & analysis (Spring 24)

Name:- Madhu Ritu Bhadra Medha

ID:- 2325385680

1. Do you think the average systolic blood pressure of the patient is 130 mm/hg? Use the
appropriate STATA command to answer it.

Answer: Command :

ttest bpsystol==130

One-sample t test
------------------------------------------------------------------------------
Variable | Obs Mean Std. err. Std. dev. [95% conf. interval]
---------+--------------------------------------------------------------------
bpsystol | 10,351 130.8817 .2293364 23.33265 130.4321 131.3312
------------------------------------------------------------------------------
mean = mean(bpsystol) t = 3.8444
H0: mean = 130 Degrees of freedom = 10350

Ha: mean < 130 Ha: mean != 130 Ha: mean > 130
Pr(T < t) = 0.9999 Pr(|T| > |t|) = 0.0001 Pr(T > t) = 0.0001

So, the mean or average of the systolic BP is 130 mm/hg which is 130.8817

2. Check your answer of the t-test output from the previous problem, calculate and check
manually using the formula where you obtain same result for the t-statistic.
Answer: Formula to calculate t-statistic is,
x ̅ −μ
t=
s
Here, x ̅ is the sample mean = 130.8817
√n ¿
¿
μ is the population mean = 130
s is the sample standard deviation =
23.33265
n is the sample size = 10351

130.8817−130
t=
23.33265/(√¿10351)

0.8817
t= 0.8817
23.33265 ≈ ≈ 3.8439
101.738 ¿ 0.2293364
¿

The difference between the manually calculated t-statistic and the one provided in the output is
minimal (around 0.0005), which could be due to rounding difference.

3. Compare whether the mean BMI varies by sex? You may use “ttest” command to answer
this. Consider both equal and unequal variance assumptions.
Answer: For unequal variance, Command:
ttest bmi,by( sex)unequal

Two-sample t test with unequal variances


------------------------------------------------------------------------------
Group | Obs Mean Std. err. Std. dev. [95% conf. interval]
---------+--------------------------------------------------------------------
Male | 4,915 25.50999 .0573945 4.023758 25.39748 25.62251
Female | 5,436 25.56256 .0759569 5.600241 25.41365 25.71146
---------+--------------------------------------------------------------------
Combined | 10,351 25.5376 .0483092 4.914969 25.4429 25.63229
---------+--------------------------------------------------------------------
diff | -.0525633 .0952028 -.2391803 .1340536
------------------------------------------------------------------------------
diff = mean(Male) - mean(Female) t = -0.5521
H0: diff = 0 Satterthwaite's degrees of freedom = 9858.54

Ha: diff < 0 Ha: diff != 0 Ha: diff > 0


Pr(T < t) = 0.2904 Pr(|T| > |t|) = 0.5809 Pr(T > t) = 0.7096

Here, in all cases P-value is >0.05, we fail to reject the null hypothesis, suggesting that
there is no significant difference in BMI between males and females.

For equal variance, Command:


ttest bmi,by( sex)

Two-sample t test with equal variances


------------------------------------------------------------------------------
Group | Obs Mean Std. err. Std. dev. [95% conf. interval]
---------+--------------------------------------------------------------------
Male | 4,915 25.50999 .0573945 4.023758 25.39748 25.62251
Female | 5,436 25.56256 .0759569 5.600241 25.41365 25.71146
---------+--------------------------------------------------------------------
Combined | 10,351 25.5376 .0483092 4.914969 25.4429 25.63229
---------+--------------------------------------------------------------------
diff | -.0525633 .0967443 -.2422008 .1370741
A------------------------------------------------------------------------------
diff = mean(Male) - mean(Female) t = -0.5433
H0: diff = 0 Degrees of freedom = 10349
Ha: diff < 0 Ha: diff != 0 Ha: diff > 0
Pr(T < t) = 0.2935 Pr(|T| > |t|) = 0.5869 Pr(T > t) = 0.7065

P0 = Null Hypothesis = BMI doesn’t vary in male and female.


PA = Alternative Hypothesis = BMI varies in male and female.
Here,the t-statistic is approximately -0.5433, and the associated p-value for the one-sided
test is 0.2935. Since this p-value is greater than the significance level (0.05), we fail to
reject the null hypothesis.
Therefore, we do not have enough evidence to conclude that there is a significant
difference in the mean between males and females in the population.

4. Check your answer of the two-sample t-test output from the previous problem, calculate
and check manually using the formula whether you obtain same result for the t-statistic!
Answer:

Here, x 1=25.50999(Mean for Male)


x 2 = 25.56256 (Mean for Female)
s 1 = 4.023758 (SD for Male)
s 2 = 5.600241 (SD for Female)
n 1 = 4915 (Sample size for Male)
n 2 = 5436 (Sample size for Female)

x 1−x 2
t=


2 2
s1 s 2
+
n1 n2

25.50999−25.56256
t=

√ (4.023758)2 (5.600241)2
4915
+
5436
−0.05257 −0.05257
t≈ ≈ ≈−0.5524
√ 0.0090644 0.0952

The manually calculated t-statistic closely matches the one obtained from the output. The
difference is likely due to rounding during calculation. Thus, the manually calculated t-
statistic confirms the result obtained from the Stata output.

5. Compare whether the mean height of the participants is similar irrespective of their race.
Answer: Command:
oneway height race

Analysis of variance
Source SS df MS F Prob > F
------------------------------------------------------------------------
Between groups 6882.21987 2 3441.10993 37.17 0.0000
Within groups 958117.822 10348 92.5896619
------------------------------------------------------------------------
Total 965000.042 10350 93.236719

Bartlett's equal-variances test: chi2(2) = 3.0051 Prob>chi2 = 0.223


Based on the P-value (0.0000) in the ANOVA table, we reject the null hypothesis and
conclude that there is a statistically significance between the mean height of the participants
irrespective of their race.

You might also like