T-Test Approach in R Programming

Last Updated : 19 Apr, 2025

The T-Test is a statistical method used to determine whether there is a significant difference between the means of two groups or between a sample and a known value.

For Example: businessman who owns two sweet shops in a town. He wants to know if there's a significant difference in the average number of sweets sold per day in each store. While he collects data from 15 random customers at each shop, he wonders if the observed difference in sales is just due to random chance or if it's statistically significant.

This is where T-testing comes into play it helps us to understand whether the difference between the two means is real or simply by chance.

Mathematically, what the t-test does is, take a sample from both sets and establish the problem assuming a null hypothesis that the two means are the same. there are three main types of T-Tests:

One Sample T-test
Two sample T-test
Paired sample T-test

One Sample T - Test Approach

One-Sample T-Test is used to test the statistical difference between a sample mean and a known or assumed/hypothesized value of the mean in the population.

We want to test if the average number of sweets sold is equal to a hypothesized value, we would use the syntax t.test(y, mu = 0) where x is the name of the variable of interest and mu is set equal to the mean specified by the null hypothesis.

set.seed(0)
sweetSold <- c(rnorm(50, mean = 140, sd = 5))

# mu=The hypothesized mean difference between the two groups.
t.test(sweetSold, mu = 150)

Output:

One Sample t-test

data: sweetSold
t = -15.249, df = 49, p-value < 2.2e-16
alternative hypothesis: true mean is not equal to 150
95 percent confidence interval:
138.8176 141.4217
sample estimates:
mean of x
140.1197 \Two sample T-Test Approach

t-value: -15.249 (statistic showing the degree of difference between the sample mean and the hypothesized mean)
p-value: < 2.2e-16 (indicating strong evidence against the null hypothesis)
Confidence Interval: [138.82, 141.42] (95% confidence range for the population mean)
Sample Estimate: The sample mean is 140.12.

The p-value is extremely small, so we reject the null hypothesis and conclude that the true mean is not 150.

Two-Sample T-Test Approach

Two-Sample T-Test compares the means of two independent groups. It is used to help us to understand whether the difference between the two means is real or simply by chance. let's test if there is a significant difference between the sweets sold in two shops.

The general form of the test is t.test(y1, y2, paired=FALSE). By default, R assumes that the variances of y1 and y2 are unequal, thus defaulting to Welch's test. To toggle this, we use the flag var.equal=TRUE.

set.seed(0)

shopOne <- rnorm(50, mean = 140, sd = 4.5)
shopTwo <- rnorm(50, mean = 150, sd = 4)

t.test(shopOne, shopTwo, var.equal = TRUE)

Output:

Two Sample t-test

data: shopOne and shopTwo
t = -13.158, df = 98, p-value < 2.2e-16
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-11.482807 -8.473061
sample estimates:
mean of x mean of y
140.1077 150.0856

t-value: -13.158 (shows how much the means differ)
p-value: < 2.2e-16 (strong evidence against the null hypothesis)
Confidence Interval: [-11.48, -8.47] (95% confidence range for the true difference in means)
Sample Estimates: The means of Shop One and Shop Two are 140.11 and 150.09, respectively.

Since the p-value is very small, we reject the null hypothesis, concluding that there is a significant difference between the two shops' average sales.

Paired Sample T-test Approach

Paired sample T-test is a statistical procedure that is used to determine whether the mean difference between two sets of observations is zero. In a paired sample t-test, each subject is measured two times, resulting in pairs of observations.

Let's test if there's a significant difference in the average sweetness level of sweets before and after a change in recipe. the test is run using the syntax t.test(y1, y2, paired=TRUE)

set.seed(2820)

sweetOne <- c(rnorm(100, mean = 14, sd = 0.3))
sweetTwo <- c(rnorm(100, mean = 13, sd = 0.2))

t.test(sweetOne, sweetTwo, paired = TRUE)

Output:

Paired t-test

data: sweetOne and sweetTwo
t = 29.31, df = 99, p-value < 2.2e-16
alternative hypothesis: true mean difference is not equal to 0
95 percent confidence interval:
0.9892738 1.1329434
sample estimates:
mean difference
1.061109

t-value: 29.31 (indicating a significant difference between the two means)
p-value: < 2.2e-16 (strong evidence against the null hypothesis)
Confidence Interval: [0.99, 1.13] (95% confidence range for the mean difference)
Mean Difference: 1.061 (indicating a mean difference of 1.061 between the two samples)

Differences between one-sample, two-sample, and paired-sample t-tests:

	One-sample t-test	Two-sample t-test	Paired sample t-test
Purpose	Determines whether a single sample's mean deviates considerably from a given population mean.	Determines whether there is a substantial difference between the means of two independent groups.	Determines whether the means of two connected or paired samples differ significantly from one another.
Data	Analyses a single set of measurements or observations.	Compares two distinct groups' or samples' means.	Analyses the identical group or set of observations made under two distinct situations or at two different times.
Hypotheses	Tests whether a population mean is hypothesized to be significantly different from the sample mean.	Determines whether there is a significant difference between the two groups' means.	Tests hypotheses to determine if the mean difference between the paired samples differs noticeably from zero.
Assumptions	Assumes that observations are independent and that the data is regularly distributed.	Assumes that observations are unrelated to one another, that data in each group is normally distributed, and that the variances of the two groups may or may not be equal.	Assumes that the paired observations are dependent or matched pairs, that the differences have a fixed variance, and that the differences are normally distributed.