0% found this document useful (0 votes)
25 views

Regression Analysis of Impact of Nutrients On Quality of Cereals

This document summarizes a regression analysis of the impact of various nutrients on cereal quality ratings. It analyzed data from 77 cereal samples testing 8 nutrients: protein, fat, sodium, fiber, carbohydrates, sugars, potassium, and vitamins. The regression found that higher protein, fiber, and carbohydrate levels correlated with higher ratings, while higher fat, sodium, sugar, potassium, and vitamin levels correlated with lower ratings. Fiber had the strongest positive impact on ratings. The regression explained over 98% of the variation in ratings. This analysis can help define healthier cereal formulations and ratings based on nutrient contents.

Uploaded by

GouthamAmudhala
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views

Regression Analysis of Impact of Nutrients On Quality of Cereals

This document summarizes a regression analysis of the impact of various nutrients on cereal quality ratings. It analyzed data from 77 cereal samples testing 8 nutrients: protein, fat, sodium, fiber, carbohydrates, sugars, potassium, and vitamins. The regression found that higher protein, fiber, and carbohydrate levels correlated with higher ratings, while higher fat, sodium, sugar, potassium, and vitamin levels correlated with lower ratings. Fiber had the strongest positive impact on ratings. The regression explained over 98% of the variation in ratings. This analysis can help define healthier cereal formulations and ratings based on nutrient contents.

Uploaded by

GouthamAmudhala
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

REGRESSION ANALYSIS OF IMPACT OF NUTRIENTS ON

QUALITY OF CEREALS
Problem Statement
Increasing rate of the nation’s obesity crisis and the prevalence of chronic disease have led to multiple efforts
aimed at promoting healthy eating within the society, including changes in the formulation, packaging of
product, labelling and marketing of food and beverage products that contribute to an important part of healthy
lifestyle. How can this information about the various nutrients present in the product help us to define the
rating of a product and how can a substantial change in a nutrient affect the total rating of a product?

Introduction to the Analysis


To understand the relation among the proportion of various nutrients and the respective rating of the cereals,
we picked a data of 8 nutrients and 77 samples of cereals and noted their ratings. The nutrients tested for
quality determination are:

• Protein • Carbohydrates
• Fat • Sugars
• Sodium • Potassium
• Fiber • Vitamins
Introduction to the Analysis
F-STATISTIC TEST
Inference: F Statistic is a test of PARTICULARS RESULTS
significance for entire regression.
F-statistic 533.7 on 8 and 68 DF
At α = 0.05, this regression is
statistically significant because p- p-value < 2.2e-16
value < 0.05

Contribution of Individual
Variable Estimates Std. Error t-Value Pr (>|t|)
Independent Variables 50.002428 1.448527 34.520 < 2e-16
Intercepts
Inference: At α = 0.05, all 8 Protein 1.993312 0.267797 7.443 2.24e-10
Independent variables are Fat -3.569612 0.253931 -14.057 < 2e-16
Sodium -0.055040 0.002942 -18.710 < 2e-16
statistically significant because
Fiber 3.652642 0.254811 14.335 < 2e-16
their corresponding p-values are 0.465675 0.070422 6.613 7.02e-09
Carbs
less than 0.05. Therefore, each Sugar -1.444068 0.067120 -21.515 < 2e-16
independent variable is Potassium -0.032513 0.008636 -3.765 0.00035
individually useful in prediction Vitamins -0.054673 0.010626 -5.145 2.46e-06
of rating.
Rating = 50.002428 + 1.993312 * Protein - 3.569612 * Fat - 0.055040 * Sodium + 3.652642 * Fiber + 0.465675 * Carbs - 1.444068 * Sugar -
Regression Line 0.032513 * Potassium - 0.054673 * Vitamins

Variable 2.50% 97.50%


Intercepts 47.11194 52.89292 Inference: It tells by increasing one unit of each
Protein 1.458932 2.527692 of the independent variable, how much can the
Fat -4.07632 -3.0629
Sodium -0.06091 -0.04917
rating increase (with a confidence of 95%).
Confidence Level
Fiber 3.144174 4.16111 Clearly Fiber leads here, increasing 1 unit of
Carbs 0.325151 0.6062 sugar can increase rating from 3.144174 to
Sugar -1.578 -1.31013
Potassium -0.04975 -0.01528 4.16111.
Vitamins -0.07588 -0.03347

Multiple R Squared 0.9843 Inference: It shows that about 98.43% of the


R Squared
total variation in Rating is explained by the
regression.

Adjusted R Adjusted R Squared 0.9825 Inference: It shows that about 98.25% of the
Squared total variation in Rating is explained by the
regression. It is more accurate measure
because variables actually contributing
significantly are used to calculate Adjusted R
squared.
Standard Error of Std. Error 1.859 Inference: A measure of unexplained variation, it
Estimate shows the average distance or value from which
the observations differ from regression line. A
value of 1.859 shows that normality of residuals
are satisfying

Normal Q-Q Plot


Normality Test

Inference: The behavior is like normal,


1. Density
including more data points would make the
curve smoother, but it can be roughly
concluded as normal.

A 0.79406 Inference: Here p-value is less than 0.05


2. Anderson-Darling
Normality Test p-value 0.036 hence the normal distribution(null
hypothesis) is rejected, but the miss is by
a small margin , so it can be concluded
that it is almost normal.
3. Multicollinearity
(Correlation matrix) R -0.1137 Inference: Since Variance Inflation factor
(Tolerance) T =(1-R^2) 0.987
is less than 5, we can say there is no
(Variance Inflation Factor) 1/T 1.013
multicollinearity.

4. Residual Plot Inference: We can observe that the


residuals are equally distributed below
and above the zero line, which infers an
equal variance, hence we can conclude
that the residuals are nearly following
homoscedasticity.
Conclusion
A multiple linear regression was applied on the data set which gave us a relation between the quantity of various
nutrients and the cereal rating.

Rating = 50.002428 + 1.993312 * Protein - 3.569612 * Fat - 0.055040 * Sodium + 3.652642 * Fiber + 0.465675
* Carbs - 1.444068 * Sugar - 0.032513 * Potassium - 0.054673 * Vitamins

Notable inferences:
1. Cereals with high proportion of Protein, Fiber and Carbs were the ones with higher rating, hence can be
considered healthier. So, the future products can be made with a high proportion of above-mentioned
nutrients.
2. Fiber contributed the most among all nutrients towards the rating.
3. Fat component should be kept as minimal as possible.
THANKYOU

You might also like