Regression Analysis of Impact of Nutrients On Quality of Cereals
Regression Analysis of Impact of Nutrients On Quality of Cereals
QUALITY OF CEREALS
Problem Statement
Increasing rate of the nation’s obesity crisis and the prevalence of chronic disease have led to multiple efforts
aimed at promoting healthy eating within the society, including changes in the formulation, packaging of
product, labelling and marketing of food and beverage products that contribute to an important part of healthy
lifestyle. How can this information about the various nutrients present in the product help us to define the
rating of a product and how can a substantial change in a nutrient affect the total rating of a product?
• Protein • Carbohydrates
• Fat • Sugars
• Sodium • Potassium
• Fiber • Vitamins
Introduction to the Analysis
F-STATISTIC TEST
Inference: F Statistic is a test of PARTICULARS RESULTS
significance for entire regression.
F-statistic 533.7 on 8 and 68 DF
At α = 0.05, this regression is
statistically significant because p- p-value < 2.2e-16
value < 0.05
Contribution of Individual
Variable Estimates Std. Error t-Value Pr (>|t|)
Independent Variables 50.002428 1.448527 34.520 < 2e-16
Intercepts
Inference: At α = 0.05, all 8 Protein 1.993312 0.267797 7.443 2.24e-10
Independent variables are Fat -3.569612 0.253931 -14.057 < 2e-16
Sodium -0.055040 0.002942 -18.710 < 2e-16
statistically significant because
Fiber 3.652642 0.254811 14.335 < 2e-16
their corresponding p-values are 0.465675 0.070422 6.613 7.02e-09
Carbs
less than 0.05. Therefore, each Sugar -1.444068 0.067120 -21.515 < 2e-16
independent variable is Potassium -0.032513 0.008636 -3.765 0.00035
individually useful in prediction Vitamins -0.054673 0.010626 -5.145 2.46e-06
of rating.
Rating = 50.002428 + 1.993312 * Protein - 3.569612 * Fat - 0.055040 * Sodium + 3.652642 * Fiber + 0.465675 * Carbs - 1.444068 * Sugar -
Regression Line 0.032513 * Potassium - 0.054673 * Vitamins
Adjusted R Adjusted R Squared 0.9825 Inference: It shows that about 98.25% of the
Squared total variation in Rating is explained by the
regression. It is more accurate measure
because variables actually contributing
significantly are used to calculate Adjusted R
squared.
Standard Error of Std. Error 1.859 Inference: A measure of unexplained variation, it
Estimate shows the average distance or value from which
the observations differ from regression line. A
value of 1.859 shows that normality of residuals
are satisfying
Rating = 50.002428 + 1.993312 * Protein - 3.569612 * Fat - 0.055040 * Sodium + 3.652642 * Fiber + 0.465675
* Carbs - 1.444068 * Sugar - 0.032513 * Potassium - 0.054673 * Vitamins
Notable inferences:
1. Cereals with high proportion of Protein, Fiber and Carbs were the ones with higher rating, hence can be
considered healthier. So, the future products can be made with a high proportion of above-mentioned
nutrients.
2. Fiber contributed the most among all nutrients towards the rating.
3. Fat component should be kept as minimal as possible.
THANKYOU