Decision Science
Decision Science
Branch of Probability
Head
Coin
Tail
Conditional probability: It is known as the possibility of an event or outcome happening,
based on the existence of a previous event or outcome. It is calculated by multiplying the
probability of the preceding event by the renewed probability of the succeeding, or
conditional, event.
Conditional probabilities are contingent on a previous result or event occurring. A
conditional probability would look at such events in relationship with one another.
Conditional probability is thus the likelihood of an event or outcome occurring based on the
occurrence of some other event or prior outcome.
Two events are said to be independent if one event occurring does not affect the probability
that the other event will occur. However, if one event occurring or not does, in fact, affect
the probability that the other event will occur, the two events are said to be dependent. If
events are independent, then the probability of some event B is not contingent on what
happens with event A. A conditional probability, therefore, relates to those events that are
dependent on one another.
According to this case study the records indicate that when teams win the championship, they
win the first game of the series 70% of the time. When they lose the series, they win the first
game 25% of the time. As per the given information, Lets draw a Probability Tree Diagram,
70% of win
60% of win the Series
of 1st match 30% of lose
the Series
1st Match
25% of win
40% of lose the Season
1st match 75% of lose
the Season
Summary: It is based on the existing scenario; the series of these events, i.e., The
Indian Premier League’s final season, features their preferred group, Garuda.
According to experienced, i.e., Raj Kaul, Garuda’s possibility to win is 60% based on
his analysis of the current circumstance. When groups win 70% of the time, they win
the series opener in the championship, Records from the previous program.
After losing 25% of the time in the series, they take the opening game. The opening suites
finished, and their team is lost or defeated. Therefore, the probability that it will win their
series
Refer to page no:101(4.2), 105(4.3), 106(Figure-4.2), 112(4.4).
Answer [2].
Content: -
➢ Introduction to Simple Linear Regression
➢ Explain Equation of Linear Regression
➢ Solve the numerical mentioned in the question
Simple linear regression is a statistical and machine learning model that estimates the linear
relationship between a quantitative explanatory x variable and quantitative response
variable y. There is also another type of linear regression called multiple linear
regression which involves using multiple explanatory variables to predict a response variable,
but we will be focusing on the former. Simple linear regression is a statistical method you can
use to understand the relationship between two variables, x and y.
One variable, x, is known as the predictor variable.
The other variable, y, is known as the response variable.
A simple linear regression line has the form:
y= b0 + b1*x
• b1 = slope of line
• b0 = intercept of line
• x = explanatory variable
• y = response variable
Linear regression is known to be the most basic and commonly used predictive analysis. In this
concept, one variable is considered to be an explanatory variable, and the other variable is
considered to be a dependent variable. For example, a modeller might want to relate the weights
of individuals to their heights using the concept of linear regression.
It is very important and used for easy analysis of the dependency of two variables. One variable
will be considered to be an explanatory variable, while others will be considered to be a
dependent variable. Linear regression is a linear method for modelling the relationship between
the independent variables and dependent variables. The linearity of the learned relationship
makes the interpretation very easy. Linear regression models have long been used by people as
statisticians, computer scientists, etc. who tackle quantitative problems.
Quality
Customer Ratings
Satisfaction given by
score (y) Customer
(x)
xy x2
5 5 25 25
5 5 25 25
5 4 20 16
5 4 20 16
5 5 25 25
5 5 25 25
5 3 15 9
5 5 25 25
4 4 16 16
4 4 16 16
4 4 16 16
4 4 16 16
4 3 12 9
4 3 12 9
4 4 16 16
4 2 8 4
3 3 9 9
3 3 9 9
3 3 9 9
3 2 6 4
2 2 4 4
2 2 4 4
1 1 1 1
1 1 1 1
1 1 1 1
1 3 3 9
1 1 1 1
Total= 93 Total=86
Total=340 Total=320
We have, ∑x= 86
∑y=93
∑x2 =320
∑xy= 340
Now, SSxy= ∑xy-∑x*∑y/n
=340-86*93/27
=340-296.222
SSxy = 118
SSxx=∑x2-(∑x)2/n
=320-(86)2/27
= 320-7396/27
=320-273.926
SSxx=46.074
Now, b1= SSxy/SSxx
b1= 118/46.074
Hence, b1=2.56
b0=∑y/n-b*∑x/n
b0=93/27-2.56*86/27
=3.444-8.514
Hence, b0= 4.710
y= b0 + b1*x
Hence, the least squares equation of the regression line is
y=4.71+2.56x
The slope of line, b1 =2.56, means that for every unit increase of x (Quality Ratings given
by Customer), y (Customer Satisfaction score) is predicted to increase by 2.56.
Summary: Linear regression is common term used in the field of statistics. Statistics is
one science that involves the collection, organization and interpretation of data that is more
frequently associated in conducting surveys, studies, researches and experiments. Simple
linear regression is one statistical approach that involves modelling of relationships between
two variables, denoted by X and Y. this type of approach focuses on the conditional
probability distribution of y in the presence of variable X. as the first type of regression
analysis being thoroughly studied and analysed, simple linear regression is found to be
extensively useful in various practical applications and methodologies. Simple linear
regression functions by assuming that the variables x and y have a linear relationship within
the given set of data.
Refer to page no: 275(7.3), 277(7.4).
Answer [3] A.
Content:
• Finding the relative frequency is equal to the frequency of an event divided by the
population. It thus requires first finding the frequency and the population.
• The frequency and the population can be based on two different things. They could be
based on either a sample or based on known possible outcomes.
• For example, asking 100 people who enter a store whether they plan to buy milk or
not is a sample. The population of this sample is therefore 100.
• Alternatively, the frequency can be based on theoretical outcomes. An example of this
is Punnet squares in genetics.
• This relative frequency is always expressed as a probability. Recall that a probability
of 0 means and event is impossible and 1 means the event is certain.
Total Relative
Sr. No State Name District Name MSMEs Frequency
WEST
1 TRIPURA TRIPURA 2915 0.438279958
SOUTH
2 TRIPURA TRIPURA 586 0.088107052
=6651 1
0.45
0.4
0.35
0.3
0.25
0.2
0.15
0.1
0.05
0
1 2 3 4 5 6 7 8
❖ The x-axis displays the Name of the District and the y-axis displays the relative frequency
of that.
A bar graph is a graphical representation of information. It uses bars that extend to different
heights to depict value.
Bar graphs can be created with vertical bars, horizontal bars, grouped bars (multiple bars that
compare values in a category), or stacked bars (bars containing multiple types of
information).
❖ The purpose of a bar graph is to convey relational information quickly in a visual manner.
The bars display the value for a particular category of data.
❖ The vertical axis on the left or right side of the bar graph is called the y-axis. The
horizontal axis at the bottom of a bar graph is called the x-axis.
❖ The height or length of the bars represents the value of the data. The value corresponds to
levels on the y-axis.
❖ The Top 2 District shown in the Bar Graph, are as follows:
Relative Frequency
0.45
0.4
0.35
0.3
0.25 West Tripura
0.2
0.15
0.1 North Tripura
0.05
0
1 2
❖ Median: Median is defined as the middle value in a given set of numbers or data. In
Mathematics, there are three different measures, which are used to find the average value
for a given set of numbers. They are mean, median and mode. These three measures are
called the measures of central tendency. The average value of the given data is given by
mean. The middle value of the given data is defined by a median. The repeated value of
the given data is defined by mode.
❖ The median is often compared with other descriptive statistics such as the mean
(average), mode, and standard deviation.
❖ The median is the middle number in a sorted list of numbers and can be more descriptive
of that data set than the average.
❖ The median is sometimes used as opposed to the mean when there are outliers in the
sequence that might skew the average of the values.
❖ If there is an odd number of numbers, the median value is the number that is in the
middle, with the same number of numbers below and above.
The formula to calculate the median of the data set is given as follows.
Number of Micro, Small and Medium
State Name District Name Enterprises
ANDHRA
PRADESH SRIKAKULAM 10895
ANDHRA
PRADESH KURNOOL 15362
ANDHRA
PRADESH ANANTHAPUR 21193
ANDHRA
PRADESH KRISHNA 23231
ANDHRA
PRADESH GUNTUR 25479
ANDHRA
PRADESH EAST GODAVARI 26546
ANDHRA
PRADESH CHITOOR 27670 (Median)
ANDHRA
PRADESH VISAKHAPATNAM 29070
ANDHRA
PRADESH VIZIANAGARAM 30186
ANDHRA
PRADESH WEST GODAVARI 33541
ANDHRA
PRADESH Y.S. R 37500
ANDHRA
PRADESH PRAKASAM 45171
ANDHRA
PRADESH SPSR NELLORE 54059
Median= x(n+1)/2
=7.
The list of the districts above the median value are as follows:
❖ Summary: A frequency distribution graph is used to show the frequency of the outcomes
in a particular sample. For frequency distribution graphs, the table of values made by
placing the outcomes in one column and the number of times they appear (i.e., frequency)
in the other column. This table is known as the frequency distribution graph from which
the cumulative frequency graph or ogive can be plotted.
❖ The median represents the middle value in a dataset. The median is important because it
gives us an idea of where the centre value is located in a dataset. The median tends to be
more useful to calculate than the mean when a distribution is skewed and/or has outliers.
THANK
YOU !!!!