Homework 2: QRA in Safety Engineering
Homework 2: QRA in Safety Engineering
Homework 2
Consumption of jet fuel for maneuvering a satellite during one year, T = 8,760 hr, is
expected to be normally distributed (behavior expected to follow the Normal
distribution) with a mean parameter value, based on previous data, of μ = 10,000 hr
and a std. deviation, σ = 1000 hr, which is the epistemic uncertainty in the data for
amount of fuel needed, and a Normal cov (unitless dispersion of the mean) = std
dev/mean = 0.1, so the epistemic uncertainty in the data is 0.1 of the mean of the
random variable T.
Calculate the probability of successfully maneuvering the satellite for the duration of a
one-year mission (8,760 hr) given you are limited to an amount of fuel to maneuver the
satellite under average conditions for similar missions during 10.000 hr.
Keep in mind that 8,760 hr corresponds to the fuel that is required for mission success
under average conditions but with aleatory (randomness) uncertainty for a given flight
due to uncertain (random) conditions during a flight, and μ = 10,000 hr corresponds to
the fuel that is available for the mission. So this problem is an example of Uncertainty
Modeling and Uncertainty Management as part of Risk Management. The primary
aleatory uncertainty (due to randomness) is the random variation of minimum fuel
requirement, due to random variation of conditions, during a one-year mission
(8,760 hr).
a) Write the expression for and calculate the T = 1yr, 365 day mission time in hours.
b) Write the expression for and transform the time variable T to Z for the standard
Normal distribution tables of cumulative probability under the pdf (probability density
function) curve of the Normal distribution and calculate the value of Z using T =
hours, here mission time of one year, and values for the Normal model parameters,
mean of the data, and std. dev, uncertainty of the data, as stated above.
c) Use a table for the cumulative standard Normal distribution to obtain the cumulative
probability corresponding to the calculated Z value that fuel will not be sufficient and
also the probability that fuel will be sufficient for a 1-year mission. (See Appendix
Table A.1 in IRME, pp. 514–519)
d) To have sufficient fuel for the mission, you need enough fuel for at least
T = 1 year during which the uncertain conditions will range similarly to previous
missions and data that resulted in the estimates of mean and std. dev. parameter
values of the Normal Distribution model. Now calculate the mission reliability, which
is the probability that you will have sufficient fuel for 1 year but you will run out of
1
fuel for T > 1 year. Note that this is way we calculate failure, F(t), cumulative
probability of failure at time t and reliability R(t), the cumulative probability of
working at time t which is also the cumulative probability of failing for time > t,
because the Pr of failure + the Pr of success must sum to 1 by Pr Axiom 2 (failure
and success are mutually exclusive and exhaustive in the sample space).
Begin by writing a logic expression for probability of T to exceed one year in terms
of the complement probability of failure for T < one year, P(T > 8760 hr) to guide
your analysis. Note that because the data shows that T < the distribution mean of
μ = 10,000 hr, the mission is more likely to succeed than to fail under the conditions
that resulted in the model parameter values. The smaller the T the smaller the
probability of failure given the mean μ = 10,000 hr,
Recognize that the condition ranges for this particular mission are unknown, so
there always will be a significant uncertainty that could be lowered by at least three
actions as indicated below.
e) Sketch a curve to represent the pdf, probability density function, for the normal
distribution, the Z value, and also highlight the specific area under the curve
corresponding the Probability of fuel at T = 1 year.
f) Sketch a curve to represent the cdf, cumulative distribution function, of the normal
distribution, and draw a point on the curve corresponding to the relevant area under
the pdf curve.
g) Evaluate your result with regard to reliability of the mission and therefore risk within
an acceptable or unacceptable range. Consider the consequences of a failed
mission. Also state at least one action that could lower the mission risk to a more
acceptable level.
2
critical mission of maneuvering a satellite, but the consequences of a failed
mission must be analyzed to judge whether the risk is within an acceptable
range. The main criterion for acceptability of the calculated reliability is the
calculated risk to include the expected consequences, such as monetary
losses and fatalities, in addition to the probability of a failed mission.
h) Your mission manager requires that the probability of mission success must be
sufficiently high that the expected monetary loss due to failure is < 10% of the
mission cost of $50 M (million) or < $5 M. Therefore your team decides to limit the
mission time T so that the probability of failure is 0.10 or less, which means that the
reliability must be 0.90 or greater. So if you calculate a reliability that is less than
0.90, estimate a revised mission time T corresponding to a Pr of success = 0.90.
a) Identify the overall purpose of an F-N curve, draw a rough sketch, and state the two
main objectives of this type of risk profile for upset events consisting of a figure
showing cumulative event frequency or probability along Y and event consequence
level along X.
b) Sketch the basic form of a generic risk profile of this type. Include upper and
lower confidence limits.
The idea here is to show with a sketched curve, as shown in class, that the
cumulative frequency of loss events is expected to drop monotonically as the
consequence level increases.
c) Define the cumulative frequency F (or cumulative probability P) along the Y axis,
show how to calculate it, and state how it differs from an individual frequency f.
3. Consider the rainfall data in the figure below. When we gather such a set of
empirical data based on observations, useful descriptive statistics includes the
average or mean of the data and the variability or dispersion of the data
measured in the variance.
3
a) Assuming all of the rainfall data are equally likely, write the expression to
calculate the data mean in which all data are equally weighted. Recall that all
mean calculations are weighted, but if each datum is equally weighted the
weight factors are the same = 1/n for n data. The data mean for the Rainfall
Data in Table 1.1 is 50.70 in.
b) Write the expression to calculate the variance of the n data. State the type of
uncertainty associated with the inherent variability of the rainfall data as
measured by the data variance. Also state what this type of uncertainty is
due to or what is the source of this type of uncertainty.
c) Assuming the data follow the Normal distribution, write the expression for and
calculate the std dev of the sample Normal distribution mean, σm, for the
n = 29 data for which the individual sample std deviation is considered
roughly the same for all of the data = σ = 7.57in. State the type of uncertainty
that is associated with the limited number and limited quality of the data.
Also, distinguish the uncertainty of the mean value of the data from the
uncertainty of each of the data points assumed equally uncertain.
4
for comparing widths or breadths of distributions relative to the distribution
mean values.
Solution:
e) Given that you are using the data in an engineering model to predict future
rainfall in the area, state the source of additional uncertainty, aside from the
uncertainty of the rainfall data, introduced by predicting rainfall intensity by
use of a model. In your answer include the expression “propagation of
uncertainty.”
4. A school population consists of 25% juniors (J) and 20% seniors (S).
a. State the general expression for the probability of J OR S, P(J∪S), which involves
the addition of probability of the 2 events, J, S. Recall the Probability Axiom 3 with
regard to addition of probability and decide if probability addition without correction for
overlap applies in this case.
P(J∪S) =
b. Calculate each term in the general expression and calculate the probability (or
fraction as an estimation of probability) of the student population that are juniors and
seniors.
c. State whether and why this is an example of intersecting sets or mutually exclusive
sets.
5. In the figure below, Ω is the set of all objects. Black is the set of all black objects, and
White is the set of all white objects. Square is the set of all square objects, and A is the
set of all objects containing A. Use the relative frequency of occurrence and the POI
(principle of indifference) to answer the following questions:
a) Calculate P(A) and P(A|Square). State whether or not A and Square are
independent. State the reason for your answer.
5
c) Calculate P(A|White) and P(A|Square⋂White). State whether or not A and Square are
conditionally independent given White. State the reason for your answer.
6. Medical Study Problem: Make sure that all members of your team understand this
important application of traditional statistics.
A medical study compared the success rates of two treatments for kidney stones. Each
treatment was applied to two groups of people – one group in which each subject had a
small stone and one group in which each subject had a large stone. Based on various
numbers of patients in each case, the ‘average’ success rates were:
a) Recall the example of course grade averages based on year or based on course
modules in the section in RDBN on “The Danger of Averages” (RDBN Chap 1, p.
26), where the hidden variable or cause was the number of modules completed
each year by each student. In this example, the hidden variable and influence or
cause is similarly the number of patient data in each case.
In the table below that shows the patient data, the upper number of each fraction is
the number of successes, and the lower number of each fraction is the total number
of cases (patients) for the data of this medical study. This is an example of the
relative frequency of occurrence, n/N, which we discussed in class. Using a System
approach to calculate all data together, fill in the table below by converting the
fractions that estimate point probabilities to success percentages and compare with
the values in the table above.
For the 4th column, “Both Small, Large”, fill in the n/N fraction and the corresponding
percentage as with the other columns.
b) Explain or show how the averages in the first table (above at the beginning of this
problem) 4th column were calculated. Was the POI (principle of indifference)
assumed in this calculation?
c) Explain and show using n/N how you calculated the averages in the second table
4th column.
6
Using n/N:
d) Compare the percentages in the 4th column of each table and explain why they are
different based on how they were calculated. Based on this data, is Treatment A
superior to Treatment B? Why or why not? State your reasons.
e) State briefly what is needed or what should be done to resolve with more
confidence the question of effectiveness of Treatments A and B for small stones
and for large stones.
7. Risk Matrix
a) Explain briefly the purpose of a risk matrix and compare its purpose and objectives
with the purpose and objectives of a risk profile, both of which are useful to display
outcomes of a risk assessment.
c) Consider two types of events: Type 1 events are not highly dependent and occur
relatively independently of each other. Type 2 events are often significantly dependent
so they are more likely to co-occur under certain conditions with greatly amplified
consequences. State a recent (within the last few years) event of high consequences
that is an example of the dependent events type, and state why this is an example.
Also, state where a highly dependent event of 2 co-occurring events should be placed
and designated on a risk matrix compared to where, relatively, each of the 2 events
appearing individually and independently would be placed on the matrix.
8. From Example 2.3, Modarres RAE, Chap 2, pp. 25–30, CNG Bus System (discussed in
class, Unit 3, Slides 40-61), prepare a semi-quantitative risk matrix similar to the one
shown on Unit 3, Slide 49 in Unit 3, Elements of Risk and Reliability Assessment.
Instead of the 4x4 matrix shown in this slide, use a larger scale matrix 5x5 or 6X6, as
shown in Unit 3, Slides 33-37, with 5 or 6 categories for probability and 5 or 6 for
consequence severity. A larger risk matrix provides more cells of Conditional Risk
acceptability to separate this region from the Acceptable Risk region and the
Unacceptable Risk region of the matrix.
For Consequence severity and Frequency description be guided by RAE, Chap 2,Tab
2.3, p. 25 (and in class slides, Unit 3, Slide 48) for the relative frequency and outcome
severity categories for the fire exposure risk matrix. Based on your team’s assessment,
the values on your team risk matrix can be different than those shown in class and
equally valid based on information and judgment of your team. We are focusing on the
assessment method that can be applied to any data, so we can use any data for this
exercise.
7
a) Make a list of your team scenarios (at least 10 upset scenarios) together with team
estimates of consequence level (or severity) and frequency for each. Include 1 or more
scenarios that result in potentially dependent events with greatly increased
consequences, such as co-occurring component failures or common cause failures
(CCF). Example scenarios involving hazard barriers are listed in RAE, Tab 2.4, p. 27.
Also, assume the assessed risk in for each scenario outcome will be a distribution,
wide or narrow, symmetric or skewed, so represent the risk for each scenario outcome
on the matrix by a bar symbol showing the confidence interval of the risk with upper
and lower confidence limits or a distribution symbol (which can be hand drawn) as
shown below and as discussed in class to show the possible range of the risk (bar
symbol) and with probability overlap (distribution symbol), such as within a confidence
interval that can reside in one region or in addition can overlap more than one region to
varying extents within the matrix. Here are some useful distribution shapes.
For each estimated risk point value, adopt a confidence interval based on ~ 1.5 to 2
standard deviations, σ, which corresponds to a confidence interval of ~ 87% to 95% for
the Normal (Gaussian) distribution. Let the σ be sized in relation to the mean value μ
and calculated according to the coefficient of variability, cov = σ/μ, which is useful for
comparing the width or spread of a distribution relative to their mean value. So if the
cov = 0.5, a Normal distribution with a mean of 10 and a std. dev. of 5 has the same
width or spread as a Normal distribution with a mean of 20 and a std. dev. of 10.
For the agrarian fertilizer decision, Unit 2, Slide 34, the cov values were 20/200 = 0.10
for the old fertilizer and 49/220 = 0.22 for the new fertilizer and demonstrated for the
new fertilizer the greater spread of its outcomes relative to its mean value resulting in a
conditionally much high probability of high profit and a conditionally much higher
probability of failure compared to the old fertilizer. Using cov, the relative widths of the
two distributions with different mean values are quantified and judged more realistically
with less bias compared to judging them by visual comparisons.
8
Both variance and the std. dev. (σ = square root of variance) are important for
distribution spread measurement, because std. dev. has the same units of the mean
value, but variances, not the std. dev. values, can be added to calculate a total
variance, such as for an overlap of two distributions with a variance contribution from
each distribution.
For this problem you use the problem point value data, so your estimations of
confidence intervals for each of the point values will be hypothetical estimates of what
they could be. The goal here is to practice thinking and working with intervals and
distributions that are part of the real world of engineering rather than only point values
that do not include all of the information that we require for a Socio-Technical System,
STS, approach, such as the ranges of outcomes for real systems that we must be
prepared to evaluate for risk reduction and to manage them within acceptable risk
ranges!
A. The team solution should include a list of scenarios with each event sequence
resulting in an adverse event. The probability (1 to 4 or greater) and outcome severity
(1 to 4 or greater) or larger ranges for larger risk matrices should be estimated semi-
quantitatively so the risk levels of all scenarios can be semi-quantitatively assessed and
prioritized. Hypothetical risk ranges or confidence limits are specified and represented
on the risk matrix by bar symbols or distribution symbols. Included are one or more
scenarios that result in potentially dependent events with greatly increased
consequences and broader confidence intervals, such as co-occurring component
failures or common cause failures (CCF).
B. The risk level estimates for the outcome scenarios should appear in a planar semi-
quantitative risk matrix with probability (1 to 5 or 6) on the one axis and consequence
severity (1 to 5 or 6) on the other axis, where the quantitative ranges of the probabilities
and consequence severity are specified.
C. The team then decides the location of the scenario outcome risk levels within one or
more of the three primary regions of a risk matrix: tolerable (or acceptable), conditionally
tolerable (or acceptable with waiver), and intolerable. Each bar symbol or distribution
symbol representing a risk range can be in one region or can overlap to varying extents
in more regions of the risk matrix. The broader the range of conditions the broader the
risk distributions with increasing overlap of one or more risk matrix regions.
a) Calculate P(A|B).
P(A|B =
9
b) Calculate P(B|A)
P(B|A =
10. Recall the discussion (Unit 2, Slides 49-50) of two types of urns, θ1, θ2, each with 10
balls. Urn θ1 has an average of 4 red and 6 pink balls, and θ2 has an average of 9 red
balls and 1 pink ball. There are 800 urns of type θ1 and 200 urns of type θ2 for a total of
1000 urns. Therefore, the prior probability = 0.8 (= 800/1000) that an urn selected at
random is θ1 and 0.2 (= 200/1000) that an urn selected at random is θ2. Recall also that
these probability estimates are based on the n/N relative frequency of occurrence.
Recall that we discussed the value of additional information to make a decision based on
lower uncertainty. Suppose we sample a ball from the particular urn drawn from random
and see that the drawn ball is red. Based on this one observed sample, we can
reassess the probability that the urn drawn from random is θ1 or θ2. We can calculate
the probability of an urn type based on the number of red balls in each urn type. So in
this exercise, think of n/N and the POI based on # of balls instead of based on # of urns
to estimate the frequency.
a) Based only on the sample of 1 red ball drawn at random from the randomly selected
urn, calculate the probability of each type of urn by calculating n/N following POI and
using information about number of red balls in each urn type.
i) Calculate the total number of red balls in the 800 θ1 urns and the total number of red
balls in the 200 θ2 urns.
In the θ1 urns there are 4(800) = 3,200 red balls, and in the θ2 urns there are 9(200) =
1,800 red balls for a total number of 5,000 red balls = N.
ii) Calculate the total number of red balls = N in all 1000 urns of both types.
iii) With the numbers of red balls in each type of the 1000 urns, use n/N and POI to
calculate the probability of a single ball observed to be red when removed at random
from a θ1 urn and the probability of a single ball observed to be red when removed at
random from a θ2 urn.
10
b) Incorporate the contract and probability information based only on the single
observation of a randomly removed red ball into the decision tree shown below, and
calculate the expected monetary value for each contract alternative. Place the expected
monetary values in the boxes above the uncertainty nodes and place the maximum
expected monetary value in the box above the decision node.
c) Determine the optimum decision based on the maximum expected monetary value.
d) Compare the optimum alternative you have identified based on ball color statistics to the
optimum alternative identified given the prior probability values based on the number of
urns of each type discussed in class. In each case, the contract award and payment
information is the same.
e) Compare and prioritize decision-making based on the two cases: 1. Information of only
the number of each type of urn performed in class. 2. Information of ball color statistics
and observation based on drawing a red ball from the specific urn drawn at random.
Which is better for this application, using urn statistics as in (1.) or using ball color
statistics based on sampling from the drawn urn as in (2.)? State why the one you have
selected is better for this application.
This distinction about the type of information is important for determining the amount and
quality of information and its cost from VI, (value of information or VOI) analysis needed for
optimum risk management of a socio-technical engineering system.
f) Does the single ball sample in (2.) lower the uncertainty for better decision making
compared to (1.)? Why or why not? What can be done to lower the uncertainty in (2.)?
Note that this concern is an example of the value of information to lower uncertainty to
support optimized decision-making.
11
θ1
$40
E(α1) $18 0.64 Expected $ value of α1
reject
α1 < V | α1 > = 0.64(40) + 0.36(–20) = $18
θ2
$33
– $20
0.36
E(α2) θ1
– $5
α2 $33
0.64
Expected $ value of α2
< V | α 2 > = 0.64(–5) + 0.36(100) = $33
optimum
θ2
$100
0.36 high payoff but
low probability
12