Fault Tree Analysis 2
Fault Tree Analysis 2
Analysis
Clifton A. Ericson II
[email protected]
[email protected]
1 © C. Ericson 1999
Fault Tree Analysis
Clifton A. Ericson II
Sept. 2000
[email protected] or [email protected]
2 © C. Ericson 1999
Fault Tree Analysis
Outline
n Overview
n History
n Basic Process
n Definitions
n Construction
n Mathematics
n Evaluation
n Pitfalls
n Rules
n Examples
3 © C. Ericson 1999
Fault Tree Analysis
FTA Overview
4 © C. Ericson 1999
Introduction
5 © C. Ericson 1999
FTA - Description
l Tool
n evaluate complex systems
n identify events that can cause an Undesired Event
n safety, reliability, unavailability, accident investigation
l Analysis
n identifies root causes
n deductive (general to the specific)
n provides risk assessment
F cut sets (qualitative)
F probability (quantitative)
6 © C. Ericson 1999
FTA - Description
l Methodology
n defined, structured and rigorous
n easy to learn, perform and follow
n utilizes Boolean Algebra, probability theory, reliability theory, logic
n follows the laws of physics, chemistry and engineering
7 © C. Ericson 1999
Example FT
System
Battery Light
A B
Light Fails
FT Model Off
A B C D E
Cut Sets
Event combinations that can cause Top Undesired Event to occur
CS Probability
A PA=1.0x10-6
B PB=1.0x10-7
C PC=1.0x10-7
D PD=1.0x10-6
E PE=1.0x10-9
8 © C. Ericson 1999
FTA Application – Why
l Root Cause Analysis
n Identify all relevant events and conditions leading to Undesired Event
n Determine parallel and sequential event combinations
n Model diverse/complex event interrelationships involved
l Risk Assessment
n Calculate the probability of an Undesired Event (level of risk)
n Identify safety critical components/functions/phases
n Measure effect of design changes
l Design Safety Assessment
n Demonstrate compliance with requirements
n Shows where safety requirements are needed
n Identify and evaluate potential design defects/weak links
n Determine Common Mode failures
9 © C. Ericson 1999
FTA -- Coverage
l Failures
l Fault Events
l Normal Events
l Environmental Effects
l Systems, subsystems, and components
l System Elements
n hardware, software, human, instructions
l Time
n mission time, single phase, multi phase
l Repair
10 © C. Ericson 1999
FT Strengths
l Visual model -- cause/effect relationships
l Easy to learn, do and follow
l Models complex system relationships in an understandable manner
n Follows paths across system boundaries
n Combines hardware, software, environment and human interaction
l Probability model
l Scientifically sound
n Boolean Algebra, Logic, Probability, Reliability
n Physics, Chemistry and Engineering
l Commercial software is available
l FT’s can provide value despite incomplete information
l Proven Technique
11 © C. Ericson 1999
FTA Misconceptions
l Not a Hazard Analysis
n root cause analysis vs. hazard analysis
n deductive vs. inductive
l Not an FMEA
n FMEA is bottom up single thread analysis
12 © C. Ericson 1999
FTA Application -- When
l Required by customer
l Required for certification
l Necessitated by the risk involved with the product (risk is high)
l Accident/incident/anomaly investigation
l To make a detailed safety case for safety critical system
l To evaluate corrective action or design options
l Need to evaluate criticality, importance, probability and risk
l Need to know root cause chain of events
l To evaluate the effect of safety barriers
l Determine best location for safety devices (weak links)
13 © C. Ericson 1999
FTA Is Not For Every Hazard
Haz1 3C
Haz2 2D
Haz3 1B FTA - Inadvertent Weapon Arm
Haz4 2C
Haz5 3B
. .
. .
. .
Haz77 1C FTA - Inadvertent Weapon Launch
. .
. .
. .
Haz100 2C
Only do FTA on
Safety Critical hazards.
14 © C. Ericson 1999
Example Applications
l Evaluate inadvertent arming and release of a weapon
l Calculate the probability of a nuclear power plant accident
l Evaluate an industrial robot going astray
l Calculate the probability of a nuclear power plant safety device
being unavailable when needed
l Evaluate inadvertent deployment of jet engine thrust reverser
l Evaluate the accidental operation and crash of a railroad car
l Evaluate spacecraft failure
l Calculate the probability of a torpedo striking target vessel
l Evaluate a chemical process and determine where to monitor the
process and establish safety controls
15 © C. Ericson 1999
FTA Timeline
l Design Phase
n FTA should start early in the program
n The goal is to influence design early, before changes are too costly
n Update the analysis as the design progresses
n Each FT update adds more detail to match design detail
n Even an early, high level FT provides useful information
l Operations Phase
n FTA during operations for root cause analysis
n Find and solve problems (anomalies) in real time
16 © C. Ericson 1999
FTA – Summary
Undesired Event
B
System V x
A C
V V
Fault Tree
UE
17 © C. Ericson 1999
FTA – Summary
18 © C. Ericson 1999
FTA History
19 © C. Ericson 1999
FTA Historical Stages
The Beginning Years (1961 – 1970)
l H. Watson of Bell Labs, along with A. Mearns, developed the
technique for the Air Force for evaluation of the Minuteman Launch
Control System, circa 1961
l Recognized by Dave Haasl of Boeing as a significant system safety
analysis tool (1963)
l First major use when applied by Boeing on the entire Minuteman
system for safety evaluation (1964 – 1967, 1968-1999)
l The first technical papers on FTA were presented at the first System
Safety Conference, held in Seattle, June 1965
l Boeing began using FTA on the design and evaluation of commercial
aircraft, circa 1966
l Boeing developed a 12-phase fault tree simulation program, and a
fault tree plotting program on a Calcomp roll plotter
l Adopted by the Aerospace industry (aircraft and weapons)
20 © C. Ericson 1999
FTA Historical Stages
21 © C. Ericson 1999
FTA Historical Stages
22 © C. Ericson 1999
FTA Historical Stages
23 © C. Ericson 1999
FTA Definitions
24 © C. Ericson 1999
FT Building Blocks
Node Types:
l Basic Events (BE)
l Gate Events (GE)
l Condition Events (CE)
l Transfer Events (TE)
25 © C. Ericson 1999
FT Node Types
Basic Event (BE)
l Failure Event
n Primary Failure - basic component failure (circle symbol)
n Secondary Failure - failure caused by external force (diamond symbol)
l Normal Event
n An event that describes a normally expected system state
n An operation or function that occurs as intended or designed, such as
“Power Applied At Time T1”
n The Normal event is usually either On or Off, having a probability of
either 1 or 0
n House symbol
l The BE’s are where the failure rates and probabilities enter the FT
26 © C. Ericson 1999
FT Node Types
27 © C. Ericson 1999
FT Node Types
28 © C. Ericson 1999
FT Node Types
Transfer Event (TE)
l A pointer to a tree branch
l Indicates a subtree branch that is used elsewhere in the tree
(transfer in/out)
l A Transfer always involves a Gate Event node on the tree, and is
symbolically represented by a Triangle
l The Transfer is for several different purposes:
n Starts a new page (for plots)
n It indicates where a branch is used numerous places in the same tree,
but is not repeatedly drawn (Internal Transfer) (MOB)
n It indicates an input module from a separate analysis (External
Transfer)
29 © C. Ericson 1999
OR Gate
Valve
Is
Closed
Valve Is Valve Is
Closed Due To Closed Due To
H/W Failure S/W Failure
30 © C. Ericson 1999
AND Gate
All Site
Power
Is Failed
31 © C. Ericson 1999
Inhibit Gate
Y1
A B
32 © C. Ericson 1999
Transfer Symbols
Switch
Transfer In Is Open
A
Switch
Is Open
A
Transfer Out
Switch SW Inadv
Fails Open Opened
33 © C. Ericson 1999
Failure / Fault
l Failure
n The occurrence of a basic component failure.
n The result of an internal inherent failure mechanism, thereby
requiring no further breakdown.
n Example - Resistor R77 Fails in the Open Circuit Mode.
l Fault
n The occurrence or existence of an undesired state for a
component, subsystem or system.
n The result of a failure or chain of faults/failures; can be further
broken down.
n The component operates correctly, except at the wrong time,
because it was commanded to do so.
n Example – The light is failed off because the switch failed
open, thereby removing power.
34 © C. Ericson 1999
Failure / Fault Example
Battery Light
Sw A
Computer
Light Fails
Off
Fault
(Command Fault)
Light Bulb Light Cmd’d
Fails Off – No Pwr
(Primary Failure)
35 © C. Ericson 1999
Primary, Secondary, Command Fault
36 © C. Ericson 1999
Primary, Secondary, Command Fault
37 © C. Ericson 1999
System Complexities Terms
l MOE
n A Multiple Occurring Event or failure mode that occurs more than one
place in the FT
n Also known as a redundant or repeated event
l MOB
n A multiple occurring branch
n A tree branch that is used in more than one place in the FT
n All of the Basic Events within the branch would actually be MOE’s
l Branch
n A subsection of the tree (subtree), similar to a limb on a real tree
l Module
n A subtree or branch
n An independent subtree that contains no outside MOE’s or MOB’s, and is
not a MOB
38 © C. Ericson 1999
Cut Set Terms
l Cut Set
n A set of events that together cause the tree Top UE event to occur
l Min CS (MCS)
n A CS with the minimum number of events that can still cause the top event
l Super Set
n A CS that contains a MCS plus additional events to cause the top UE
l Critical Path
n The highest probability CS that drives the top UE probability
• • • E F • Cut Sets
A, D
B, D
A D B D C D G H C, D
E Order 1
F
G, H Order 2
40 © C. Ericson 1999
FTA Construction
41 © C. Ericson 1999
Construction Process − Overview
Top Structure
Middle Structure
Bottom Structure
42 © C. Ericson 1999
FT Construction
Primary Secondary
Command I,N,S
Causes P,S,C
Primary Secondary
Command I,N,S
Causes
P,S,C
Primary Primary
43 © C. Ericson 1999
FT Construction -- Iterative Process
Top UE
Step 1, Level 1
EFFECT EFFECT
Iterative
Analysis CAUSE CAUSE
Event B
Event A Step 2, Level 2
EFFECT EFFECT
CAUSE CAUSE
EFFECT EFFECT
CAUSE CAUSE
Step 4, Level 4
44 © C. Ericson 1999
Node Construction -- Three Step Process
45 © C. Ericson 1999
Step 1
46 © C. Ericson 1999
Step 2
Step 2 − Primary, Secondary and Command (PSC)?
l Read the IG event wording
l Ask “what is Immediate, Necessary and Sufficient” to cause event
(Step 1)
l Word Gate events in terms of Input or Output
l Consider the type of fault path for each Enabling Event
n identify each causing event as one of the following path types
FPrimary Fault
FSecondary Fault
FCommand Fault (Induced Fault, Sequential Fault)
n structure the sub events and gate logic from the path type
n any event that is not a BE (component) event is another Enabling Event
(Command Path)
47 © C. Ericson 1999
Step 3
48 © C. Ericson 1999
Example
ARM Command
Occurs
P C S
P C S A
D Fails In ARM D Receives EMI Causes
Inadv Cmd C D ARM
Output Mode Cmd From D Command
From C
B
P C S
C Fails In ARM Inadv Cmds EMI Causes
Output Mode From A & B Cmd From C
P C C S
Primary C Receives C Receives
Secondary
[none] Inadv Cmd Inadv Cmd
[none]
From A From B
P S P S
C C
A Fails In A Receives EMI Causes A B Fails In A Receives EMI Causes B
Output Mode Inadv Input To Output Output Mode Inadv Input To Output
A A
49 © C. Ericson 1999
Construction Example
Battery Light
A B
Light Fails
Off
Command Failure
Primary Failure
50 © C. Ericson 1999
Construction Example (continued)
Battery Light
Light Fails A B
Off
State of System
51 © C. Ericson 1999
Node Wording Is Important
7 3 25
U31 U25
52 © C. Ericson 1999
FT Data Requirements
•Node name √
•Node text √
•Node type √
•Node failure rate & exposure time √
Node text
Node name
53 © C. Ericson 1999
FTA Mathematics
54 © C. Ericson 1999
Basic Reliability Equations
l R = e-λT
l R+Q=1
l Q = 1 – R = 1 - e-λT
l Approximation
n When λT < 0.001 then Q ≈ λT
l Where:
n R = Reliability or probability of success
n Q = Unreliability or probability of failure
n λ = component failure rate = 1 / MTBF
n T = time interval (mission time or exposure time)
55 © C. Ericson 1999
Effects of Failure Rate & Time
l The longer the mission (or exposure time) the higher the
probability of failure
56 © C. Ericson 1999
Example
The Effect of Exposure Time On Probability
λA TA PA=1 - e-λT
(FPH) (HRS)
λA 1.0xE-6 1 9.99xE-7
TA
1.0xE-6 10 9.99xE-6
A 1.0xE-6 100 9.99xE-5
1.0xE-6 1000 9.99xE-4
1.0xE-6 10000 9.95xE-3
1.0xE-6 100000 0.095
1.0xE-6 1000000 0.6321
1.0xE-6 10000000 0.99995
T = 1,000 Hrs
Probability
T = 1 Hr
0
Time (hours)
57 © C. Ericson 1999
Axioms of Boolean Algebra
[A1] ab = ba
Commutative Law
[A2] a+b=b+a
[A3] (a + b) + c = a + (b + c) = a + b + c
Associative Law
[A4] (ab)c = a(bc) = abc
[A5] a(b+c) = ab + ac Distributive Law
58 © C. Ericson 1999
Theorems of Boolean Algebra
[T1] a+0=a
[T2] a+1=1
[T3] a•0=0
[T4] a•1=a
[T5] a•a=a ü
Idempotent Law
[T6] a+a=a ü
[T7] a •a = 0
[T8] a +a = 1
[T9] a + ab = a ü
Law of Absorption
[T10] a(a + b) = a ü
[T11] a +ab = a + b
where a = not a
59 © C. Ericson 1999
Probability
Union
For two events A and B, the union is the event {A or B} that
contains all the outcomes in A, in B, or in both A and B.
60 © C. Ericson 1999
Probability
Intersection
For two events A and B, the intersection is the event {A and B} that
contains the occurrence of both A and B.
61 © C. Ericson 1999
CS Expansion Formula
CS {A; B; C; D}
P= (PA + PB + PC + PD)
– (PAB + PAC + PAD + PBC + PBD + PCD)
+ (PABC + PABD + PACD + PBCD )
– (PABCD)
62 © C. Ericson 1999
FTA Evaluation
63 © C. Ericson 1999
FT Evaluation− Purpose
64 © C. Ericson 1999
Evaluation Process
l Process
n generate Cut Sets √
n apply failure data
n compute probabilities
n compute criticality measures
65 © C. Ericson 1999
Types
l Qualitative
n Cut Sets only
l Quantitative
n Cut Sets and Probability
n Importance Measures
66 © C. Ericson 1999
Requirements
67 © C. Ericson 1999
Cut Set
68 © C. Ericson 1999
The Value of Cut Sets
Note:
Always check all CS’s against the system design to make
sure they are valid and correct.
69 © C. Ericson 1999
Qualitative Evaluation
l Non-numerical
l Process
n Obtain the entire list of Min Cut Sets from the FT
Note:
Slightly subjective than quantitative evaluation.
70 © C. Ericson 1999
Value of Qualitative Evaluation
71 © C. Ericson 1999
Quantitative Evaluation
72 © C. Ericson 1999
Value of Quantitative Evaluation
73 © C. Ericson 1999
Basic Evaluation Methods
l Manual
n possible for small/medium noncomplex trees
l Computer
n required for large complex trees
n two approaches
- analytical
- simulation
74 © C. Ericson 1999
Methods For Finding Min CS
l Boolean reduction
l Bottom up reduction algorithms
n MICSUP algorithm
75 © C. Ericson 1999
Evaluation Trouble Makers
l Tree size
l Tree Complexity
n from redundancy (MOE’s)
n from large AND/OR combinations
l Exotic gates
l Computer limitations
n speed
n memory size
n software language
76 © C. Ericson 1999
Min CS
77 © C. Ericson 1999
Min CS
A B A B C A B
78 © C. Ericson 1999
FTA Pitfalls
79 © C. Ericson 1999
Pitfall #1 – FT Design
80 © C. Ericson 1999
Poorly Planned FT
81 © C. Ericson 1999
Pitfall #3 – AND Gate Overconfidence
82 © C. Ericson 1999
Pitfall #3 – AND Gate Overconfidence
No TFR Avoid the temptation to
Example: Fly Up Cmd truncate tree at high level
because it appears safe.
15 FT levels and 5
subsystems in depth.
A B C
85 © C. Ericson 1999
Pitfall #10 – Gate Calculations Errors
86 © C. Ericson 1999
MOE Error Example 1
PA=8x10-6 Incorrect
P=4x10- 6 P=4x10- 6
A B A C
PA=6x10-6
Cut Sets = A ; B ; C A B C
P= PA + PB + PC
= (2x10-6) + (2x10-6) + (2x10-6) PA=2x10-6 PB=2x10-6 PC=2x10-6
P=4x10- 6 P=4x10- 6
A B A C
Correct
PA=2x10-6 PB=2x10-6 PA=2x10-6 PC=2x10-6
P=2x10- 6
A P=4x10- 12
Cut Sets = A ; B,C
P = PA + PBPC PA=2x10-6
= (2x10-6) + (2x10-6)(2x10-6) B C
= 2x10-6 + 4x10-12
= 2x10-6 [upper bound] PB=2x10-6 PC=2x10-6
88 © C. Ericson 1999
MOE Error Example 3
Incorrect but
P=8x10- 12
Correct
P=4x10- 12 P=4x10- 12
A B A C
Correct
PA =2x10-6 PB =2x10-6 PA =2x10-6 PC =2x10-6
P=8x10- 12
90 © C. Ericson 1999
Rule #1
91 © C. Ericson 1999
Rule #2
92 © C. Ericson 1999
Rule #3
Rule #3 -- Establish Your FTA Ground Rules
l Define and document assumptions
l Scope the problem
n size, level of analysis, level of detail
93 © C. Ericson 1999
Rule #4
94 © C. Ericson 1999
Rule #4 (continued)
95 © C. Ericson 1999
Rule #4 (continued)
96 © C. Ericson 1999
Rule #5
Rule #5 -- Know Your System
l Know the system design and operation
l Know the interfaces between subsystems
l Utilize all sources of design information
n drawings, procedures, block diagrams, flow diagrams, FMEA’s
n stress analyses, failure reports, maintenance procedures
98 © C. Ericson 1999
Rule #7
99 © C. Ericson 1999
Rule #7 (continued)
A B
Computer A
Method 1 – Structured A B
Warhead Computer A
Inadv Armed
WH Fails WH Receives
Armed Arm Cmd
A B
D
ARM 2
Closed
A
Battery ARM 1 ARM 2
WH
Signal Signal
l www.system-safety.org
l www.FaultTree.net or www.fault-tree.net
l www.aot.com