Reliability Analysis For Repairable
Reliability Analysis For Repairable
Repairable System
Life Data Analysis
Recurring Data Analysis
System Reliability Simulation
Course Objectives
• To provide a solid foundation of the methods, analyses
related to repairable systems for the asset management
professional.
Introduction
Part 1. Life Data Analysis
Part 2. Recurring Data Analysis
Part 3. Simulation approach for Repairable System
Introduction
Qualitative Quantitative
Qualitative Analysis
Introduction
Statistic Background
Distribution Models
Distribution Parameter Estimation
Censored Data
Confidence Bounds
Life Distributions
Summary
Life Data Analysis (What)
• Manufacture
• Customer expectations/satisfactions
• Competitions (Market share/pricing)
• Warranty cost
• Demonstration of product performance
• Maintenance Organization (Asset Owner)
• Spare management
• Optimum replacement interval
• Quantify failure rate behaviors and downtime
distributions for RAM analysis
Life Data Analysis (Why)
• Data
• Time to failure data
• Understand the data type
• The most time consuming task of analysis process
• Software tools
• Weibull Toolbox (AssetStudio)
• Facilitate data classification and data entry
• Visualize the results
• Focus on your engineering problems
• Other commercial software also available
• Free statistical tools: R, Python (coding required)
Life Data Analysis (How)
• Know-how
• Data classifications, validations
• Understand the underlaying statistical concepts
• Understand when it is not appropriate to use LDA and
what are the alternatives
• Interpret results
Life Data Analysis
Introduction
Statistic Background
Distribution Models
Distribution Parameter Estimation
Censored Data
Confidence Bounds
Life Distributions
Summary
Statistical Background
Random Variable
PDF, CDF
Reliability, Unreliability
Failure Rate
Condition Reliability
Mean and Median
Random Variables
• For Life data analysis, we focus on Time-to-Failure
Distribution. The unit of your data set is time. The random
variable is time-to-failure.
• Supposing you want to perform a distribution analysis on
the weight distribution of boys in your school, sample
weights data are collected. In this case weights is the
random variable.
• Our random variables (TTF) are Real Number in the positive
domain.
• TTF is classified as Continuous Random Variable
Discrete Random Variable
902 234 489 511 748 443 567 353 494 1170
1130 252 175 241 591 366 484 262 521 644
843 632 184 494 322 774 587 896 310 683
642 291 871 574 233 543 809 425 265 949
177 717 699 372 742 484 618 715 576 1020
577 490 360 394 745 341 649 922 453 1002
539 436 456 183 635 500 379 207 551 757
715 913 592 620 336 348 247 422 872 837
245 595 656 987 549 594 534 280 727 395
212 401 965 359 316 356 499 638 726 429
… … … … … … … … … …
… … … … … … … … … …
Probability Density Function (PDF)
258
222
193
67
25
5 1
0-200 200-400 400-600 600-800 800-1000 1000-1200 1200-1400 1400-1600
T t
Mean Life
Introduction
Statistic Background
Distribution Models
Distribution Parameter Estimation
Censored Data
Confidence Bounds
Life Distributions
Summary
Distribution models
• Lognormal (μ’,σ’)
• Exponential (λ)
• Weibull (β,η)
Introduction
Statistic Background
Distribution Models
Distribution Parameter Estimation
Censored Data
Confidence Bounds
Life Distributions
Summary
Distribution Parameter Estimation
• Objective:
• Solve for the values that best describe the distribution of the
observed data.
Rank Regression
• Associate each observed data with a probability value
(probability of failure), z.
j tj zj
1 15 z1
2 20 z2
3 30 z3
4 40 z4
where
is the rank position
is the Time-to-Failure (TTF), and
is the rank value (probability of failure)
j tj zj
1 15 0.159
2 20 0.386
3 30 0.614
4 40 0.841
1.000
0.800
0.600
0.400
0.200
0.000
0.000 10.000 20.000 30.000 40.000 50.000
Probability-Weibull plot
Y m.x c
Probability-Weibull plot j tj zj
1 15 0.159
2 20 0.386
3 30 0.614
4 40 0.841
0.90
0.50
0.10
10.000 100.000
10 15 20 30 40
Probability-Weibull plot, value
0.90
1 1
∆𝑌 = 𝑙𝑛 𝑙𝑛 − 𝑙𝑛 𝑙𝑛
0.50 1 − 0.86 1 − 0.32
= 1.629
∆𝑋 = 𝑙𝑛 40 − 𝑙𝑛 20
= 0.689
0.10
10 15 40
10.000 100.000
20 30
Probability-Weibull plot, value
• One could determine the Y-interception and workout the .
• An easier way is to evaluate Q( ) from cdf:
Probability-Weibull plot, value
Parameter Estimation
• Similarly, you could choose to fit other distributions with the
same data set.
99.000 99.000
90.000
50.000
50.000
Probability Probability
Weibull Exponential
10.000 10.000
10.000 100.000 0.000 10.000 20.000 30.000 40.000 50.000
99.000
99.000
50.000
50.000
Probability Probability
Normal Lognormal
10.000
0.000 10.000 20.000 30.000 40.000 50.000 10.000
0.000 10.000 20.000 30.000 40.000 50.000
Least Square Regression
RRX RRY
90.000
50.000
Unreliability, F(t)
Unreliability, F(t)
50.000
10.000
5.000
10.000
5.000
1.000 1.000
0.010 0.100 1.000 10.000 100.000 1000.000 0.010 0.100 1.000 10.000 100.000 1000.000
Time, (t) Time, (t)
ρ =0.991 ρ =0.954
Parameter Estimation: MLE
• Objective:
• Solve for the values that best describe the distribution of the
observed data.
MLE Concept
• Solve this, or
MLE solution
• The derivatives of is much easier to obtain than L.
Introduction
Statistic Background
Distribution Models
Distribution Parameter Estimation
Censored Data
Confidence Bounds
Life Distributions
Summary
Data Type for Distribution analysis
• Exact Time-to-Failure
TTF
• Interval Censored
Failed interval
• Left Censored
Failed interval
Data Type
• A Complete Data Set refers to a data set that contains only
exact time-to-failure data.
Original Original
1 F 35
1 F 43
Censored Data 1
1
1
F
F
F
59
60
103
1 F 117
1 F 125
• If you consider the 90 suspensions data… 1 F 126
1 F 135
1 F 148
90 S 150
Parameter Estimation with Censored
Data
• Using rank regression analysis, the rank position of each
failure has to be adjusted to accommodate suspension data
(Leonard Johnson’s approach).
• In the case of MLE, the complete likelihood function
consider the censoring time.
Introduction
Statistic Background
Distribution Models
Distribution Parameter Estimation
Censored Data
Confidence Bounds
Life Distributions
Summary
Confidence Bounds
• Assuming you received the Supplier A Supplier B
following data from supplier 25
35
18
19
A and B respectively: 40 26
53 26
60 31
33
33
34
39
40
42
45
47
48
55
58
58
59
67
74
Probability - Weibull
99.000000
90.000000
Unreliability, F(t)
• Both suppliers tell you B(5)= 17.9 Hrs
that their products have a 10.000000
1.000000
10.000 100.000
Probability - Weibull
prefer? 99.000000
90.000000
50.000000
Unreliability, F(t)
10.000000
B(5)= 17.7 Hrs
5.000000
1.000000
10.000 100.000
Time, (t)
Confidence Bounds
B(5) life @ 90% Confidence level
Probability Density Function
10%
B(5)
Probability - Weibull
99.000000
90.000000
50.000000
Unreliability, F(t)
10.000000
B(5)
5.000000
90.000000
50.000000
Unreliability, F(t)
10.000000
5.000000
1.000000
10.000 100.000
Time, (t)
10.6 13.5
Life Data Analysis
Introduction
Statistic Background
Distribution Models
Distribution Parameter Estimation
Censored Data
Confidence Bounds
Life Distributions
Summary
Life Distributions
Exponential Distribution
Weibull Distribution
Normal Distribution
Lognormal Distribution
Exponential Distribution
• The exponential is given by:
where:
• = failure rate
• = mean time to failures ( )
• Unreliability function
• Reliability function
Exponential Distribution
• Recall failure rate function,
• Substitute
i.e.
Exponential Distribution
where:
• is the slop parameter
• is the scale parameter
Weibull Distribution
• Reliability Function
• Unreliability Function
• Failure Rate
• Mean
• Median
Weibull Distribution
90
50
10
1
1 10 100 1000
Weibull Distribution
0.016
0.012
0.008
0.004
0.000
0.000 60.000 120.000 180.000 240.000 300.000
Weibull Distribution
0.160
0.120
0.080
0.040
0.000
0.000 60.000 120.000 180.000 240.000 300.000
Weibull Distribution
0.800
0.600
0.400
0.200
0.000
0.000 60.000 120.000 180.000 240.000 300.000
Weibull Distribution
90
50
10
1
10 100 1000 10000
Weibull Distribution
0.010
0.007
0.005
0.002
0.000
10.000 248.000 486.000 724.000 962.000 1200.000
Weibull Distribution
0.240
0.180
0.120
0.060
0.000
0.000 1000.000 2000.000 3000.000 4000.000 5000.000
Weibull Distribution
0.800
0.600
0.400
0.200
0.000
0.000 300.000 600.000 900.000 1200.000 1500.000
Normal Distribution
• The N is given by:
where
• is the mean
• is the standard deviation
Normal Distribution
• Reliability Function
• Unreliability Function
• Failure Rate
• Mean
• Median
Normal Distribution
• The normal distribution is useful in statistic because of the
central limit theorem:
• The averages of samples of observations of random variables
become normally distributed when the number of observations is
sufficiently large.
0.024
0.018
f(t)
0.012
0.006
0.000
0.000 20.000 40.000 60.000 80.000 100.000
Time, (t)
Lognormal Distribution
• The Logn is given by:
where
• is the mean of the natural logarithms of t
• is the standard deviation of the natural logarithms of t
Lognormal Distribution
• Reliability Function
• Failure Rate
• Mean
• Median
Lognormal Distribution
0.640
0.480
f(t)
0.320
0.160
0.000
0.000 6.000 12.000 18.000 24.000 30.000
Time, (t)
Lognormal Distribution
• Pdf with varying Log-std, .
• Log−mean, 1
Probability Density Function
0.800
0.640
0.480
f(t)
0.320
0.160
0.000
0.000 1.200 2.400 3.600 4.800 6.000
Time, (t)
Life Data Analysis
Introduction
Statistic Background
Distribution Models
Distribution Parameter Estimation
Censored Data
Confidence Bounds
Life Distributions
Summary
Summary
Introduction
NHPP with Power Law
Optimum overhaul
Summary
Introduction
• Life Data Analysis deals with (a population of) units that
each experience only one failure. Each sample has one
observed value, either its age at “failure" or its current age
while “non-failed."
Pump 1
Pump 2
Pump 3
Common Mistake
Time-To-Failure Comment
281 Pump1
190 Pump2
252 Pump3
Correct analysis
• Beta = 4.84, and Eta = 261 days
Introduction
NHPP with Power Law
Optimum overhaul
Summary
Recurring Data Analysis
• The time to first failure follows the Weibull distribution, then each
succeeding failure is governed by the Power Law model in the
case of minimal repair.
NHPP/Power Law
where:
• P[N(t)=n] is the probability that n failures will be observed by time, t.
• (t) is the cumulative no. of failure (Mean Value Function).
• u(t) is the Failure Intensity Function (Rate of Occurrence of Failures).
NHPP/Power Law
N q
ˆ q 1
T S
k
ˆ ˆ
q q
q 1
k
N q
ˆ q 1
k k Nq
q 1 q 1 i 1
NHPP/Power Law parameters
Introduction
NHPP with Power Law
Optimum overhaul
Summary
Optimum Overhaul (Economical Life)
• Let overhaul time = TOH, the system cost per unit time
(CPUT) is
Optimum Overhaul (Economical Life)
where ,
• Solving
Optimum Overhaul (Economical Life)
• Assuming the average repair cost and overhaul cost for the
pump are $10,000 and $50,000 respectively,
1.48 10−5
Recurring Data Analysis
Introduction
NHPP with Power Law
Optimum overhaul
Summary
Summary (RDA)
Cost of operating
the system
Failure Behaviors
Availability
Optimum
System Overhaul
Maintenance Modeling Time
Policies
Introductions: Why RAM?
• Identify the gaps for improvement.
• Quantify the performance of assets
• Quantify the production impact due to asset unreliability
• Improvement program
• What-if analysis
• Tracking of improvement program (because you can quantify
it)
• Anticipant events
• Resource planning (e.g. spare ordering)
Introductions: How to perform RAM?
• Historical data
• Equipment status log
• Work Order data
System PDF
System Level Reliability, RS
• Assumptions:
• The computer is a non-repairable item.
• System equation:
Parallel construct
• Combination of series
and parallel construct.
• System equation:
𝑅 =𝑅 ⋅𝑅
𝑅 =𝑅 ⋅𝑅
𝑅 = 1 − (1 − 𝑅 ) ⋅ (1 − 𝑅 )
K-out-of-N construct
• System equation:
Assuming R1 = R2 = …= RN = R
Complex Configuration
Availability Efficiency
• System Max Flowrate, FRmax = 1
unit/hour
• Av =
• Max Productions, Pmax= sim_time x FRmax
• Efficiency, η =
Simple Parallel example 1
Repairable Asset
Production Network
Asset Reliability Performance
Consider the following scenarios:
• Scenario 1
A reparable asset fails due to components. How can we describe
the Asset Performance that is meaningful for maintenance
planning?
• Scenario 2
A production system is made of equipment (assets) whose MTBF
are known. What are the impact on the production?
Repairable Asset
LSB (450)
Pump 2 LSB
TTE/day Status ID
281 F Pump 1
LSB (350) 519 F Pump 1
295 S Pump 1
Pump 3
800 - 281 450 F Pump 2
280 S Pump 2
350 F Pump 3
380 S Pump 3
Line Shaft-Bearing (LSB) Failure
Distribution
Original
Observed LSB events
LSB
LSB
TTE/day Status ID
281 F Pump 1
519 F Pump 1
295 S Pump 1
450 F Pump 2
280 S Pump 2
350 F Pump 3
380 S Pump 3
Life Data Analysis on LSB
Converting Recurring Data to IID data
Distributions (day)
s/n Component Abbr.
Model Param 1 Param 2
1 Line Shaft Bearing LSB Weibull 5.83 452
2 Arm & Seal Assembly ASA Weibull 4.10 778
3 Rotor RTR Weibull 4.44 880
4 Impeller IPL Weibull 6.47 887
5 Suction Bell Vanes SBV Weibull 2.25 810
6 Shaft Seal SSL Weibull 1.71 912
7 Switch Assembly SWA Exponential 2.56E+03
Reliability Performance of Repairable
Asset
• The reliability information is assigned to the
corresponding components.
• Assuming every failure takes 10 hours to fix (to
replace the faulty component with a new one).
Reliability Performance of Repairable
Asset
• Run a 730-days simulation with 1000 executions
Reliability Performance of Repairable
Asset
• Top level results
• Using RDA, the projected
number of failure was 3.8
(lower: 2.5 , upper: 6.0 @
90% confidence bound)
No Yes
RDA RBD
Scenario 2, Example
s/n Node Name Flowrate Barrels/day MTBF/day Downtime/hour
production platform.
4 BNDPA-WHCP13L 250 2372 LGN (2.77, 1.95)
5 49L 250 34.5 LGN (3.08, 1.79)
6 BNDPI-WHCP49L 250 2372 LGN (2.77, 1.95)
7 49S 400 Cannot Fail
8 BNDPI-WHCP49S 400 2372 LGN (2.77, 1.95)
9 50S 100 224 LGN (3.08, 1.79)
10 BNDPI-WHCP50S 100 2372 LGN (2.77, 1.95)
11 50L 400 363 LGN (2.43, 1.96)
12 BNDPI-WHCP50L 400 2372 LGN (2.77, 1.95)
13 53 400 224 LGN (2.61, 1.78)
14 BNDPI-WHCP53 400 2372 LGN (2.77, 1.95)
15 65 1500 548 EX1 (48.1)
16 BNDPI-WHCP65 1500 2372 LGN (2.77, 1.95)
17 21S 30 Cannot Fail
18 BNJTC-WHCP21S 30 2372 LGN (2.77, 1.95)
19 23S 1000 Cannot Fail
20 BNJTC-WHCP23S 1000 2372 LGN (2.77, 1.95)
21 BNPA-V200 4333 100 LGN (0.523, 1.4)
22 BNDPA-IGScrubber 280 1095 EX1 (6.5)
23 BNDPA-Others 280 274 EX1 (210)
24 BNDPI-Autocon 3050 274 LGN (0.368, 0.427)
25 BNDPI-IGScrubber 3050 1095 EX1 (25)
26 BNDPI-Others 3050 548 LGN (3.82, 1.68)
27 BNJTC-Autocon 1030 1095 EX1 (5.5)
28 BNJTC-IGScrubber 1030 100 LGN (2.43, 1.17)
29 BNJTC-CompIAPAC 1030 548 EX1 (183)
Reliability Performance of Repairable
Asset
• Run a 365-days simulation for 1000 executions
Reliability Performance of Repairable
Asset
• Take well 49L for example. The projected downtime over a
year is 866 hours over a year.
• Improvement program
• What-if analysis
• Tracking of improvement program (because you can quantify it)
• Anticipant events
• Resource planning (e.g. spare ordering)
Simulation approach for
Repairable System
• Life-Stress-Relationship
• Inverse Power Law (IPL): item cannot fail at 0 flowrate
• Modified IPL: item can fail at 0 flowrate
• Optimum: item TTF is optimum at design flowrate
Regular Node
• It represents a maintainable
component/asset and is used to
define the characteristics of
production network components.
CourseExamples\StorageExample.aro
Storage Node
• PM Group
• PM-Group manager defines the inventory of items for a
maintenance group, and the PM triggering mechanism.
• If a given pump fails, the standby become active. The failed pump
become standby.
• If 2 pumps fail, the system still continue to operate with the only
working pump, with a reduced production rate of 1 unit/hour.
Examples\Redundancy.aro
Standby Example
• Define which nodes, upon failure, can trigger the group, and nodes
to be triggered for PM.
Examples\UserGuide\PMGroup.aro
PM Group example
• Consider a series network. When any of the items (A, B or C)
fails, perform PM on the remaining items.
PM Group Manager
• PM Group Manager is used to define Regular Nodes participating in the
maintenance group, and the PM triggering mechanism.
Trigger
source
PM Group Manager
2 1
Replenishment policies
• LevelBased Restock
• Restock occurs when the spare part inventory level is equal to or less than the user
defined trigger-level (Trigger Restock Level).
2
1
Simulation approach for
Repairable System
• Modified IPL: Node can age (i.e. can fail) at 0 stress (for
example during standby).
• The life-stress-relationship is
• The life-stress-relationship is
Advantages
• Quantify the impact of assets reliability on
production efficiency.
• Provide a model for engineers to perform “what-if”
analysis.
• Identify gaps and improvement program to close
the gaps
Reference Textbooks
Reliability Engineering Handbook
Vols 1 & 2, by Dimitri Kececioglu
Reliability and Life Testing Handbook
Vols 1 & 2, by Dimitri Kececioglu
Statistical Methods for Reliability Data
by William K. Meeker and Luis A. Escobar
Applied Life Data Analysis
by Wayne Nelson
Appendix
Median Rank
Appendix: Median Rank calculation
• Rank value z can be calculated using Cumulative Binomial
equation.
where
N is the sample size,
P is the probability that at least j failures are observed.
for N = 4 and j = 1
=> = 0.159
Interpretation:
• If = 0.159 (the probability of failure of this population)
then, there is a 50% chance of observing at least 1 failure.
Side note: Solving for Median Rank, z
• Excel function
• BETA.INV(0.5, j, N)
• Bernard's approximation
•