RCM Implementation

Published on September 2016 | Categories: Types, Instruction manuals | Downloads: 82 | Comments: 0 | Views: 297
of 7
Download PDF   Embed   Report

RCM Implementation Techniques

Comments

Content

A Constmct ive Critique of Re1iability-Centered Maintenance
David J. Sherwin 0 Lund University Institute of Technology Lund.
Key Words:

Maintenance, RCM, Life-cycle costs, Terotechnology.

This could now with advantage be amended to ...life-cycle
SUMMARY & CONCLUSIONS
proJts ”.
Maintenance should be based on the intrinsic RAM
3. BRIEF DESCRIPTION OF RCM
properties of the machinery to be maintained, and costRCM is described fully by Nowlan & Heap, MIL-STD’s,
optimized. Because maintenance acts on parts, data collection
and analysis must also be at that level. Data costs are falling and Moubray, [ 1,2,3]. RCM is a good idea spoiled rather than
and optimization methods are improving, yet the maintenance a wholly bad scheme, though the good parts of it are not all
industry still resists change. This is at least partly because the original whereas the faulty ones, apart from the Bathtub Curve
books on RCM contain some wrong ideas which spoil it as a confusion, see below, generally are specific to RCM.
RCM purports to be a procedure for discovering what
basis for terotechnological investigation and amelioration. The
paper first demolishes some of the tenets of RCM, then shows maintenance is required by an asset in its operating context, in
how these myths have delayed progress and finally makes particular what must be done to ensure that it continues to
suggestions for a system of maintenance based more truly provide its intended functions to its owner. But this should not
be the sole aim; maintenance is an economic rather than just a
upon reliability. The points are illustrated by examples.
reliability
problem. In outline the RCM procedure is :1. INTRODUCTION
a) Define the system’s functions
The Concise Oxford Dictionary defines a fad as “A pet b) Define failure modes relative to these functions
notion or rule of action, a craze, a piece of fancied c) Carry out FMECA
enlightenment”. Quality, Reliability and Terotechnology have d) How can failure modes be prevented?
a long history of fads. In Quality there were Quality Costs, e) If prevention is not possible, what should be done?
Quality Circles (1 970’s), Taguchi Methods (1980’s), IS0
Generally, this procedure is carried out by groups of
9000 (1990’s). In Reliability there were the Bathtub Curve engineers, technicians, and operators familiar with the plant to
(1960’s), FMECA (1970’~)~
Bayesian Methods, (1980’s). In be maintained, with the expectation that they will advise less
Terotechnology we have RCM . All have some good features, maintenance that requires stopping and dismantling, but more
but none is a panacea. Exclusive reliance on them is focused on function. They use what data there are, but if there
dangerous. All can be misused or used out of context. The is none, then they rely on experienced estimates; if a failure
paper shows that RCM is a fad by examining its weaknesses mode has never been recorded, they tend to assume that no
and errors, and then suggests some more effective methods.
maintenance is needed to prevent it, despite the possible
operation of such a PM routine during the period of no failure.
2. GLOSSARY
The criterion is reliability of function, not economics. The
[Continuous] Condition Monitoring
[CICM
investigation groups contain both senior and junior personnel,
AR
Age Renewal at cost-optimized intervals
contrary to the accepted theory of small-group dynamics, e.g.
Total
cost
of
a
Failure,
PM
CF7CM
Quality Circles. The author has noted in two by-the-book
FMECA Failure Modes Effects & Criticality Analysis
RCM
exercises that, however hard they try not to, senior
Life-cycle Costs & Profits
LCC/P
people tend to bully juniors into agreeing to cuts in the PM
Latest airline version of RCM
MSG3
which may save money for a while, but would eventually
Original Equipment Manufacturer
OEM
prove detrimental. Also, their understanding of the design may
Preventive Maintenance - any action to prevent
PM
be inadequate to eliminate or open out schedules safely.
failure.
In a Decision Diagram, failures are classified as Function
Reliability-centered Maintenance as described in
RCM
Loss, SafetyEnvironmental, Hidden Faults and Others (which
11-31
do not directly affect functionality). Why this is considered so
ROCOF Rate of Occurrence of Failures as per [4]
pimportant
is unclear, because the Decision Charts then advise
Weibull distribution shape parameter
I3
Terotechnology [ll] is defined -as- ‘‘A combination oj almost the same procedure for each class, i.e. to examine, in
management, financial, engineering, building and other order, the feasibility (rather than cost) of, CM (running), PM
practices applied to physical assets in pursuit of economic (to restore, or renew), Inspection at Intervals (stopped, Hidden
Faults only), and as a last resort, re-design. The first feasible
life-cycle costs,
solution is to be accepted, except in MSG3, which also puts


238

0-7803-5143-6/99/$10.00
0 1999 IEEE
1999 PROCEEDINGS Annual RELIABILITY and MAINTAINABILITY Symposium

Inspection before PM in the order of consideration, and
extends Inspection to modes other than Hidden Faults. That
these alterations were made indicates the inadequacy of the
original charts, which nevertheless remain in widespread use.

4. HISTORY & BATHTUB MISCONCEPTIONS
RCM is a child of the aircraft, and more particularly the
airline, industries. Airliners have redundant machinery and
control systems, and structures designed to tolerate minor
damage without danger. Airlines found in the 1950’s that
increasing the intensity of overhauls to engines, other
machines and avionics did not increase reliability. The U.S.
Federal Aviation Authority’s 1960 investigation was to
analyze the factors affecting reliability and the efficacy of PM.
It “confirmed” that “scheduled overhaul had little eflect on
overall reliability of complex items unless there was a
dominant mode, and there are many items for which there is
no eflectiveform of scheduled maintenance.. ”.
This of course is nonsense. It arises from misconceptions
about the Bathtub Curve, see Ascher & Feingold, [4]. Figure 1
shows how repair of only the failed part leads to randomized
part ages and a pseudo-Poisson process for the system as a
whole. The raised ends are due to initial quality and training
problems and wear-out of longer-lasting parts respectively.

System
ROCOF

W

S

System Lifetime

T

Figure 1. Bathtub curve for Repairable System
We can be sure that they never tried operating the aircraft
entirely without such overhauls! If they had, they would have
found that the reliability fell. What actually occurred was a
switch from age to condition-based maintenance, which
depended heavily upon better data collection and analysis, see
for example Cole, [13]. They apparently did not investigate the
quality of the workmanship, the introduction of new faults at
PM, as a reliability factor either in 1960 or in the later
investigations leading to the formulation of RCM, see Sherwin
& Lees, [5]. Moubray’s book, [3], for example, implies
several times that the system or machine Bathtub Curve is
inherent, when it is obviously shaped and scaled by the
maintenance policy, and should not therefore be used to set the
policy, [4,5]. More generally, the investigations were empirical
and made no attempt to reconcile wear-out and AR theory at
the part level with apparently Poisson failure patterns to
machines. This was excusable in 1960, given the state of the

to the low, smooth, loading, or have very low constant or
falling hazard rates because failures are due to residual
manufacturing quality faults and random voltage peaks.
However, given the need to save weight, progressive effects in
mechanical parts, such as metal fatigue, corrosion, wear and
creep are inevitable and usually amenable to CM, Inspection
whilst stopped, or AR. But it is parts not systems which, given
enough data, are amenable to PM optimization. The reversing
returns recorded under increasing overhaul frequency probably
arose as follows. Some overhaul routines call for inspection
and renewal of worn parts according to the judgment of the
technician, others for renewal regardless of condition, still
others for an exchange with a whole machine withdrawn from
another identical system and overhauled at leisure, but none
for the complete renewal of the machine which would justify
treating it as statistically equivalent to a part. E.g, a Weibull
analysis of bus gearboxes by Kelly, [6] had shape parameter p
= 2.5 for first failures from new, but /3 = 1.1 for subsequent
failures. From new until first failure, all the part hazard rates
are additive, but after overhaul, pseudo-Poisson failures would
be expected because the wearing parts are then of different
ages, different parts being renewed in each gearbox.
In CM we assess whether the part will endure to the next
renewal opportunity, so there is a tendency to renew at about
the same intervals regardless of the frequency of checks,
provided that this prevents most failures. In both this and the
other cases, some bad quality parts are fitted, which fail early,
and also some badly fitted renewals occur (poor
workmanship). These are reducible by quality control and
training, but are often not seen for what they are because of the
confusion between part and system Bathtub Curves, and so
appear to give negative reliability returns from more frequent
overhauls. However, with good work and good spares,
reliability would increase with overhaul frequency. There
would, of course be a turning point with respect to availability.
The other important “finding” of the 1960 investigation
was that “Thereare many itemsfor which there is no effective
form of scheduled maintenance”, [1,2,3] This is a direct
indication that the bathtub confusion dominates RCM theory,
confirmed by the reference to the need for “a dominant
mode ”. All frequently-failing parts, either give detectable
signs that they are about to fail, or else have rising hazard rate
functions. The next question is whether the costs justify preemptive or on-condition renewal. Few such parts fail this test,
because the costs and their absolute difference can be quite
small, it is the ratios which determine, with the distribution
form, whether such work is worthwhile.
The books on RCM all show six variations on the Bathtub
Curve, see Figure 2 . The axes of these graphs are marked, if at
all, as “(Conditional)probability of failure versus Time”. They
do not try to distinguish between system and part time-scales,
and policy recommendations are developed from the curves
without regard to any economic factors, as if the prevention of
“most of the potential non-random failures” were the only

art of Reliability Theory, but not since [4]. It is true now, criterion of success. By random failures RCM means constant
though possibly not in 1960, that some electronic systems are conditional probability, but whether that in turn implies part
best left alone, because either they are inherently reliable due hazard rate or system ROCOF is never clear.

1999 PROCEEDINGS Annual RELIABILITY and MAINTAlNABlLlTY Symposium

239

The “evidence” upon which
these failure patterns are based is
the same as that upon which it
was concluded that overhauls did
not improve reliability unless
there was a dominant mode of
failure. This suggests that they
remain
confused
between
systems and parts, ROCOF and
hazard rate.
Discussing
Pattern
A,
Moubray, [3] states that two or
more modes are operating and
that each must be dealt with
separately, but he fails to
acknowledge that the central
portion may be the result of part
renewals in a system.
In Pattern B, the initial flat
Figure 2. RCM’s
portion is attributed to “random
Failure Patterns
factors which cause ‘yaster
wear than usual” in a part with a
three-parameter Weibull distribution, once more indicating
bathtub curve confusion.
The 1960 data, in which these patterns were all identified
are mainly for systems which were overhauled with pseudoPoisson failures between overhauls. The conclusion that
renewal should occur just before the onset of relatively rapid
wear-out is of course sub-optimal even for parts; if data suffice
to identify this pattern then they suffice also for distribution
analysis which separates the Poisson and wear-out modes and
permits optimization of the cost rate.
RCM texts say that Pattern C may apply to parts failing by
metal fatigue. Here again, it is unclear in [ 1,2,3] whether the
time scale is part life or system-time-since-overhaul,with the
fatigue failure as the dominant mode triggering overhaul. It is
easy to show that a straight-line hazard rate implies a Weibull
shape parameter p =2. Fatigue failures are often Lognormal.
Pattern D is said to correspond to 1 < p < 2 in the Weibull
form. The author’s own data, gathered at chemical plants in
Britain in the 1970’s, [5] found centrifugal pumps with system
ROCOF bathtubs of this shape, presumably because there was
good manufacturing quality control in an established design. It
also seems plausible that avionics systems of the 1950’s would
exhibit this pattern of ROCOF. They were “burned in“,
repaired by exchange, sent to a workshop to find and replace
the failed part(s), then placed on the shelf for the next time.
The description of Pattern E, together with reference back
to the data of the 1960 study, confirms that RCM specialists
are definitely confused between hazard rate and ROCOF. They
are also confused between true Poisson failures, which are by
nature completely unpredictable individually, and failures
which give detectable warning and are amenable to CM but
not to AR because of their highly variable times to the start of
detectable deterioration. Moubray [3] cites the example of
rolling contact bearings. These are miniature systems which
are renewed as parts; that is they have several potential failure


240

modes which compete and combine to cause failure. Weibull
analysis of such bearings shows multiple modes which can be
separated graphically. It is wrong to draw the Weibull plot and
declare that the failures are random because the initial slope is
about unity. Ball bearings do wear out, and elsewhere in his
book Moubray describes roughly how that happens in respect
of fatigue failures to the outer race. He states that the interval
from detectable warning becoming available and actual failure
(his P-F interval) is reasonably constant and that it “should not
be necessary to take additional readings a@er theJirst sign of
deviation is discovered ...should only be tracked fi the process
of deterioration is poorly understood” Actually, it can be
shown in practical cases such as the roller bearings in paper
mills, [7], that the P-F interval is variable and that it pays to
increase the frequency of readings when deterioration is first
detected. In some cases continuous monitoring, either of the
last phase or as the only policy, is economically best, [SI.
Moubray is right to claim that better understanding of the
failure process could lead to more accurate assessment of the
time remaining, but so far, more precise analysis of the
vibration frequencies relative to the rotational speed has led
researchers to identify more (and more complex) failure
modes. Money is wasted if more readings are not taken; the
vibration level is quite likely to fall again, and the rate of
deterioration varies between and within failure modes.
Finally, Pattern F is presented as the most common shape
in the 1960 airline study data, (68%). They call it “infant
mortality”, a phrase normally associated with parts, but then
describe correctly the usual system effects, including poor
maintenance workmanship, once again indicating basic
confusion between the two types of bathtub curve. However,
some RCM texts go on to perpetuate the myth of “too much
preventive maintenance” as a cause of “decreasing failure
rate”. This is a pity because one would have expected more
logical development of the better workmanship theme, [ 5 ] . It is
easy to show fast (but temporary!) savings, by advising less
PM rather than more training and better supervision. Bad work
in a new system is inexperience rather than carelessness or
over-maintenance. Talk of “unnecessary or unnecessarily
invasive” routines is unhelpful; OEM’s call for such early
routines against the special problems of newness, such as the
first change of the oil in a new engine. One car-hire company
buys its cars new, ignores running in and all the early
maintenance routines, and sells them at 50, 000 miles without
even changing the oil or taking the first free service. They get
little trouble, but the second owners do! This is not to say that
the frequency and basic need for routines should not be
challenged; unfortunately many OEM’s make up money on
spare parts that they lost on competitive pricing. But it should
be done on the basis of data analysis and engineering
investigation, including asking the OEM to justify his
schedules. Hyper-exponentially distributed failures at the part
level are due to bad workmanship or poor quality spares, [ 5 ] .
At the machine or system level, the overall ROCOF settles to a
constant value that is higher than it needs to be, and a
reduction in PM frequency may sometimes lead to a temporary
improvement, but permanent improvement is achieved by

1999 PROCEEDINGS Annual RELIABILITY and MAINTAINABILITY Symposium

training maintainers in fitting techniques and spare parts
inspection. We showed in three diverse situations, (fertilizers,
petro-chemicals and hospital autoclaves) that this was so, [SI.
In each case, an increase in effective PM frequency saved
money. Standards have improved since 1980 and we do not
claim that this would always be so, now or then. But it is
worrying that the RCM analysis is being applied to aircraft,
nuclear plants, ships etc., to reduce the cost of maintenance.

5. THE DATA PROBLEM AS SEEN BY RCM.
RCM refutes the need for data for both rare and very
common events and is wrong in both cases. RCM fudges the
data issue for rare but very serious events as follows, “The
acquisition of the information thought to be most needed by
maintenance policy designers - information about critical
failures - is in principle unacceptable and is evidence of the
failure of the maintenance program. This is because critical
failures entailpotential (in some cases, certain) loss of life, but
there is no rate of loss of life which is acceptable to an
organization as the price of failure information to be used for
designing a maintenance policy. Thus the designer is faced
with the problem of creating a maintenance system for which
the expected loss of life will be less than one over the planned
operational lifetime of the asset. This means that, both in
practice and in principle, the policy must be designed without
using experiential data which will arisepom thefailures which
the policy is meant to avoid.”, [9] “Resnikoff s Conundrum”,
above, is treated as a profhdity in RCM circles, but in fact
(fatal) accidents are coincidences. The calculated probability
of the accident is made very small by design; it is the product
of much larger constituent event probabilities, which can be
estimated from data collected from previous similar systems. If
the problem is significant, there will be adequate data, and if it
is not, then the censored data are reassuring. But if data are not
collected then there can be no statistically sound assurance.
Censorings are also data; operation without failure is relevant.
Example. those risking bungy-jumping are assured that an
inert dummy is the first to jump each day, and that even he
does not jump before the rope has been visually inspected. The
attachments and anchor point are double-checked. Moreover,
the rope is changed anyway after a fixed number of jumps, and
is manufactured to a strict standard. Estimates of the
probabilities of failure of equipment and procedure can be
made, and multiplied together to form an estimate of the
probability of an accident, which is more accurate than
“known deaths / known jumps” in similar but non-identical
situations. But even if there have been no fatalities, it is still
possible to estimate the upper limit of probability from the
number of successful jumps. Suppose there have been 1000
jumps and no fatalities. On a Poisson assumption, the best we
can make in the circumstances, the 95% upper limit of the
failure probability p is 0.003/jump.
Pr(0) = e-1000p095
= 0.05

p0.95 = 0.003

(1)

Moubray, [3], also argues as follows against collecting data
for common failures, reversing Resnikoff s argument

“This contradiction applies in reverse at the other end of the
scale of consequences. Failures with minor consequences tend
to be allowed to occur precisely because they do not matter
very much. As a result, large quantities of historical data will
be available concerning these failures, which means that there
will be ample material for accurate actuarial analyses. These
may even reveal some age limits. However, because the
failures do not matter very much, it is highly unlikely that the
resulting scheduled restoration or scheduled discard tasks will
be cost-effective. So while the actuarial analysis of this
information may be precise, it is also likely to be a waste of
time. The chief use of actuarial analysis in maintenance is to
study reliabilityproblems on the middle ground where there is
an uncertain relationship between age and failures which have
signijkant economic consequences.. two categories..(qual&)),
large numbers of identical items..... and age-related failures)
where preventive and failure costs are both very high.” The
basic error here is to suppose that maintenance is a question of
reliability; it is really an economic problem, in which reliability
is a factor. Figure 3 shows how PM, operator and maintainer
training, quality control by the OEM and the number of parts
included in the PM schedules affect the shape of the machine
or system bathtub curve, and as a consequence, the shape of
the corresponding total cost from new curve.

1

ROOOF
irg

1

. .

pvbreitans

.
.

inPM

.
.

Figure 3 Malleability of System Bathtub under PMand
Relation to Costs and Durability
The tangent at the origin of this cost curve represents the
minimum cost rate if renewal (or overhaul) takes place at the
tangential age. This theme is further developed in Figure 4, to
include the value added as well as the expenditure, [ 101. This is
the Life-cycle Profit (LCP) principle.
N e t Benefit = Sales

- Costs

I

............... T.ota!..P.rof!t.................................................

I/

Max Angle

L”

Renewal

Break-Even

Figure 4 Optimization of Nett Benefii

1999 PROCEEDINGS Annual RELIABILITY and MAINTAINABILITY Symposium

241

The LCP concept permits maintenance to be seen as an
investment rather than an expense.
Every item in a working system has an economic function
or it should not be there. Properly calculated AR limits depend
upon the ratio rather than the values of CF, CM, and include
costs which are not paid from the maintenance budget.
Properly optimized renewal schedules are cost-effective by
definition, but RCM applies the method only to items with a
failure-free or very low “failure rate” period at the start of life,
and advises the operator to repair or change out the item as
soon as this period is over, making no distinction between
these policies or between first and subsequent failures or
simple and complex items. Using their own argument with
respect to large numbers of identical items, all the little savings
would add up to a big sum also in the case of many different
items. If there are enough data, doing the calculations, even
properly, is quite cheap. The justification for collecting the
data is that we do not know which items will fail sufficiently
often to measure the distributions and calculate schedules until
we have operated the system for a while. Intrinsically reliable
items produce no data and so present no data collection or
storage problems. The principal cost in data collection and
storage is in the collection itself.
We have found that maintenance staff are willing to collect
data provided that the managers put it to good use. Early
schedules are based upon experience and data of uncertain
relevance and may have to be changed later. It is actually the
critical, frequent and expensive failures which are most likely
to warrant the expense of re-design, and the ones of moderate
to low cost and moderate frequency which justify preventive
maintenance. Safety is best incorporated as very high failure
COSt, CF.
The advocates of RCM seem not to understand how to
analyze data properly, particularly censored data. No failures
of a wearing part with only moderate cost ratio over many
cycles usually indicates that the PM interval is too short. When
intervals are shorter than optimum, more money is wasted than
by being too long by the same amount. RCM’s simplistic
methods of setting intervals for renewal or inspection generally
make the intervals too short. For every maintenance
optimization there is an expected residual probability of
failure, which can be used to check whether the policy is
operating as expected, or needs adjustment. The distribution
and other estimates do not have to be super-accurate to
produce worthwhile savings. It is usually not the case where a
physical deterioration situation exists, that operation to failure
is the best policy. The cost ratio will usually be known from
the FMECA, and if it is high, any reasonable schedule will be
better than none until analysis of operational data gives a better
estimate of the distribution.
The value of data collection and analysis is not confined to
maintenance schedule adjustment. It is much more important
that plant manufacturers hear about all the failures, so that they
can consider re-design and avoid making the same design
errors again. Nowlan & Heap [l] say that manufacturers often
refuse to accept responsibility for failures which they consider
due to operation beyond design limits, and that collecting such

242

data is therefore not worthwhile. Actually such data are very
useh1 to OEM’s because they indicate the real relationship
between duty and reliability and how much margin there is in
the design. Designers do tend to reject data critical of their
designs unless it is so well documented as to be
unimpeachable. From another viewpoint, designers do well to
note and accommodate the ways that their products are
actually used, and design new products that can do what is
needed, rather than insist that they should be used in ways
which would have lost the sale to a competitor. In RCM’s own
beloved airline industry, the engine manufacturers pay their
customers to report part failures, the condition of parts
renewed on age, and condition monitoring readings, because
this helps them to improve the product and its maintenance in a
very competitive market. Data analysis indicates whether PM
is justified as well as how often it should be done, and we
cannot be sure about either without them. It is ironic that RCM
claims the outstandingly data-conscious aircraft industry as a
major success, while sustaining this silly attitude to data
collection. RCM’s faulty ideas arguably are delaying progress
which would be possible with better, integrated IT systems.
6. THE VALUE OF HUMAN LIFE.
The existence of regulations and inspectors is grim witness
to the fact that some organizations are quite willing to risk
human lives in pursuit of profit. There is a price for human life
in safety economics and it is sentimental nonsense to deny it
[9]. It is a high one of course in a civilized society, but it is
nowhere near infinite. It embraces both actual and estimated
sums, including compensation, fines, loss of production, loss
of reputation and community goodwill, increase in insurance
premiums, and internal morale factors e.g. risk of strike.
However an interval for inspection of safety-sensitive
equipment is determined, the cost of the accident is implicit; it
can be found by inverting the appropriate maintenance model.
Example: Safety valves on boilers are tested and reset 4
times a year at a cost of 120 dollars. The rate of developing
faults which would prevent the valves blowing when required
is estimated at 0.01 per year, and the demand rate (incidence
of over-pressure) is 0.1 per year. The false alarm rate is
assumed negligible, and the 120 dollars covers any work done
to pass the tests. The cost attributed to a failure, (boiler
explosion) can be estimated by assuming that the test rate is
economically optimal, although it would usually be preferable
to calculate the optimum test rate after assigning a cost of
failure. The situation can be modeled as in the Markov
diagram, Figure 5 . The mean cycle time, is

qYCM

All
OK

h,O
4

7-

qYcM

b

S.V.
Failed
/

Figure 5 Example of Cost of Human Life

1999 PROCEEDINGS Annual RELIABILITY and MAINTAINABILITY Symposium

T = 1/ A + 1/ (q + f ) = 1/ 0.01+ 1/ 4.1= 1 0 0 2 4 9 3 ~ s (2)

The cost per year is given by
(3)
4 4 ) = C M 4 + (CFf + C M 4 ) / T ( f + 4 )
Differentiating (3) with respect to q, equating to zero and
substituting gives CF = 2,027,435. To find the cost of a human
life subtract the material cost of the failure, i.e. to rebuild the
boiler and the loss of revenue plus the costs of fines etc., say
500,000 in all and divide the remainder by the expected
number of fatalities, say 2, giving 763,727 dollars/life. Not all
critical failures are fatal or potentially so. Some, like Piper
Alpha, are tragic and expensive; others are just expensive.
7. RCM’S DECISION CHARTS
Failure Mode

N!
If CM is possible & Economic then do it
If scheduled repair is possible & Economic then do it
If Age or Block Renewal is possible & Economic then do it
Consider in that order. Else No Maintenance but maybe Redesign

(No Loss of F’nj Insu. to find failures. if not then must

iI

I

1

redesign if Safety or Environment,otherwise no Maintenance
(Safety & Env’m’t) Use a combination of Tasks. If this is not
feasible then Redesign is Compulsory

Figure 6. Simplijied Version of RCM Decision Charts
The Decision Charts are the Commandments of RCM; just
as the Failure Patterns are the Credo. The order of
consideration of policies in RCM is fixed, and not necessarily
the best that could be done in any individual case, even
according to RCM’s criterion of reliability rather than
economics. Note how although the failure modes are
classified, the decision process is almost the same for all. The
basic classification of failures into No Loss of Function (incl.
Hidden Faults), Safety & Environmental, Function Loss and
Economic due to Quality or Output Loss is reasonable except
for the failure to understand that there is always an economic
loss if the failed item has any purpose, typified by the
treatment of redundancy. If there is a standby for a machine in
the system, then the failures are classified as No Loss of
Function. This begs the question of the need for and
prioritizing of redundancy. RCM strikes no economic balance
in consideration of redundant items, it simply assumes that no
loss of production occurs if there is a full-size standby, and
that therefore PM is less likely to be worthwhile. In fact, of
course, there is always the small probability that the standby
will fail before the repair is completed, and the machine’s
performance or output quality suffers if PM tasks are
abandoned. Under RCM, it is implicit that machines are either
operational or not; they may be showing signs of impending
failure. but until failure they are fully operational. Yet the (unsimp1 ad) Decision Charts repeatedly ask whether tasks, i.e.
maint .lance policies such as “scheduled restoration ”, are
“tech rically feasible and worthwhile In fact technical
’I.

feasibility and cost inevitably are connected. For example, for
a very high cost relative to the expected period of failure-free
operation that might follow, it is technically possible to take a
ball bearing out of the machine, take it apart and renew just the
fatigued outer race. The high cost relative to the expected
benefit is the reason why such an operation is not generally
considered feasible. When we then consider the more sensible
option of renewing the whole bearing, a “scheduled discard
task” in RCM parlance, whether it is “worthwhile” cannot
really be decided without the data necessary to find an
optimum renewal interval. In contrast, it is worthwhile to
repair exchanged computer mother-boards. Which policies will
work certainly cannot be discovered without such information;
in RCM terms, if you do not know the “Failure Pattern”
(Figure 2) you cannot decide the policy. We defy any RCM
practitioner to determine the failure pattern and so the policy
without the necessary data to optimize the interval. Without the
costs you cannot decide if it is worthwhile. In other words one
might as well do it properly as badly, and there is no ducking
the need for detailed data and data analysis.
8. HOW CAN RCM BE REPAIRED?
The question is really, “What can be salvaged?”
a) Initial schedules should be based upon an FMECA agreed
by OEM’s and users. This should remain focused upon
functional reliability, but consider also quality of product
and system thermodynamic efficiency.
b) Prompt feedback of failure and repair data direct to
designers is vital to improve plant and products quickly
enough to be useful in modem industrial conditions.
c) The maintenance schedules should be regularly reviewed by
Maintenance/Quality Circles as to work content and an
optimization group as to frequency.
d) The principle of LCCP, [lo] should inform all decisions,
including policy reviews.
e) The decision charts should be modified to require data
analysis and economic optimization, including the fusion
of routines into blocks and overhauls.
f ) The need for detailed data to be collected to inform the
FMECA’s, policy choices and optimizations must be faced.
Savings are available if maintenance is treated as one
aspect of an integrated company-wide IT system.
9. SOME FINAL THOUGHTS
a) The entire structure of RCM rests upon the shaky
foundation of the faulty analyses of the 1960 data. If they
are wrong then so is much else. System ROCOF curves are
the result of the maintenance policy and cannot be used
directly to set that policy.
b) The proposition that RCM is a fad has been substantiated.
There is much over-simplification. The decision charts are
“pet rules of action”, it has become a “craze”, and the
failure pattems are “a piece offancied enlightenment”.
c) The modifications required are so extensive that it would
not be fair still to call the result RCM. What is needed is

1999 PROCEEDINGS Annual RELIABILITY and MAINTAINABILITY Symposium

243

better described as Terotechnology, [ 1 I]. Even its backers 8. Sherwin, D.J. & AI-Najjar, B., “Practical models for condition
.
ofMaintenance
monitoring inspection intervals”, Proc. 3 r d ~Con$
are upgrading RCM by re-defining it, e.g. [12], and this is
Societies, Adelaide, I.E.Aust May 1998
causing more confusion, but it is not really salvageable.
9. Resnikoff, H.L., “Mathematical Aspects of Reliability-Centered
Maintenance”, Dolby Access Press, Los Altos, California, 1978
d) Theoretical flaws always produce bad results sooner or
later; for RCM it is likely to be later. The delay whilst 10. Ahlmann H. “Maintenance effectiveness and economic models in the
terotechnology concept”, Maintenance Management International, vol
policy changes work through ensures RCM is not blamed.
4, ~ ~ 1 3 1 - 1 31984
9,
e) The omission of reliability and maintenance from otherwise 11. British Standard BS:3811, “Maintenance Management Terms in
Terotechnology” BSI, London., 1984.
integrated IT systems in manufacturing companies possibly
12. Creecy, M.E., & Agarwal, R., “Maximize reliability through an
is connected to the prevalence of unreformed RCM and its
optimized maintenance program : streamlined reliability-centered
careless view of the need for data. Lack of will to collect
maintenance”, Proc. j r d Int ’I Con$ of Maintenance Societies, Adelaide,
data is certainly making it difficult to prove the efficacy of
I.E.Aust, May 1998
13. Cole, G.K., “Practical issues relating to statistical failure analysis of aero
modem OR models for maintenance.
gas turbines”, Proc. 1Mech.E. Con$ on Mech. Re1 ’y, MEP, London,
REFERENCES
1996
1. Nowlan,F.S. & Heap, H), “Reliability-Centered Maintenance”, US. Dept.
BIOGRAPHY
of Commerce (NTIS), Springfield Va., 1978.
David
J.Sherwin
,
MSc,
PhD,
CEng, MIMechE, MIPlantE.
2. MIL-STD-2173(AS), “Reliability-centered Maintenance - Requirements
for Naval Aircraft, Weapon Systems and Support Equipment” U.S. Dept. Dept. of Industrial Engineering
Lund University Institute of Technology
of Defense, Washington D.C., 1986.
PO BOX118, S-221 00, Lund, SWEDEN.
3. Moubray , J., “Reliability-Centred Maintenance”, Buttenvorth
Heinemann., 1991
E-mail : [email protected]
4. Ascher H & Feingold H, “Repairable Systems Reliability: Modeling,
Inference, Misconceptions and their Causes”, Basel, Marcel Dekker,
David Sherwin was trained in marine engineering by the Royal Navy in
1984.
which he served for 19 years. He then took an MSc in Q&R at the University
5 Sherwin D J. & Lees, F.P “An investigation of the application of failure
of Birmingham and a PhD in Reliability Applied to Maintenance at
data analysis to decision-making in maintenance of process plants”,
Loughborough University of Technology. After two years with Y-ARD Ltd.,
Proceedzngs of the Instztution of Mechanical Engineers, vol 194, #29,
a marine and off-shore consultancy, as Senior Consultant in Reliability he
pp301-319 (in two parts), London, 1980.
returned to Birmingham University where he taught and researched in Q&R
6. Kelly. A. in Davidson J. (ed), “ The Reliability of Mechanical Systems”,
and Maintenance Optimization for 10 years. He was then appointed
I Mech E Guidesfor the Process Industries, MEP, London, 1988, 2”d
Professor of Maintenance Engineering at Queensland University of
Edition 1994.
Technology, Brisbane, Australia, and took up his present appointment as
7. AI-Najjar, B., “Improvement in effectiveness of vibration-based
Professor of Terotechnology at Lund and Vajlxja Universities in Sweden in
condition monitoring system in paper mills” Journal of Engineering
1993. Dr. Sherwin is a Chartered Engineer (UK), and a member of the
Tribology ofthe LMech.E, MEP, 1998 (in press)
Institutions of Mechanical and of Plant Engineers.

244

1999 PROCEEDINGS Annual RELIABILITY and MAINTAINABILITY Symposium

Sponsor Documents

Or use your account on DocShare.tips

Hide

Forgot your password?

Or register your new account on DocShare.tips

Hide

Lost your password? Please enter your email address. You will receive a link to create a new password.

Back to log-in

Close