Logo en.artbmxmagazine.com

Rcm

Table of contents:

Anonim

Reliability-Centered Maintenance (MCC), or Reliability-centered Maintenance (RCM), has been developed for the civil aviation industry more than 30 years ago.

The process allows you to determine the appropriate maintenance tasks for any physical asset.

RCM has been used in thousands of companies around the world: from large petrochemical companies to the main armed forces of the world use RCM to determine the maintenance tasks of their equipment, including large mining, electricity generation, oil and derivatives, metal -mechanical, etc. The SAE JA1011 standard specifies the requirements that a process must meet in order to be called an RCM process. It can be downloaded through the SAE portal (www.sae.org).

According to this standard, the 7 basic questions of the RCM process are:

1. What are the desired roles for the equipment being analyzed?

2. What are the failure states (function failures) associated with these functions?

3. What are the possible causes of each of these failure states?

4. What are the effects of each of these failures?

5. What is the consequence of each failure?

6. What can be done to predict or prevent failure?

7. What to do if a predictive or preventive task cannot be found

2 RCM concepts

The RCM shows that many of the maintenance concepts that were considered correct are actually wrong. In many cases, these concepts can be even dangerous. For example, the idea that most failures occur as equipment ages has been proven false for the vast majority of industrial equipment. Several concepts derived from Reliability Centered Maintenance are explained below, many of which are not yet fully understood by industrial maintenance professionals.

2.1 The operational context

Before starting to write the desired functions for the asset being analyzed (first question of the RCM), you must have a clear understanding of the context in which the equipment operates. For example, two identical assets operating in different plants can result in totally different maintenance plans if their operating contexts are different. A typical case is that of a standby system, which usually requires very different maintenance tasks than a main system, even though both systems are physically identical. Then, before starting the analysis, the operational context must be written, a brief description (2 or 3 pages) where it must be indicated: equipment operation regime, availability of labor and spare parts, consequences of equipment unavailability (lost production or reduced,recovery of production in overtime, outsourcing), quality, safety and environmental objectives, etc.

2.2 Functions

RCM analysis begins with writing the desired functions. For example, the function of a pump can be defined as "Pump not less than 500 liters / minute of water". However, the pump may have other associated functions, such as "Contain water (avoid losses)". In an RCM analysis, all the desired functions must be listed.

2.3 Functional failures or failure states

The functional faults or failure states identify all the undesirable states of the system. For example, for a pump, two failure states could be "Unable to pump water", "Pumps less than 500 liters / minute", "Not able to contain water". Note that the failure states are directly related to the desired functions. Once all the desired functions of an asset have been identified, identifying functional failures is a trivial problem.

2.4 Failure modes

A failure mode is a possible cause by which equipment can reach a failure state.

For example, "worn impeller" is a failure mode that causes a pump to reach the failure state identified by the functional failure "pumps less than required". Each functional failure usually has more than one failure mode. All failure modes associated with each functional failure must be identified during the RCM analysis.

When identifying the failure modes of an equipment or system, it is important to list the

"Root Cause" of the failure. For example, if you are analyzing the failure modes of the bearings of a pump, it is incorrect * to list the failure mode "bearing failure".

The reason is that the listed failure mode does not give an accurate idea why the failure occurs. Is it due to "lack of lubrication"? Is it "normal wear and tear"? Is it because of "improper installation"? Note that this breakdown in the causes underlying the failure does give an accurate idea of ​​why the failure occurs, and therefore what could be done to handle it properly (lubrication, vibration analysis, etc.). (* in some cases, it may be appropriate to list the failure mode as "bearing failure", depending on the context in which the asset works) it is important to know the operational context well).

2.5 The effects of failure

For each failure mode the associated failure effects must be indicated. The "failure effect" is a brief description of "what happens when failure occurs". For example, the failure effect associated with the "worn impeller" failure mode could be as follows:

”As the impeller wears, it lowers the level in the tank, until the low level alarm sounds in the control room. The time required to detect and repair the fault (change impeller) is usually 6 hours. Since the tank empties after 4 hours, the downstream process must be stopped for two hours. It is not possible to recover lost production, so these two hours of downtime represent a loss of sales ”. Failure effects must clearly indicate how significant the failure would be if it occurred.

2.6 Category of consequences

The failure of a computer can affect its users in different ways:

_ Putting people's safety at risk "security consequences")

_ Affecting the environment ("environmental consequences")

_ Increasing costs or reducing the economic benefit of the company

("operational consequences")

_ None of the above ("non-operational consequences")

In addition, there is a fifth category of consequences, for those failures that have no impact when they occur unless some other failure occurs later. For example, the failure of the spare tire has no adverse consequence unless a subsequent failure occurs (a puncture of a service tire) that makes it necessary to change the tire. These faults correspond to the category of hidden faults.

Each failure mode identified in the RCM analysis must be classified into one of these categories. The order in which the consequences are evaluated is as follows: safety, environment, operational, and non-operational, after separation between obvious and hidden faults. The RCM analysis bifurcates at this stage: the treatment that is going to be given to each failure mode will depend on the category of consequences in which it has been classified, which is quite reasonable: it would not be logical to treat in the same way to failures that can affect safety than those that have economic consequences. The criteria to be followed to evaluate maintenance tasks is different if the consequences of failure are different.

2.7 Difference between effects and consequences of failure

The failure effect is a description of what happens when the failure occurs, while the failure consequence classifies this effect into one of 5 categories, according to the impact these failures have.

2.8 Difference between functional failure and failure modes

Functional failure identifies a failure state: unable to pump, unable to cut the part, unable to support the weight of the structure… It says nothing about the causes for which the equipment reaches that state. That is exactly what you are looking for with failure modes: identifying the causes of these failure states (shaft cut by fatigue, filter plugged by dirt, etc.).

2.9 Hidden faults

Equipment usually has protective devices, that is, devices whose main function is to reduce the consequences of other failures (fuses, smoke detectors, over-speed / temperature / pressure stop devices, etc.).

Many of these devices have the peculiarity that they can be in a fault state for a long time without anyone or anything showing that the fault has occurred. (For example, a fire extinguisher today may be unable to put out a fire, and this may go completely unnoticed (if the fire does not occur).

A pressure relief valve in a boiler can fail in such a way that it is unable to relieve pressure if it exceeds the maximum pressure, and this can go completely unnoticed (if the failure that causes the pressure to exceed the maximum pressure does not occur).) If no maintenance task is done to anticipate the failure or to see if these devices are capable of providing the required protection, then it may be that the failure only becomes evident when that other failure occurs whose consequences the device of protection is there to ease. (For example, we may find that the extinguisher does not work only when a fire occurs, but then it is too late: the fire has started out of control.It is possible that we realize that the safety valve does not work only when the pressure rises and it does not act, but it is also late: the boiler has exploded.) These types of failures are called hidden failures, since they require another flaw to become apparent.

2.10 Different types of maintenance

Traditionally, it was considered that there were three different types of maintenance: predictive, preventive, and corrective. However, there are four different types of maintenance:

_ Predictive maintenance, also called condition maintenance.

_ Preventive maintenance, which can be of two types: replacement or cyclical reconditioning.

_ Corrective maintenance, also called work to failure.

_ Detective maintenance or "fault finding".

2.11 Predictive or condition maintenance

Predictive maintenance or maintenance on condition consists of looking for signs or symptoms that allow us to identify a failure before it occurs. For example, visual inspection of the degree of wear on a tire is a predictive maintenance task, since it allows the failure process to be identified before the functional failure occurs. These tasks include: inspections (eg visual inspection of the degree of wear), monitoring (eg vibrations, ultrasound), checks (eg oil level). They have in common that the decision to take corrective action or not depends on the measured condition. For example, from the vibration measurement of a piece of equipment, it can be decided to change it or not. For the suitability of these tasks to be evaluated, a clear potential failure condition must necessarily exist. That is to say,there must be clear symptoms that the failure is in the process of occurring.

2.12 Preventive maintenance (replacement or cyclical reconditioning)

Preventive maintenance refers to replacement or rework tasks performed at fixed intervals regardless of the condition of the element or component.

These tasks are only valid if there is a wear pattern: that is, if the probability of failure increases rapidly after the element's useful life has been exceeded. Great care must be taken, when selecting a preventive task (or any other maintenance task, in fact), not to confuse a task that can be done with a task that should be done. For example, when evaluating the maintenance plan to be carried out on the impeller of a turbine, we could decide to carry out a preventive task (cyclical replacement of the impeller), a task that in general can be done since the failure generally responds to a wear pattern (pattern B of the 6 RCM failure patterns). However, in certain cases it might be convenient to carry out a predictive task (task to condition), which in many cases is less invasive and less expensive.

2.13 Corrective maintenance or work to failure

If it is decided that no proactive task (predictive or preventive) will be done to handle a failure, but instead will be repaired once it occurs, then the maintenance chosen is corrective maintenance. When is this type of maintenance appropriate? When the cost of failure (direct indirect) is less than the cost of prevention, or when no proactive task can be done and a redesign of the equipment is not justified. This option is only valid if the failure has no consequences on safety or the environment. Otherwise, it is mandatory to do something to reduce or eliminate the consequences of the failure.

2.14 Detective maintenance or fault finding

Detective or troubleshooting maintenance consists of testing protection devices under controlled conditions to ensure that these devices will be able to provide the required protection when needed. Detective maintenance is not repairing an element that failed (corrective maintenance), it is not changing or reconditioning an element before its useful life (preventive maintenance), nor is it looking for symptoms that a failure is in the process of occurring (Predictive Maintenance). Therefore, detective maintenance is a fourth type of maintenance. This maintenance is also called troubleshooting or bump testing, and the interval at which this task is performed is called a troubleshooting interval, or FFI,for its acronym in English (Failure-Finding Interval). For example, blowing smoke at a fire detector is a detective maintenance task.

2.15 How to select the right type of maintenance?

In RCM, the selection of maintenance policies is governed by the category of consequences to which the failure belongs.

_ For faults with hidden consequences, the optimal task is one that achieves the required availability of the protection device.

_ For failures with safety or environmental consequences, the optimal task is one that reduces the probability of failure to a tolerable level.

_ For failures with economic consequences (operational and non-operational), the optimal task is one that minimizes the total costs for the organization.

Even today, many people think of preventive maintenance as the main option to corrective maintenance. However, the RCM shows that in the industry average, preventive maintenance is the appropriate strategy for less than 5% of failures! What to do with the other 95%? On average, when conducting an RCM analysis it is seen that maintenance policies are distributed as follows: 30% of failures handled by predictive maintenance (on condition), another 30% by detective maintenance, around 5% by preventive maintenance, 5% redesigns, and approximately 30% corrective maintenance. This effectively shows that one of the TPM (Total Productive Maintenance) maxims that "all failures are bad and all must be prevented" is in fact wrong:only those that should be prevented, based on a careful cost-benefit analysis, should be prevented.

2.16 Frequency of on-condition tasks (predictive maintenance)

For a condition task to be possible, there must be some identifiable physical condition that anticipates the occurrence of the failure. For example, a visual inspection of an item only makes sense if there is a faulty symptom that can be detected visually. In addition to there being a clear symptom of failure, the time from symptom to functional failure must be long enough to be useful. The frequency of a condition task is then determined based on the time that elapses between the symptom and the failure. For example, if you are evaluating the convenience of checking motor bearings for noise, then the frequency will be determined by the time between the noise being detectable and the bearing failure. If this time is, for example, two weeks,then the task should be done at a lower frequency, to ensure in this way that the failure does not occur in the time between successive checks. The same reasoning must be followed for any predictive task.

2.17 Frequency of cyclical replacement tasks (preventive maintenance)

A cyclical replacement task is only valid if there is a wear pattern. That is, if there is "an age in which the conditional probability of failure increases rapidly". The frequency of the replacement task depends on this age, called the lifespan. For example, if the service life of a tire is 40,000 km, then the cyclical replacement task (preventive tire change) should be performed every less than 40,000 km, in order to avoid entering the zone of high probability of failure.

2.18 Frequency of detective tasks (fault finding)

The interval with which the fault finding task (detective maintenance) is carried out is called FFI (Failure Finding Interval). There is a relationship between this interval and the availability of the protection device. Mathematical tools can be used to calculate this ratio, and set the FFI that achieves the target availability.

2.19 The place of redesign in maintenance

One bearing company had the following policy: If a failure occurred more than once, the equipment was redesigned to eliminate the cause of the failure. As a consequence of this policy, the plant was operating more and more reliably, but the engineering department's costs were growing rapidly. As this example illustrates, in most companies suggestions for design changes often exceed the company's ability to carry out these changes. Therefore, there must be a filter that allows us to distinguish those cases where the redesign is justified and recommended from those cases where it is not. This is why for those design changes whose objective is to avoid failures,It is usually more convenient to previously evaluate if there is some other way to handle failures without the need to resort to a design change. For example, a few years later the bearing company realized that only 20% of the redesigns performed were actually worth it, and that for the rest there were other ways to handle failures that were more costly. effective. It should also be noted that design changes are often time consuming and expensive, and that it is not always clear whether they will be effective in alleviating the consequences of failures. In turn, in many cases redesigns introduce other flaws whose consequences must also be evaluated. It is for all this that the redesign should generally be selected as the last option.a few years later the bearing company realized that only 20% of the redesigns carried out was actually worth it, and that for the rest there were other ways of handling failures that were more cost-effective. It should also be noted that design changes are often time consuming and expensive, and that it is not always clear whether they will be effective in alleviating the consequences of failures. In turn, in many cases redesigns introduce other flaws whose consequences must also be evaluated. It is for all this that the redesign should generally be selected as the last option.a few years later the bearing company realized that only 20% of the redesigns carried out was actually worth it, and that for the rest there were other ways of handling failures that were more cost-effective. It should also be kept in mind that design changes are often time consuming and expensive, and that it is not always clear whether they will be effective in alleviating the consequences of failure. In turn, in many cases redesigns introduce other flaws whose consequences must also be evaluated. It is for all this that the redesign should generally be selected as the last option.It should also be noted that design changes are often time consuming and expensive, and that it is not always clear whether they will be effective in alleviating the consequences of failures. In turn, in many cases redesigns introduce other flaws whose consequences must also be evaluated. It is for all this that the redesign should generally be selected as the last option.It should also be kept in mind that design changes are often time consuming and expensive, and that it is not always clear whether they will be effective in alleviating the consequences of failure. In turn, in many cases redesigns introduce other flaws whose consequences must also be evaluated. It is for all this that the redesign should generally be selected as the last option.

2.20 Failure patterns as a function of time

What is the relationship between the probability of failure and time? Traditionally, the relationship was thought to be very simple: the older the equipment, the more likely it is to fail. However, studies carried out in different industries show that the relationship between the probability of failure and the time or hours of operation is much more complex. There are not one or two failure patterns, instead there are 6 different failure patterns, as shown in the original Nowlan & Heap report (Figure 1).

The figure shows the 6 failure patterns. Each pattern represents the probability of failure as a function of time.

_ A pattern A, where the failure has a high probability of occurring shortly after commissioning (infant mortality), and after exceeding an identifiable useful life.

_ Pattern B, or "wear curve".

Figure 1: The 6 failure patterns

_ Pattern C, where you see a continuous increase in the conditional probability of failure.

_ Pattern D, where once an initial stage of increase in the probability of failure is passed, the element enters a zone of conditional probability of constant failure.

_ Pattern E, or random failure pattern.

_ Pattern F, with a high probability of failure when the equipment is new followed by a constant and random conditional probability of failure.

3 Benefits of RCM

RCM implementation should lead to safe and reliable equipment, cost reductions (direct and indirect), improvement in product quality, and greater compliance with safety and environmental regulations. RCM is also associated with human benefits, such as an improvement in the relationship between different areas of the company, fundamentally a better understanding between maintenance and operations.

Download the original file

Rcm