We all have them. They cause us to worry incessantly, lose sleep and frequently miss precious time with our families. They are often the bane of processes that require liquids to be transferred from one location to another. These mechanical monsters are pumps that fail repeatedly and are widely and unflatteringly known as bad actors.
By definition, bad actors are pumps that fail so frequently that they stand apart from the rest of the pump population. There are bad actors that have failed as many as 16 times in one year, some even more often. These troublesome machines sap precious resources from our maintenance departments and prevent us from achieving world-class reliability performance.
Chances are these troublemakers were all carefully selected by well-intentioned vendors and project engineers, and installed dutifully by construction companies. But the devil is in the details. Fatal flawsranging from slender shafts (poor L/D ratios) to poor operating practicescrept into these pumping systems. They crippled performance and forced pumps to lead notorious lives.
The inordinate number of failures experienced by bad actors tends to dramatically skew downward the mean time between repairs (MTBR) for a plant average. For this reason, a key strategy for improving plant MTBR starts by identifying and improving the reliability of ones most troublesome pumps. This article presents a straightforward methodology for addressing the most problematic pumps at an operating plant.
Addressing bad actors
To address bad actors, one must first define what constitutes such pumps. Usually, definitions contain a combination of failure rate and repair cost criteria. For example, one may define a bad actor as any pump that fails two or more times and has caused more than $10,000 in repair costs over the previous 12-month period. Of course, these criteria can be modified to satisfy management preferences.
Some plants also include lost opportunity costs during the same reporting period. It is possible to simplify reporting by combining repair cost and production losses into a single figure called losses. These multiple criteria tend to cull nuisance pumps that fail many times each year but do not have a large annual repair total. By using the multiple criteria of failure rate and repair costs, one can quickly identify the pumps having the greatest impact on reliability.
Go after the top. After creating a list similar to Table 1, one simply sorts in descending order of the most to the least costly pump. The top 10 on this list represents bad actors. This list should probably be compiled quarterly, semi-annually or annually. It is customary to start by attacking the worst of the bad actors.
Examine the equipment history. The next steps describe more closely how to attack each bad actor. Lets examine a hypothetical data set for a bad actor. To construct a data format similar to Table 2, one needs to know the date of each failure and the repair cost for every past failure in the time frame of interest. A starting point must be defined, as well.
In the following example, the first failure occurred 15 months after the defined starting time and the repair cost was $5,000. The next failure occurred 12 months after the first failure and resulted in a repair cost of $5,500. This means that the cumulative time (third column) for the second failure was 27 months and the cumulative repair cost (fourth column) for the second was $10,500. For each subsequent failure, you keep accumulating the failure numbers, time and repair costs, as seen in the cumulative failure, time and cost columns in Table 2.
Plotting the cumulative failure number and cumulative repair cost value vs. the cumulative time will yield a plot similar to the one shown in Fig. 2. One might call these reliability growth plots because they clearly illustrate if the failure rate is constant or changing over time and if the rate of cost to perform maintenance is changing over time. A constant slope means the failure rate is constant, while a curving plot means the failure rate is changing. The reliability growth plot in Fig. 2 shows a constant failure rate up until months 160 to 170. After that time, the failure rate and expenditure rate begin to increase and eventually settle into a new higher failure rate for some undefined reason.
| Fig. 1. A troublesome centrifugal pump.|
| Fig. 2. Reliability growth plot for a hypothetical |
These reliability growth plots offer a wealth of information. First, the cumulative failure plot shows if the failure rate is constant or changing with time. If the failure rate did change, it tells the analyst when the change occurred. One can discover if the failure rate was always bad or if it changed at some time in that past. Similarly, examining cumulative repair cost data allows the analyst to determine if something changed in the past or if failure costs have been constant from the beginning.
If there is a defining moment when reliability decreased, the analyst might ask what changed. Interviews with operators and mechanics allow us to find reasons for the observed change in reliability. Field personnel very often provide key insights that assist in complex root cause failure analyses (RCFAs). Among the clues, we may find mechanical and procedural changes, such as:
The nature of the process has changed
The control scheme was modified in the past
The seal flush source was modified due to process contamination concerns.
Interviewing personnel close to the equipment is a great way to uncover subtle issues that may be affecting reliability performance. Here, then, is a telling example involving pumps that were failing every few months. It was discovered that a production engineer decided to eliminate the use of an external seal flush because he felt it was contaminating the process. After convincing him to reinstate the flush at a lower, friendlier rate, seal life returned to the anticipated norm.
Suppose the general trends observed on reliability growth plots are fairly constant over the operational lives of the pumps in question. It would then be fair to assume there is something wrong with the basic design of the pumping system. Possible causes may include:
Poor L/D ratio
Poor pump selection
Excessive piping strain.
The reliability growth plots also tell reviewers how much the pumps are costing. In this particular example one can quickly conclude that $126,700 was spent over a period of 219 months. This equates to an annual rate of $6,942. The annual rate of expenditure conveys the value of solving the problem. If one were to assume that annual repair costs can be reduced to 25% of the starting value, one might expect to save about $5,200 per year. For a two-year payback needed to justify capital expenditures, spending about $10,000 on a solution would be justified.
To ensure an acceptable return on investment, the author tries to avoid working on pumps that have annual repair and process losses below $10,000. Although it is often assumed that finding economic justification of reliability projects for pumps with annual losses less than $10,000 is next to impossible, this rule will not hold whenever simple seal improvements, bearing upgrades or procedural changes are involved.
Conducting detailed design audits and RCFAs. The next step in dealing with troublesome pumps requires conducting a design, installation and performance audit. Such an audit involves:
Reviewing the pump selection, driver selection, seal design, piping design and control system design
Conducting a detailed vibration analysis of the pump, motor and piping system
Reviewing the base plate and foundation design
Assessing current hydraulic performance vs. what was expected or ascertained on earlier occasions.
It can be said that this phase of an audit includes ascertaining that the correct pump and system design are used for the service in question. The cold eye review will often be appropriate. It refers to the fresh assessment of a system or process by an experienced, unbiased third party. This party could be another pump engineer or technician accompanying the audit engineer on his or her field visit and inspecting the pump in question.
The intent of the cold eye review is to look for anything that might be considered unacceptable. Excessive vibration, lack of piping supports, inattention to thermal growth and absence of pressure gauges are among the many things noted and requiring remedial action. After living with a problem pump for a long period, we can become oblivious to issues right in front of us. The cold eye review can help uncover potentially important issues that were overlooked by those living and working close to chronic bad actors.
Once the analyst has reviewed the failure history and conducted a design audit, some seemingly elusive contributing factors begin to stand out. The next analysis step requires us to determine the root causes of failures. It is important not to stop at a physical root cause, such as the pump failed due to a bearing failure or shaft failure. A good investigative team will uncover any latent root causes, ones that often lurk beneath the figurative surface. The key point here is that an investigative team must be open-minded during the data collection and evaluation. Parts fail for a reason and the decisions of people led to whatever issues we now experience. Your goal is to seek the truth and back it up by science.
Determine a path, then track progress. Once the root cause and contributing factors are established by the team, it is time to formulate a plan of attack. It has been said that less is more. In other words, it is easier to sell two recommendations to management than 20 recommendations. It is also easier to implement two recommendations than 20. This doesnt mean that no more than two recommendations can be made. It simply means that, by only presenting the highest priority recommendations to management, ones chances of securing approval dramatically improve.
Dont be afraid to fail; we all fail occasionally. The best approach involves gathering lots of data, analyzing the data in exhaustive detail, and using a repeatable and structured RCFA approach. The RCFA process is a process of continuous improvement. Some problems are so complex that they may take several tries to solve.
After obtaining management approvals, it is time to implement remedial recommendations in a timely fashion and to track the benefits of the improvements. Proof of success will be seen in an updated reliability growth plot, where, hopefully, reliability improvements are manifested and sustained.
Whenever clear improvement is seen, the news deserves to be published. Management, operating personnel, and contributors will be motivated to continue working toward reducing, and even eliminating, bad actors until plant-wide MTBR targets are reached.
Critically important steps
There are seemingly insignificant buying decisions and other events that can occur during the early life of a pump that eventually lead to below-average reliability performance. However, reliability improvements and systematic upgrading of weak links can turn things around. Successful reliability improvement programs require that latent root causes be identified and corrected. Starting with ones most troublesome pumps, failure lists must be systematically reduced until world class reliability is achieved. Remember these critical steps in bad actor reviews:
Define, list and compare
Go after the top bad actors
Examine the equipment history
Conduct a detailed audit
Perform an RCFA
Determine a path forward
Track your progress.
There will always be another pump failure from which to analyze and learn. Every failure should be considered an opportunity to learn more about equipment, processes and systems, and improve them. HP
Robert Perez is the author of Operators Guide to Centrifugal Pumps and co-creator and editor of the PumpCalcs.com website. He has more than 30 years of rotating equipment experience in the petrochemical industry and has numerous machinery reliability articles to his credit. Mr. Perez holds a BS degree in mechanical engineering from Texas A&M University at College Station and an MS degree in mechanical engineering from the University of Texas at Austin. He holds a Texas PE license.