May 2018

Special Focus: Maintenance and Reliability

Instrumentation reliability: A systematic approach

Many challenges exist in maintaining instrumentation reliability.

Many challenges exist in maintaining instrumentation reliability. Some of these obstacles include: the migration from mechanical to electronic-type instruments, pushing more smart devices into harsh field environments, the openness of control systems and challenges faced during the right competency development. Market demand, “cutthroat” competition and the decreasing trend of fiscal contribution of products add to the challenges being faced in the processing industries. Ensuring a plant’s 100% availability is the prime goal of a leading manufacturing unit.

Instrumentation reliability starts at the engineering stage of a plant. The enhanced reliability of a plant can be achieved by adopting good engineering practices. Subsequently, sustainability of the desired reliability can be attained through abiding by good maintenance systems. 

In general, 100% reliability of a manufacturing unit cannot be achieved by design alone. In many ways, it depends on the effort put forth during building a disciplined maintenance system, along with the seamless adoption of the best prevailing maintenance practices. In a manufacturing unit, a sustained, reliable maintenance system can be ensured by following these practices:

  • Building an equipment master covering all field instruments and instrumented systems, defining the maintenance strategy against each piece of equipment and the frequency of maintenance
  • Preparing a foolproof standard maintenance procedure (SMP) and checksheets
  • Ensuring the availability of as-built documentation
  • Conducting a root cause analysis (RCA) on all failures
  • Identifying critical/insurance spares and ensuring their availability and preservation
  • Preparing a standard job list for turnaround (TAR) and opportunity-based shutdowns
  • Safeguarding from environmental harshness
  • Preparing a safety integrity level (SIL) assessment and implementing voting logic wherever required (valid for relatively old plants)
  • Preparing plans for obsolescence mitigation and asset renewal
  • Assuring good quality of utilities, such as power and air
  • Developing competency
  • Conducting periodical audits.

Detailed insights on these practices are provided here.

Equipment master, maintenance strategy and frequency. The equipment master should contain data fields, such as the tag number, functional location, make/model/serial numbers, service description, etc. Once the equipment master is ready, it is important to define the maintenance strategy against each piece of equipment. The strategy describes the nature of periodic jobs that need to be carried out on equipment, such as predictive maintenance (PDM), preventive maintenance (PM)-major, PM-minor, etc.

Generally, PM jobs, scheduled for execution during plant running conditions, are non-invasive in nature. This type of PM job can be classified as PM-minor. PM-major jobs are treated as invasive in nature and are carried out during a plant or equipment/unit shutdown. PM-major work can also be done while the plant is running, if all precautionary measures are taken. However, this type of work should only be done in selective cases.

Instrumentation PM should be done with an enterprise resource planning (ERP) package-based database that includes an auto-scheduled feature. Typical task lists for PM-major and PM-minor of control valves are shown in TABLE 1. These task lists should be linked with the PM plan.

Determining the PM frequency must be based on service/application. As a guideline, equipment that is involved in a plant’s safety system, fiscal measurement or related to the environment may have a frequency that should not exceed 52 wk. Equipment that are in closed control loops may have a frequency of between 52 wk and 204 wk. The remaining installed equipment can be scheduled on a demand basis.

PDM in instrumentation systems is synonymous with look, listen and feel (LLF). The entire plant should be divided into several functional areas. LLF for all installed instruments covered under one particular functional area should have its schedule spread out over 1 wk and be completed at one time. Generally, LLF of one functional location should be completed every 3 mos. LLF check sheets should be as specific as possible and may cover specific observations, such as instruments exposed to ambient heat and harshness due to process line leakage, line vibrations impacting instrument small-bore piping, and gradual degradation of instruments that face continuous movement, among others. 

Preparing robust SMPs and check sheets. SMPs must be prepared for any routine and nonroutine jobs. Each SMP is to be translated into a check sheet, which should be filled at the time of execution of the respective job. To avoid any unexpected safety incidences, risk assessment (RA) sheets must mention all associated hazards and mitigation methods before the execution of work. SMPs and check sheets should have a review frequency of once every 3 yr.

Ensuring availability of as-built documentation. A company-wide system must exist that permits any major or minor modification intended to be carried out in the existing system. Sufficient control and review processes should be in place before moving ahead with a proposed modification. Once the modification is carried out, relevant documentation needs to be updated before the proposal’s closure. Documents that must be completed include a piping and instrumentation diagram (P&ID), a loop diagram, equipment specification, and a cause-and-effect diagram, among others.

RCA. All failures that contribute to a production loss, a safety incident or product quality loss should be analyzed to determine root causes. RCA should be done by a multidisciplinary team of experienced engineers. A suitable “why-tree analysis” should be made to derive the exact root cause of failure before putting forward responsibility and the target of implementation recommendations. A time-bound plan must be in place to implement recommendations.

Identifying and ensuring availability of critical/insurance spares. A system should be established to ensure the availability of critical and insurance spares. Critical spares are those that are used for single-line equipment. A full unit instrument can also be preserved as a critical spare if failure of the unit becomes detrimental to plant safety, regulatory norms or yield loss of product quality/quantity. Insurance spares generally include those items that have a long lead time, are capital-intensive and are crucial to operations. Most importantly, the life of the spare should be equal to that of the parent equipment. 

Preparing a standard job list for TAR and opportunity-based shutdowns. Certain jobs should be listed so they can be planned without fail during a TAR or opportunity shutdown. Those standard jobs include:

  • Control valve regulator back flushing
  • Overhauling valves that do not have a bypass/hardwheel, and are installed in exotic service and/or experience frequent operations
  • Full load testing of power supplies
  • Testing of power supply network components (e.g., MCBs, contractors)
  • Checking grounding systems
  • RCA recommendations that call for a shutdown for implementation
  • Carrying out the draining of air headers at the low point
  • Taking system backup and redundancy and trip testing.

Preserving spares. Generally, good preservation techniques are adopted for the enhancement of a spare’s shelf life. Electronic cards/modules should be kept in a dust-free room at a temperature of 25° above/below 3°C and a moisture content of 65 above or below 5%. These modules should be kept in antistatic bags. It is advisable to conduct a performance test every 6 mos on all modules. Generally, parts made of rubber/neoprene have a shelf life of less than 5 yr. It is recommended to scrap these parts after 5 yr.

Safeguarding from environmental harshness. Equipment and electrical enclosures, valve accessories and impulse tubing must be protected from adverse surroundings. A detailed plan should be in place for rain protection, winterizing, insulation or suitable supporting when required.

SIL assessment and voting logic implementation. Approximately 20 yr ago, the concept of an SIL study was not in practice. Many engineering companies did not opt for a fault-tolerant system. System-level redundancy was a far-fetched idea. Engineering companies used to install many more excess trip initiators/loops as a fallback arrangement for safety assurance. These additional loops can be identified and removed from the trip logic, thus reducing the chances of spurious trips. Similarly, any critical loops that have a high impact can be associated with voting logic if a newly assessed SIL level demands it.

Obsolescence mitigation plan and asset renewal. It is important to be prepared to tackle obsolescence. A lifecycle plan for every major piece of equipment should be in place. While making a lifecycle document, the original equipment manufacturer’s (OEM’s) notice on obsolescence and assured periods for supply of spares are to be considered. The replacement/upgradation plan is to be prepared based on the OEM’s document, available spares in the warehouse, installation base and consumption pattern.

Ensuring good quality of utilities. It is important to maintain the quality of both of the previously mentioned lifelines for instrumentation systems. DC power is mostly used in instrumentation systems. It is important to monitor earth leakage current on a continuous basis. Biannual thermography is prescribed in power network junctions. Each bulk DC power supply must be tested for full load during each TAR. The integrity of the grounding system needs to be checked during each TAR, as well.

To maintain instrument air quality, it is important to monitor the dewpoint at the supply and consumer ends on a regular basis. If the instrument air line is composed of carbon steel, it is essential to purge the line at every TAR, as well as to carry out a flame test to detect if any contamination of instrument air with iron particles is present. This test should be conducted once per year.

Competency development. In nearly every industry, one-third of all failure incidences are due to poor workmanship. These incidents are primarily due to a gap in competency and/or an inadequate skill set. An exhaustive program needs to be created to address job-specific competency and skill set requirements. First, a competency catalogue should be developed to indicate the required competencies for a particular maintenance position. This catalogue will help identify training modules that each individual needs to complete. Training needs are to be identified based on the gap assessment between demanded competency for a particular job position and the knowledge and skills of an individual occupying that position. The introduction of a robust validation system is essential to ensure that the system is more effective.

Periodical audits. A well-established audit system should be built to ensure zero deviation from the established system. The audit process is expected to cover the following:

  • General housekeeping
  • Documentation
  • Bypass status of trip initiators
  • Maintenance KPI status
  • Competency validation status for both company staff and contract workers
  • RCA quality and recommendation status.

Takeaways. These practices are guidelines only. The reliability of installed instrumented systems can be achieved through a disciplined approach by the maintenance crew. It should be kept in mind that equipment reliability primarily depends upon the extent of abuse it faces during operation. All pieces of equipment are designed for an intended function within a specific boundary of operational parameters. A system to monitor any violation of this specific boundary needs to be developed and followed. The same must have rationale in engineering perspective before the continuation of an out-of-window operation. HP

From the Archive