## June 2018

## Special Focus: Process Optimization

# Use advanced predictive analytics for early detection and warning of column flooding events

In this study, a methodology was implemented to predict crude distillation tower flooding events based on key process variables, including product yields, column pumparound (PA) flowrates, column temperatures and overhead reflux flowrate.

In this study, a methodology was implemented to predict crude distillation tower flooding events based on key process variables, including product yields, column pumparound (PA) flowrates, column temperatures and overhead reflux flowrate. A logistic regression model was selected as the predictive tool due to its ability to differentiate flooding events from non-events, as well as the ease of implementation.

As column flooding events occurred very infrequently—with an incidence rate of about 0.25% for the distillation tower considered—a data mining oversampling technique was used to improve the model sensitivity to flooding events. Several logistic regression models were constructed using different time lags of the predictor variables. The performance results of the different models were compared using receiver operating characteristic (ROC) curves, which provide the tradeoff between model event detection rate, also known as the true positive rate (TPR), and the model false alarm rate, also known as the false positive rate (FPR). The final model selected was validated against a stationary monitoring study and radioisotope scans of the column. The model was then implemented in a data historian system to provide early warning to engineering and operations of potential flooding events for preventive action.

## Focusing the study

A crude distillation unit (CDU) takes a crude stream and separates it into boiling point fractions, which include naphtha, kerosine, diesel and tower bottoms residue.1 A schematic of the atmospheric crude column utilized in this study is shown in **FIG. ****1****.** This particular crude tower is a typical, mild atmospheric operation [327°C (620°F), 38 psig flash zone] with 21 rectifying trays, a packed gasoil wash section and a shrouded stripping section. Overhead light naphtha is routed to a gas plant, and atmospheric tower bottoms are routed to a fluidized catalytic cracking unit (FCCU). Two side-draw products (kerosine and diesel) from the crude tower are further processed within the refinery. The tower has two PAs (kerosine and diesel) utilized to control product cutpoints according to downstream specifications. In a tower with this number of trays at this specific tray spacing, plus a packed bed, an overall column pressure drop of 3.2 psid is normal. Higher pressure drop, particularly when abruptly increased, typically signals accumulation of liquid and flooding.

This study focuses on column flooding events that lead to unstable operations, internal liquid accumulation (flooding) and, ultimately, off-spec products. These events can be costly, often requiring a reduction in crude rate to stabilize unit operations.

The refinery reported two types of operational upsets, both characterized by high column pressure drop. In the first, heavy material contaminated the diesel product (“black diesel”), and bottoms product level was lost; and in the second, the diesel pumparound draw Tray 21 “dried up,” resulting in lost flow to the PA and product pumps. In both cases, the tower pressure drop would abruptly increase from a typical 3.2 psid to more than 5 psid, signifying liquid accumulation, or “flooding.”

An upset creating black diesel suggests entrainment of heavy components from the flash zone. The normal diesel draw temperature is 293°C (560°F). With a 321°C (610°F) flash zone temperature, bottoms residue is below its boiling point—50% point by ASTM simulated distillation of 449°C (840°F)—and thereby in the liquid phase. The presence of very heavy hydrocarbon in the diesel product can be attributed only to liquid droplets originating from the flash zone, or a leak in a cross exchanger with crude. This type of entrainment is typically caused by one of several factors:

- Very high superficial velocities in the flash zone, which is more common in vacuum towers where the vapor density of the two-phase feed is very low.
- Mechanical failure of the feed inlet device, creating shear of droplets or other hydraulic restriction. This tower has a vapor horn, which in other units has resulted in mechanical damage from a major process upset. This is again much more common in a vacuum tower, where process upsets often create much higher uplift force due to rapid expansion of vapor under vacuum pressure.
- Accumulation of liquid from the stripping section that subsequently cannot pass down the tower and, if it reaches the two-phase feed inlet line, will create severe entrainment. The tray downcomers in the stripping section of this crude tower operate at near-liquid-hydraulic capacity during normal bottoms residue production rates.

The role of the diesel PA is to utilize cross-exchange with the cold crude feed to condense the distillate-range material coming up the tower, to be either pulled as diesel product or overflowed from the draw tray to the trough-style distributor below the draw that feeds the packing bed. Internal reflux from the kerosine fractionation section also contributes liquid to these trays. To “dry up” both the diesel draw and the diesel PA, one of several possible operating scenarios must exist:

- The heat balance in the tower is impacted to where distillate-range material is no longer condensed by sufficient amounts to overflow the tray and be pulled at the product draw. This would be characterized first by an increase in vapor temperature below the diesel draw tray when overflow is lost, and can be caused by loss of diesel PA duty or by loss of internal reflux from the kerosine section.
- Physical restriction of a tray within the diesel PA section to where the returning diesel PA stream is backed up above the draw tray, robbing the tray of inventory.
- Equipment damage of the diesel draw tray to where liquid bypasses the tray and dumps directly onto the packed bed below. Bypassing of subcooled liquid directly onto a packed bed below has very low residence time and creates a large quench of the flash zone.2 This will be evidenced by loss of temperature in the flash zone, and either higher column bottoms level or accumulation of liquid above the stripping section (if hydraulically limited).
- Distillate-range material can be held up or otherwise prevented from reaching the diesel section from below as a result of loss of “lift” from the flash zone, as caused by significant loss of heat input to the tower (e.g., heater trip).

The process variables that represent the conditions that influence the two operational upset scenarios here were considered in the predictive model. These variables include product yields, column temperatures, column PA flows and overhead reflux. Several time lags of the predictor variables were considered to predict the column flooding events. Classification and regression trees (CARTs)3 and logistic regression4 modeling methods were considered. Operating data for 9 mos at 1-min time intervals were used to build the models. A column flooding event was defined as operations with an overall tower pressure drop (tower dP) exceeding 5 psi.

The methodologies used in the study include:

- Collect actual column operating data at 1-min time intervals
- Classify observations as a flooding event if the column pressure drop exceeds a specified threshold (5 psi in this study)
- Create oversampled data sets using synthetic minority oversampling technique (SMOTE) method5
- Build logistic regression models using oversampled data sets
- Evaluate the model’s ability to predict actual flooding events using ROC curves4
- Examine behavior of key process variables prior to the event to validate the model.

This article is organized according to the following sections: model screening; model tuning; model validation using the results of a radio isotope scan of the column, along with a stationary monitoring study; and study conclusions.

## Model screening

Operations where the overall tower dP exceeded 5 psi were considered to be a column flooding event. **FIG. 2** shows the tower dP as a function of time. Several events were observed from November 2015–January 2016, as evident from the significant rise in tower dP. A binary variable was created to designate when a flooding event occurred. Observations when the tower dP exceeded 5 psi were assigned a value of 1; otherwise, the value assigned was 0. All flowrates were expressed as ratios relative to crude rate, as it was found that ratios performed better than flowrates in predicting flooding events, which reduced the number of false positive signals at a given event detection rate.

As mentioned, the CART classification tree method and logistics regression were considered as predictive modeling tools. The CART method was implemented with the R part package.6 The logistic regression method was implemented using the SAS procedure PROC Logistic.4 A CART model was built using 30-min time-lagged data of the predictors. **FIG. 3** shows the CART classification tree results.

The model predicted a column flooding event when the terminal node value is 1, and a non-flooding event when the value is 0. The tree clearly shows that low kerosine yield, along with high reflux ratio, were associated with a significant portion of the flooding events. Note that approximately 23% of the events (216 of 947) were observed when the kerosine yield was less than 14% and the reflux ratio exceeded 28%. Note that close examination of the results shows that this method failed to differentiate 409 out of the total 947 events, so 43% of the events were undetected. This is equivalent to a 57% true positive rate (TPR), also known as sensitivity, which represents the percentage of true events correctly predicted by the model.

Logistic regression models were constructed using 10-min, 20-min and 30-min lagged data. The logistic regression model predicts the probability of an event occurring based on a set of predictors. The ROC curves for the three logistic regression models are shown in **FIG. 4.** The ROC curve provides the tradeoff between the TPR and the false positive rate (FPR), also known as 1-specificity for a model. The TPR represents the percentage of the flooding events correctly predicted by the model. The FPR represents the percentage of non-events predicted as events by the model.

Each point on the ROC curve represents a different probability cutoff to predict an event. Note that, as expected, the model built using the 10-min lagged data outperformed the other two. Also note that the TPR for the 30-min lagged logistic regression model is more than 80% at a 3% FPR, which clearly outperformed the CART classification tree that yielded a 57% TPR. Based on the improved results, as well as the ease of implementation, the logistic regression modeling method was chosen.

## Model tuning

The 0.25% low incidence rate of the column flooding events for the column considered made it challenging to predict column flooding events with traditional methods. To improve the model sensitivity to flooding events, the SMOTE oversampling method was used. This technique creates synthetic instances of events by interpolating randomly sampled event data in close proximity. The SMOTE method was implemented using the R DMwR package, 5 which differs from other oversampling methods where oversampling is limited to repeating existing observations. An oversampled data set with 10% of the data representing events was created using the SMOTE method. A significant improvement in the model event detection rate—or TPR at a given FPR—was observed when using 30-min time-lagged data. The significant improvement observed using the SMOTE method allowed the use of predictor data 90 min before the event for earlier detection.

**FIG. 5** shows the ROC curves for the four scenarios:

- 30-min lagged data with original data set (ORIGI30_D)
- 30-min lagged data with 10%/90% oversampling scheme (SM30_1090)
- 90-min lagged data with original data set (ORIGI30_D)
- 90-min lagged data with 10%/90% oversampling scheme (SM90_1090).

Note that the 30-min and 90-min models using the oversampled data sets significantly outperformed the models using the original data sets. The 90-min time-lagged model using the oversampled data set was selected over the 30-min time-lagged model to provide earlier detection of flooding events and adequate lead time for preventive action.

Odds ratio charts were also generated to identify key variables associated with flooding events. The odds ratio represents the ratio of the odds when the predictor value is increased by one unit. The odds are defined as the probability of an event occurring divided by the probability of an event not occurring. **FIGS. 6** and **7** provide the odds ratio charts for the 30-min time-lagged data model and the 90-min time-lagged data model built with the oversampled data sets, respectively. Note that for the 30-min time-lagged data model, reflux ratio and kerosine yield appear to be the most significant factors impacting the odds ratio.

An increase in the reflux ratio by one unit increases the odds by close to 70%, while an increase in kerosine yield decreases the odds by more than 50%. Conversely, a decrease in kerosine yield by one unit increases the odds by more than 100%. For the 90-min time-lagged data model, a one-unit increase in tower bottoms yield increases the odds by about 50%, while a one-unit decrease in kerosine yield increases the odds by more than 100%. The 90-min time-lagged data model identified an increase in bottoms yield and a decrease in kerosine yield as early indicators of a flooding event. The 30-min time-lagged data model identified increasing reflux ratio and decreasing kerosine yields as late indicators of a flooding event.

**FIGS. 8a** and **8b** illustrate the tower dP and the predicted probability from the 90-min time-lagged data model, along with trends of several of the key process variables for one of the events. The analysis assumed a 0.3 probability cutoff to predict the events, and the model predicted the event before the tower dP crossed the 5-psi threshold. Note that the tower bottoms yield had been rising prior to the event and the kerosine yield had been declining, which is consistent with the behavior observed in the odds ratio chart of the 90-min time-lagged data model. Also note that, prior to the event reaching the maximum tower dP, the reflux ratio rose and the kerosine yield declined, which is consistent with the odds ratio chart of the 30-min time-lagged data model. **FIGS. 9a** and **9b** illustrate similar behavior for another event.

## Model validation using independent study with isotope scanning and stationary monitoring

Radio isotope scans and stationary source monitoring are commonly used to identify the specific location of a hydraulic restriction(s) within a tower. For this column, both troubleshooting methods were utilized, as the proper identification of the area of the tower inducing the flood events was necessary to develop a scope for repair or other maintenance.

The standard radio isotope scan indicated several areas of concern (**FIG. 10**). The tray above the diesel draw collector tray (Tray 21) showed abnormally low liquid height. A normal liquid loading would be consistent with that shown on Tray 20 above it. It is possible that some type of damage to this tray contributed to a hydraulic restriction of liquid down the tower.

Also noteworthy is the height of clear liquid on the collector tray located below the feed inlet vapor horn (above bottoms stripping Tray 22). The liquid height exceeds the internal riser height, and may cause entrainment of liquid up past the flash zone.

Because flooding is a dynamic event, it can be difficult for a full elevation isotope scan to pinpoint a flooding event. Unless the event was captured during its incipient phase, the specific initiating location cannot be identified. For such scenarios, a stationary monitoring study utilizing a single isotope source with multiple detectors is employed.** FIG. 11** highlights the design and the results of the stationary monitoring study. For this study, sources were placed at locations at each of the key collector trays within the vapor space. By beginning the monitoring study with all detectors reading clear vapor, the sequence by which the liquid accumulates in the tower can be determined.

During the baseline condition, the tower was unflooded and the stationary detectors show absorption consistent with clear vapor. The diesel product draw setpoint was stepped down on flow control. By pulling less diesel out of the tower for a fixed PA duty, more liquid overflows the diesel draw collector tray and will ultimately end up as a higher bottoms residue rate. The induced liquid traffic load to the bottom section of the tower proceeded normally at first, with the temperature indication within the packed bed decreasing, as additional vapor is quenched by the cooler distillate refluxing down the tower below the diesel draw tray.

However, as the bottoms product rate reached 26 Mbpd, the pressure drop in the column began to increase, and Detector 3 began to read higher absorption consistent with liquid level reaching into the vapor space above the collector. It is hypothesized that the tower is sensitive to higher bottoms liquid rates, which is a function of not only the crude charge rate, but also the internal reflux from the kerosine and distillate sections. Therefore, reduced product draw rates as a percentage of charge rate can induce liquid accumulation in the tower.

## Takeaway

A predictive model has been developed to provide early detection and warning of potential column flooding events to engineering and operations. A logistic regression model was chosen over other methods due to its higher event detection rate and ease of implementation. The model performance was improved by using a data mining oversampling technique. The analysis highlights the importance of determining the appropriate time lag of the predictor variables for early detection. The model predicted 83% of the actual events with a 2.5% false positive rate using the 90-min time-lagged data of the predictors.

The variables identified as the best predictors of a flooding event were confirmed using a stationary monitoring study and radio isotope scanning methods. The model and the associated process monitoring using predicted probabilities were implemented in a data historian system. Flooding event conditions are automatically detected when the predicted probability exceeds a 0.3 threshold. The operating conditions are recorded when this threshold is exceeded and automatic notifications are generated. To reduce the number of false positives generated by the model, the predicted probability is required to exceed the 0.3 threshold for a consecutive 5 min before a signal is generated. The model has successfully predicted several flooding events and operational upsets over the last 9 mos of operations, with minimum false positives. **HP**

**Note**

The views expressed in this article are those of the authors and do not necessarily reflect the views of Valero Energy Corp.

**Literature Cited**

- Gary, J. H., G. E. Handwerk and M. J. Kaiser, Petroleum Refining: Technology and Economics, 5th Ed., CRC Press, 2007.
- Hanson, D. W., J. Brown Burns and M. Teders, “How does a collector tray leak impact column operation? Comparing the impact of two leaking FCC light cycle oil collector trays,” AIChE Spring Meeting, Distillation Conference, Austin, Texas, 2014.
- Hastie, T., R. Tibshirani and J. Friedman, The Elements of Statistical Learning, 2nd Ed., Springer-Verlag, New York, 2009.
- Allison, P. D., Logistic Regression Using SAS Theory and Application, 2nd Ed., SAS Publishing, 2012.
- Chawla, N. V., K. W. Bowyer, L. O. Hall and W. P. Kegelmeyer, “SMOTE: Synthetic minority over-sampling technique,” Journal of Artificial Intelligence Research, 16 (2002) 321-357, 2002.
- R package rpart: https://cran.r-project.org/web/packages/rpart/rpart.pdf
- R package DMwR: https://cran.r-project.org/web/packages/DMwR/DMwR.pdf
- Isotope scanning by Tracerco, part of the Process Technologies Division of Johnson Matthey.
- Brown Burns, J., K. Becht and B. Mueller, “Troubleshooting crude column constraints,” PTQ, 3Q 2017.

## Comments