Most of us will have come across the saying ‘data is the new oil’—although this term is becoming something of a loosely-used buzzword in industrial circles, it is gaining traction for good reason. 

I have been working in the applied and industrial artificial intelligence (AI) field for the best part of 20 years. To me, data turned into insights represents the bridge between theory and real-life practice and an opportunity to unlock the potential in already collected data. It also helps us to measure and solve uncertainty and test assumptions against reality. It provides a non-biased, repeatable and factual grounding from which we can transform for the better. 

In the offshore industry, we are currently sat at the tip of an iceberg in regard to unlocking the potential held inside unstructured data. If leveraged effectively, these data-driven insights can inform decisions across an enormous scope of critical business functions—from improved operations and informed planning to focused training and personnel development. The potential to advance safety performance is also significant.

Part of my recent work with ABS has looked into the role data science can play in underpinning offshore asset maintenance operations and driving condition-based maintenance decisions. 

Unplanned downtime, as we know, is costly for all offshore operators. A study by Baker Hughes found that 1% of unplanned downtime (i.e. 3.65 days a year) costs offshore oil and gas organizations on average $5.037 million annually. The industry averages a little over 27 days of downtime every 12 months, which translates into costs of about $38 million—for the worst performers, figures are upwards of $88 million. 

Why (quality) data matters. 

Predictive maintenance—which anticipates problems and enables them to be fixed before they arise—is a critical part of the solution.

Similarly, condition-based maintenance (CBM) is a maintenance strategy that dictates decisions about what work needs to be carried out based on the actual condition of an asset. Under CBM, maintenance should only be performed when certain indicators are triggered and when it is economically optimal. In other words, maintenance is performance when there are signs of decreasing performance and/or upcoming failures and with the determination of the right opportunity at an economically optimal time or location. 

The need for CBM arises in part due to the challenges of time-based maintenance practices, as well as for reducing uncertainty during maintenance events, requirements to safely extend the service life of equipment and achieve maximum availability. 

Indeed, the evolution of maintenance strategy has followed the path from corrective maintenance to preventive maintenance to predictive and condition-based maintenance; each evolution helping to take away more uncertainties from decision makers while maintaining safety standards. 

It is important to stress, however, that condition-based maintenance is not a replacement for subject matter experts (in fact, CBM relies on their input to train and utilize their experiential knowledge to guide improvements). Rather, it is a methodology and not a tool designed to inform maintenance strategies.  

Recent developments in this field are enabling new opportunities for marine and offshore operators to adopt a more effective asset management strategy - the crux of this strategy being to combine data analytics with historical data and operational experience to reduce unplanned downtime and achieve higher operational availability. 

This involves fusing the data generated from operations and historical maintenance, combining diverse information from sources such as equipment design, sensor time series, inspection records, performance reports and class-survey reports. From this, an understanding of observed failure trends and emergent risks can be gained, which in turn provides the data-driven insights needed to underpin CBM. 

One of the largest obstacles to obtaining value from this exercise is that the maintenance history and operator observational data is typically unstructured. This limits the achievement of CBM insights to be based mostly on structured parameter sensor data and in-situ or offline tests such as vibration and oil quality. 

In response to this problem, part of my recent work has been looking at ways to better unlock the value of unstructured data, usually stored in an operator’s computerized maintenance management system (CMMS), repair and spare logs, and other repositories where unstructured data gets generated in an operation.

A typical offshore CMMS allows users to input free-text status reports and many of the drop-down fields often have missing or incomplete entries. The problem with free-text fields is that they are written in a natural language format—how a person would speak and/or using community-specific terminology—making it a major contributor to the poor quality of data generally found in a CMMS. 

Specific examples of data quality issues include: non-standard abbreviations used by different operators and crew; inconsistent equipment taxonomy; leaving critical CMMS fields blank due to lack of time or knowledge; common spelling and grammar mistakes; variation in sentence structures used to describe the same situation, etc.  

All of this means that datasets must be analyzed to extract useful information locked in an unstructured form before further analyses can be performed—a process which can be the difference between uncovering a systemic problem and letting it slip through the net due to the nature of the data. 

Historical maintenance records have been used to train AI algorithms capable of working with unstructured free-form data to automate the task of identifying maintenance action types, differentiating maintenance scope and also isolating maintenance triggers. This results in faster, repeatable and more accurate analysis leading into identifying emergent issues and model the reliability risks in an asset fleet.

This formed a key part of our recent studies, which involved developing advanced methods to perform natural language processing customized to the unique problem domain of marine maintenance. 

We concentrated efforts on measuring, identifying and improving data quality and extracting relevant information from unstructured textual data in a form that can then be used for further model building. Several models with diverse hypothesis spaces and complexity were tested for building model ensembles. These included randomization-based methods, kernel-based techniques, probabilistic models and instance-based learning ideas. 

Following this work, we now have generalizable, accurate and automated modeling process to extract insights from otherwise unstructured and free text information, coming from the domain of diverse marine and offshore assets. In so doing, we have also developed a set of AI methods to address the data quality challenge posed by unstructured operations generated data sources.

This presents many advantages. First, it facilitates faster and more reliable data processing, and provides robustness against variations in expression of semantics which are commonplace in marine and offshore working environments. 

Furthermore, it grants the ability to perform data fusion, and being able to seamlessly use multiple sources of data as one as opposed to in siloes. 

Trust the data—adopting a data-driven but domain-guided decision-making paradigm.

By adopting some or all of these measures to enhance data quality and derive maximum effectiveness of condition-based maintenance, offshore operators could reverse much of the disruption and financial cost caused by unplanned downtime.  

Furthermore, the insights extracted can inform other fundamental business operations such as human development and training. For original equipment manufacturers (OEM), it can highlight common faults which can be fixed at source on the production line before it even reaches an end-user, or identification of equipment issues before they go on to become widespread across the deployed fleet, prompting timely design, controls or configuration enhancement.

At the same time, however, the added value brought to offshore maintenance strategies by data science must also be met with a willingness to make risk-based decisions and embrace changes. 

As well as deepening collaboration between offshore operators, class societies such as ABS and OEMs towards a standardized equipment hierarchy, the industry needs to adopt a more condition-based paradigm to enable optimized decision making. Buy-in from business leaders is fundamental if value from data is to be extracted. 

But this is not just a one-way conversation. We as data scientists must also accept that we must do a better job of applying our findings to live situations. It is therefore upon us data scientists to be cognizant of problems such as overfitting, confusing correlation for causation, and over-estimating the ability of our models to generalize. The onus should also be on us to communicate this clearly to offshore decision makers who operate in the real world.

Indeed, it will need to be a collaborative effort between various stakeholders—the industry needs to start using insights from data-driven programs, with subject matter experts and operators playing their part of the algorithm building process with the data scientists. This will only improve trust and adoption. 

Arguably the most important input, however, will come from the top. Utilizing and, critically, acting upon data-driven condition-based recommendations requires deliberate top-down support from executive leadership. Data and AI must become part of the C-suite realm if a real, lasting impact is to be made.

Subrat Nanda is the chief data scientist at ABS and leads data science and advanced analytics efforts for marine, offshore and government clients.