As the oil and gas industry became more digital, it became more important to establish open standards for its scientific, engineering and operations aspects. Energistics was formed to tackle this challenge.

The company facilitates and maintains three flagship data exchange standards for the upstream oil and gas industry: WITSML (drilling and well data), PRODML (production operations data) and RESQML (earth model data).

This year will see a near-synchronous new release of all three standards plus an update to the Energistics Transfer Protocol (ETP). Each will be a completely revised version (known as v2.x). WITSML v2.0 features a revised set of data objects and a new application programming interface (API) that will provide true realtime data transfer. PRODML v2.0 includes three new components: distributed acoustic sensing data; pressure, volume and temperature fluid sampling and analysis; and a simplified production volume report. RESQML v2.1 is a bug fix update that also incorporates previous “add-on” data objects into the main design packages for properties, streamlines and activity models.

The most important enhancement in this next generation of standards is the new Common Technical Architecture (CTA). The CTA is the foundation across the standards that effectively enables a user to use any combination of standards to address 21st century integrated operations requirements. The CTA makes it possible for the individual domains to be treated as one standard, allowing data to be transferred among all domains in a seamless manner. An example of this would be the sending of a PRODML production report that refers to a WITSML well and completion. The CTA is built on other open standards to avoid licensing or other potential intellectual property issues and to maximize the likelihood that member companies can find experienced development resources familiar with them.

XML

eXtensible Markup Language (XML) is the familiar technology used in the existing Energistics standards suite. The Energistics markup languages will continue to use XML but in a simplified style more comfortable to developers.

UML

Universal Modeling Language (UML) is now used internally among the membership and Energistics staff to collaborate around a graphical depiction of a process or exchange data model. Pictures with boxes and arrows are much easier for subject matter experts to understand and comment on during the standards development process. XML (and potentially other transfer technologies) are generated directly from the UML pictures, boosting productivity and quality by avoiding transcription mistakes.

HDF5

Hierarchical Data Format v5 (HDF5) is a standard widely used in the scientific community for the transfer of large volumes of floating-point data. A downside of XML is that an XML document can be many times larger than simpler formats because of the extra bytes (tags) associated with the self-documenting features of XML. HDF5 is close to as small as a dataset can be at the expense of human readability. But for moving seismic volumes or grids or other large datasets, this trade-off is deemed to be worthwhile.

OPC

Open Packaging Conventions (OPC) organize and reduce the sizes of XML files involved in a transfer. In the case of an earth model, a data transfer could involve hundreds of thousands of small XML documents. It would obviously be helpful if these documents could be organized in a folder structure and zipped to reduce the file sizes. There is an existing standard that does just that: the OPC used by Microsoft Office suite files (the familiar pptx, docx, xlsx, etc. files). This file format was adopted by standards organizations and has broad support in Visual Studio and other development toolsets.

EPC

Energistics Packaging Convention (EPC) is a formal kind of customization of OPC called a “profile” developed by Energistics. The EPC standard describes the specialization of OPC with some optional items in OPC mandatory in EPC. OPC (thus EPC) has wide support among developers, and EPC is mandatory for RESQML v2.

WebSocket

WebSocket is a standard related to HTML5, which is used to stream data over the Internet. In a normal Internet browser session the user types a URL into the browser, which identifies a web server the user wants to connect to (say, http://www.amazon.com/) and the resource the user would like the web server to send it (e.g., the page called index.html). The web server listening on the other end then sends the requested resource. This process of call and response is repeated over and over as a user chooses further links on a webpage.

With Websocket, the user connects to a web server and gives the web server the identifier for a port on the user’s machine and tells the web server to begin streaming data through that port as fast as it can. Data are streamed until the connection is broken or at the request of either party.

The Websocket protocol is being used in the next generation of Energistics standards to support true realtime streaming of data from a server to a client, whether the server is a drilling rig, a producing well or an application serving data to a series of consumers.

Avro

When communicating in a spoken language, sentences are recited one word at a time because humans are unable to say multiple words simultaneously. Serial data protocols like Websocket have the same limitation. For the bytes of data to be transmitted over a wire, they have to be broken up into bytes and each byte broken into bits. There has to be a mirror-image set of processes that break up the data into a known sequence to put on the wire and then reassemble the data into meaningful structures at the receiving end. This process is broadly called serialization.

Avro is a data serializer developed and maintained by the Apache open-source community. One feature is that the structure of the data doesn’t have to be decided in advance—it simply informs the listening client of the structure of the data it is about to receive. This gives upstream oil and gas the flexibility it needs to be able to make changes at a rig site or producing well dynamically. Javascript Object Notation (JSON) is used by Avro to describe the structure of the incoming data stream.

ETP

Energistics Transfer Protocol (ETP) is the name given to a new standard that uses Websocket, Avro and JSON to transfer true real-time and static data from a sender to a receiver. No server is required. The protocol is a simple API consisting of messages passed between a sender and a receiver to initiate and close sessions, identify available data, initiate transfer of some subset of that data and other functions.

ETP was initially developed for WITSML 2.0; WITSML v1.4.1.1 is not a true real-time transfer, and streaming data was a fundamental requirement of the new version. PRODML and RESQML will eventually also use it. ETP can be used for any kind of data transfer from the field to the office or between applications in the office. It also can be used for any sensor-based machine-to-machine application.

EIP

Energy Industry Profile (EIP) is an oil industry metadata standard. EIP is a “profile” of ISO 19115- 3, which is a geospatial metadata standard. Use of EIP goes beyond geospatial data and is used by all of Energistics’ other standards.

Standard naming, other conventions

The final component of the CTA is standard naming and other conventions plus units of measure and schema components shared across all Energistics standards. Much of this work is derived from prior work in Energistics and from widely respected development shops willing to publicize their internal coding standards, like Microsoft and Google.

The next generation of standards—version 2— brings the Energistics standards into the future with the latest technology available. This version of the standards is much more robust because the components are integrated, they are more programmer- oriented and there is new workflow support with the enhanced underlying technology.