Javascript must be enabled to continue!
Data Quality Considerations for Petrophysical Machine-Learning Models
View through CrossRef
Decades of subsurface exploration and characterization have led to the collation and storage of large volumes of well-related data. The amount of data gathered daily continues to grow rapidly as technology and recording methods improve. With the increasing adoption of machine-learning techniques in the subsurface domain, it is essential that the quality of the input data is carefully considered when working with these tools. If the input data are of poor quality, the impact on precision and accuracy of the prediction can be significant. Consequently, this can impact key decisions about the future of a well or a field. This study focuses on well-log data, which can be highly multidimensional, diverse, and stored in a variety of file formats. Well-log data exhibits key characteristics of big data: volume, variety, velocity, veracity, and value. Well data can include numeric values, text values, waveform data, image arrays, maps, and volumes. All of which can be indexed by time or depth in a regular or irregular way. A significant portion of time can be spent gathering data and quality checking it prior to carrying out petrophysical interpretations and applying machine-learning models. Well-log data can be affected by numerous issues causing a degradation in data quality. These include missing data ranging from single data points to entire curves, noisy data from tool-related issues, borehole washout, processing issues, incorrect environmental corrections, and mislabeled data. Having vast quantities of data does not mean it can all be passed into a machine-learning algorithm with the expectation that the resultant prediction is fit for purpose. It is essential that the most important and relevant data are passed into the model through appropriate feature selection techniques. Not only does this improve the quality of the prediction, but it also reduces computational time and can provide a better understanding of how the models reach their conclusion. This paper reviews data quality issues typically faced by petrophysicists when working with well-log data and deploying machine-learning models. This is achieved by first providing an overview of machine learning and big data within the petrophysical domain, followed by a review of the common well-log data issues, their impact on machine-learning algorithms, and methods for mitigating their influence.
Society of Petrophysicists and Well Log Analysts (SPWLA)
Title: Data Quality Considerations for Petrophysical Machine-Learning Models
Description:
Decades of subsurface exploration and characterization have led to the collation and storage of large volumes of well-related data.
The amount of data gathered daily continues to grow rapidly as technology and recording methods improve.
With the increasing adoption of machine-learning techniques in the subsurface domain, it is essential that the quality of the input data is carefully considered when working with these tools.
If the input data are of poor quality, the impact on precision and accuracy of the prediction can be significant.
Consequently, this can impact key decisions about the future of a well or a field.
This study focuses on well-log data, which can be highly multidimensional, diverse, and stored in a variety of file formats.
Well-log data exhibits key characteristics of big data: volume, variety, velocity, veracity, and value.
Well data can include numeric values, text values, waveform data, image arrays, maps, and volumes.
All of which can be indexed by time or depth in a regular or irregular way.
A significant portion of time can be spent gathering data and quality checking it prior to carrying out petrophysical interpretations and applying machine-learning models.
Well-log data can be affected by numerous issues causing a degradation in data quality.
These include missing data ranging from single data points to entire curves, noisy data from tool-related issues, borehole washout, processing issues, incorrect environmental corrections, and mislabeled data.
Having vast quantities of data does not mean it can all be passed into a machine-learning algorithm with the expectation that the resultant prediction is fit for purpose.
It is essential that the most important and relevant data are passed into the model through appropriate feature selection techniques.
Not only does this improve the quality of the prediction, but it also reduces computational time and can provide a better understanding of how the models reach their conclusion.
This paper reviews data quality issues typically faced by petrophysicists when working with well-log data and deploying machine-learning models.
This is achieved by first providing an overview of machine learning and big data within the petrophysical domain, followed by a review of the common well-log data issues, their impact on machine-learning algorithms, and methods for mitigating their influence.
Related Results
3D Structural Geomechanics and Machine Learning Based Petrophysical Evaluation of Proven and Mature Oilfield in Upper Indus Basin, Pakistan
3D Structural Geomechanics and Machine Learning Based Petrophysical Evaluation of Proven and Mature Oilfield in Upper Indus Basin, Pakistan
ABSTRACT:
Machine learning (ML) has revolutionized petrophysical analysis by providing advanced tools to efficiently interpret well-log data and predict reservoir...
Characterizing Petrophysical Uncertainties of Thin-Bedded Gas Sands With Cores and Production Data
Characterizing Petrophysical Uncertainties of Thin-Bedded Gas Sands With Cores and Production Data
In this paper, we study the largest producing gas field in SE Asia that supplies about 50% of the domestic gas demand. During the development of the field, production data analysis...
Best Practices in Automatic Permeability Estimation: Machine-Learning Methods vs. Conventional Petrophysical Models
Best Practices in Automatic Permeability Estimation: Machine-Learning Methods vs. Conventional Petrophysical Models
Multiple physics-based and empirical models have been introduced in the past to estimate permeability from well logs. Estimation of flow-related petrophysical properties from boreh...
Integrating quantum neural networks with machine learning algorithms for optimizing healthcare diagnostics and treatment outcomes
Integrating quantum neural networks with machine learning algorithms for optimizing healthcare diagnostics and treatment outcomes
The rapid advancements in artificial intelligence (AI) and quantum computing have catalyzed an unprecedented shift in the methodologies utilized for healthcare diagnostics and trea...
An Approach to Machine Learning
An Approach to Machine Learning
The process of automatically recognising significant patterns within large amounts of data is called "machine learning." Throughout the last couple of decades, it has evolved into ...
APPLICATION OF THE METHOD OF GROUP ACCOUNTING OF ARGUMENTS FOR THE ANALYSIS OF PETROPHYSICAL DATA
APPLICATION OF THE METHOD OF GROUP ACCOUNTING OF ARGUMENTS FOR THE ANALYSIS OF PETROPHYSICAL DATA
The paper considers a range of tasks related to the processing and analysis of petrophysical data, which are effectively solved by the method of group accounting of arguments (MGUA...
Advanced frameworks for fraud detection leveraging quantum machine learning and data science in fintech ecosystems
Advanced frameworks for fraud detection leveraging quantum machine learning and data science in fintech ecosystems
The rapid expansion of the fintech sector has brought with it an increasing demand for robust and sophisticated fraud detection systems capable of managing large volumes of financi...
Quantification of the Process of Mud-Filtrate Invasion in Heterogeneous Rocks by Combining X- Ray Computed Tomography and Numerical Simulations
Quantification of the Process of Mud-Filtrate Invasion in Heterogeneous Rocks by Combining X- Ray Computed Tomography and Numerical Simulations
Understanding the behavior of mud-filtrate invasion and mudcake buildup in permeable rocks is important for the accurate interpretation of borehole measurements such as resistivity...

