Javascript must be enabled to continue!
Conditional Subsampling of Legacy Boreholes for Subsurface Model Validation
View through CrossRef
Practical constraints often force modellers to rely on legacy data rather than targeted new data collection relying tailored sampling design for subsurface modelling. While these pre-existing datasets enable model development and gap identification, their spatial density and distribution may not always meet the desired resolution or precision. Consequently, strategic subsampling for calibration and validation is essential to ensure a robust and accurate performance assessment of the resulting models. While cross-validation techniques are commonly applied to maximize data utility, their application in spatial modelling yields overoptimistic performance estimates with high variance, particularly when data are clustered. Probabilistic-based sampling is known to tackle bias, but its effectiveness remains poorly understood for spatially sparse and clustered legacy data.This research evaluates the impact of subsampling methods on the validation of spatial interpolation techniques. Conditional versus random subsampling is compared for different subsample sizes in terms of actual model performance with particular attention to geostatistical concepts that additionally take into account spatial autocorrelation within subsurface data. Legacy boreholes spanning over a century with sparse and clustered spatial distribution were queried to model peat content in 3D. Conditioning relied on 2D legacy attributes such as age, spatial coordinates, and target feature statistics. We also investigated how the complexity of spatial variation (represented in different models with varying anisotropic autocorrelation) influenced performance by populating the existing borehole configuration with three 3D target features: two more spatially continuous synthetic and one heterogeneous, real field dataset. First results suggest that variance of validation results reduced exclusively in the heterogeneous case, provided the validation subset was large enough (35%) to incorporate the cumulative peat content within a borehole as a 2D attribute. These results underscore the resilience of conditioned probabilistic subsampling over alternative validation methods for legacy-based modelling.
Title: Conditional Subsampling of Legacy Boreholes for Subsurface Model Validation
Description:
Practical constraints often force modellers to rely on legacy data rather than targeted new data collection relying tailored sampling design for subsurface modelling.
While these pre-existing datasets enable model development and gap identification, their spatial density and distribution may not always meet the desired resolution or precision.
Consequently, strategic subsampling for calibration and validation is essential to ensure a robust and accurate performance assessment of the resulting models.
While cross-validation techniques are commonly applied to maximize data utility, their application in spatial modelling yields overoptimistic performance estimates with high variance, particularly when data are clustered.
Probabilistic-based sampling is known to tackle bias, but its effectiveness remains poorly understood for spatially sparse and clustered legacy data.
This research evaluates the impact of subsampling methods on the validation of spatial interpolation techniques.
Conditional versus random subsampling is compared for different subsample sizes in terms of actual model performance with particular attention to geostatistical concepts that additionally take into account spatial autocorrelation within subsurface data.
Legacy boreholes spanning over a century with sparse and clustered spatial distribution were queried to model peat content in 3D.
Conditioning relied on 2D legacy attributes such as age, spatial coordinates, and target feature statistics.
We also investigated how the complexity of spatial variation (represented in different models with varying anisotropic autocorrelation) influenced performance by populating the existing borehole configuration with three 3D target features: two more spatially continuous synthetic and one heterogeneous, real field dataset.
First results suggest that variance of validation results reduced exclusively in the heterogeneous case, provided the validation subset was large enough (35%) to incorporate the cumulative peat content within a borehole as a 2D attribute.
These results underscore the resilience of conditioned probabilistic subsampling over alternative validation methods for legacy-based modelling.
Related Results
Deformations caused by subsurface heat islands: a study on the Chicago Loop
Deformations caused by subsurface heat islands: a study on the Chicago Loop
The ground beneath urban areas is warming up due to anthropogenic activity, leading to subsurface urban heat islands [1]. A recent review of the literature suggests that subsurface...
Modelling subsurface melt of Swiss glaciers
Modelling subsurface melt of Swiss glaciers
Glacier subsurface melt, consisting of englacial and basal melt, is far less understood than surface mass balance. Yet it represents a potentially relevant component of glacier ret...
Validation in Doctoral Education: Exploring PhD Students’ Perceptions of Belonging to Scaffold Doctoral Identity Work
Validation in Doctoral Education: Exploring PhD Students’ Perceptions of Belonging to Scaffold Doctoral Identity Work
Aim/Purpose: The aim of this article is to make a case of the role of validation in doctoral education. The purpose is to detail findings from three studies which explore PhD stude...
Performance Assessment of Boreholes for Domestic Water Supply in the Tolon District of Ghana
Performance Assessment of Boreholes for Domestic Water Supply in the Tolon District of Ghana
This study was conducted to assess the performance and functionality of boreholes for domestic water use in the Tolon District of the Northern Region of Ghana. A total of 115 bore...
Subsampling scaling
Subsampling scaling
AbstractIn real-world applications, observations are often constrained to a small fraction of a system. Such spatial subsampling can be caused by the inaccessibility or the sheer s...
Subsampling in ensemble Kalman inversion
Subsampling in ensemble Kalman inversion
Abstract
We consider the ensemble Kalman inversion (EKI) which has been recently introduced as an efficient, gradient-free optimisation method to estimate unknown pa...
An Enhanced Subsampling Technique in Compressive Sensing using Linear Interpolation and Random Measurement Matrix
An Enhanced Subsampling Technique in Compressive Sensing using Linear Interpolation and Random Measurement Matrix
Abstract
In Compressive Sensing, the incoherence of a measurement matrix during subsampling is a crucial requirement for the accurate reconstruction of a signal. However, s...
Automated detection and characterization of diffraction curves in WISDOM/ExoMars radargrams
Automated detection and characterization of diffraction curves in WISDOM/ExoMars radargrams
IntroductionWISDOM (Water Ice Subsurface Deposits Observation on Mars) will be the Ground Penetrating Radar of the Rosalind Franklin rover of the ExoMars 2022 mission. This rover w...

