Javascript must be enabled to continue!
Practical notes on lossy compression of scientific data
View through CrossRef
<p>Lossy compression methods are extremely efficient in terms of space and performance and allow for reduction of network bandwidth and disk space needed to store data arrays without sacrificing the number of stored values. &#160;Lossy compression involves an irreversible transformation of data that reduces the information content of the data. &#160;The transformation introduces a distortion that is normally measured in terms of absolute or relative error. The error is higher for higher compression ratios. &#160;A good choice of lossy compression parameters maximizes the compression ratio while keeping the introduced error within acceptable margins. &#160;Negligence or failure to chose a right compression method or its parameters leads to poor compression ratio, or loss of the data.</p><p>A good strategy for lossy compression would involve sepcification of the acceptible error margin and choice of compression parameters and storage format. We will discuss specific techniques of lossy compression, and illustrate pitfalls in choice of the error margins and tools for lossy/lossless compression. The following specific topics will be covered:</p><p>1. Packing of floating-point data to integers in NetCDF is sub-optimal in most cases, &#160; and for some quantities leads to severe errors.<br>2. Keeping relative vs absolute precision: false alternative.<br>3. Acceptible error margin depends on both the origin and the intended application of data.<br>4. Smart algorithms to decide on compression parameters have limited area of applicability, &#160; which has to be considered in each individual case.<br>5. Choice of a format for compressed data (NetCDF, GRIB2, Zarr): tradeoff between size, speed and precision.<br>6. What "number_of_significant_digits" and "least_significant_digit" mean in terms of relative/absolute error.<br>7. Bit-Shuffle is not always beneficial.</p>
Title: Practical notes on lossy compression of scientific data
Description:
<p>Lossy compression methods are extremely efficient in terms of space and performance and allow for reduction of network bandwidth and disk space needed to store data arrays without sacrificing the number of stored values.
&#160;Lossy compression involves an irreversible transformation of data that reduces the information content of the data.
&#160;The transformation introduces a distortion that is normally measured in terms of absolute or relative error.
The error is higher for higher compression ratios.
&#160;A good choice of lossy compression parameters maximizes the compression ratio while keeping the introduced error within acceptable margins.
&#160;Negligence or failure to chose a right compression method or its parameters leads to poor compression ratio, or loss of the data.
</p><p>A good strategy for lossy compression would involve sepcification of the acceptible error margin and choice of compression parameters and storage format.
We will discuss specific techniques of lossy compression, and illustrate pitfalls in choice of the error margins and tools for lossy/lossless compression.
The following specific topics will be covered:</p><p>1.
Packing of floating-point data to integers in NetCDF is sub-optimal in most cases, &#160; and for some quantities leads to severe errors.
<br>2.
Keeping relative vs absolute precision: false alternative.
<br>3.
Acceptible error margin depends on both the origin and the intended application of data.
<br>4.
Smart algorithms to decide on compression parameters have limited area of applicability, &#160; which has to be considered in each individual case.
<br>5.
Choice of a format for compressed data (NetCDF, GRIB2, Zarr): tradeoff between size, speed and precision.
<br>6.
What "number_of_significant_digits" and "least_significant_digit" mean in terms of relative/absolute error.
<br>7.
Bit-Shuffle is not always beneficial.
</p>.
Related Results
Data Compression
Data Compression
AbstractLossless compression systems and lossy compression systems are the two types of data compression systems. In a lossless compression system, a lossless code is designed to e...
ChIPWig: A Random Access-Enabling Lossless and Lossy Compression Method for ChIP-seq Data
ChIPWig: A Random Access-Enabling Lossless and Lossy Compression Method for ChIP-seq Data
Abstract
Motivation
The past decade has witnessed a rapid development of data acquisition technologies that enable integrative ...
Survey on Various Image Compression Techniques Used in Image Processing to Improve the Quality of Image
Survey on Various Image Compression Techniques Used in Image Processing to Improve the Quality of Image
This paper presents study of assorted lossy compression techniques. the 2 techniques are Wavelet Difference Reduction (WDR) based compression and Singular Value Decomposition (SVD)...
Differential Diagnosis of Neurogenic Thoracic Outlet Syndrome: A Review
Differential Diagnosis of Neurogenic Thoracic Outlet Syndrome: A Review
Abstract
Thoracic outlet syndrome (TOS) is a complex and often overlooked condition caused by the compression of neurovascular structures as they pass through the thoracic outlet. ...
A note on precision-preserving compression of scientific data
A note on precision-preserving compression of scientific data
Abstract. Lossy compression of scientific data arrays is a powerful tool to save network bandwidth and storage space. Properly applied lossy compression can reduce the size of a da...
A JPEG 2000 BASED HYBRID IMAGE COMPRESSION TECHNIQUE FOR MEDICAL IMAGES
A JPEG 2000 BASED HYBRID IMAGE COMPRESSION TECHNIQUE FOR MEDICAL IMAGES
Use of lossy compression for medical images could result in compression error that may be considered as diagnostic problem by medical doctor. Hybrid schemes, a combination of lossy...
Deep learning-based Point Cloud Compression
Deep learning-based Point Cloud Compression
Compression de nuages de points par apprentissage profond
Les nuages de points deviennent essentiels dans de nombreuses applications et les progrès des technologies...
Lossy Scientific Data Compression With SPERR
Lossy Scientific Data Compression With SPERR
<p>Much of the research in lossy data compression has focused on minimizing the average error for a given storage budget. For scientific applications, the maximum poi...

