Javascript must be enabled to continue!

Optical character recognition based document image quality assessment

Optical Character Recognition (OCR) systems play a crucial role in digitizing documents. However, their performance significantly deteriorates when handling low-quality images. Even advanced OCR systems struggle if the input is visually or structurally poor. Therefore, achieving high OCR accuracy requires assessing document image quality in terms of how well characters can be recognized, not just visual clarity. In this work, we propose a Document Image Quality Assessment (DIQA) model that predicts OCR accuracy without requiring the actual execution of an OCR engine. To assess document image quality for OCR performance, twelve distinct features are extracted that capture various aspects of sharpness, focus, edge clarity, and structural distortion. Instead of relying on subjective human opinion scores, we generate labels by measuring the actual OCR accuracy using modern engines like PaddleOCR and Keras OCR. These accuracy scores, calculated using the Levenshtein distance, serve as ground truth labels for training. Using the extracted features and corresponding OCR-based labels, we train the machine learning models to learn the relationship between image characteristics and OCR performance. The proposed models are evaluated using statistical metrics such as RMSE, PLCC, and SROCC to determine the most effective predictor. Our experiments demonstrate the importance of using OCR scores as labels, and the results show that our approach yields improved performance compared to existing baseline methodologies.

Frontiers Media SA

R. Krithika J. Joshan Athanesious S. Kiruthika

Frontiers in Signal Processing

2026

Title: Optical character recognition based document image quality assessment

Description:

Optical Character Recognition (OCR) systems play a crucial role in digitizing documents.

However, their performance significantly deteriorates when handling low-quality images.

Even advanced OCR systems struggle if the input is visually or structurally poor.

Therefore, achieving high OCR accuracy requires assessing document image quality in terms of how well characters can be recognized, not just visual clarity.

In this work, we propose a Document Image Quality Assessment (DIQA) model that predicts OCR accuracy without requiring the actual execution of an OCR engine.

To assess document image quality for OCR performance, twelve distinct features are extracted that capture various aspects of sharpness, focus, edge clarity, and structural distortion.

Instead of relying on subjective human opinion scores, we generate labels by measuring the actual OCR accuracy using modern engines like PaddleOCR and Keras OCR.

These accuracy scores, calculated using the Levenshtein distance, serve as ground truth labels for training.

Using the extracted features and corresponding OCR-based labels, we train the machine learning models to learn the relationship between image characteristics and OCR performance.

The proposed models are evaluated using statistical metrics such as RMSE, PLCC, and SROCC to determine the most effective predictor.

Our experiments demonstrate the importance of using OCR scores as labels, and the results show that our approach yields improved performance compared to existing baseline methodologies.

Back

This study aims to analyze the implementation of social studies learning as strengthening character education in elementary schools. The research method used is a qualitative descr...

Theoretical study of laser-cooled SH<sup>–</sup> anion

The potential energy curves, dipole moments, and transition dipole moments for the <inline-formula><tex-math id="M13">\begin{document}${{\rm{X}}^1}{\Sigma ^ + }$\end{do...

Revisiting near-threshold photoelectron interference in argon with a non-adiabatic semiclassical model

<sec> <b>Purpose:</b> The interaction of intense, ultrashort laser pulses with atoms gives rise to rich non-perturbative phenomena, which are encoded within th...

Depth-aware salient object segmentation

Object segmentation is an important task which is widely employed in many computer vision applications such as object detection, tracking, recognition, and ret...

Double Exposure

I. Happy Endings Chaplin’s Modern Times features one of the most subtly strange endings in Hollywood history. It concludes with the Tramp (Chaplin) and the Gamin (Paulette Godda...

Transformation of recording features in an electronic environment

The article deals with one of the main theoretical problems of document science related to the definition of document features. This problem is also of applied importance, since wh...

Deep Learning Based Image Recognition for 5G Smart IoT Applications

Abstract With the advent of the 5G era，the development of massive data learning algorithms and in-depth research on neural networks, deep learning methods are widely used i...

Ukrainian Embroidery as a Type of Document

The purpose of the article is to determine the general and specific features of Ukrainian embroidery as a type of carrier of documented information. The methodology. We chose the ...

Email:
Password:

Email:

Optical character recognition based document image quality assessment

Related Results