Javascript must be enabled to continue!
Visual Information Facilitation Scene Text Retrieval
View through CrossRef
Abstract
Given a query text and a database of images to be queried, a scene text retrieval system can output the addresses of images containing this query word, as well as the location of the query word in the images. The current state-of-the-art method matches the visual features of query embedding and text instance embedding. This approach effectively mitigates the heterogeneity differences between the two modalities, but it disrupts the visual structure of the query images. In this paper, we optimize this method. First, we directly convert the entire query text into an image, thereby maximally preserving the visual information of the original text image. Then, we devised a Coarse-Grained Feature Rectification (CGFR) module, which facilitates visual alignment. Finally, we propose Adaptive Edit Distance (AED) and improve the main loss function. With the promotion of the above scheme, our method has reached the state-of-the-art level on multiple benchmark datasets. Particularly, this method is suitable for multilingual retrieval tasks, with a 16.98% improvement in mAP score compared to the current state-of-the-art method.
Title: Visual Information Facilitation Scene Text Retrieval
Description:
Abstract
Given a query text and a database of images to be queried, a scene text retrieval system can output the addresses of images containing this query word, as well as the location of the query word in the images.
The current state-of-the-art method matches the visual features of query embedding and text instance embedding.
This approach effectively mitigates the heterogeneity differences between the two modalities, but it disrupts the visual structure of the query images.
In this paper, we optimize this method.
First, we directly convert the entire query text into an image, thereby maximally preserving the visual information of the original text image.
Then, we devised a Coarse-Grained Feature Rectification (CGFR) module, which facilitates visual alignment.
Finally, we propose Adaptive Edit Distance (AED) and improve the main loss function.
With the promotion of the above scheme, our method has reached the state-of-the-art level on multiple benchmark datasets.
Particularly, this method is suitable for multilingual retrieval tasks, with a 16.
98% improvement in mAP score compared to the current state-of-the-art method.
Related Results
Field facilitation in open and distance learning in resource-constrained environments: a case of Mzuzu University, Malawi
Field facilitation in open and distance learning in resource-constrained environments: a case of Mzuzu University, Malawi
As part of the drive to enhance students’ learning experiences and success for students pursuing the B.Ed Science programme through distance education at Mzuzu University (Mzuni), ...
E-Press and Oppress
E-Press and Oppress
From elephants to ABBA fans, silicon to hormone, the following discussion uses a new research method to look at printed text, motion pictures and a te...
On Flores Island, do "ape-men" still exist? https://www.sapiens.org/biology/flores-island-ape-men/
On Flores Island, do "ape-men" still exist? https://www.sapiens.org/biology/flores-island-ape-men/
<span style="font-size:11pt"><span style="background:#f9f9f4"><span style="line-height:normal"><span style="font-family:Calibri,sans-serif"><b><spa...
Is investment facilitation a substitute or supplement? A comparative analysis of China and Brazil practices
Is investment facilitation a substitute or supplement? A comparative analysis of China and Brazil practices
Investment facilitation, which tackles ground-level obstacles to FDI and has no substantial challenges to regulatory space, is emerging as a new trend of global governance. Meanwhi...
Improving Sentence Retrieval Using Sequence Similarity
Improving Sentence Retrieval Using Sequence Similarity
Sentence retrieval is an information retrieval technique that aims to find sentences corresponding to an information need. It is used for tasks like question answering (QA) or nove...
New Research Progress in Image Retrieval
New Research Progress in Image Retrieval
Image retrieval is generally divided into two categories: one is text-based Image Retrieval; another is content-based Image Retrieval. Early image retrieval technology is mainly ba...
Trade Facilitation in ASEAN and ASEAN+1 FTAs: An Analysis of Provisions and Progress
Trade Facilitation in ASEAN and ASEAN+1 FTAs: An Analysis of Provisions and Progress
Regional trade agreements concluded in recent years have increasingly included provisions on trade facilitation. Although ASEAN 1 FTAs vary in their scope, specificity and depth of...
Residual Bound Ca2+ Can Account for the Effects of Ca2+ Buffers on Synaptic Facilitation
Residual Bound Ca2+ Can Account for the Effects of Ca2+ Buffers on Synaptic Facilitation
Facilitation is a transient stimulation-induced increase in synaptic response, a ubiquitous form of short-term synaptic plasticity that can regulate synaptic transmission on fast t...

