Javascript must be enabled to continue!
Exploring the Effectiveness of Binary-Valued and Real-Valued Representations for Cross-Modal Retrieval
View through CrossRef
AbstractCross-modal retrieval(CMR) refers to the task of retrieving semantically related items across different modalities. For example, given an image query, the task is to retrieve relevant text descriptions or audio clips. One of the major challenges in CMR is the modality gap, which refers to the differences between the features and representations used to encode information in different modalities. To address the modality gap, researchers have developed various techniques such as joint embedding, where the features from different modalities are mapped to a common embedding space where they can be compared directly. Binary-valued and real-valued representations are two different ways to represent data. A binary-valued representation is a type of discrete representation where data is represented using either 0 or 1. Real-valued representation, on the other hand, represents each item as a vector of real numbers. Both types of representations have their advantages and disadvantages, and researchers continue to explore new techniques for generating representations that can improve the performance of CMR systems. First time, the work presented here generates both the representations and comparison is made by performing experiments on standard benchmark datasets using mean average precision (MAP). The result suggest that real-valued representation outperforms binary-valued representation in terms of MAP, especially when the data is complex and high-dimensional. On the other hand, binary codes are more memory-efficient than real-valued embedding, and they can be computed much faster. Moreover, binary codes can be easily stored and transmitted, making them more suitable for large-scale retrieval tasks.
Title: Exploring the Effectiveness of Binary-Valued and Real-Valued Representations for Cross-Modal Retrieval
Description:
AbstractCross-modal retrieval(CMR) refers to the task of retrieving semantically related items across different modalities.
For example, given an image query, the task is to retrieve relevant text descriptions or audio clips.
One of the major challenges in CMR is the modality gap, which refers to the differences between the features and representations used to encode information in different modalities.
To address the modality gap, researchers have developed various techniques such as joint embedding, where the features from different modalities are mapped to a common embedding space where they can be compared directly.
Binary-valued and real-valued representations are two different ways to represent data.
A binary-valued representation is a type of discrete representation where data is represented using either 0 or 1.
Real-valued representation, on the other hand, represents each item as a vector of real numbers.
Both types of representations have their advantages and disadvantages, and researchers continue to explore new techniques for generating representations that can improve the performance of CMR systems.
First time, the work presented here generates both the representations and comparison is made by performing experiments on standard benchmark datasets using mean average precision (MAP).
The result suggest that real-valued representation outperforms binary-valued representation in terms of MAP, especially when the data is complex and high-dimensional.
On the other hand, binary codes are more memory-efficient than real-valued embedding, and they can be computed much faster.
Moreover, binary codes can be easily stored and transmitted, making them more suitable for large-scale retrieval tasks.
Related Results
Cross-modal Retrieval based on Shared Proxies
Cross-modal Retrieval based on Shared Proxies
Abstract
Inconsistency of distribution and representation across different data modalities makes measuring cross-modal similarities a very difficult problem. Learning a com...
Adversarial Learning Based Semantic Correlation Representation for Cross-Modal Retrieval
Adversarial Learning Based Semantic Correlation Representation for Cross-Modal Retrieval
With the rapid development of Internet and the widely usage of smart devices, massive multimedia data are generated, collected, stored and shared on the Internet. This trend makes ...
Modal Sosial Masyarakat Dusun Melayang dalam Pemanfaatan Buah Tengkawang di Hutan Adat Pikul
Modal Sosial Masyarakat Dusun Melayang dalam Pemanfaatan Buah Tengkawang di Hutan Adat Pikul
AbstrakModal sosial adalah kemampuan masyarakat untuk bekerjasama demi mencapai suatu tujuan bersama didalam suatu kelompok. Hutan Adat Pikul memiliki potensi tengkawang yang sanga...
Meta-Representations as Representations of Processes
Meta-Representations as Representations of Processes
In this study, we explore how the notion of meta-representations in Higher-Order Theories (HOT) of consciousness can be implemented in computational models. HOT suggests that consc...
Peran Pemerintah Dalam Mitigasi Kejahatan Pasar Modal
Peran Pemerintah Dalam Mitigasi Kejahatan Pasar Modal
AbstrakSaat ini perkembangan ekonomi berjalan sangat pesat namun, ditengah pesatnya pertumbuhan ekonomi terdapat juga ketidakstabilan ekonomi yang kemudian memberikan peluang kepad...
MODAL SOSIAL KANDIDAT DALAM KONSTETASI PEMILIHAN KEPALA DESA LOHIA KECAMATAN LOHIA KABUPATEN MUNA
MODAL SOSIAL KANDIDAT DALAM KONSTETASI PEMILIHAN KEPALA DESA LOHIA KECAMATAN LOHIA KABUPATEN MUNA
Tujuan penelitian ini adalah Untuk mengetahui mengetahui bagaimana Modal Sosial Kandidat Dalam Konstetasi Pemilihan Kepala Desa Lohia Kecamatan Lohia Kabupaten Muna..Metode peneli...
Improving Sentence Retrieval Using Sequence Similarity
Improving Sentence Retrieval Using Sequence Similarity
Sentence retrieval is an information retrieval technique that aims to find sentences corresponding to an information need. It is used for tasks like question answering (QA) or nove...
Multimodal Information Integration and Retrieval Framework Based on Graph Neural Networks
Multimodal Information Integration and Retrieval Framework Based on Graph Neural Networks
In the context of the rapid proliferation of multimodal data (e.g. text, image, audio), the effective integration and retrieval of information across different modalities has emerg...

