Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

Cross-modal Retrieval based on Shared Proxies

View through CrossRef
Abstract Inconsistency of distribution and representation across different data modalities makes measuring cross-modal similarities a very difficult problem. Learning a common space that is semantically discriminative and modality invariant is the main challenge in cross-modal retrieval. Existing solutions usually employ pairwise or triplet data relationships to learn the common space, which can only capture the data similarity locally but would be unable to effectively characterize the global geometry of the common embedding space, and thus would limit the performance of cross-modal retrieval. In this paper, we introduce a shared proxy solution to cross-modal retrieval. We propose to incorporate the principles of shared proxy with neighbourhood component analysis to learn a common space for different modalities in which the distance between a sample’s representation and its corresponding proxy is minimized while the distances between a sample’s representation and the proxies not belonging to the sample are maximized. We propose the Cross-mOdal proXy learnIng (COXI) framework which integrates a cross-modal shared proxy loss, a discriminative loss and a modality invariant loss for supervised cross-modal retrieval. Extensive experiments on benchmark datasets clearly shows that COXI outperforms state of the art cross-modal retrieval techniques. Code is available on https://github.com/LigangZheng/COXI.
Title: Cross-modal Retrieval based on Shared Proxies
Description:
Abstract Inconsistency of distribution and representation across different data modalities makes measuring cross-modal similarities a very difficult problem.
Learning a common space that is semantically discriminative and modality invariant is the main challenge in cross-modal retrieval.
Existing solutions usually employ pairwise or triplet data relationships to learn the common space, which can only capture the data similarity locally but would be unable to effectively characterize the global geometry of the common embedding space, and thus would limit the performance of cross-modal retrieval.
In this paper, we introduce a shared proxy solution to cross-modal retrieval.
We propose to incorporate the principles of shared proxy with neighbourhood component analysis to learn a common space for different modalities in which the distance between a sample’s representation and its corresponding proxy is minimized while the distances between a sample’s representation and the proxies not belonging to the sample are maximized.
We propose the Cross-mOdal proXy learnIng (COXI) framework which integrates a cross-modal shared proxy loss, a discriminative loss and a modality invariant loss for supervised cross-modal retrieval.
Extensive experiments on benchmark datasets clearly shows that COXI outperforms state of the art cross-modal retrieval techniques.
Code is available on https://github.
com/LigangZheng/COXI.

Related Results

Sum things are not what they seem: Problems with the interpretation and analysis of radiocarbon-date proxies
Sum things are not what they seem: Problems with the interpretation and analysis of radiocarbon-date proxies
Radiocarbon-date proxies are widely used in studies exploring long-term variation in human and environmental phenomena. Examined phenomena include, for example, variation in past h...
Adversarial Learning Based Semantic Correlation Representation for Cross-Modal Retrieval
Adversarial Learning Based Semantic Correlation Representation for Cross-Modal Retrieval
With the rapid development of Internet and the widely usage of smart devices, massive multimedia data are generated, collected, stored and shared on the Internet. This trend makes ...
The proxies conundrum
The proxies conundrum
Purpose No systematic models are being used in empirical research that provide assurance for the choice of proxies that are being used. The purpose of this paper is to examine the ...
Modal Sosial Masyarakat Dusun Melayang dalam Pemanfaatan Buah Tengkawang di Hutan Adat Pikul
Modal Sosial Masyarakat Dusun Melayang dalam Pemanfaatan Buah Tengkawang di Hutan Adat Pikul
AbstrakModal sosial adalah kemampuan masyarakat untuk bekerjasama demi mencapai suatu tujuan bersama didalam suatu kelompok. Hutan Adat Pikul memiliki potensi tengkawang yang sanga...
Improving Sentence Retrieval Using Sequence Similarity
Improving Sentence Retrieval Using Sequence Similarity
Sentence retrieval is an information retrieval technique that aims to find sentences corresponding to an information need. It is used for tasks like question answering (QA) or nove...
Peran Pemerintah Dalam Mitigasi Kejahatan Pasar Modal
Peran Pemerintah Dalam Mitigasi Kejahatan Pasar Modal
AbstrakSaat ini perkembangan ekonomi berjalan sangat pesat namun, ditengah pesatnya pertumbuhan ekonomi terdapat juga ketidakstabilan ekonomi yang kemudian memberikan peluang kepad...
MODAL SOSIAL KANDIDAT DALAM KONSTETASI PEMILIHAN KEPALA DESA LOHIA KECAMATAN LOHIA KABUPATEN MUNA
MODAL SOSIAL KANDIDAT DALAM KONSTETASI PEMILIHAN KEPALA DESA LOHIA KECAMATAN LOHIA KABUPATEN MUNA
Tujuan penelitian ini adalah Untuk mengetahui mengetahui bagaimana Modal Sosial Kandidat Dalam Konstetasi  Pemilihan Kepala Desa Lohia Kecamatan Lohia Kabupaten Muna..Metode peneli...
New Research Progress in Image Retrieval
New Research Progress in Image Retrieval
Image retrieval is generally divided into two categories: one is text-based Image Retrieval; another is content-based Image Retrieval. Early image retrieval technology is mainly ba...

Back to Top