Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

Multimodal Representation and Cross Modal Enhancement for Short Video Recommendation

View through CrossRef
Abstract The surge in short video content production on various platforms has marked the emergence of short videos as a new and popular form of media. However, the sheer abundance and complexity of short video data present challenges for effective video recommendation. Short videos encapsulate rich multimodal information across both temporal and spatial dimensions, allowing users to engage with videos in various ways—whether focusing on the content of a particular shot, delving into the storyline, or enjoying the accompanying music. Conventional video recommendation systems typically focus on a singular type of recommended content, providing recommendations for entire videos, which may not fully satisfy the nuanced preferences of users. In multimodal data, each modality contributes specific information to the others, establishing correlations between them. In the context of video data, which combines image, speech, and text data, understanding the relationships among these three media types is crucial for effective multimodal content-based video recommendation. In this paper, we leverage the consistency of multimodal features for understanding multimedia content, aiming to derive a robust representation from the inherent characteristics of short videos. Unlike previous studies that primarily concentrate on a single modality in short video recommendation, our approach capitalizes on the multimodality of short video content and adopts a multimodal recommendation strategy. By extracting and fusing information from multiple modalities, we achieve a more comprehensive short video content analysis, paving the way for our recommendation method.
Title: Multimodal Representation and Cross Modal Enhancement for Short Video Recommendation
Description:
Abstract The surge in short video content production on various platforms has marked the emergence of short videos as a new and popular form of media.
However, the sheer abundance and complexity of short video data present challenges for effective video recommendation.
Short videos encapsulate rich multimodal information across both temporal and spatial dimensions, allowing users to engage with videos in various ways—whether focusing on the content of a particular shot, delving into the storyline, or enjoying the accompanying music.
Conventional video recommendation systems typically focus on a singular type of recommended content, providing recommendations for entire videos, which may not fully satisfy the nuanced preferences of users.
In multimodal data, each modality contributes specific information to the others, establishing correlations between them.
In the context of video data, which combines image, speech, and text data, understanding the relationships among these three media types is crucial for effective multimodal content-based video recommendation.
In this paper, we leverage the consistency of multimodal features for understanding multimedia content, aiming to derive a robust representation from the inherent characteristics of short videos.
Unlike previous studies that primarily concentrate on a single modality in short video recommendation, our approach capitalizes on the multimodality of short video content and adopts a multimodal recommendation strategy.
By extracting and fusing information from multiple modalities, we achieve a more comprehensive short video content analysis, paving the way for our recommendation method.

Related Results

[RETRACTED] Rhino XL Male Enhancement v1
[RETRACTED] Rhino XL Male Enhancement v1
[RETRACTED]Rhino XL Reviews, NY USA: Studies show that testosterone levels in males decrease constantly with growing age. There are also many other problems that males face due ...
ANALISIS MODAL KERJA PADA KOPERASI SERBA USAHA DI KOTA METRO
ANALISIS MODAL KERJA PADA KOPERASI SERBA USAHA DI KOTA METRO
Modal kerja merupakan suatu kekayaan yang digunakan untuk membelanjai perusahaan sehari-hari. Modal kerja biasanya berbentuk uang kas, piutang, persediaan barang yang kesemuanya it...
Multimodal Emotion Recognition and Human Computer Interaction for AI-Driven Mental Health Support (Preprint)
Multimodal Emotion Recognition and Human Computer Interaction for AI-Driven Mental Health Support (Preprint)
BACKGROUND Mental health has become one of the most urgent global health issues of the twenty-first century. The World Health Organization (WHO) reports tha...
Audio and video editing system design based on OpenCV
Audio and video editing system design based on OpenCV
With the rapid development of the Internet, a new carrier for people to perceive the world and communicate with each other - audio and video - is gradually being favoured by the pu...
FM-based Recommendation Model for Short-video with Topic Distribution
FM-based Recommendation Model for Short-video with Topic Distribution
Abstract With the popularity of mobile internet terminals, the speed of the network and With the popularization of mobile Internet terminals, the speed of network and the r...
Kontribusi Modal Sosial dalam Mengefektifkan Modal Lingkungan (Kasus Komunitas Kampung Nelayan Untia Makassar)
Kontribusi Modal Sosial dalam Mengefektifkan Modal Lingkungan (Kasus Komunitas Kampung Nelayan Untia Makassar)
AbstractThe Untia fishing village community was formed from the relocation of the residents of Laelae Island in 1998. The community that was built from the results of relocation ha...
Literasi Multimodal: Teori, Desain, dan Aplikasi
Literasi Multimodal: Teori, Desain, dan Aplikasi
Buku ini bertujuan untuk pengembangan strategi dan model paket pelajaran atau mata kuliah dengan menawarkan contoh-contoh strategi instruksional yang memiliki landasan teori dan be...

Back to Top