Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

Exploring and Exploiting Data Heterogeneity in Recommendation

View through CrossRef
Massive amounts of data are the foundation of data-driven recommendation models. As an inherent nature of big data, data heterogeneity widely exists in real-world recommendation systems. It reflects the differences in the properties among sub-populations. Ignoring the heterogeneity in recommendation data could mislead the models, hurt the sub-populational robustness, and finally limit the performance of recommendation models. However, data heterogeneity has not received substantial attention within the recommendation community, prompting us to adequately explore and exploit data heterogeneity to solve these challenges and enhance data analysis. In this study, we specifically focus on two representative categories of heterogeneity in recommendation data: heterogeneity of prediction mechanism and covariate distribution. To explore the data heterogeneity, we propose an algorithm based on bilevel clustering. Additionally, we demonstrate how the explored data heterogeneity can be exploited for prediction and debias in recommendation scenarios, specifically by building models using multiple sub-models and augmenting the propensity score estimation. Extensive experiments conducted on real-world data substantiate the existence of heterogeneity in recommendation data and validate the effectiveness of exploring and exploiting data heterogeneity in improving recommendation performance.
Title: Exploring and Exploiting Data Heterogeneity in Recommendation
Description:
Massive amounts of data are the foundation of data-driven recommendation models.
As an inherent nature of big data, data heterogeneity widely exists in real-world recommendation systems.
It reflects the differences in the properties among sub-populations.
Ignoring the heterogeneity in recommendation data could mislead the models, hurt the sub-populational robustness, and finally limit the performance of recommendation models.
However, data heterogeneity has not received substantial attention within the recommendation community, prompting us to adequately explore and exploit data heterogeneity to solve these challenges and enhance data analysis.
In this study, we specifically focus on two representative categories of heterogeneity in recommendation data: heterogeneity of prediction mechanism and covariate distribution.
To explore the data heterogeneity, we propose an algorithm based on bilevel clustering.
Additionally, we demonstrate how the explored data heterogeneity can be exploited for prediction and debias in recommendation scenarios, specifically by building models using multiple sub-models and augmenting the propensity score estimation.
Extensive experiments conducted on real-world data substantiate the existence of heterogeneity in recommendation data and validate the effectiveness of exploring and exploiting data heterogeneity in improving recommendation performance.

Related Results

Online Diagnosis-Treatment Department Recommendation based on Machine Learning in China (Preprint)
Online Diagnosis-Treatment Department Recommendation based on Machine Learning in China (Preprint)
BACKGROUND As a supplement to the traditional medical service mode, online medical mode provides services of online appointment, online consultation, online...
FM-based Recommendation Model for Short-video with Topic Distribution
FM-based Recommendation Model for Short-video with Topic Distribution
Abstract With the popularity of mobile internet terminals, the speed of the network and With the popularization of mobile Internet terminals, the speed of network and the r...
Personalized Recommendation Algorithm of Tourist Attractions Based on Transfer Learning
Personalized Recommendation Algorithm of Tourist Attractions Based on Transfer Learning
With the development of information technology and the popularity of the Internet, the data on the network is growing exponentially. Information overload has become a significant i...
RESEARCH ON PERSONALIZED RECOMMENDATION ALGORITHM FUSING TIME AND LOCATION
RESEARCH ON PERSONALIZED RECOMMENDATION ALGORITHM FUSING TIME AND LOCATION
With development of recommendation systems, they are faced with more and more challenges. In order to relieve problems existing in commodity selection by users of different prefere...
xLightGCN: A Simplified GCN-based Model for Multimedia Recommender System
xLightGCN: A Simplified GCN-based Model for Multimedia Recommender System
Abstract With the gradual development of Internet technology, information resources are growing at a high speed and the problem of information overload has emerged. It is d...
Effect of surface heterogeneity on hyper-resolution simulation of soil moisture
Effect of surface heterogeneity on hyper-resolution simulation of soil moisture
<p>Due to the land surface complexity, soil moisture immensely varies both spatially and temporally. However, the combined effects of land surface complexity and key ...

Back to Top