Javascript must be enabled to continue!

Exploring and Exploiting Data Heterogeneity in Recommendation

Massive amounts of data are the foundation of data-driven recommendation models. As an inherent nature of big data, data heterogeneity widely exists in real-world recommendation systems. It reflects the differences in the properties among sub-populations. Ignoring the heterogeneity in recommendation data could mislead the models, hurt the sub-populational robustness, and finally limit the performance of recommendation models. However, data heterogeneity has not received substantial attention within the recommendation community, prompting us to adequately explore and exploit data heterogeneity to solve these challenges and enhance data analysis. In this study, we specifically focus on two representative categories of heterogeneity in recommendation data: heterogeneity of prediction mechanism and covariate distribution. To explore the data heterogeneity, we propose an algorithm based on bilevel clustering. Additionally, we demonstrate how the explored data heterogeneity can be exploited for prediction and debias in recommendation scenarios, specifically by building models using multiple sub-models and augmenting the propensity score estimation. Extensive experiments conducted on real-world data substantiate the existence of heterogeneity in recommendation data and validate the effectiveness of exploring and exploiting data heterogeneity in improving recommendation performance.

Association for Computing Machinery (ACM)

Zimu Wang Jiashuo Liu Hao Zou Xingxuan Zhang Yue He Dongxu Liang Peng Cui

ACM Transactions on Knowledge Discovery from Data

2025

Title: Exploring and Exploiting Data Heterogeneity in Recommendation

Description:

Massive amounts of data are the foundation of data-driven recommendation models.

As an inherent nature of big data, data heterogeneity widely exists in real-world recommendation systems.

It reflects the differences in the properties among sub-populations.

Ignoring the heterogeneity in recommendation data could mislead the models, hurt the sub-populational robustness, and finally limit the performance of recommendation models.

However, data heterogeneity has not received substantial attention within the recommendation community, prompting us to adequately explore and exploit data heterogeneity to solve these challenges and enhance data analysis.

In this study, we specifically focus on two representative categories of heterogeneity in recommendation data: heterogeneity of prediction mechanism and covariate distribution.

To explore the data heterogeneity, we propose an algorithm based on bilevel clustering.

Additionally, we demonstrate how the explored data heterogeneity can be exploited for prediction and debias in recommendation scenarios, specifically by building models using multiple sub-models and augmenting the propensity score estimation.

Extensive experiments conducted on real-world data substantiate the existence of heterogeneity in recommendation data and validate the effectiveness of exploring and exploiting data heterogeneity in improving recommendation performance.

Back

BACKGROUND Background: The online health community provides diagnosis and treatment assistance online so that doctors and patients can keep in touch continu...

Pre-extension Demonstration of Soil Test Crop Response Based Recommended Phosphorus Fertilizer Rate for Tef in Burka Jiren Watershed of Gechi District, Oromia

The pre-extension demonstration trial was conducted during the 2024 main rainy season in Burka Jiren Community Watershed of Gechi District, Buno Bedele Zone. The objectives were to...

Online Diagnosis-Treatment Department Recommendation based on Machine Learning in China (Preprint)

BACKGROUND As a supplement to the traditional medical service mode, online medical mode provides services of online appointment, online consultation, online...

FM-based Recommendation Model for Short-video with Topic Distribution

Abstract With the popularity of mobile internet terminals, the speed of the network and With the popularization of mobile Internet terminals, the speed of network and the r...

Personalized Recommendation Algorithm of Tourist Attractions Based on Transfer Learning

With the development of information technology and the popularity of the Internet, the data on the network is growing exponentially. Information overload has become a significant i...

RESEARCH ON PERSONALIZED RECOMMENDATION ALGORITHM FUSING TIME AND LOCATION

With development of recommendation systems, they are faced with more and more challenges. In order to relieve problems existing in commodity selection by users of different prefere...

xLightGCN: A Simplified GCN-based Model for Multimedia Recommender System

Abstract With the gradual development of Internet technology, information resources are growing at a high speed and the problem of information overload has emerged. It is d...

Effect of surface heterogeneity on hyper-resolution simulation of soil moisture

<p>Due to the land surface complexity, soil moisture immensely varies both spatially and temporally. However, the combined effects of land surface complexity and key ...

Email:
Password:

Email:

Exploring and Exploiting Data Heterogeneity in Recommendation

Related Results