Unlocking latent features of users and items: empowering multi-modal recommendation systems
Abstract Multimedia recommendation has emerged as a pivotal area in contemporary research, propelled by the exponential growth of digital media consumption. In recent years, the proliferation of multimedia content across diverse platforms has necessitated sophisticated recommendation systems to assi...
Saved in:
| Main Authors: | , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Nature Portfolio
2025-07-01
|
| Series: | Scientific Reports |
| Online Access: | https://doi.org/10.1038/s41598-025-95872-4 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Abstract Multimedia recommendation has emerged as a pivotal area in contemporary research, propelled by the exponential growth of digital media consumption. In recent years, the proliferation of multimedia content across diverse platforms has necessitated sophisticated recommendation systems to assist users in navigating this vast landscape. Existing research predominantly centers on integrating multimodal features as auxiliary information within user–item interaction models. However, this approach proves inadequate for an effective multimedia recommendation. Primarily, it implicitly captures collaborative item–item connections via high-order item–user–item associations. Given that items encompass diverse content modalities, we suggest that leveraging latent semantic item–item structures within these multimodal contents could significantly enhance item representations and consequently augment recommendation performance. Existing works also fail to effectively capture user–user affinity in multimedia recommendations as they only focus on improving the item representation. To this end, we propose a novel framework where we capture the latent features of different modalities and also consider the user–user affinity to solve the Recommendation System (RecSys) problem. We have also incorporated the cold-start study in our experiments. We did an extensive experiment over three publicly available datasets to demonstrate the efficacy of our framework over the state-of-the-art model. |
|---|---|
| ISSN: | 2045-2322 |