Multi-View Collaborative Training and Self-Supervised Learning for Group Recommendation

Wei, Feng; Chen, Shuyu

doi:10.3390/math13010066

Open AccessArticle

Multi-View Collaborative Training and Self-Supervised Learning for Group Recommendation

by

Feng Wei

^1,2 and

Shuyu Chen

^1,2,*

¹

Key Laboratory of Dependable Service Computing in Cyber Physical Society, Chongqing University, Ministry of Education, Chongqing 401331, China

²

School of Big Data and Software Engineering, Chongqing University, Chongqing 400044, China

^*

Author to whom correspondence should be addressed.

Mathematics 2025, 13(1), 66; https://doi.org/10.3390/math13010066

Submission received: 3 November 2024 / Revised: 1 December 2024 / Accepted: 20 December 2024 / Published: 27 December 2024

Download

Browse Figures

Versions Notes

Abstract

:

Recommendation systems offer an effective solution to information overload, finding widespread application across e-commerce, news platforms, and beyond. By analyzing interaction histories, these systems automatically filter and recommend items that are most likely to resonate with users. Recently, with the swift advancement of social networking, group recommendation has emerged as a compelling research area, enabling personalized recommendations for groups of users. Unlike individual recommendation, group recommendation must consider both individual preferences and group dynamics, thereby enhancing decision-making efficiency for groups. One of the key challenges facing recommendation algorithms is data sparsity, a limitation that is even more severe in group recommendation than in traditional recommendation tasks. While various group recommendation methods attempt to address this issue, many of them still rely on single-view modeling or fail to sufficiently account for individual user preferences within a group, limiting their effectiveness. This paper addresses the data sparsity issue to improve group recommendation performance, overcoming the limitations of overlooking individual user recommendation tasks and depending on single-view modeling. We propose MCSS (multi-view collaborative training and self-supervised learning), a novel framework that harnesses both multi-view collaborative training and self-supervised learning specifically for group recommendations. By incorporating both group and individual recommendation tasks, MCSS leverages graph convolution and attention mechanisms to generate three sets of embeddings, enhancing the model’s representational power. Additionally, we design self-supervised auxiliary tasks to maximize the data utility, further enhancing performance. Through multi-task joint training, the model generates refined recommendation lists tailored to each group and individual user. Extensive validation and comparison demonstrate the method’s robustness and effectiveness, underscoring the potential of MCSS to advance state-of-the-art group recommendation.

Keywords:

group recommendation; multi-view co-training; self-supervised learning

MSC:

68P10

1. Introduction

In today’s fast-growing information era, people enjoy diverse services offered by electronic platforms. However, as the number of users continues to rise, the issue of information overload has become increasingly severe, leaving users feeling overwhelmed. Search engines can assist users in sifting through vast amounts of information by matching keywords, but for users without a specific goal, they are less effective and cannot efficiently target the desired content. As a solution to information overload, recommendation systems have gained wide acceptance [1]. By learning from user interaction history, these systems automatically recommend items that are likely to interest users [2]. With the progress of society and the improvement in material living standards, people now have more non-survival needs, which are closely linked to individual personalities and preferences. This has led to the continuous refinement of personalized recommendation systems, which are now widely deployed across various fields, including e-commerce and news platforms [3,4,5].

Group activities like office gatherings and family movie nights are common forms of daily entertainment. In recent years, with the rise of online social networking, it has become increasingly convenient and easy for users with similar interests to form online groups [6,7]. In such scenarios, group recommendations are necessary. Traditional recommendation algorithms designed for individuals cannot serve groups effectively, thus giving rise to group recommendation algorithms. These algorithms consider group preferences observed in the group interactions and aggregate the diverse preferences of group members, enabling the group to filter information quickly and recommend satisfactory items. Unlike personalized recommendation algorithms, group recommendations must also model the affiliation between the group and its members to aid in efficient decision making.

Both traditional user-based recommendations and group recommendation systems face the same challenge: historical records of users and groups are often sparse compared with the vast number of items, making it difficult for the system to provide users and groups with accurate content recommendations, significantly impairing the user experience. Collaborative filtering, a widely used approach to alleviate data sparsity, has been applied extensively to recommendation systems [8,9,10,11,12,13,14,15,16]. However, collaborative filtering often struggles with cold-start problems and fails to fully capture complex relationships between users, items, and groups. Inspired by the success of graph convolution and self-supervised learning in other fields, recent group recommendation algorithms have integrated these techniques to address data sparsity effectively. Graph convolution networks (GCNs) are particularly well suited for recommendation tasks as they can model higher-order connectivity and capture rich contextual relationships in sparse data environments. Similarly, self-supervised learning leverages unlabeled data through contrastive learning or auxiliary tasks, providing an additional layer of optimization that enhances the quality of embeddings for sparse datasets. These techniques offer significant advantages over traditional collaborative filtering by exploiting structural patterns in the data and learning robust representations even with limited interactions.

Early research on group recommendations typically focused on aggregating the preferences of group members or scoring items across users. The three most common aggregation methods include the average, least misery, and maximum satisfaction strategies [17,18,19]. However, these methods are often overly simplistic, overlooking interactions between users within the group. In recent years, with the rapid development of deep learning, more group recommendation models have adopted attention mechanisms to model member interactions within groups. For example, Cao et al. [20] integrated attention networks with neural collaborative filtering to solve preference aggregation issues by learning aggregation strategies from data, significantly improving group recommendation performance, particularly for groups with no interaction history. Subsequently, He et al. [21] used heterogeneous information networks and attention mechanisms to learn multi-view embeddings and member weights. Vinh et al. [22] employed self-attention networks to understand individual user preferences and model member interactions. Yin et al. [23] introduced a sophisticated group recommendation model that incorporates a latent variable and attention mechanism to capture both the local and global social influence of users and to model interactions within groups, using bipartite graph embedding to mitigate data sparsity. Jia et al. [24] proposed a dual-channel hypergraph convolutional network for group recommendations in which member-level and group-level preference networks independently learn both personal and general group preferences.

Traditional group recommendation algorithms mostly focus on generating a recommendation list for the group and tend to overlook individual recommendations. Although users are not the primary target in group recommendation scenarios, individual recommendation performance influences overall group satisfaction to some extent, as a group interaction is likely only when most members are satisfied with the recommended items. Focusing solely on group recommendations may lead to wasted user interaction data, as graph convolutional capabilities might underperform when limited to a single group interaction dataset.

To this end, this paper proposes a multi-view co-training and self-supervised learning model (MCSS) for group recommendations, addressing both group and user recommendation tasks. By utilizing user–item, group–item, and group–user interactions, three bipartite graphs are generated, with graph convolution and attention mechanisms [9] used to obtain three sets of embeddings. A self-supervised auxiliary task is designed to further leverage data, generating recommendation lists for each group and user through multi-task joint training. The embedding propagation, embedding fusion, and multi-task joint training steps are designed. Embedding propagation leverages three bipartite graphs (user–item, group–user, and group–item) to capture the complex relationships among users, groups, and items, to produce initial embeddings that form the foundation for further refinement. Embedding fusion then integrates these embeddings, employing an attention mechanism to balance individual member preferences with group-level interactions, ensuring the final embeddings reflect the group’s collective interests rather than simple aggregations. Finally, multi-task joint training combines group recommendation, user recommendation, and contrastive learning tasks, which enhances model robustness by maximizing features shared among group members, thereby addressing data sparsity. Together, these steps enable the model to deliver accurate and personalized recommendations that better meet the needs of both groups and individuals.

Firstly, embedding propagation is applied to generate initial embeddings for users, groups, and items based on three bipartite graphs: user–item, group–user, and group–item. By utilizing the LightGCN algorithm, this step allows for the capture of complex relationships between the entities in a graph structure. In contrast with conventional convolutional models, LightGCN removes self-loops and combines embeddings across propagation layers with a weighted sum, enabling a flexible representation. This propagation process ultimately provides multiple sets of embeddings for users, groups, and items that can be used in subsequent steps.

Firstly, embedding propagation is applied to generate the initial embeddings for users, groups, and items based on three bipartite graphs: user–item, group–user, and group–item. LightGCN is chosen for its ability to model high-order connectivity efficiently while maintaining computational simplicity, making it well-suited for sparse data scenarios. Unlike conventional convolutional models, LightGCN removes self-loops and combines embeddings across propagation layers with a weighted sum, enabling a flexible representation. This propagation process ultimately provides multiple sets of embeddings for users, groups, and items that can be used in subsequent steps.

Following this, embedding fusion integrates these embeddings to capture both individual and group preferences more comprehensively. An attention mechanism is employed to fuse the embeddings, thereby selecting the most relevant features across different views. This approach ensures that group preferences are not solely reliant on individual preferences but are shaped by broader group-level interactions, which is crucial for recommendations that meet collective satisfaction.

To tackle data sparsity, a multi-task joint training process combines group and user recommendation tasks with a contrastive learning task derived from self-supervised learning. For instance, in the contrastive learning task, positive samples are defined as pairs of group and member embeddings from the same group, such as a group embedding generated from Group A and a member embedding from a user in Group A. Hard negative samples are defined as embeddings of users with similar interaction histories but belonging to different groups, such as a user from Group B who has interacted with similar items as Group A members. A non-sampling strategy is employed to use all non-positive samples as negatives, improving robustness in sparse conditions. The contrastive learning task, leveraging InfoNCE loss, encourages the model to maximize the mutual information between group embeddings and member embeddings, helping the model learn shared latent preferences among group members. Positive samples are defined as pairs of group and member embeddings, while hard negative samples are drawn from users outside the group with similar interaction histories, which reinforces the learning of nuanced relationships within the group.

The contributions of the paper are as follows.

Introduces a multi-view embedding propagation framework. This framework leverages three bipartite graphs—user–item, group–user, and group–item—to capture complex relationships and generate initial embeddings for users, groups, and items, improving representation in sparse data scenarios.
Proposes a multi-task joint training strategy with contrastive learning. By combining group recommendation, user recommendation, and contrastive learning tasks, the model enhances robustness and generalization. The self-supervised contrastive learning task maximizes shared features among group members, addressing data sparsity and improving recommendation accuracy for both groups and individuals.
The proposed model is extensively evaluated on public datasets, demonstrating significant improvements in recommendation accuracy and robustness in sparse data scenarios. The experimental results validate the effectiveness of the multi-view embedding propagation, attention-based fusion, and multi-task joint training approach in enhancing group recommendation performance.

The remainder of this paper is organized as follows: Section 2 reviews related works in recommendation systems, particularly focusing on group recommendation techniques and self-supervised learning. Section 3 details our proposed methodology, including multi-view embedding propagation, attention-based embedding fusion, and multi-task joint training. Section 4 presents the experimental results and analysis on public datasets, demonstrating the model’s effectiveness and comparing it with state-of-the-art approaches. Finally, Section 5 concludes the paper.

2. Related Works

2.1. Recommendation Systems

In recent years, recommendation systems have evolved significantly, with various approaches addressing challenges like data sparsity and complex user–item interactions. BPR (Bayesian personalized ranking), a foundational method for implicit feedback, optimizes pairwise preferences by ranking observed interactions higher than unobserved ones. However, its lack of structural modeling limits its effectiveness in handling complex relationships and sparse datasets. To overcome such limitations, graph-based methods like NGCF (Neural Graph Collaborative Filtering) and LightGCN have been introduced. NGCF utilizes graph convolutional networks to model high-order connectivity between users and items, capturing intricate interaction patterns. LightGCN further streamlines this approach by eliminating redundant operations, such as non-linear activations and self-loops, focusing on embedding propagation for computational efficiency and improved performance in sparse environments. Beyond these, attention-based frameworks, such as UIFRS-HAN (User Interests-aware Food Recommender System based on the Heterogeneous Attention Network) [25], leverage diverse data sources and attention mechanisms to prioritize relevant features, enhancing recommendation accuracy.

These advancements demonstrate the potential of leveraging graph-based and attention-driven techniques to address the core challenges in recommendation systems, inspiring the development of more robust and adaptive models. Additionally, contrastive learning-based approaches, which effectively mitigate data sparsity, have gained traction in recommendation systems. Emerging scenarios such as group recommendations further expand the field. These topics will be explored in detail in the following sections.

2.2. Self-Supervised Learning for Recommendation Systems

Self-supervised learning offers a new perspective by generating additional supervisory signals from unlabeled data, allowing models to learn valuable representations for primary tasks. Generally, self-supervised learning approaches can be categorized into generative and contrastive learning methods. Contrastive learning, in particular, provides a promising approach to addressing the data sparsity problem in recommendation systems. Inspired by its success in other fields, researchers have increasingly integrated contrastive learning with recommendation algorithms in recent years, achieving significant results. DUAL [26] introduces a framework that combines predictive and contrastive auxiliary tasks to improve recommendation accuracy. By leveraging predictive signals to enhance future item relevance and contrastive learning to differentiate positive and negative samples, the model achieves more robust and nuanced user preference representations.

The core idea of contrastive learning is to compare positive and negative samples in the feature space, enhancing node representations by pulling positive samples closer and pushing negative samples further apart. Cross-Entropy Loss and InfoNCE loss are commonly used loss functions in contrastive learning for recommendation systems. The Cross-Entropy Loss function is typically used for classification tasks and measures the closeness between the actual and expected outputs, aligning well with the goal of bringing positive samples closer in contrastive learning. As a result, it is widely used in contrastive learning.

Many recent group recommendation models that incorporate self-supervised learning, such as GroupIM [27] and HHGR [28], utilize these two loss functions to design auxiliary tasks. GroupIM proposes a framework that maximizes mutual information between user and group representations to enhance group recommendation performance. By capturing high-level dependencies among group members and aligning them with group preferences, it improves the robustness and accuracy of recommendations in group settings. HHGR leverages self-supervised learning to optimize both local (individual) and global (group) representations, allowing for more robust modeling of user preferences and group dynamics. This dual-scale approach effectively enhances the recommendation quality by addressing the unique challenges in group recommendation settings.

2.3. Group Recommendations

Recommender systems aim to suggest items that users may find interesting. To accurately predict items to recommend, these systems use various data, such as users’ preferences or characteristics. However, in certain domains or scenarios, conventional recommendation systems are insufficient because interactions occur within groups. In group recommendation systems, interactions occur only when a consensus is reached among group members. Therefore, a group recommendation model must generate suggestions that satisfy most group members.

AGREE [20] is a group recommendation algorithm based on neural collaborative filtering. It addresses the challenge of preference aggregation by learning aggregation strategies from data using an attention network and neural collaborative filtering, achieving state-of-the-art performance at the time. This model can learn complex interactions between groups, users, and items to improve recommendation performance, particularly for groups with little to no interaction history. In real-world scenarios, group preferences are not always a simple aggregation of individual interests and may pursue goals that are not directly aligned with members’ individual preferences. HCR [24] addresses this by proposing a dual-channel hypergraph convolutional network for group recommendation, which models user and group preferences through member-level and group-level preference networks, respectively, to better enhance group recommendation performance. ConsRec [29], which learns the consensus behind group interactions, provides a more accurate representation of shared preferences among group members. HHE [30], a hierarchical hyperedge embedding-based representation learning model, leverages hypergraphs to model higher-order interactions and better capture group preferences. MA-NN [31], a multi-attention-based neural network model for group recommendation, employs multiple attention mechanisms to effectively capture both individual contributions and group-level preferences. These approaches highlight the growing emphasis on refining group-level interactions and leveraging advanced models to address the complexity of group recommendation tasks. LARGE [32] is a group recommendation framework based on the concept of leadership dynamics, where the dominant preferences of a leader in the group play a key role in influencing recommendations. Unlike conventional methods that aggregate all group members’ preferences equally, LARGE identifies leadership groups and incorporates their influence into the recommendation process. Additionally, CDRec [2] introduces a novel contrastive learning framework that captures both consistency and discrepancy across user–item contexts in a tripartite graph structure. By leveraging these dual perspectives, the model enhances the representation learning of user–item interactions, improving the adaptability and accuracy of the recommendations.

Despite the advances made by various group recommendation models, there remain significant challenges, especially when dealing with sparse data. One of the key limitations is the insufficient use of inter-supervision among group, user, and item interactions. In scenarios where interactions are sparse, many of these models struggle to generate optimal recommendations because they fail to fully exploit the complex relationships between group members, users, and items. The lack of adequate cross-supervision limits the models’ ability to learn from both individual and collective preferences, leading to suboptimal performance in sparse settings. As a result, these models often cannot effectively capture the nuanced dynamics of group preferences, especially when there are limited historical data available.

3. Methodology

In group recommendation scenarios, in addition to group–item interaction data, there are often interaction data between individual users within the group and items as well as information about each group’s membership. The core challenge lies in effectively leveraging these data to enhance recommendation performance.

Graph convolution can use interaction data to construct a user–item bipartite graph, enabling the propagation of learned node embeddings. Multi-view co-training allows the model to learn embeddings from different views, each reflecting various pieces of semantic information. This approach captures rich, multi-angle information, yielding more accurate embeddings for users, groups, and items. Self-supervised learning, by designing auxiliary tasks, can further harness data to boost recommendation performance, alleviating data sparsity issues.

To this end, we propose a multi-view co-training and self-supervised learning model (MCSS) that simultaneously performs group and user recommendations. Based on interaction and group membership data from the dataset, three bipartite graphs—user–item, group–user, and group–item—are constructed. These graphs enable the model to effectively capture the complex relationships between users, items, and groups, which is crucial for handling data sparsity and individual preferences within the group. After applying graph convolution, an attention mechanism is used to derive three sets of embeddings for users, groups, and items. The attention mechanism allows the model to focus on the most relevant interactions, optimizing the embeddings by considering both individual user preferences and group dynamics. Additionally, a contrastive learning task is designed to maximize the mutual information between groups and their members, fully utilizing interaction data to address data sparsity. This contrastive learning task further enhances the model’s robustness by ensuring that the learned representations are consistent with both individual- and group-level preferences, which is key to improving recommendation accuracy. Group recommendation, user recommendation, and contrastive learning tasks are trained jointly to generate the final recommendation list. This joint training ensures that the model not only learns the preferences of individual users but also aligns these preferences with the broader group context, which is essential for effective group recommendations.

The overall architecture of the proposed group recommendation algorithm, which focuses on both group and user recommendations through multi-view co-training and self-supervised learning, is illustrated in Figure 1.

As shown in the figure, the entire algorithm process is divided into three parts: the embedding propagation layer, the embedding fusion layer, and the multi-task joint training layer: embedding propagation to capture interactions across user–item, user–group, and group–item views; embedding fusion with attention to integrate these embeddings; and multi-task joint training to optimize user and group recommendations along with contrastive learning.

Table 1 summarizes the key symbols and their definitions used throughout the proposed model. These symbols represent various components and parameters, including embeddings, loss functions, and hyperparameters, which are essential for understanding the theoretical framework and implementation details.

3.1. Embedding Propagation

This section corresponds to the embedding propagation layer in Figure 1, which primarily obtains group embeddings, user embeddings, and item embeddings to accomplish the tasks of the algorithm. In traditional recommendation algorithms that incorporate graphs, user–item interactions are typically modeled using bipartite graphs. Simply put, a bipartite graph is structured so that its vertices are divided into two disjoint subsets, with each edge connecting vertices from different subsets. This structure aligns well with user–item interactions.

In this paper, we adopt a bipartite graph structure to construct three types of bipartite graphs: user–item, group–user, and group–item. The user–item graph captures interaction information between users and items, the group–user graph displays group membership and the relationships of users across different groups, while the group–item graph models all interactions between groups and items.

After constructing these three bipartite graphs, we randomly initialize three sets of embeddings for graph convolution. We apply a parallel convolution strategy on the three bipartite graphs using the LightGCN algorithm for graph convolution. The propagation rule of LightGCN is shown in Equation (3). Unlike traditional approaches, LightGCN removes self-loops in the convolution process and combines embeddings from all propagation layers using a weighted sum, which captures the effect of self-loops. For instance, the final user embedding is represented by Equation (1):

e_{u} = \sum_{l = 0}^{L} α_{l} e_{u}^{(l)}

(1)

where

e_{u}

represents the user embedding,

L

is the number of propagation layers, and

α_{l}

is a manually adjustable weight for each layer. To simplify the process, our model omits the weighted combination of all layers’ embeddings to obtain the final embedding, and instead, we add self-loops in LightGCN’s structure to maintain model performance.

The graph convolution operation generates two sets of embeddings for users, groups, and items, capturing their interactions across different contexts. For users, UE1 (User Embedding 1) captures individual preferences derived from user–item interactions, while UE2 (User Embedding 2) reflects their roles and contributions within group settings from user–group relationships. For groups, GE1 (Group Embedding 1) models collective preferences based on group–item interactions, and GE2 (Group Embedding 2) captures internal group dynamics from user–group relationships. Similarly, for items, IE1 (Item Embedding 1) highlights relevance derived from user–item interactions, while IE2 (Item Embedding 2) focuses on item popularity within group–item interactions. These complementary embeddings collectively capture both individual- and group-level preferences, enabling the model to effectively learn nuanced relationships and improve recommendation accuracy across diverse scenarios.

It is worth mentioning that although this paper employs the LightGCN encoder, the proposed framework’s encoder is modular, allowing LightGCN to be easily replaced by other graph convolution encoders in the code. Thus, our framework is model agnostic, providing a relatively general approach for group recommendation. Future research can use this framework for further performance studies.

3.2. Embedding Fusion

In some group recommendation models, group embeddings are aggregated from the embeddings of their members, as in GroupIM [27]. This approach effectively encodes common preferences among members, which positively impacts group recommendation tasks. However, in specific situations, this approach may not provide accurate recommendations for the group. For example, in a family of three, where the parents prefer thriller and romance movies, respectively, and the child likes animated films, the family may choose a comedy or educational film suitable for all ages when watching together. In such scenarios, aggregating user embeddings to form group embeddings may overlook the unique interaction purpose of the group.

Therefore, this paper obtains group embeddings through graph convolutions based on group interactions and fuses these embeddings with those derived from group–user relationships to create the final group embedding.

In the embedding acquisition stage mentioned above, we obtain three sets of embeddings: (UE1, UE2), (GE1, GE2), and (IE1, IE2). We employ an attention mechanism to fuse these embeddings into the final embeddings for users, groups, and items. The attention mechanism is designed to automatically identify and prioritize the most relevant features from the different embeddings, ensuring that the fused embeddings effectively represent the underlying preferences and interactions.

For group recommendations, the attention mechanism addresses unique challenges by dynamically balancing the contributions of individual member preferences (captured by UE2 and GE2) with collective group-level preferences (captured by GE1 and IE2). This ensures that the final group embeddings reflect both the individual member contributions and the overall group consensus, which is critical for generating recommendations that satisfy the group as a whole. Similarly, for user and item embeddings, the attention mechanism selects features that capture nuanced relationships within user–item and group–item interactions, enhancing the model’s ability to provide accurate and personalized recommendations.

By leveraging the attention mechanism in this way, the model effectively mitigates the risk of over-relying on either individual- or group-level interactions, addressing the inherent complexity of group recommendations. This approach enables the system to adaptively weigh different sources of information, improving the robustness and accuracy of the recommendations in diverse scenarios.

3.3. Multi-Task Joint Training

This study aims to improve the performance of both group and user recommendations while using contrastive learning from self-supervised learning to address the issue of data sparsity. As shown in Figure 2, the model includes three tasks: a group recommendation task, a user recommendation task, and an auxiliary contrastive learning task.

3.3.1. Group Recommendation Task

The group recommendation algorithm proposed in this paper prioritizes target items using implicit feedback data. Two common training strategies for implicit feedback are the negative sampling strategy and the non-sampling strategy [33]. Briefly, the main difference lies in the selection of negative samples: the former selects a subset of all samples, excluding positive samples, as negatives, while the latter uses all non-positive samples as negatives. One of the algorithms compared in this paper, BPR, adopts the negative sampling strategy, which has fewer training samples and is faster, but less robust, making it difficult for the model to achieve optimal performance.

Due to the inclusion of all samples, traditional non-sampling strategies generally yield better training results but are inefficient, with a model training complexity of O (|B||V|d), where B represents the batch size of users, V is the total number of items, and d is the embedding dimension. This complexity is generally unacceptable for recommendation models. In recent years, the information retrieval group at Tsinghua University has explored non-sampling strategies for recommendation systems, designing and implementing efficient non-sampling learning algorithms successfully applied in various recommendation system scenarios [33].

The present study employs a non-sampling strategy, with the loss function for the group recommendation task illustrated in Equation (2). This formulation is designed to balance computational efficiency and representation accuracy, leveraging the relationships between groups, items, and users to enhance model performance.

L_{g} = \sum_{n = 1}^{d} \sum_{j = 1}^{d} ((\sum_{g \in B} e_{g, n}^{I} e_{g, j}^{I}) (\sum_{i \in V} c_{i}^{I -} e_{i, n} e_{i, j}) (h_{1, n} h_{1, j})) + \sum_{g \in B} \sum_{i \in V^{+}} ((1 - c_{i}^{I -}) {\hat{R}}_{g i}^{2} - 2 {\hat{R}}_{g i})

(2)

Here,

c_{i}^{I -}

denotes the weight of the sample, and

{\hat{R}}_{g i}

represents the interaction data between the group and the items. These interaction data are computed as shown in Equation (3).

{\hat{R}}_{g i} = h^{T} (e_{g} ⊙ e_{i})

(3)

in Equation (3),

e_{g}

and

e_{i}

denote the embedding vectors for the group g and item i, respectively, while ⊙ represents the element-wise dot product of these vectors; h is a trainable parameter vector that projects the interaction features into a scalar value.

The computational complexity of this loss function is

O ((| B | + | V |) d^{2} + |R_{B}| d)

, where

|R_{B}|

denotes the number of positive samples. Given that the number of positive samples in practical data is less than

R_{B}

<

| B | | V |

, the complexity of this loss function is an order of magnitude lower than that of traditional non-sampling complexities, allowing for efficient application in neural recommendation systems [33]. This loss function integrates seamlessly into the broader model framework by jointly optimizing the group recommendation task (Equation (2)) alongside the user recommendation task and the contrastive learning task, as described in the following sections. This joint optimization ensures that the theoretical goals of balancing group-level and individual-level interactions translate effectively into practical implementation, enhancing recommendation accuracy across all tasks.

3.3.2. User Recommendation Task

The group recommendation task aims to provide a recommendation list for each group, whereas the user recommendation task seeks to generate a recommendation list for each user. To maintain consistency and reduce model complexity, the user recommendation task adopts the same efficient non-sampling strategy mentioned earlier. The loss function for the user recommendation task is expressed in Equation (4).

L_{u} = \sum_{n = 1}^{d} \sum_{j = 1}^{d} ((\sum_{u \in B} e_{u, n}^{I} e_{u, j}^{I}) (\sum_{i \in V} c_{i}^{I -} e_{i, n} e_{i, j}) (h_{1, n} h_{1, j})) + \sum_{u \in B} \sum_{i \in V^{+}} ((1 - c_{i}^{I -}) {\hat{R}}_{u i}^{2} - 2 {\hat{R}}_{u i})

(4)

The definitions of the various parameters in this equation are similar to those in the group recommendation loss, with the only difference being the shift from groups to users.

3.3.3. Contrastive Learning Task

To address the issue of data sparsity, this study introduces a contrastive learning approach derived from self-supervised learning techniques. By constructing auxiliary tasks, it enhances the performance of the main task and the model’s generalization ability. According to empirical data, the model typically encounters challenges due to insufficient group interaction data while completing recommendation tasks. Group preferences are partially dependent on the preferences of group members, and members who frequently interact within a group often exhibit similar preferences. Consequently, group activities reveal both intra-group connections and inter-group distinctions.

By maximizing the mutual information between group members and group embeddings—contrasting the preference representations of group members with those of non-group members who have similar item interaction histories—it effectively regularizes the feature spaces of user and group representations. This process promotes the encoding of shared features among group members, which may not be discernible from their limited interaction histories in the group–item graph. Positive sample pairs are defined as (group embedding, group member embedding), while negative sample pairs are (group embedding, non-group member embedding with similar interaction history). The contrastive loss utilizes InfoNCE loss, defined as in Equation (5).

L_{M I} = \frac{1}{| G |} \sum_{g \in G} - l o g \frac{\sum_{u_{i} \in u^{g}} e x p (c o s (e_{g}, e_{u_{i}}) / τ)}{e x p (c o s (e_{g}, e_{\tilde{u}}) / τ)}

(5)

In this equation,

τ

is the temperature coefficient for the InfoNCE loss, a manually adjustable hyperparameter.

G

denotes the set of all groups,

u^{g}

represents all members within group

g

,

e_{g}

indicates the embedding of group

g

,

e_{u_{i}}

refers to the embedding of group members, and

e_{\tilde{u}}

signifies the embedding of user-specific negative samples for group

g

.

This study employs a preference-based negative sample sampling distribution

P_{N} (\tilde{u} ∣ g)

, which allocates a higher probability to non-group member users who have purchased items interacted with by the group. These hard negative sample pairs encourage the model to learn shared latent information among group members by contrasting with other users who have similar individual item histories. The sampling distribution

P_{N} (\tilde{u} ∣ g)

is defined as in Equation (6).

P_{N} (\tilde{u} ∣ g) \propto η I (x_{\tilde{u}}^{T} \cdot x_{g} > 0\}) + (1 - η) \frac{1}{| U |}

(6)

Here,

I (\cdot)

is the indicator function, and

η

is the hyperparameter used to control sampling bias. Variables

x_{g}

and

x_{\tilde{u}}^{T}

represent the interaction histories of group

g

and negative sample

\tilde{u}

, respectively. This sampling method is more effective than random negative sampling in achieving the model’s objectives.

This section corresponds to the contrastive learning task illustrated in Figure 2, where ∣K∣ indicates the number of members in group

g

, and M denotes the number of negative samples sampled from group

g

, which is also an adjustable hyperparameter.

3.3.4. Multi-Task Joint Training Loss

The model optimization employs a joint training approach for the group recommendation task, user recommendation task, and contrastive learning task. The overall objective of the model comprises the group loss (Equation (2)), user loss (Equation (4)), and contrastive loss (Equation (5)). The composite objective is expressed in Equation (7).

L = L_{g} + {λ L}_{u} + λ_{M I} L_{M I}

(7)

In this equation,

λ

and

λ_{M I}

are hyperparameters used to regulate the importance between different tasks. Parameter

λ

balances the group recommendation task and the user recommendation task, while

λ_{M I}

adjusts the weight of contrastive learning.

4. Experiments and Analysis

To verify the performance of the proposed algorithm, this section implements the MCSS group recommendation algorithm based on multi-view co-training and self-supervised learning. The algorithm is then compared with three classical recommendation algorithms and two group recommendation algorithms on two public datasets. Additionally, we analyze the impact of key hyperparameters on the model’s effectiveness. Further ablation studies are conducted on essential components of the model, examining performance differences between single-task and multi-task models, exploring possible reasons, and analyzing the impact of sample interaction quantities on model performance.

The experiments are divided into three main parts:

Performance Evaluation of MCSS: This section implements the proposed algorithm and compares its performance with three classic recommendation algorithms and two group recommendation algorithms on two public datasets.
Hyperparameter Sensitivity Analysis: The MCSS algorithm contains several hyperparameters, with three particularly important ones: the depth of the neural network (i.e., the number of graph convolution layers, lll); the number of negative samples, MMM, in contrastive learning; and the temperature coefficient, τ, in contrastive loss. This section analyzes the sensitivity of these three hyperparameters on the CAMRa2011 dataset.
Ablation Study: This experiment analyzes the role of key components in the model by testing three variants: a standalone group recommendation task, a standalone user recommendation task, and a dual-task model compared with the overall model with contrastive learning. The performance of these variants is compared and analyzed.

4.1. Datasets, Baselines, and Setup

This study selects two public datasets, CAMRa2011 [20] and Mafengwo [20], commonly used for group recommendation. The CAMRa2011 dataset is a public dataset provided for a movie recommendation competition, containing interaction records between individual users, families, and movies. It includes 602 users, 290 groups, 7710 items, 116,344 user–item interactions, and 145,068 group–item interactions, with an average group size of 2.08. The Mafengwo dataset comes from the travel website Mafengwo, where users can record travel destinations and create or join group trips. This dataset contains 5275 users, 995 groups, 1513 items, 39,765 user–item interactions, and 3595 group–item interactions, with an average group size of 7.19 users. Details of these two datasets are presented in Table 2.

In this section, we first implement the group recommendation algorithm based on multi-view co-training and self-supervised learning on the CAMRa2011 and Mafengwo datasets. Additionally, three classical recommendation algorithms (BPR, NGCF, and LightGCN), one recommendation algorithm for mitigating the cold start problem (DUAL), and five group recommendation algorithms (AGREE, HCR, HHGR, LARGE, and CDRec) are implemented on both datasets for comparative evaluation.

Since this study investigates group recommendations in sparse scenarios, we use only 40% of the original training set for training, 20% as a validation set, and the remaining 40% as a test set. In our approach, the embedding dimension is set to 64, and the batch size is 512. The baseline methods retain their optimal parameter settings. The variance proportion threshold β is set at 0.075 for CAMRa2011 and 0.3 for Mafengwo. Notably, the three classical recommendation algorithms are general-purpose algorithms in the recommendation domain, not specifically designed for group recommendations. As a result, they do not capture group–user affiliation relationships, so they rely only on group–item and user–item interactions to separately generate group and user recommendation lists. For fairness in algorithm comparison, the group recommendation algorithms are adjusted to ensure consistency in evaluation metrics with the proposed approach. This adjustment is made based on the original source code provided by the authors of these algorithms.

4.2. Overall Performance Evaluation

The overall comparisons for group and user recommendations on the CAMRa2011 and Mafengwo datasets are shown in Table 3 and Table 4, respectively. Precision, Recall, and NDCG are used as evaluation metrics to assess model performance at the Top-20 level.

We highlight the best results across all models in bold and underline the best-performing comparison algorithm, with “improve” in the tables representing the increase relative to the best comparison algorithm. Based on Table 3, it can be observed that on the CAMRa2011 dataset, our proposed model achieves the highest scores across all metrics at Top-20 for both group recommendation and user recommendation tasks compared with the other algorithms. Among the three classical recommendation algorithms, NGCF and LightGCN perform comparably and outperform BPR, as both NGCF and LightGCN utilize graph convolution to effectively model the interaction process. BPR utilizes a traditional matrix factorization approach, NGCF and LightGCN exploit graph-based representations, capturing more complex interaction patterns within the data. The performance boost in NGCF and LightGCN highlights the importance of utilizing graph convolutional networks to model high-order interactions, which proves particularly advantageous in scenarios with dense interaction data.

The DUAL method, which incorporates two auxiliary tasks to alleviate the cold-start problem in general recommendation settings, demonstrates competitive performance, achieving the second-best precision in user recommendation while also showing strong results in group recommendation. The dual-task approach allows the model to better generalize across different scenarios by addressing data sparsity and providing more robust embeddings for both users and items, particularly in cold-start situations.

Among the five group recommendation algorithms compared, HHGR, CDRec, and LARGE significantly outperform AGREE and HCR in terms of precision for the group recommendation task. This is because AGREE and HCR primarily generate group embeddings from group–item interactions, which limits their ability to capture the diverse preferences of individual group members. In contrast, HHGR, CDRec, and LARGE better model the interactions between group members and items, either through hierarchical structures (HHGR), contrastive learning (CDRec), or leadership dynamics (LARGE). These models focus on aggregating embeddings from individual members, which allows for a more comprehensive and representative group preference model.

CDRec, in particular, stands out by addressing the cold-start problem in group recommendations, resulting in better precision compared with the other baseline methods. This highlights the critical importance of effectively addressing cold-start issues, particularly in group settings where data for new or less-active groups is often limited.

In summary, our model consistently outperforms the baseline methods across multiple metrics, particularly by better handling both user and group recommendation tasks. The comparative analysis demonstrates the advantages of incorporating advanced techniques such as graph convolutions, auxiliary tasks, and contrastive learning to address the specific challenges in recommendation systems, such as cold-start problems and the accurate aggregation of group member preferences.

Referring to Table 4, we observe that on the Mafengwo dataset, our proposed model again outperforms all other algorithms across all metrics for both group and user recommendation tasks. The performance of the other algorithms on this dataset is similar to their performance on the CAMRa2011 dataset. Notably, some group recommendation algorithms exhibit slightly worse performance compared with NGCF or LightGCN on both datasets. This can be attributed to a larger base of items used in generating the recommendation lists in group recommendation algorithms. For example, suppose in a dataset, the item set A contains 1000 items that have interacted with users, while the item set B contains 500 items that have interacted with groups. Due to an overlapping yet partial distinction between sets A and B, the total item set C that has interacted with either users or groups may comprise 1200 items. For the group recommendation baselines, the item set C is used to generate recommendations for both group and user tasks. In contrast, NGCF and LightGCN, which do not consider group–user interaction relationships, use item set A for the group recommendation tasks and item set B for the user recommendation tasks, resulting in slightly better performance than some group recommendation baselines.

4.3. Hyperparameter Sensitivity Analysis

We conduct a hyperparameter sensitivity analysis on the CAMRa2011 dataset, using NDCG@20 and Recall@20 as performance evaluation metrics. This analysis examines the effects of three key hyperparameters: the number of graph convolution layers, the number of negative samples M, and the temperature coefficient τ in contrastive loss. The impact of the graph convolution layers l on each performance metric is shown in Figure 3.

From the figure, it can be observed that for both group and user recommendation tasks, as the number of graph convolution layers (i.e., the neural network depth) increases, the metrics NDCG@20 and Recall@20 initially rise and then decline, peaking at a graph convolution layer count of 3. Therefore, setting the graph convolution layer number to 3 yields the best model performance. The performance decrease beyond this peak may be due to the over-smoothing problem in deep graph convolutional networks.

The impact of the number of negative samples M per group on model performance is shown in Figure 4.

From the figure above, it can be observed that as the number of negative samples increases from 2 to 10, both metrics for both tasks initially improve and then decline, reaching their highest values at a negative sample count of 6. This suggests that a small number of negative samples can enhance the recommendation task, while a larger number of negative samples may mislead the recommendation system, leading to decreased performance.

The impact of the temperature coefficient

τ

in contrastive loss on the model is shown in Figure 5. Figure 5a illustrates the effect of

τ

on the group recommendation, and Figure 5b shows its effect on the user recommendation. As the temperature coefficient increases, contrastive loss degrades into a loss function that focuses only on the hardest negative samples. Conversely, as the temperature coefficient decreases, contrastive loss applies equal weighting to all negative samples, thus losing the focus on hard samples. Figure 5 indicates that the method presented in this paper is not highly sensitive to the temperature coefficient. Consequently, we adopt a value of τ = 0.07, which comparatively achieves higher performance, as the final temperature coefficient.

4.4. Ablation Study

In this section, an ablation study is conducted to validate the effectiveness and necessity of each major component of the model, analyzing the possible reasons behind the results obtained. The model consists of three tasks: group recommendation, user recommendation, and contrastive learning tasks.

To evaluate the importance of each component, we design three variants: a standalone group recommendation model (MCSS-G), a standalone user recommendation model (MCSS-U), and a dual-task model without self-supervised learning (MCSS-Dual). By comparing the performance of these three variants with the complete model (MCSS), we can assess whether the two main tasks are mutually beneficial and the true impact of the contrastive learning task.

This experiment is conducted on the CAMRa2011 and Mafengwo datasets, with NDCG@20, Recall@20, NDCG@50, and Recall@50 as the evaluation metrics, representing the Top-20 and Top-50 performance for each metric. Table 5 shows the specific results for the CAMRa2011 dataset.

Firstly, examining the data in Table 5, where the left side shows Top-20 and the right side shows Top-50 results, we observe that in the Top-50 setting, the final model (MCSS) achieves the best performance for both group and user recommendation tasks. This confirms the effectiveness and high performance of the proposed multi-view co-training and self-supervised learning-based group recommendation algorithm.

Moreover, the dual-task model (MCSS-Dual) slightly outperforms the two single-task models, indicating that on the CAMRa2011 dataset, jointly training the two tasks yields better performance. We attribute this to the following: for the group recommendation task, a group’s preferences are largely dependent on the common preferences of its member users. Therefore, training the user task together with this task helps improve the learning of group embeddings. For the user recommendation task, the interactions within groups often relate to the preferences of individual users, which may be influenced by group interactions. For instance, Alice might never have tried Sichuan cuisine, but after dining with roommates who enjoy it, she tries it and grows to like it, increasing her likelihood of choosing Sichuan cuisine when dining alone. Thus, combining group and user recommendations is mutually beneficial.

Further examining Table 5, we observe that the performance improvement of the dual-task model (MCSS-Dual) over the single-task model (MCSS-G) is not very large. Similarly, while adding the contrastive learning task to the dual-task model (MCSS) enhances performance, the improvement is not substantial. In theory, adding the user task to the group task should provide considerable benefits, as it compensates for sparse group data with a wealth of user interactions. However, analyzing the CAMRa2011 dataset reveals that the average group size is only 2.08, and the group–item interactions exceed user–item interactions by 28,724 records, which deviates from the typical assumption. Under these conditions, user interactions do not significantly aid the group task, making the modest performance gain reasonable. Additionally, statistics show that, on average, 40% of items interacted with by users are also interacted with by groups the users belong to, further reducing the contribution of user interactions to groups, supporting the validity of our algorithm’s results. The conclusions drawn from the Top-20 results on the CAMRa2011 dataset align with this analysis.

4.5. Discussion

The proposed model demonstrates significant advancements over existing group recommendation approaches. It successfully integrates the strengths of multiple learning techniques, including attention mechanisms, contrastive learning, and multi-task training. In comparison with traditional models, such as AGREE, HCR, and LightGCN, the proposed model offers several key advantages.

One of the primary strengths of the model is its ability to address the cold-start problem more effectively through contrastive learning and preference-based negative sampling. This enables the model to learn from limited data and make more accurate predictions, especially in scenarios wherein group interactions are sparse or limited. The use of attention mechanisms in the model further enhances its ability to prioritize important user and group interactions, thereby improving the accuracy of both group and user recommendation tasks.

Moreover, the joint optimization of multiple tasks—group recommendation, user recommendation, and contrastive learning—allows the model to learn shared features among group members and refine user and group representations simultaneously. This holistic approach is particularly effective in modeling complex group dynamics, where group preferences are not simply an aggregation of individual preferences.

However, there are also limitations. Despite its promising performance, the model still faces challenges related to computational efficiency and scalability, particularly when dealing with very large datasets. The increased complexity introduced by multiple attention mechanisms and the contrastive learning component may also result in longer training times. Additionally, the model’s reliance on group embeddings and individual user interactions could limit its ability to generalize in more heterogeneous group settings, where member preferences may vary significantly.

In comparison with other models, the proposed approach shows clear advantages in terms of recommendation accuracy and the ability to handle sparse data, but it may require further optimization to balance model complexity with computational efficiency.

5. Conclusions

In conclusion, this paper presents a novel approach to group recommendations designed to address challenges associated with data sparsity and varying group member preferences. By introducing a multi-view embedding propagation framework, an attention-based embedding fusion process, and a multi-task joint training strategy, the model effectively captures complex user–group–item interactions, balancing both individual and collective preferences within groups. Extensive experiments on public datasets demonstrate the model’s robustness and superior performance over existing methods, particularly in sparse data scenarios. This approach not only improves recommendation accuracy but also provides a flexible, modular framework that can be adapted to various recommendation contexts.

While the proposed multi-view co-training and self-supervised learning model (MCSS) provides a solid foundation for group recommendations, there are several promising directions for future research. First, the embedding fusion layer in the model relies on an attention mechanism to integrate user, group, and item embeddings. Future work could explore advanced attention techniques, such as multi-head attention or hierarchical attention networks, to better capture the complex relationships between users and groups, especially in diverse groups where individual interactions vary significantly. This would enhance the model’s ability to focus on the most relevant interactions, improving overall recommendation accuracy.

Second, our approach leverages contrastive learning to mitigate data sparsity, but there is potential for further refinement. Future research could focus on optimizing the contrastive learning task to better handle situations where user–item interactions are particularly sparse. For instance, semi-supervised learning approaches could be incorporated to make better use of limited labeled data, improving the model’s robustness and its ability to generalize from fewer interactions.

Lastly, the multi-task joint training layer simultaneously optimizes group and user recommendation tasks. Future work could investigate how to fine-tune this multi-task learning process to improve the balance between individual user satisfaction and group-level recommendations. Techniques such as dynamic task weighting or adaptive task prioritization could help ensure that both tasks are optimized effectively, leading to more accurate and personalized recommendations for both individual users and groups.

Author Contributions

Conceptualization, S.C.; methodology, F.W.; validation, S.C.; writing—original draft preparation, F.W.; writing—review and editing, S.C.; visualization, F.W.; supervision, S.C.; project administration, F.W. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Science and Technology Innovation Key R&D Program of Chongqing, China (CSTB2022TIAD-STX0006) and the Science and Technology Research Program of Chongqing Municipal Education, China (KJZD-K202204402 and KJZD-K202304401).

Data Availability Statement

The datasets used in our research are derived from public domain resources. The datasets are available at the following link: https://github.com/FDUDSDE/WWW2023ConsRec/tree/master/data (accessed on 1 June 2024).

Acknowledgments

We sincerely thank the reviewers for their thorough evaluation of our manuscript and their valuable feedback.

Conflicts of Interest

The authors declare no conflicts of interest. The funders had no role in the design of the study; in the collection, analysis, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

References

Awati, C.J.; Shirgave, S.K.; Thorat, S.A. Improving performance of recommendation systems using sentiment patterns of user. Int. J. Inf. Technol. 2023, 15, 3779–3790. [Google Scholar] [CrossRef]
Guo, L.; Zhu, Y.; Gao, M.; Tao, Y.; Yu, J.; Chen, C. Consistency and Discrepancy-Based Contrastive Tripartite Graph Learning for Recommendations. In Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Barcelona, Spain, 25–29 August 2024; pp. 944–955. [Google Scholar]
Lin, X.Y.; Xu, Y.Y.; Wang, W.J.; Zhang, Y.; Feng, F.-L. Mitigating spurious correlations for self-supervised recommendation. Mach. Intell. Res. 2023, 20, 263–275. [Google Scholar] [CrossRef]
Narducci, F.; Basile, P.; de Gemmis, M.; Lops, P.; Semeraro, G. An investigation on the user interaction modes of conversational recommender systems for the music domain. User Model. User-Adapt. Interact. 2020, 30, 251–284. [Google Scholar] [CrossRef]
Indira, K.; Kavithadevi, M.K. Efficient machine learning model for movie recommender systems using multi-cloud environment. Mob. Netw. Appl. 2019, 24, 1872–1882. [Google Scholar] [CrossRef]
Ausat, A.M.A. The role of social media in shaping public opinion and its influence on economic decisions. Technol. Soc. Perspect. (TACIT) 2023, 1, 35–44. [Google Scholar] [CrossRef]
González-Bailón, S.; Lelkes, Y. Do social media undermine social cohesion? A critical review. Soc. Issues Policy Rev. 2023, 17, 155–180. [Google Scholar] [CrossRef]
Javadi, S.; Safa, R.; Azizi, M.; Seyed, A. A recommendation system for finding experts in online scientific communities. J. AI Data Min. 2020, 8, 573–584. [Google Scholar]
Wen, P.; Yuan, W.; Qin, Q.; Sang, S.; Zhang, Z. Neural attention model for recommendation based on factorization machines. Appl. Intell. 2021, 51, 1829–1844. [Google Scholar] [CrossRef]
Iwendi, C.; Ibeke, E.; Eggoni, H.; Velagala, S.; Srivastava, G. Pointer-based item-to-item collaborative filtering recommendation system using a machine learning model. Int. J. Inf. Technol. Decis. Mak. 2022, 21, 463–484. [Google Scholar] [CrossRef]
Lin, J.; Pan, W.; Ming, Z. FISSA: Fusing item similarity models with self-attention networks for sequential recommendation. In Proceedings of the 14th ACM Conference on Recommender Systems, Rio de Janeiro, Brazil, 1 May–1 July 2020; pp. 130–139. [Google Scholar]
Wang, C.S.; Chen, B.S.; Chiang, J.H. TDD-BPR: The topic diversity discovering on Bayesian personalized ranking for personalized recommender system. Neurocomputing 2021, 441, 202–213. [Google Scholar] [CrossRef]
Noulapeu Ngaffo, A.; Choukair, Z. A deep neural network-based collaborative filtering using a matrix factorization with a twofold regularization. Neural Comput. Appl. 2022, 34, 6991–7003. [Google Scholar] [CrossRef]
Forouzandeh, S.; Rostami, M.; Berahmand, K. Presentation a Trust Walker for rating prediction in recommender system with Biased Random Walk: Effects of H-index centrality, similarity in items and friends. Eng. Appl. Artif. Intell. 2021, 104, 104325. [Google Scholar] [CrossRef]
Guo, Y.; Yan, Z. Recommended system: Attentive neural collaborative filtering. IEEE Access 2020, 8, 125953–125960. [Google Scholar] [CrossRef]
Da’u, A.; Salim, N. Recommendation system based on deep learning methods: A systematic review and new directions. Artif. Intell. Rev. 2020, 53, 2709–2748. [Google Scholar] [CrossRef]
Şatır, E.; Bulut, H. A novel hybrid approach to improve neural machine translation decoding using phrase-based statistical machine translation. In Proceedings of the 2021 International Conference on INnovations in Intelligent SysTems and Applications (INISTA), Kocaeli, Turkey, 25–27 August 2021; IEEE: New York, NY, USA, 2021; pp. 1–5. [Google Scholar]
Baevski, A.; Mohamed, A. Effectiveness of self-supervised pre-training for asr. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Barcelona, Spain, 4–8 May 2020; IEEE: New York, NY, USA, 2020; pp. 7694–7698. [Google Scholar]
Berkovsky, S.; Freyne, J. Group-based recipe recommendations: Analysis of data aggregation strategies. In Proceedings of the fourth ACM Conference on Recommender Systems, Barcelona, Spain, 26–30 September 2010; pp. 111–118. [Google Scholar]
Cao, D.; He, X.; Miao, L.; An, Y.; Yang, C.; Hong, R. Attentive group recommendation. In Proceedings of the 41st International ACM SIGIR Conference on Research & Development in Information Retrieval, Ann Arbor, MI, USA, 8–12 July 2018; pp. 645–654. [Google Scholar]
He, Z.; Chow, C.Y.; Zhang, J.D. GAME: Learning graphical and attentive multi-view embeddings for occasional group recommendation. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval, Virtual, 25–30 July 2020; pp. 649–658. [Google Scholar]
Vinh Tran, L.; Nguyen Pham, T.A.; Tay, Y.; Liu, Y.; Cong, G.; Li, X. Interact and decide: Medley of sub-attention networks for effective group recommendation. In Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, Paris, France, 21–25 July 2019; pp. 255–264. [Google Scholar]
Yin, H.; Wang, Q.; Zheng, K.; Li, Z.; Yang, J.; Zhou, X. Social influence-based group representation learning for group recommendation. In Proceedings of the 2019 IEEE 35th International Conference on Data Engineering (ICDE), Macao, China, 8–11 April 2019; IEEE: New York, NY, USA, 2019; pp. 566–577. [Google Scholar]
Jia, R.; Zhou, X.; Dong, L.; Pan, S. Hypergraph Convolutional Network for Group Recommendation. In Proceedings of the 2021 IEEE International Conference on Data Mining (ICDM), Auckland, New Zealand, 7–10 December 2021; IEEE: New York, NY, USA, 2021; pp. 260–269. [Google Scholar]
Forouzandeh, S.; Berahmand, K.; Rostami, M.; Aminzadeh, A.; Oussalah, M. User interests-aware food recommender system based on the heterogeneous attention network. Eng. Appl. Artif. Intell. 2024, 135, 108766. [Google Scholar] [CrossRef]
Tao, Y.; Gao, M.; Yu, J.; Wang, Z.; Xiong, Q.; Wang, X. Predictive and contrastive: Dual-auxiliary learning for recommendation. IEEE Trans. Comput. Soc. Syst. 2023, 10, 2254–2265. [Google Scholar] [CrossRef]
Sankar, A.; Wu, Y.; Wu, Y.; Zhang, W.; Yang, H.; Sundaram, H. Groupim: A mutual information maximization framework for neural group recommendation. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval, Virtual, 25–30 July 2020; pp. 1279–1288. [Google Scholar]
Zhang, J.; Gao, M.; Yu, J.; Guo, L.; Li, J.; Yin, H. Double-Scale Self-Supervised Hypergraph Learning for Group Recommendation. In Proceedings of the 30th ACM International Conference on Information & Knowledge Management, Virtual, 1–5 November 2021; pp. 2557–2567. [Google Scholar]
Wu, X.; Xiong, Y.; Zhang, Y.; Jiao, Y.; Zhang, J.; Zhu, Y.; Yu, P.S. Consrec: Learning consensus behind interactions for group recommendation. In Proceedings of the ACM Web Conference, Austin, TX, USA, 30 April–4 May 2023. [Google Scholar]
Guo, L.; Yin, H.; Chen, T.; Zhang, X.; Zheng, K. Hierarchical hyperedge embedding-based representation learning for group recommendation. arXiv 2021, arXiv:2103.13506. [Google Scholar] [CrossRef]
Huang, Z.; Liu, X.; Gao, W.; Xu, X.; Zhu, H.; Zhou, M. An efficient group recommendation model with multi attention-based neural networks. IEEE Trans. Neural Netw. Learn. Syst. 2020, 31, 4461–4474. [Google Scholar] [CrossRef] [PubMed]
Gan, D.; Gao, M.; Li, W.; Wang, Z.; Guo, L.; Jiang, F.; Song, Y. LARGE: A leadership perception framework for group recommendation. Expert Syst. Appl. 2024, 260, 125416. [Google Scholar] [CrossRef]
Chen, C.; Zhang, M.; Zhang, Y.; Liu, Y.; Ma, S. Efficient neural matrix factorization without sampling for recommendation. ACM Trans. Inf. Syst. (TOIS) 2020, 38, 1–28. [Google Scholar] [CrossRef]

Figure 1. Structure of group recommendation algorithm based on multi-view co-training and self-supervised learning.

Figure 2. Multi-task joint training.

Figure 3. Influence of the layers of graph convolution. (a) Group recommendation, (b) User recommendation.

Figure 4. Influence of the number of negative samples. (a) Group recommendation, (b) User recommendation.

Figure 5. Influence of the temperature coefficient

τ

. (a) Impact of temperature coefficient on group recommendation. (b) Impact of temperature coefficient on user recommendation.

Figure 5. Influence of the temperature coefficient

τ

. (a) Impact of temperature coefficient on group recommendation. (b) Impact of temperature coefficient on user recommendation.

Table 1. Key symbols and definitions used in the model.

Symbol	Definition
e_u	User embedding.
L	Number of propagation layers in LightGCN.
α_l	Manually adjustable weight for each propagation layer.
UE1, UE2	User Embedding 1, User Embedding 2: Embeddings derived from user–item interactions and user–group relationships, respectively.
GE1, GE2	Group Embedding 1, Group Embedding 2: Embeddings derived from group–item interactions and user–group relationships, respectively.
IE1, IE2	Item Embedding 1, Item Embedding 2: Embeddings derived from user–item interactions and group–item interactions, respectively.
e_g	Group vector representing the embedding for the group.
e_i	Item vector representing the embedding for the item.
⊙	Dot product operation between vectors.
${\hat{R}}_{g i}$	Interaction data between the group and the item.
$c_{i}^{I -}$	Weight of the sample for item i, used for sample weighting.
h_1,n h_1,j	Embedding features from the attention mechanism or neural network layers.
L_g	Loss function for group recommendation task (Equation (2))
L_u	Loss function for user recommendation task (Equation (4))
L_MI	Contrastive loss function (Equation (5))
τ	Temperature coefficient for InfoNCE loss (a hyperparameter)
G	Set of all groups
u^g	Set of all group members within group g
e^g	Embedding of group g
e_ui	Embedding of group member u_i
$e \tilde{u}$	Embedding of user-specific negative samples for group g
η	Hyperparameter used to control sampling bias
x_g	Interaction history of group g
$x_{\tilde{u}}^{T}$	Interaction history of negative sample u
K	Number of members in group g
M	Number of negative samples sampled from group g
λ	Hyperparameter used to balance group and user recommendation tasks
λ_MI	Hyperparameter used to adjust the weight of contrastive learning

Table 2. Information table of experimental datasets.

	CAMRa2011	Mafengwo
GROUP	290	995
USER	602	5275
ITEM	7710	1513
USER–ITEM INTERACTION	116,344	39,765
GROUP–ITEM INTERACTION	145,068	3595
AVERAGE NUMBER OF USERS PER GROUP	2.08	7.19

Table 3. The results of ten recommendation algorithms on the CAMRa2011 dataset.

	Group Recommendation			User Recommendation
Top20	Precision	Recall	NDCG	Precision	Recall	NDCG
BPR	0.380	0.121	0.396	0.158	0.109	0.170
NGCF	0.415	0.134	0.425	0.161	0.110	0.176
LightGCN	0.418	0.134	0.434	0.159	0.109	0.172
AGREE	0.399	0.127	0.411	0.138	0.091	0.150
HCR	0.409	0.131	0.423	0.156	0.104	0.170
HHGR	0.420	0.136	0.437	0.160	0.110	0.173
LARGE	0.421	0.132	0.424	0.155	0.106	0.171
DUAL	0.417	0.138	0.439	0.163	0.111	0.176
CDRec	0.423	0.132	0.435	0.159	0.112	0.177
MCSS	0.445	0.145	0.457	0.168	0.118	0.181

Table 4. The results of ten recommendation algorithms on the Mafengwo dataset.

	Group Recommendation			User Recommendation
Top20	Precision	Recall	NDCG	Precision	Recall	NDCG
BPR	0.048	0.747	0.578	0.114	0.495	0.313
NGCF	0.048	0.744	0.541	0.119	0.556	0.366
LightGCN	0.051	0.797	0.615	0.118	0.531	0.310
AGREE	0.043	0.668	0.488	0.109	0.542	0.311
HCR	0.047	0.737	0.572	0.115	0.555	0.335
HHGR	0.051	0.809	0.625	0.118	0.556	0.365
LARGE	0.051	0.810	0.625	0.110	0.539	0.312
DUAL	0.050	0.812	0.627	0.119	0.559	0.366
CDRec	0.052	0.807	0.622	0.117	0.561	0.368
MCSS	0.056	0.854	0.674	0.120	0.606	0.391

Table 5. The results of ablation experiment on CAMRa2011.

CAMRa2011
	Group Recommendation		User Recommendation		Group Recommendation		User Recommendation
	R@20	N@20	R@20	N@20	R@50	N@50	R@50	N@50
MCSS-G	0.144	0.454	-	-	0.282	0.413	-	-
MCSS-U	-	-	0.115	0.182	-	-	0.234	0.215
MCSS-Dual	0.145	0.455	0.117	0.181	0.282	0.414	0.235	0.215
MCSS	0.147	0.457	0.118	0.181	0.283	0.416	0.238	0.216

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wei, F.; Chen, S. Multi-View Collaborative Training and Self-Supervised Learning for Group Recommendation. Mathematics 2025, 13, 66. https://doi.org/10.3390/math13010066

AMA Style

Wei F, Chen S. Multi-View Collaborative Training and Self-Supervised Learning for Group Recommendation. Mathematics. 2025; 13(1):66. https://doi.org/10.3390/math13010066

Chicago/Turabian Style

Wei, Feng, and Shuyu Chen. 2025. "Multi-View Collaborative Training and Self-Supervised Learning for Group Recommendation" Mathematics 13, no. 1: 66. https://doi.org/10.3390/math13010066

APA Style

Wei, F., & Chen, S. (2025). Multi-View Collaborative Training and Self-Supervised Learning for Group Recommendation. Mathematics, 13(1), 66. https://doi.org/10.3390/math13010066

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Multi-View Collaborative Training and Self-Supervised Learning for Group Recommendation

Abstract

1. Introduction

2. Related Works

2.1. Recommendation Systems

2.2. Self-Supervised Learning for Recommendation Systems

2.3. Group Recommendations

3. Methodology

3.1. Embedding Propagation

3.2. Embedding Fusion

3.3. Multi-Task Joint Training

3.3.1. Group Recommendation Task

3.3.2. User Recommendation Task

3.3.3. Contrastive Learning Task

3.3.4. Multi-Task Joint Training Loss

4. Experiments and Analysis

4.1. Datasets, Baselines, and Setup

4.2. Overall Performance Evaluation

4.3. Hyperparameter Sensitivity Analysis

4.4. Ablation Study

4.5. Discussion

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI