1. Introduction
Social networks have recently emerged as some of the most significant platforms for information exchange on the Internet. Coupled with this is the growing trend of users consuming video content through various social web applications. Consequently, the development of efficient social network video recommendation algorithms is attracting considerable research interest.
Recommendation systems, which have been firmly established on the Internet, greatly influence our daily lives. These systems typically build models of user interests based on their online behavior. Social networks serve as open platforms where users freely exchange information. However, with the exponential growth in both users and information, challenges inevitably arise. For instance, recommending new videos based purely on predicted scores becomes impossible due to the absence of user feedback records for such content. Similarly, recommendations for new users are often imprecise due to the lack of their behavioral records.
The evolution of social networks also poses challenges for time perception modeling. The rising volume of videos on social networks drastically increases the workload for calculating user similarities and the resemblance between videos and target users. As a result, the efficacy of traditional recommendation systems has diminished. This discussion introduces three distinct video classification and recommendation algorithms designed to address these challenges and improve user experience within the growing sphere of social networks.
2. Data set
UCF101: UCF101 is a widely used public dataset for multi-video classification, encompassing 13,320 YouTube video clips from 101 categories. The clips are drawn from 22 diverse sports and non-sports categories, including basketball, biking, diving, golf swings, and others. Each category has approximately 100 to 800 video clips, each lasting from 5 to 10 seconds. Clips are in AVI format, with a 320x240 resolution and 25fps frame rate. Annotations include the category and start and end frames for each clip. This important dataset tests multiple video classification algorithms and is commonly used in video classification, action recognition, and action detection tasks.
HMDB51: The HMDB51 dataset is another multi-video classification public dataset, with 7,000 video clips spanning 51 action categories, such as brushing teeth, gripping, and drinking. Each category includes approximately 100 to 300 video clips, each ranging from 1 to 30 seconds in length. Clips are provided in AVI format with a resolution of 320x240 and a frame rate of 25fps. Annotation information includes the category, start frame, and end frame of each clip. The HMDB51 dataset is frequently used in video classification, action recognition, and action detection tasks.
Kinetics: Kinetics is a substantial multi-video classification public dataset featuring 650,000 video clips across 600 categories. These clips represent various human activities, such as making phone calls, dancing, and drinking water. Each category contains about 400 to 700 clips, each approximately 10 seconds long. Provided in MP4 format, clips have a resolution of either 240p or 360p and a 25fps frame rate. Annotations include the category, start time, and end time of each clip. Given its size and diversity, Kinetics serves as an important benchmark dataset for training and evaluating deep learning models.
ActivityNet: ActivityNet is a prominent multi-video classification public dataset with 10,024 video clips from 200 categories. These clips depict a variety of human behaviors, such as brushing teeth, clamping, and drinking. Each category has approximately 25 to 50 clips, each lasting between 5 and 10 seconds. Clips are in MP4 format, with a resolution of 720p or 1080p and a frame rate of 25fps. Annotations include the category, start time, and end time of each clip. Thanks to its diverse categories and wide range of video lengths, ActivityNet is instrumental in evaluating the robustness and generalization capabilities of algorithms.
Sports-1M: Sports-1M is a large-scale public dataset for multi-video classification, containing 1,133,997 video clips from 487 categories. These clips depict various sports activities, such as basketball, biking, diving, and golf swings. Each category contains approximately 1,000 to 3,000 clips, each about 10 seconds long. Clips are in AVI format, with a resolution of either 240p or 360p and a frame rate of 25fps. Annotation information includes the category, start frame, and end frame of each clip. Given its scale and diversity, Sports-1M is a vital benchmark for training and evaluating deep learning models. As shown in Table 1.
Table 1. Dataset Details.
Name | Category number | Number of video clips |
UCF101 | 101 | 13,320 |
HMDB51 | 51 | 7,000 |
Kinetics | 600 | 655,000 |
ActivityNet | 200 | 10,024 |
Sports-1M | 487 | 1,133,997 |
3. The background of traditional (recommended) algorithms
This article delves into two fundamental types of recommendation algorithms: Collaborative Filtering and Content Filtering. Collaborative Filtering algorithms primarily leverage user behavior data, such as viewing records, ratings, or preferred videos. These algorithms are designed to recommend videos to users who share similar interests [1]. On the other hand, Content Filtering algorithms formulate their recommendations based on specific video content characteristics and user preferences. Initially, these algorithms extract features from videos, like labels, classifications, actors, and more, and then they calculate the similarity between videos using these features. Subsequently, they recommend videos that align with the user's preferences and historical viewing records [2]. However, contemporary recommendation algorithms continue to grapple with severe challenges, notably data sparsity and overfitting, which remain unresolved [3]. The evolution of recommendation systems is profoundly tied to the problems and challenges they encounter. Thus, the quest for recommendation systems that can tackle these issues remains an intriguing area in the field of intelligent information processing [4].
Content Based Recommendations (CB), though considered archaic, represents the earliest recommendation algorithm used. Despite its antiquity, it remains widely used in the industry in the era of deep learning, underlining its irreplaceable advantages and time-tested efficacy [5]. Content-based recommendation algorithms were primarily applied in information retrieval systems, which makes numerous filtering methods useful for content-based recommendations. The content recommendation in the personalized recommendation field is essentially an information retrieval system reimagined as a recommendation system [6]. User actions toward the target object can include commenting, bookmarking, liking, watching, browsing, clicking, adding to a shopping cart, purchasing, etc., based on past behavior. Content-based recommendation algorithms typically rely exclusively on the user's behavior to provide recommendations, without involving the behavior of other users [7]. Collaborative Filtering-based recommendations can be further divided into several subcategories, including user-based Collaborative Filtering, item-based Collaborative Filtering, commodity-based Collaborative Filtering, and model-based Collaborative Filtering [8]. User-based CF predicts a user's preference for unevaluated items based on the similarity between users. Item-based CF predicts a user's preference for unevaluated items based on the similarity between items. Model-based CF establishes a model, such as a Matrix decomposition model or a deep learning model, to predict the user's preference for unevaluated items. The commodity-based Collaborative Filtering algorithm calculates the similarity between commodities for a specific user, then estimates the user's preference for similar products and makes recommendations based on the user's rating of similar products.
Despite the variety of algorithms, the core concept in Collaborative Filtering methods is the use of similarity-based metrics to recommend items and calculate similarity between users. Similarity measurement can effectively balance the importance of ratings in prediction algorithms, thereby enhancing accuracy [9].
Another method is the K-nearest neighbor (KNN) algorithm based on the item. To determine which movies are 'similar,' we need to define a similarity function (similarity measure). However, sometimes, the ratings provided by users may not reflect their true feelings because tolerant users may give higher ratings than other users, while stricter users may give lower ratings. This bias can impact the overall predicted value and diminish the quality of recommendations [10].
A third type is the hybrid recommendation system, a fusion of two or more recommendation algorithms designed to overcome the limitations of single algorithms and enhance recommendation effectiveness. Studies have shown that the introduction of hybrid recommendation systems diversifies the types of recommendation results, with a nearly 50% increase in the recall rate of recommendation results and a relatively stable accuracy rate [11].
The hybrid video recommendation algorithm uses both Collaborative Filtering and Content Filtering to offer users personalized video recommendation services. Traditional video recommendation algorithms primarily use Collaborative Filtering to recommend videos based on the user's behavior history and similarity with other users. However, this method faces issues like the cold start problem and data sparsity, as it heavily relies on user behavior data. On the other hand, content filtering methods recommend videos based on their content features, such as video tags and descriptions. Still, due to the negligence of user preferences and behaviors, personalized recommendations might not be accurate [11].
To tackle these challenges, researchers have proposed a hybrid video recommendation algorithm that provides more precise and personalized recommendations by considering both user behavior history and video content characteristics in the recommendation process. The algorithm will holistically consider the recommendation results generated by Collaborative Filtering and Content Filtering and deliver the final recommendation list to users through weighted fusion or hierarchical combination.
4. Application analysis
The predominant technology employed by e-commerce platforms in their recommendation systems is the collaborative filtering algorithm. This sophisticated mechanism caters to customers' shopping needs by crafting an information recommendation model derived from extensive product data. It encompasses a suite of functions, including collecting commodity information, processing data, analyzing information, and projecting commodity recommendations. Collaborative filtering is a decision-making technique that analyzes linear patterns based on similar purchase characteristics or product attributes, forms affinity groups with similar preferences, and then issues recommendations to customers. This algorithm has earned the reputation of being the most successful technology to date. It filters through intricate concepts without necessitating rigorous user and item modeling, bypassing the need for complex, automation-resistant content analysis.
Short videos have skyrocketed as a favored form of entertainment. However, a key challenge encountered by short video platforms is the often mismatched recommended content and users' preferences. To heighten user experience, the precision of short video recommendations must be improved. For example, Feng Yong devised a recommendation model amalgamating video content features with barrage text content, consequently enhancing accuracy and efficiency [12]. In a similar vein, Zhang Shengzhe boosted recommendation precision by analyzing emotional similarity to glean user preferences [13]. Persistent refinement of collaborative filtering algorithms will usher prediction outcomes closer to actual user behavior, thereby augmenting the efficacy of video recommendations.
With the burgeoning wealth of user data and online educational resources, pinpointing effective strategies to bolster learners' grasp of key concepts and propose suitable learning materials tailored to their individual requirements has become a matter of urgency. Recently, recommendation systems have gained substantial traction in education and have become a focal point of discourse. Numerous researchers have advanced the accuracy of recommendations by fortifying the collaborative filtering algorithm. For example, Tarus J K proposed an e-learning recommendation system that amalgamates ontology knowledge with sequential pattern mining. This approach addresses the issues of limited data availability and initial user engagement, ultimately boosting the recommendation system's performance [14]. Similarly, Zhao Qin harnessed Ant colony optimization algorithms to refine learning path recommendations into smaller units, thereby amplifying recommendation precision [15]. This method enables real-time tracking of learners' progress and the timely optimization and adjustment of micro-learning paths.
5. Conclusion
This paper provides a comprehensive review of various relevant algorithms, delving into an in-depth analysis of the pros and cons inherent in each method. It offers a detailed investigation into the commonly utilized similarity computation techniques and score estimation methods integral to recommendation systems. The similarity calculation techniques explored are critical as they determine the degree of resemblance or correlation between different data elements. These methods are the backbone of most recommendation algorithms, determining the recommendations' relevance and precision. Similarly, the scoring estimation methods assessed in this paper play a crucial role in predicting user preferences, a cornerstone of any recommendation system. These techniques generate predictions based on historical data, aiming to forecast the user's reactions to previously uninteracted items. Additionally, the paper evaluates and summarizes the frequently used assessment techniques and criteria for recommendation algorithms. These evaluation methods are pivotal in gauging the effectiveness and accuracy of the recommendation systems, providing insights into their practical performance. Altogether, this comprehensive examination and analysis provide valuable insights into the inner workings of recommendation algorithms. This endeavor aims not only to demystify the complex mechanisms of these systems but also to stimulate the ongoing quest for optimizing and perfecting recommendation algorithms.