An Empirical Comparison of Bayesian LinUCB, UCB, and Thompson Sampling for Recommendation on MovieLens

Jingyun Wang

doi:10.54254/2753-8818/2026.CH30042

Theoretical and Natural ScienceOpen access

An Empirical Comparison of Bayesian LinUCB, UCB, and Thompson Sampling for Recommendation on MovieLens

Research Article

Open Access

An Empirical Comparison of Bayesian LinUCB, UCB, and Thompson Sampling for Recommendation on MovieLens

Jingyun Wang ^1*

¹ Chongqing University

^*Corresponding author: cquwangjy7@gmail.com

Published on 26 November 2025

TNS Vol.151

ISSN (Print): 2753-8826

ISSN (Online): 2753-8818

ISBN (Print): 978-1-80590-559-2

ISBN (Online): 978-1-80590-560-8

Download Cover

Abstract

T Recommender systems have evolved into core business hubs, with approximately 35% of Amazon's revenue stemming from recommendation-guided behaviors. This study conducts a systematic comparative analysis of three multi-armed bandit algorithms—Bayesian Linear Upper Confidence Bound (Bayesian LinUCB), Upper Confidence Bound (UCB), and Thompson Sampling—using the MovieLens dataset. The research evaluates algorithm performance across three key dimensions: cumulative regret, optimal arm selection frequency, and regret rate. Experimental variables are strictly controlled with consistent parameters, including decision steps and data division ratios to eliminate confounding factors. Results reveal significant performance differences among the algorithms within the limited experimental steps on the MovieLens dataset. UCB demonstrates optimal performance with the lowest cumulative regret (817.93) and highest optimal arm selection frequency (0.9822), followed by Thompson Sampling with moderate performance (cumulative regret: 2776.36, selection frequency: 0.924). Bayesian LinUCB performs poorly across all metrics, showing the highest cumulative regret (34105.02), lowest selection frequency (0.1324), and a regret rate of approximately 1, indicating linear rather than sublinear growth. The sublinear growth characteristic exhibited by UCB and Thompson Sampling confirms their superior exploration-exploitation balance, while Bayesian LinUCB's linear growth pattern suggests inadequate adaptation to the MovieLens dataset scenario, highlighting the importance of algorithm-dataset compatibility in recommendation systems.

Keywords:

Bayesian LinUCB, Upper Confidence Bound, Thompson Sampling, MovieLens, Cumulative Regret

View PDF

References

[1]. Auer, P., Cesa-Bianchi, N., Fischer, P.: Finite-time analysis of the multiarmed bandit problem. Machine Learning 47(2), 235–256 (2002).

[2]. Li, L., Chu, W., Langford, J., Schapire, R.E.: A contextual-bandit approach to personalized news article recommendation. In: Proceedings of the 19th International Conference on World Wide Web, pp. 661–670 (2010).

[3]. Agrawal, S., Goyal, N.: Thompson sampling for contextual bandits with linear payoffs. In: International Conference on Machine Learning, pp. 127–135. PMLR (2013).

[4]. Wei, L., Srivastava, V.: Nonstationary stochastic multiarmed bandits: UCB policies and minimax regret. arXiv preprint arXiv: 2101.08980 (2021).

[5]. Zhu, J., Liu, J.: Distributed multi-armed bandit over arbitrary undirected graphs. In: 2021 60th IEEE Conference on Decision and Control (CDC), pp. 6976–6981. IEEE (2021).

[6]. Qiu, S., Wang, L., Bai, C., Yang, Z., Wang, Z.: Contrastive ucb: Provably efficient contrastive self-supervised learning in online reinforcement learning. In: International Conference on Machine Learning, pp. 18168–18210. PMLR (2022).

[7]. Zhu, R.J., Qiu, Y.: UCB Exploration for Fixed-Budget Bayesian Best Arm Identification. arXiv preprint arXiv: 2408.04869 (2024).

[8]. Elumar, E.C., Tekin, C., Yağan, O.: Multi-armed bandits with costly probes. IEEE Transactions on Information Theory (2024).

[9]. Wu, H., Xu, Y., Cao, S., Liu, J., Takakura, H., Norio, S.: Sleeping Multi-Armed Bandit-Based Path Selection in Space-Ground Semantic Communication Networks. In: 2025 IEEE Wireless Communications and Networking Conference (WCNC), pp. 1–6. IEEE (2025).

[10]. Saday, A., Demirel, İ., Yıldırım, Y., Tekin, C.: Federated multi-armed bandits under byzantine attacks. IEEE Transactions on Artificial Intelligence (2025).

References

[1]. Auer, P., Cesa-Bianchi, N., Fischer, P.: Finite-time analysis of the multiarmed bandit problem. Machine Learning 47(2), 235–256 (2002).

[3]. Agrawal, S., Goyal, N.: Thompson sampling for contextual bandits with linear payoffs. In: International Conference on Machine Learning, pp. 127–135. PMLR (2013).

[4]. Wei, L., Srivastava, V.: Nonstationary stochastic multiarmed bandits: UCB policies and minimax regret. arXiv preprint arXiv: 2101.08980 (2021).

[5]. Zhu, J., Liu, J.: Distributed multi-armed bandit over arbitrary undirected graphs. In: 2021 60th IEEE Conference on Decision and Control (CDC), pp. 6976–6981. IEEE (2021).

[7]. Zhu, R.J., Qiu, Y.: UCB Exploration for Fixed-Budget Bayesian Best Arm Identification. arXiv preprint arXiv: 2408.04869 (2024).

[8]. Elumar, E.C., Tekin, C., Yağan, O.: Multi-armed bandits with costly probes. IEEE Transactions on Information Theory (2024).

[10]. Saday, A., Demirel, İ., Yıldırım, Y., Tekin, C.: Federated multi-armed bandits under byzantine attacks. IEEE Transactions on Artificial Intelligence (2025).

Cite this article

Wang,J. (2025). An Empirical Comparison of Bayesian LinUCB, UCB, and Thompson Sampling for Recommendation on MovieLens. Theoretical and Natural Science,151,31-39.

Data availability

The datasets used and/or analyzed during the current study will be available from the authors upon reasonable request.

About volume

Volume title: Proceedings of CONF-CIAP 2026 Symposium: Applied Mathematics and Statistics

ISBN: 978-1-80590-559-2(Print) / 978-1-80590-560-8(Online)

Editor: Marwan Omar

Conference website: https://www.confciap.org/chicago.html

Conference date: 27 January 2026

Series: Theoretical and Natural Science

Volume number: Vol.151

ISSN: 2753-8818(Print) / 2753-8826(Online)

© 2024 by the author(s). Licensee EWA Publishing, Oxford, UK. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license. Authors who publish this series agree to the following terms:

1. Authors retain copyright and grant the series right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this series.

2. Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the series's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this series.

3. Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See Open access policy for details).