Artificial Intelligence in Music Generation: Models, Evaluation, Applications, and Future Prospects

Tiffany Chiu

doi:10.54254/2753-8818/2026.CH30045

Theoretical and Natural ScienceOpen access

Artificial Intelligence in Music Generation: Models, Evaluation, Applications, and Future Prospects

Research Article

Open Access

Artificial Intelligence in Music Generation: Models, Evaluation, Applications, and Future Prospects

Tiffany Chiu ^1*

¹ Jericho Senior High School

^*Corresponding author: tiffanychiu3000@gmail.com

Published on 26 November 2025

TNS Vol.151

ISSN (Print): 2753-8826

ISSN (Online): 2753-8818

ISBN (Print): 978-1-80590-559-2

ISBN (Online): 978-1-80590-560-8

Download Cover

Abstract

Music creation through the use of artificial intelligence (AI) is an emerging and rapidly developing field. This paper presents a comprehensive review of the current state of AI music generation, covering the historical development of computer-assisted music production and AI-assisted music from early analog and digital tools to modern neural network architectures, and highlighting key developments such as MIDI, DAWs, plugins, and early algorithmic composition systems. It also examines symbolic and audio-based music representations, including MIDI, sheet music, waveforms, and spectrograms, and evaluates generative models such as GANs, LSTMs, Transformers, VAEs, and diffusion models, analyzing their various capabilities and limitations. Applications in areas such as content creation, gaming, healthcare, and marketing also demonstrate AI’s growing global impacts. This review also compares subjective, objective, and combined evaluation strategies used to assess new AI music models and addresses challenges and potential problematic areas in current studies and research. Finally, future research directions are discussed, including improved generative techniques, interdisciplinary integration, and real-time interactive systems, suggesting pathways for researchers to enhance creativity, expressiveness, and practical application in AI-assisted music production.

Keywords:

Artificial Intelligence, Music Generation, Music Evaluation, Generative Models, Music Representation

View PDF

References

[1]. Pinch, T., & Trocco, F. (2004). Analog Days: The Invention and Impact of the Moog Synthesizer. Harvard University Press. 10.4159/9780674042162

[2]. Chen, Y., Huang, L., & Gou, T. (2024, September 3). Applications and Advances of Artificial Intelligence in Music Generation: A Review. arXiv: 2409.03715

[3]. Yang, Y. (2024, March 13). Analysis Of Different Types of Digital Audio Workstations. Highlights in Science Engineering and Technology, 85, 563-569. 10.54097/6vvy8z41

[4]. Manan, Verma, G., Singh, S., & Kukreti, K. (2022, December 23-24). A Review of Multimedia Processing and its Application in MIDI. 2022 2nd International Conference on Innovative Sustainable Computational Technologies (CISCT), 1-5. 10.1109/CISCT55310.2022.10046532

[5]. Xiong, Z., Wang, W., Yu, J., Lin, Y., & Wang, Z. (2023, August 26). A Comprehensive Survey for Evaluation Methodologies of AI-Generated Music. arXiv: 2308.13736

[6]. van den Oord, A., Dieleman, S., Zen, H., Simonyan, K., Vinyals, O., Graves, A., Kalchbrenner, N., Senior, A., & Kavukcuoglu, K. (2016, September 19). WaveNet: A Generative Model for Raw Audio. arXiv: 1609.03499

[7]. Wang, Y., Skerry-Ryan, R., Stanton, D., Wu, Y., Weiss, R. J., Jaitly, N., Yang, Z., Xiao, Y., Chen, Z., Bengio, S., Le, Q., Agiomyrgiannakis, Y., Clark, R., & Saurous, R. A. (2017, April 6). Tacotron: Towards End-to-End Speech Synthesis. arXiv: 1703.10135

[8]. Schneider, F., Kamal, O., Jin, Z., & Schölkopf, B. (2024, August). Moûsai: Efficient Text-to-Music Diffusion Models (L.-W. Ku, A. Martins, & V. Srikumar, Eds.). Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). arXiv.2301.11757

[9]. Huang, Q., Park, D. S., Wang, T., Denk, T. I., Ly, A., Chen, N., Zhang, Z., Zhang, Z., Yu, J., Frank, C., Engel, J., Le, Q. V., Chan, W., Chen, Z., & Han, W. (2023, March 6). Noise2Music: Text-conditioned Music Generation with Diffusion Models. arXiv.2302.03917

[10]. Donahue, C., McAuley, J., & Puckette, M. (2019, February 9). Adversarial Audio Synthesis. arXiv.1802.04208

[11]. Dong, H.-W., Hsiao, W.-Y., Yang, L.-C., & Yang, Y.-H. (2017, November 24). MuseGAN: Multi-track Sequential Generative Adversarial Networks for Symbolic Music Generation and Accompaniment. arXiv.1709.06298

[12]. Liang, F. (2016, August). BachBot: Automatic composition in the style of Bach chorales. https: //www.mlmi.eng.cam.ac.uk/files/feynman_liang_8224771_assignsubmission_file_liangfeynmanthesis.pdf

[13]. Hadjeres, G., Pachet, F., & Nielsen, F. (2017, June 17). DeepBach: a Steerable Model for Bach Chorales Generation. arXiv.1612.01010

[14]. Agostinelli, A., Denk, T. I., Borsos, Z., Engel, J., Verzetti, M., Caillon, A., Huang, Q., Jansen, A., Roberts, A., Tagliasacchi, M., Sharifi, M., Zeghidour, N., & Frank, C. (2023, January 26). MusicLM: Generating Music From Text. arXiv: 2301.11325

[15]. Huang, C.-Z. A., Vaswani, A., Uszkoreit, J., Shazeer, N., Simon, I., Hawthorne, C., Dai, A. M., Hoffman, M. D., Dinculescu, M., & Eck, D. (2018, December 12). Music Transformer. arXiv: 1809.04281

[16]. Brunner, G., Konrad, A., Wang, Y., & Wattenhofer, R. (2018, September 20). MIDI-VAE: Modeling Dynamics and Instrumentation of Music with Applications to Style Transfer. arXiv: 1809.07600

[17]. Engel, J., Resnick, C., Roberts, A., Dieleman, S., Eck, D., Simonyan, K., & Norouzi, M. (2017, April 5). Neural Audio Synthesis of Musical Notes with WaveNet Autoencoders. arXiv: 1704.01279

[18]. Yang, L.-C., & Lerch, A. (2018, November 3). On the evaluation of generative models in music. Neural Computing and Applications, 32, 4773–4784. 10.1007/s00521-018-3849-7

[19]. Huang, J., Wang, J.-C., Smith, J. B. L., Song, X., & Wang, Y. (2021, March 26). Modeling the Compatibility of Stem Tracks to Generate Music Mashups. arXiv: 2103.14208

[20]. Gerstgrasser, M., Schaeffer, R., Dey, A., Rafailov, R., Sleight, H., Hughes, J., Korabk, T., Agrawal, R., Pai, D., Gromov, A., Roberts, D. A., Yang, D., Donoho, D. L., & Koyejo, S. (2024, April 29). Is Model Collapse Inevitable? Breaking the Curse of Recursion by Accumulating Real and Synthetic Data. arXiv: 2404.01413