Overview of Speech Recognition Algorithms and Their Applications

Xuyang Chen

doi:10.54254/2755-2721/2025.BJ26545

Applied and Computational EngineeringOpen access

Overview of Speech Recognition Algorithms and Their Applications

Research Article

Open Access

Overview of Speech Recognition Algorithms and Their Applications

Xuyang Chen ^1*

¹ School of Mechanical Engineering, Nantong University, Nantong, Jiangsu, China, 226019

^*Corresponding author: 3368136462@qq.com

Published on 3 September 2025

ACE Vol.183

ISSN (Print): 2755-273X

ISSN (Online): 2755-2721

ISBN (Print): 978-1-80590-341-3

ISBN (Online): 978-1-80590-342-0

Download Cover

Abstract

Speech recognition technology, a pivotal element in human-computer interaction, has witnessed substantial advancements in recent years, propelled by the synergies of deep learning and big data. This paper provides a systematic review of the evolution of speech recognition algorithms, delineating the principal characteristics and application contexts of traditional speech recognition algorithms, such as Hidden Markov Models (HMM), deep learning-based algorithms, including Recurrent Neural Networks (RNN) and Convolutional Neural Networks (CNN), and end-to-end speech recognition algorithms. Furthermore, this study delves into the multifaceted applications of these algorithms in domains such as voice assistants (e.g., Siri and Alexa), machine translation, and meeting transcription, elucidating their transformative impact. The paper also synthesizes the prevailing speech recognition technologies and the challenges they confront, with a particular emphasis on the limitations of commonly used language recognition algorithms, such as susceptibility to noise, accent variability, and data dependency. Through this comprehensive analysis, the paper aims to illuminate the current state and future trajectories of speech recognition technology. This paper identifies and summarizes the shortcomings of commonly used language recognition algorithms.

Keywords:

Speech Recognition, Deep Learning, End-to-End Model, Voice Assistant, Machine Translation

View PDF

References

[1]. Jiang Yinhe. Research on Multi-Modal Emotion Recognition Algorithms Based on Video and Speech [D]. Hangzhou Dianzi University, Electronic Information, 2024: 1-9.

[2]. Fan Peng. Research on Key Technologies for Speech Recognition in Mixed-Language Air Traffic Control [D]. Sichuan University, Computer Science and Technology, 2024: 5-7.

[3]. Chen Shuo. Research on the Application of Deep Learning Neural Networks in Speech Recognition [D]. South China University of Technology, School of Electronics and Information, 2013: 28-31.

[4]. Ji Xueying. Research on Speech Recognition Technology Based on Deep Learning [D]. North China University of Technology, Information and Communication Engineering, 2024: 15-19.

[5]. Zhu Shuqin. Research on Key Technologies of Language Recognition Systems [D]. Xidian University, Computer Application Technology, 2004: 8-15.

[6]. Zhao Xiaoqun, Zhang Yang. Review on the Construction of Acoustic Models for Speech Keyword Recognition Systems [J]. Journal of Yanshan University, 2017, 41(6): 471-478.

[7]. Lu Lin, Wang Dong. A Brief Discussion on the Development Trends of Sound Recognition Models [J]. Automobile Applied Technology, (12): 186-188.

[8]. Shang Kailun. Research on Federated Learning Algorithms for Data Heterogeneity and System Heterogeneity Problems [D]. Beijing University of Chemical Technology, Electronic Information, 2024: 14-18.

[9]. Lin Quan. Research on End-to-End Chinese Language Recognition Algorithms [D]. Southeast University, Electronic and Communication Engineering, 2022: 23-39.

[10]. Jin Xiuli. Research and Implementation of End-to-End Language Recognition Algorithms [D]. Lanzhou Jiaotong University, Information and Communication Engineering, 2023: 27-32.

[11]. Qing Yuan. Research on the Engineering Application of Speech Recognition Based on RNN-T End-to-End Scheme [D]. Nanjing University of Posts and Telecommunications, Software Engineering, 2022: 22-28.

[12]. Li Xue'ao. A Brief Introduction to Intelligent Speech Interaction Control Technology for Virtual Digital Humans [J]. China Equipment Engineering, 2023, 12(1): 28-30.

[13]. Wang Bei, Zhao Ruisong, Niu Yiru, Guo Yuanyuan, Zhao Dongyang, Deng Yunfeng, Zhong Dingrong. Application and Exploration of Intelligent Speech Recognition Technology in Pathology Department [J].

[14]. Li Zhen. Research on End-to-End Neural Network Machine Translation Technology [D]. Information Engineering University of Strategic Support Force, Information and Communication Engineering, 2020: 4-11.

[15]. Shi Xiaohu, Yuan Yuping, Lv Guilin, Chang Zhiyong, Zou Yuanjun. Review of Model Compression Algorithms for Automatic Language Recognition [J]. Journal of Jilin University, 2024, 62(1): 122-131.