Analysis of the Effectiveness of Deep Learning Spam Email Classifiers against Text-Based Attacks
Research Article
Open Access
CC BY

Analysis of the Effectiveness of Deep Learning Spam Email Classifiers against Text-Based Attacks

Jianchao Li 1*
1 Guangzhou Nanfang College
*Corresponding author: outlook_165D168DEE309A11@outlook.com
Published on 3 December 2025
Volume Cover
ACE Vol.211
ISSN (Print): 2755-273X
ISSN (Online): 2755-2721
ISBN (Print): 978-1-80590-579-0
ISBN (Online): 978-1-80590-580-6
Download Cover

Abstract

With the widespread application of machine learning, particularly deep learning models, in the field of cybersecurity, the intelligence of spam filtering systems has been continuously enhanced. Deep learning classifiers, with their advantages such as character-level feature learning and semantic invariance, have become the preferred choice for deployment. However, these models rely on surface text features, making them vulnerable to adversarial attacks. As a result, they exhibit significant vulnerability when facing carefully constructed text adversarial attacks. Text adversarial attacks, through covert modifications such as synonym substitution and character perturbation, can mislead the model to misjudge malicious emails, leading to risky spam emails such as phishing and fraud passing through the defense system. This study first elaborates on three attack methods, namely character-level attack, word-level attack, and sentence-level attack. Secondly, it introduces the existing limitations of spam email attacks and then this study comprehensively reviews the key findings in the existing research results: deep learning models generally have a high attack success rate (ASR). The aim is to provide a theoretical basis for building a more robust next-generation spam email filtering system.

Keywords:

Deep Learning, Spam Emails, Character Set Attack, Word-level Attack

View PDF
Li,J. (2025). Analysis of the Effectiveness of Deep Learning Spam Email Classifiers against Text-Based Attacks. Applied and Computational Engineering,211,16-20.

References

[1]. Lin, Z., Liu, Z., & Fan, H. (2025). Improving Phishing Email Detection Performance of Small Large Language Models.  arXiv preprint arXiv: 2505.00034.

[2]. Eger, S., & Benz, Y. (2020). From Hero to Z\'eroe: A Benchmark of Low-Level Adversarial Attacks.  arXiv preprint arXiv: 2010.05648.

[3]. Chen, X., Salem, A., Chen, D., Backes, M., Ma, S., Shen, Q., ... & Zhang, Y. (2021, December). Badnl: Backdoor attacks against nlp models with semantic-preserving improvements. In Proceedings of the 37th Annual Computer Security Applications Conference (pp. 554-569).

[4]. Li, L., Ma, R., Guo, Q., Xue, X., & Qiu, X. (2020). Bert-attack: Adversarial attack against bert using bert.  arXiv preprint arXiv: 2004.09984.

[5]. Belinkov, Y., & Bisk, Y. (2017). Synthetic and natural noise both break neural machine translation.  arXiv preprint arXiv: 1711.02173.

[6]. Boucher, N., Shumailov, I., Anderson, R., & Papernot, N. (2022, May). Bad characters: Imperceptible nlp attacks. In 2022 IEEE Symposium on Security and Privacy (SP) (pp. 1987-2004). IEEE.

[7]. Gao, J., Lanchantin, J., Soffa, M. L., & Qi, Y. (2018, May). Black-box generation of adversarial text sequences to evade deep learning classifiers. In 2018 IEEE Security and Privacy Workshops (SPW) (pp. 50-56). IEEE.

[8]. Gao, J., Lanchantin, J., Soffa, M. L., & Qi, Y. (2018, May). Black-box generation of adversarial text sequences to evade deep learning classifiers. In 2018 IEEE Security and Privacy Workshops (SPW) (pp. 50-56). IEEE.

[9]. Jin, D., Jin, Z., Zhou, J. T., & Szolovits, P. (2019). Is bert really robust? natural language attack on text classification and entailment.  arXiv preprint arXiv: 1907.11932,   2(10).

[10]. Zang, Y., Qi, F., Yang, C., Liu, Z., Zhang, M., Liu, Q., & Sun, M. (2019). Word-level textual adversarial attacking as combinatorial optimization.  arXiv preprint arXiv: 1910.12196.

[11]. Gregory, J., & Liao, Q. (2023, September). Adversarial spam generation using adaptive gradient-based word embedding perturbations. In 2023 IEEE International Conference on Artificial Intelligence, Blockchain, and Internet of Things (AIBThings) (pp. 1-5). IEEE.

[12]. Huq, A., & Pervin, M. (2020). Adversarial attacks and defense on texts: A survey.  arXiv preprint arXiv: 2005.14108.

[13]. Hotoğlu, E., Sen, S., & Can, B. (2025). A Comprehensive Analysis of Adversarial Attacks against Spam Filters.  arXiv preprint arXiv: 2505.03831.

Cite this article

Li,J. (2025). Analysis of the Effectiveness of Deep Learning Spam Email Classifiers against Text-Based Attacks. Applied and Computational Engineering,211,16-20.

Data availability

The datasets used and/or analyzed during the current study will be available from the authors upon reasonable request.

About volume

Volume title: Proceedings of CONF-SPML 2026 Symposium: The 2nd Neural Computing and Applications Workshop 2025

ISBN: 978-1-80590-579-0(Print) / 978-1-80590-580-6(Online)
Editor: Marwan Omar, Guozheng Rao
Conference date: 21 December 2025
Series: Applied and Computational Engineering
Volume number: Vol.211
ISSN: 2755-2721(Print) / 2755-273X(Online)