References
[1]. Chen, H., Liu, Q., Sun, Z. (2023) The rise and potential of large language model-based agents: a survey. Science China – Information Sciences, 66: 1–20.
[2]. Wang, Y., Zhao, M., Li, K. (2023) A survey on large language model based autonomous agents. Frontiers of Computer Science, 17: 1–18.
[3]. Zhang, H., Lin, J., Wang, X. (2024) LLM With Tools: A Survey. arXiv preprint.
[4]. Sun, Y., He, Z., Li, J. (2024) LLMs Working in Harmony: A Survey on the Technological Aspects of Building Effective LLM-Based Multi Agent Systems. arXiv preprint.
[5]. Zhang, X., Li, Y., Wang, J. (2024) A study on classification based concurrent API calls and optimal model combination for tool augmented LLMs for AI agent. Scientific Reports, 14: 1–14.
[6]. Li, J., Zhao, K., Hu, F. (2024) Achieving Tool Calling Functionality in LLMs Using Only Prompt Engineering Without Fine-Tuning. arXiv preprint.
[7]. Gao, L., Chen, H., Xu, W. (2024) Granite-Function Calling Model: Introducing Function Calling Abilities via Multi-task Learning of Granular Tasks. arXiv preprint.
[8]. Wang, Y., Zhang, Z., Xu, J. (2025) Small Models, Big Tasks: An Exploratory Empirical Study on Small Language Models for Function Calling. arXiv preprint.
[9]. Johnson, M., Lee, D., Parker, S. (2025) eidos: A modular approach to external function integration in LLMs. SoftwareX, 24: 101–110.
[10]. Li, D., Chen, J., Yang, Q. (2024) Asynchronous LLM Function Calling. arXiv preprint.
[11]. Kumar, S., Patel, R., Singh, A. (2024) Simple Action Model: Enabling LLM to Sequential Function Calling Tool Chain. Proceedings of the International Conference on Advancement in Renewable Energy and Intelligent Systems (AREIS): 1–8.
[12]. Huang, P., Zhou, J., Wang, T. (2025) ODIA: Oriented Distillation for Inline Acceleration of LLM-based Function Calling. arXiv preprint.
[13]. Zhao, R., Chen, Q., Fang, L. (2024) Enhancing Function-Calling Capabilities in LLMs: Strategies for Prompt Formats, Data Integration, and Multilingual Translation. arXiv preprint.
[14]. Lee, C., Park, J., Smith, A. (2024) Benchmarking Floworks against OpenAI & Anthropic: A Novel Framework for Enhanced LLM Function Calling. arXiv preprint.
[15]. Tan, R., Luo, M., Wei, H. (2024) ToolACE: Winning the Points of LLM Function Calling. arXiv preprint.
[16]. Wang, F., Zhang, H., Zhao, K. (2024) Improving Small-Scale Large Language Models Function Calling for Reasoning Tasks. arXiv preprint.
[17]. Zhou, T., Wang, Y., Liu, Q. (2023) MetaTool Benchmark for Large Language Models: Deciding Whether to Use Tools and Which to Use. arXiv preprint.
[18]. Fang, Z., Liu, H., Chen, J. (2025) Meta-Tool: Unleash Open-World Function Calling Capabilities of General-Purpose Large Language Models. Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (ACL, Long Papers): 1481–1495.
[19]. Schick, T., Dwivedi-Yu, J., Zellers, R. (2023) Large Language Models as Tool Makers. arXiv preprint.
[20]. Li, X., Zhou, P., Tang, J. (2023) CREATOR: Tool Creation for Disentangling Abstract and Concrete Reasoning of Large Language Models. arXiv preprint.
[21]. Yang, F., Chen, R., Xu, H. (2023) CRAFT: Customizing LLMs by Creating and Retrieving from Specialized Toolsets. arXiv preprint.
[22]. Anonymous. (2024) From RAG to Multi-Agent Systems: A Survey of Modern Approaches in LLM Development. arXiv preprint.
[23]. Gupta, R., Feng, Y., Wang, L. (2024) TroVE: Inducing Verifiable and Efficient Toolboxes for Solving Programmatic Tasks. arXiv preprint.
[24]. Xu, M., Zhang, Y., Chen, H. (2024) ClashEval: Quantifying the tug-of-war between an LLM's internal prior and external evidence. arXiv preprint.
[25]. Zhao, L., Lin, H., Wu, P. (2024) Blinded by Generated Contexts: How Language Models Merge Generated and Retrieved Contexts When Knowledge Conflicts? arXiv preprint.
[26]. Sun, Z., Gao, Y., Li, P. (2025) ComplexFuncBench: Exploring Multi-Step and Constrained Function Calling under Long-Context Scenario. arXiv preprint.
[27]. Wu, T., Lin, X., Huang, Y. (2025) LongFuncEval: Measuring the effectiveness of long context models for function calling. arXiv preprint.
[28]. Nguyen, H., Tran, P., Li, X. (2024) Are More LLM Calls All You Need? Towards the Scaling Properties of Compound AI Systems. Advances in Neural Information Processing Systems, 37: 51173–51180.
[29]. Patel, A., Singh, R., Kumar, V. (2024) The Dark Side of Function Calling: Pathways to Jailbreaking Large Language Models. arXiv preprint.
[30]. He, Y., Ma, C., Zhang, L. (2025) Querying Databases with Function Calling. arXiv preprint.
[31]. Chen, M., Wang, L., Huang, S. (2025) Graph-Grounded LLMs: Leveraging Graphical Function Calling to Minimize LLM Hallucinations. arXiv preprint.
[32]. Ma, R., Zhou, Y., Xu, L. (2024) Large Language Models as Zero-shot Dialogue State Tracker through Function Calling. arXiv preprint.