1. Introduction
The collection of electromagnetic shielding material databases and their preprocessing using machine learning techniques have become increasingly essential in contemporary materials science and engineering [1-3]. Electromagnetic shielding materials play a critical role in various applications, including electronics, telecommunications, aerospace, and automotive industries, where the attenuation of electromagnetic interference is paramount [4,5]. The compilation of comprehensive databases facilitates the systematic organization and analysis of diverse material properties, enabling researchers to identify novel materials with enhanced shielding effectiveness. Moreover, machine learning algorithms have emerged as powerful tools for predictive modeling and data analysis in materials research. By harnessing the capabilities of machine learning, researchers can expedite the screening process of candidate materials, predict material properties with high accuracy, and uncover hidden correlations within complex datasets. However, the successful application of machine learning techniques relies heavily on the quality of input data and the preprocessing steps undertaken to refine and standardize the datasets [6-8].
Laser additive manufacturing (LAM) of FeCo alloy has gained significant attention in recent years due to its promising applications in various industries, including aerospace, automotive, and biomedical. However, the complex interaction between process parameters and material properties poses challenges in achieving optimal manufacturing conditions and desired material characteristics. In this regard, machine learning (ML) techniques have emerged as valuable tools for process optimization and prediction of material properties in LAM.
2. Methods
In this study, data preprocessing techniques were employed, including data cleaning and correlation analysis, to assess the interrelationship among electromagnetic interference (EMI) data and identify significant features. The first step involved data cleaning, where missing or erroneous data points were addressed through imputation, deletion of incomplete records, or interpolation, ensuring the integrity of the dataset [9,10]. Subsequently, correlation analysis was conducted to evaluate the associations between different EMI data variables. This analysis helped uncover patterns of dependency or independence among the variables, providing insights into the underlying relationships within the dataset. Based on the results of the correlation analysis, important features were selected to further refine the dataset for subsequent analysis [11,12].
Figure 1. Flowchart of data processing for electromagnetic shielding materials in machine learning
3. Results and discussion
As electromagnetic shielding materials generally consist of two or more elements in combination, we categorize them into matrix materials and additive materials used for compounding. The attributes of electromagnetic shielding materials obtained from the literature are summarized as follows: P (weight percentage of additive materials: wt.%), T (sample thickness: mm), S (sample structure properties), M (number of elements), F (structural forms of additive elements: powder, tubular, filamentous, polymorphic mixture, pure solid block material), Cf (theoretical electrical conductivity of additive elements), Cb (theoretical electrical conductivity of the matrix). The histograms presented in this analysis illustrate the distribution of Electromagnetic Interference (EMI) rates across several properties of shielding materials, highlighting the influence of these properties on shielding effectiveness as Fig.2 show.
Figure 2. Histogram showing the electromagnetic shielding interference (EMI) performance distribution for the different features
The distribution of EMI rates with respect to the overall actual electrical conductivity (C) and the theoretical electrical conductivities of the matrix (Cb) and additive elements (Cf) shows a broad range with high variability, suggesting diverse effectiveness across conductivity measurements. Notably, the histogram for the weight percentage of additive materials (P) indicates a decreasing trend in EMI rates as the percentage increases, which may imply that higher proportions of additives do not necessarily enhance shielding effectiveness. Additionally, variations in EMI rates across different sample structures (S) and thicknesses (T) underscore the significant impact of physical configurations on shielding properties. The data also reveal that the structural forms of additives (F) and the number of different elements used (M) contribute differently to EMI rates, with certain forms and higher diversity potentially offering better shielding. This detailed examination across multiple parameters illustrates the complex interplay between material composition and structural characteristics in determining the effectiveness of electromagnetic shielding materials, emphasizing the need for meticulous material design and optimization in shielding applications.
The presented correlation matrices offer a comprehensive statistical analysis of the relationships between various properties of electromagnetic interference (EMI) shielding materials using multiple correlation measures. The matrices displayed include Pearson Rank Correlation Coefficients, Spearman Rank Correlation Coefficients, Kendall Tau Rank Correlation Coefficients, and Distance Correlation Matrix, each assessing different aspects of correlation between variables such as the weight percentage of additive materials (P), sample thickness (T), structural properties of the sample (S), number of elements (M), forms of additive elements (F), overall actual electrical conductivity (C), theoretical electrical conductivity of additive elements (Cf), and that of the matrix (Cb) in relation to EMI rates.
Figure 3. Analysis of correlations between features: (a) Pearson rank; (b) Spearman rank; (c) Kendall Tau rank; (d) Distance correlation
Gradient Boosting Trees (GBT) is a powerful machine learning technique that combines multiple weak prediction models to form a robust predictive model [13,14]. As a form of ensemble learning, GBT sequentially integrates multiple base learners, typically decision trees, to construct a highly accurate prediction model. Each tree-building step aims to correct the residuals of all previous trees, which are the discrepancies between the current model's predictions and the actual values. Another significant feature of this algorithm is the flexibility of its parameters. GBT allows for the adjustment of the loss function to suit specific problems and controls the model's complexity through parameters such as tree depth and learning rate, thus balancing between bias and variance.
As shown in Fig. 4, we utilized the XGBoost (eXtreme Gradient Boosting) regression model, XGBRegressor, with the objective function set to reg:squarederror, and configured parameters including 100 trees, a learning rate of 0.1, and a maximum depth of 3. The model was trained by minimizing the squared error loss function. The results indicated a root mean squared error of 10.71, a coefficient of determination of 0.84, and an average absolute error of 7.80 dB. The XGBoost algorithm demonstrated exceptional performance in predicting EMI, not only showing high prediction accuracy but also handling complex data relationships effectively.
Figure 4. Actual vs. predicted EMI value of XGBoost
Finally, the selection of self-developed FeCo-based alloy powder as the foundational material for laser cladding was based on several factors. Firstly, FeCo alloys exhibit excellent electromagnetic interference (EMI) shielding properties due to their high magnetic permeability and electrical conductivity. Secondly, the use of FeCo alloy powder allows for precise control over composition and microstructure, enabling tailored material properties to meet specific EMI shielding requirements. Additionally, the integration of rare earth elements in the processing optimization enhances the performance of the FeCo alloy by further refining the microstructure and improving the homogeneity of the material, thereby enhancing its EMI shielding effectiveness.
4. Conclusion
In summary, the study confirms that the use of machine learning techniques in the preprocessing of electromagnetic shielding material databases substantially enhances the capability to analyze and predict material properties. Through methodologies like data cleaning and correlation analysis, significant interrelationships among EMI data were identified, which facilitated the selection of crucial features for further FeCo-based alloy analysis. Additionally, our findings demonstrate that material properties such as the percentage of additives and their conductivity significantly influence EMI shielding effectiveness. The integration of diverse correlation measures and advanced machine learning models like Gradient Boosting Trees further strengthens the predictive modeling, providing a robust framework for the design and optimization of electromagnetic shielding materials. This integrated approach not only streamlines the process of material discovery but also contributes to the development of innovative solutions for electromagnetic interference mitigation across various technological applications.
Acknowledgments
Funding: This work was financially supported by National Key R&D Program of China (2016YFB1100201), the open competition mechanism to select the best candidates to lead key research projects in Shenyang city (22-101-0-16). Special thanks are due to the instrumental data analysis from the Analytical and Testing Center (Northeastern University).
Competing interests: The authors declare that they have no competing interests.
Data and materials availability: All data needed to evaluate the conclusions in the paper are present in the paper and/or the Supplementary Materials. Additional data related to this paper may be requested from the authors.
Data availability statement
The raw/processed data required to reproduce these findings cannot be shared at this time as the data also forms part of an ongoing study.