Abstract
Physics-informed deep learning (PIDL) is one of the emerging topics in additive manufacturing (AM). However, the success of previous PIDL approaches is generally significantly dependent on the existence of massive datasets. As the data collection in AM is usually challenging, a novel Architecture-driven PIDL structure named APIDL based on the deep unfolding approach for limited data scenarios has been proposed in the current study for predicting thermal history in the laser powder bed fusion process. The connections in this machine learning architecture are inspired by iterative thermal model equations. In other words, each iteration of the thermal model is mapped to a layer of the neural network. The hyper-parameters of the APIDL model are tuned, and its performance is analyzed. The APIDL for 1000 points with 80:20 split ratio achieves testing mean absolute percentage error (MAPE) of 2.8% and R2 value of 0.936. The APIDL is compared with the artificial neural network, extra trees regressor (ETR), support vector regressor, and long short-term memory algorithms. It was shown that the proposed APIDL model outperforms the others. The MAPE and R2 of APIDL are 55.7% lower and 15.6% higher than the ETR, which had the best performance among other pure machine learning models.
1 Introduction
Metal additive manufacturing (AM) techniques have the potential to change the face of the manufacturing industry and become the manufacturing technology of the future [1]. Leveraging the recent advances in artificial intelligence and machine learning (ML), metal AM is making significant progress toward smarter design, process, and quality control [2]. Despite wide applications, metal AM technologies have not been adopted to their expected extent in the industry because of their undesired porosity [3], residual stresses [4], and distortions [5]. The root causes of these problems have been thermal in nature [6]. Therefore, in the past decade, significant research efforts have been aimed at predicting various temperature fields like temperature distribution [7] and peak melt-pool temperature [8] in metal AM.
Pure physics-based simulations have been adopted in the literature to predict temperature fields in laser-based metal AM. Analytical and numerical methods are the two approaches widely used in pure physics-based simulations that use differential equations based on heat transfer models [9]. However, the major drawback in these methods is assumptions and simplifications, which fail to capture the variations and uncertainty in the laser powder bed fusion (LPBF) AM process, thus leading to low prediction accuracies [10]. Moreover, numerical methods are computationally expensive [11], making them less practical for real-time decision-making and process control. On the other hand, pure machine learning and data-driven approaches have become extremely popular in the design and production phases of AM [12]. These approaches usually require massive datasets (e.g., from 17,000 [13] to 4.34 million [14]), which are generally difficult and costly to collect in AM. Although there are some approaches that increase the dataset size by advanced data augmentation approaches, like general adversary networks [15], they require high computational power.
To overcome the limitations of the pure physics-based models and pure data-driven methods mentioned earlier, physics-informed machine learning (PIML) models have been trending in recent years. Various terminologies have been adopted by the literature to refer to these models including physics-informed deep learning (PIDL), hybrid, physics-based machine learning, scientific machine learning, model-based deep learning, physics-constrained machine learning, physics-aware machine learning, and physics-informed neural network (PINN) [16]. According to a recent review paper by Wang et al. [17], PIDL is expected to have a significant impact on the modeling of AM using data-driven methods in the future. In feature engineering, the inputs based on the physics to common ML models [18] or modifying the machine learning components (such as loss functions) with the integration of physical model (constraints and boundary conditions) [8] are among the most common approaches, as shown in Fig. 1. On similar lines, Guo et al. [19] reviewed the state-of-the-art PIDL studies in metal AM. It was concluded that the PIDL owns numerous benefits for manufacturers due to its low computational time, real-time responsiveness to dynamical systems, data-driven decision logic, the ability to quantify uncertainty compared to pure physics-based models, and also its inherent physical consistency compared to pure machine learning models. Mozzafar et al. [20] further highlighted the advantages and huge potential of mechanistic artificial intelligence (including physics-informed learning) in the field of manufacturing.
A review of the literature shows that while the existing physics-based machine learning models have superior performance compared to most of the pure physics-based and pure data-driven models, they generally require data in magnitudes of 1000 s and time (minutes) in magnitudes of 10 s. For instance, Xie et al. [21] used PIDL to predict the temperature field in metal AM. The proposed PIDL used a fully connected deep neural network with a modified loss function based on a set of partial differential equations. The PIDL used 6000 data points; the inputs were spatial coordinates, temporal data, scanning speed, and laser power; and the outputs were melt-pool temperature and thermal history. It was shown that PIDL results are consistent with experimental and finite element simulations. Moreover, it was shown that the PIDL outperforms artificial neural networks (ANN), long short-term memory (LSTM), and XGBoost pure ML models for temperature prediction in metal AM from an accuracy standpoint. However, the availability of large data remains an unsolved issue in the PIDL models. Obtaining data in the context of AM is often a costly and time-consuming challenge due to various factors such as the distinct production scales involved, design variability, slow process speed, and complex data processing [22,23]. Therefore, we suggest adopting the concept of algorithm unrolling or deep unfolding to address these issues in the field of AM as a novel solution to benefit from PIDL models and to address the issue of data scarcity in current pure ML and PIDL approaches [24].
The idea of linking the iterative physics model to deep neural networks in additive manufacturing was first introduced in our previous work [24]. A proof of concept is outlined to benefit from the distinct advantages of this novel method in different applications, including thermal modeling, topology optimization, and tolerance design. The development of the deep unfolding framework in recent times has created an opportunity for a revolutionary integration of established data-driven architectures and physical models. This approach relies on an iterative physical model that is unfolded at each iteration to one layer of a deep neural network [25–27]. These unfolding iterations offer the benefit of leveraging the predictive power of deep neural networks while utilizing physical knowledge. Moreover, they create sparser networks that are less data-hungry compared to the fully connected networks, i.e., are suitable for limited data scenarios, and are generally quicker to generate results. Furthermore, the structure of well-established models/algorithms for AM can be maintained and reflected correctly in the structure of the network, resulting in a higher degree of explainability [28]. The general illustration of the architecture-driven physics-informed machine learning approach (pursued in this study) compared to the literature’s two most common physics-informed machine learning approaches in the context of additive manufacturing is presented in Fig. 1. The associated elements of the network that are manipulated based on the process physics in each approach are highlighted in red. It can be seen that in the previous PIDL studies, physics-based information is mainly brought to the model by input feature engineering or loss function augmentation. However, the focus of the current study (APIDL) is to bring physical equations into the layers/architecture of the network.
To the best of the authors’ knowledge, the proposed approach was the first attempt at the time to formulate/shape the architecture of deep neural networks by incorporating physics-based models within the additive manufacturing domain. The proposed architecture-driven PIDL (APIDL) structure is inspired by the deep unfolding methodology where the iterative physical model's iterations are mapped onto the layers of deep neural networks. Our preliminary comparative study between the temperature predictions in LPBF by APIDL and LSTM approach showed that APIDL had 49% lower mean squared error (MSE) than LSTM and was 5–6 times faster than LSTM under limited data scenarios (i.e., in the order of hundreds) [29]. However, further study is required to fully evaluate the capabilities of the APIDL under limited data conditions and its performance compared to other existing state-of-the-art approaches.
In the current study, an APIDL method is developed to unfold the iterations of a transient thermal model for the LPBF process into layers of a neural network. To get the data for APIDL, a single-track single-layer LPBF process is considered. The goal is to predict the thermal history of three points on the track, which are the points of laser turn-on (start point), laser turn-off (end point), and the middle point between them. To do so, laser power, scan speed, laser spot size, time-step, and coordinates on the track are used as inputs. After tuning the hyper-parameters, the model’s performance has been evaluated. To further demonstrate the capability of our proposed APIDL method in scenarios with limited data, the accuracy and computation time for APIDL are compared with different pure ML algorithms (ANN, extra tree regressor (ETR), LSTM, and support vector regression (SVR)) for temperature prediction in LPBF. The question of minimum dataset size requirements for achieving a similar accuracy by APIDL and other ML methods is discussed. Finally, the performance of our proposed APIDL approach compared to other existing reported PIDL models in the literature is presented. The remainder of this paper is arranged as follows: Sec. 2 describes the APIDL methodology for temperature prediction. In Sec. 3, the experimental setup and preparation of the dataset needed for training the APIDL model are presented. In Sec. 4, hyper-parameter tuning of the APIDL model is outlined first. Next, the APIDL model’s performance is analyzed in detail. A detailed comparative analysis is provided to highlight the superiority of the APIDL method in limited data scenarios. In the end, Sec. 5 discusses the conclusions and future works.
2 Proposed Architecture-Driven Physics-Informed Deep Learning Model
In this section, the proposed APIDL model’s mathematical formulation and its algorithmic procedure are discussed in detail. Figure 2 depicts a flowchart of the proposed APIDL model. It can be seen that the process starts with building the APIDL architecture, followed by the training and testing process of the APIDL model. Finally, different pure ML algorithms for comparing the performance of APIDL are discussed.
2.1 Mathematical Formulation.
This concludes the formulation of one layer in APIDL. This structure is repeated to form a deep network with N number of layers. In the next section, these mathematical equations will be used in building the architecture of the APIDL model.
2.2 APIDL Implementation.
Python (TensorFlow library) is used for building the APIDL architecture and performing the training/testing process, as shown in the Pseudocode in Table 1. It can be seen that the algorithm starts with initializing the material properties according to Table 2, followed by defining the matrices and vectors as stated in Sec. 2.1. After setting the hyper-parameters, the architecture of the APIDL model is built, and it is trained and tested. To explain more, Fig. 3 shows the details of the APIDL architecture and its training process. Each layer of the APIDL model is associated with one iteration of the transient thermal model. As mentioned before, the equations within the layers of the neural network model are directly derived from transient thermal model, as discussed in Sec. 2.1. For each layer, both the temperature data from the previous iteration and the inputs are processed to calculate the temperature in the current iteration (layer). For the first layer (n = 1), the input temperature is considered a constant experimental value (given in Fig. 2). The iterations of the transient model are continued layer by layer in the neural network to reach the last layer, n = N. The temperature computed at the last layer of the neural network, i.e., the last iteration of the transient model (T(N)), is the output of the neural network and is considered the predicted temperature. The goal is to find the optimal deep neural network parameters (W, b) that minimize the loss function. As depicted in Fig. 3, a least square loss function between the experimental values (Texperiment) and predicted values (Tpredicted) is minimized by using an optimizer during the training process. The loss is calculated as the average value of squared errors for i points in the temperature vector, T, as shown in Table 1. The learning rate and optimizer type are the hyper-parameters that affect the accuracy of the model. After calculating the loss, the weights and biases are needed to be updated. According to the equations (Eqs. (12)–(15)) in each layer, the weights (W1n, W2n, W3n, W4n, W5n) and biases (b1n, b2n, b3n, b4n) are linked to the thermal model. Unlike the conventional neural networks in which the number of weights and biases are determined by the number of neurons, in APIDL, they are linked to the equations of the transient thermal model. Weights and biases were initialized with random values from a normal distribution (with a standard deviation of 0.01 in the current study). The weights are defined as matrix of dimension (i, j), and biases are defined as vectors of dimension (i). However, the calculating/updating process of weights and biases for APIDL is the same as the one in conventional neural networks; i.e., it is done automatically during python (TensorFlow library) model training. It is also worth mentioning that material properties are updated during the solution process, as they are already attached to the weights and biases. The stopping criterion for the APIDL model is the number of epochs, which refers to the number of times the entire training dataset is passed through the model. After the training process, the model’s performance was investigated with unseen test data.
Program start in python Set values to all variables as shown in Table 2 Set hyper-parameter values for APIDL #Building the APIDL architecture Define the placeholders for inputs and outputs Compute matrices and vectors shown in Eqs. (2)–(8) Initialize temperature FOR each layer Initialize weights (W1n, W2n, W3n, W4n, W5n) and biases (b1n, b2n, b3n, b4n) Write Eqs. (12)–(15) to calculate the temperature in each iteration based on temperature of previous iteration END FOR Assign calculated temperature at last layer (T(N)) as predicted temperature (Tpredicted) Define loss , i is the number of points in temperature vector T Define optimizer # Train (regular training and testing procedure) Split the data into train and test dataset FOR each epoch FOR each batch Perform forward pass with train data and predict temperature Perform backward pass and update weights and biases using optimizer to minimize loss END FOR END FOR # Test Perform predictions for test data and get metrics |
Program start in python Set values to all variables as shown in Table 2 Set hyper-parameter values for APIDL #Building the APIDL architecture Define the placeholders for inputs and outputs Compute matrices and vectors shown in Eqs. (2)–(8) Initialize temperature FOR each layer Initialize weights (W1n, W2n, W3n, W4n, W5n) and biases (b1n, b2n, b3n, b4n) Write Eqs. (12)–(15) to calculate the temperature in each iteration based on temperature of previous iteration END FOR Assign calculated temperature at last layer (T(N)) as predicted temperature (Tpredicted) Define loss , i is the number of points in temperature vector T Define optimizer # Train (regular training and testing procedure) Split the data into train and test dataset FOR each epoch FOR each batch Perform forward pass with train data and predict temperature Perform backward pass and update weights and biases using optimizer to minimize loss END FOR END FOR # Test Perform predictions for test data and get metrics |
Parameter/property | Symbol | Starting value | Unit |
---|---|---|---|
Heat capacity | Cp | 0.6 | J/(g.K) |
Density | ρ | 0.00799 | g/(mm)^3 |
Thermal conductivity | k | 0.02 | W/(mm.K) |
Convection heat transfer coefficient | h | 0.00005 | J/(mm^2.K) |
Stefan–Boltzmann Constant | σ | 5.6704 × 10^(−14) | W/((mm)^2.K^4) |
Surface emissivity | 0.2 | – |
Parameter/property | Symbol | Starting value | Unit |
---|---|---|---|
Heat capacity | Cp | 0.6 | J/(g.K) |
Density | ρ | 0.00799 | g/(mm)^3 |
Thermal conductivity | k | 0.02 | W/(mm.K) |
Convection heat transfer coefficient | h | 0.00005 | J/(mm^2.K) |
Stefan–Boltzmann Constant | σ | 5.6704 × 10^(−14) | W/((mm)^2.K^4) |
Surface emissivity | 0.2 | – |
2.3 Model Comparison.
To evaluate the effectiveness of the proposed model, it is beneficial to compare the performance of APIDL with pure machine learning models in order to highlight the significance of imposing physical models inside the machine learning architecture. To evaluate the proposed APIDL approach, different machine learning algorithms, including LSTM, ANN, ETR, and SVR, are compared with the APIDL model. The machine learning models were chosen in a way that has at least been proven in one of the contexts of limited data or time-series data or application in temperature predictions in AM to cover different aspects of the current study. The motivation for choosing LSTM [13] is its superior performance in time-series prediction problems, which is similar to the temperature prediction in the LPBF process addressed in the current study. Moreover, the ANN [32], ETR [14], and LSTM [13] have been proven to be effective for temperature prediction in AM. Moreover, the SVR model is considered in the current study as it is one of the machine learning models suitable for limited data scenarios [33]. The size of the dataset in this study is smaller (in the order of hundreds) compared to the majority of existing similar studies (in the order of thousands).
3 Experimental Setup and Data Preparation
A single-track single-layer LPBF experimental setup was considered in the current study to collect the data for training the APIDL model. A schematic view of the experimental setup is shown in Fig. 4. Miniature powder bed specimens based on Ti64 material with a length of x = 2 mm were created to mimic the commercial AM process. Beamline 32-ID was used to conduct the synchronized high-speed infrared imaging experiments at the Advanced Photon Source at Argonne National Laboratory. The IPG YLR-500-AC laser and Telops Fast M3K high-speed IR camera were used during the experiments. Additional information regarding the experimental setup can be found in Ref. [34]. The in-situ LPBF system used is experimentally customizable and allows for investigations as to numerous factors, including laser power, laser scan speed, and laser spot size. By altering the laser power, laser scan speed, and laser spot size, ten experiments were conducted, as detailed in Table 3. In each experiment, the laser power, laser scan speed, and spot size are first selected. Next, the laser is positioned over the start point (x = 0) and moved along the x-direction to pass through the middle and end points, as shown in Fig. 4(b). In the meantime, near on-axis IR images are taken with a resolution of 192 × 100, time-step of τ = 0.1 ms, and the laser turns off at the point x = 2 mm. This time-step is equivalently used in the transient thermal model in APIDL architecture. After conducting the experiments, temperature data from IR images were collected as .csv files through Fiji-ImageJ software. Temperature data for start, middle, and end points (as depicted in Fig. 4(b)) along the deposited tracks for each experiment were extracted, and a calibration factor of 2.07 was used to transfer pixel values to temperature values. Consequently, the temperature data (T) in each time-step is a 3 × 1 vector of the temperatures in the start, middle, and end points. The start point is assumed to be the point where the temperature has the maximum value right after the laser is turned on. The end point is assumed to be the point where the temperature has the maximum value right after the laser is turned off. The middle point is the center point on the line joining the start and end points. To reduce the bias while extracting the temperature values, the temperature values at a pixel are taken as the average of that pixel and its eight neighboring pixels. While 2000 images (samples) were recorded per experiment, most of them had no significance as they belonged either to the before start time or to a semi-steady-state temperature after laser turn-off and specimen cool down. Additionally, to cover the entire temperature cycle during the LPBF process (i.e., melting, solidifying, and cooling), image collection starts before the laser turns on and continues until the laser turns off and the specimen cools down to a semi-steady-state temperature. Therefore, only the 100 useful images that include the entire temperature cycle (starting from 5 steps before laser start) were used for data extraction. This highlights the fact that gathering the data is a time-consuming task, and it is much more beneficial to have models that can be run with a limited number of data.
Experiment number | Laser power (W) | Scan speed (mm/s) | Laser spot size (mm) |
---|---|---|---|
Training dataset | |||
1 | 311 | 600 | 0.04 |
2 | 311 | 800 | 0.04 |
3 | 311 | 400 | 0.04 |
4 | 311 | 600 | 0.04 |
5 | 197 | 600 | 0.08 |
6 | 254 | 600 | 0.08 |
7 | 426 | 600 | 0.08 |
8 | 311 | 600 | 0.08 |
Testing dataset | |||
9 | 254 | 600 | 0.06 |
10 | 311 | 400 | 0.08 |
Experiment number | Laser power (W) | Scan speed (mm/s) | Laser spot size (mm) |
---|---|---|---|
Training dataset | |||
1 | 311 | 600 | 0.04 |
2 | 311 | 800 | 0.04 |
3 | 311 | 400 | 0.04 |
4 | 311 | 600 | 0.04 |
5 | 197 | 600 | 0.08 |
6 | 254 | 600 | 0.08 |
7 | 426 | 600 | 0.08 |
8 | 311 | 600 | 0.08 |
Testing dataset | |||
9 | 254 | 600 | 0.06 |
10 | 311 | 400 | 0.08 |
To prepare the required dataset for training and testing APIDL, laser power, scan speed, laser spot size, time-step, and coordinates on the track for start, middle, and end points were used as the inputs, while the temperatures at start, middle, and end points were considered as outputs. The goal is to predict the temperature values at three aforementioned points on the track at different times. For training and testing the APIDL and pure ML models, 1000 input–output samples were used as a dataset, where 80% of the dataset was selected for training (experiments 1–8) and 20% of the dataset (experiments 9 and 10) was used for validation purposes.
4 Results and Discussion
This section will explain the effect of hyper-parameters on the proposed APIDL model’s performance. Then, the detailed analysis of APIDL results on temperature prediction of the LPBF process will be discussed. Furthermore, a comprehensive comparison will be conducted to evaluate the ability of different pure ML models and the proposed APIDL model in terms of accuracy and dataset size. Finally, a comparison is made between the APIDL model’s performance and those of previously published PIDL models for the prediction of thermal fields in different AM processes. All results have been generated using a standard PC with Intel i7 core 2.90 GHz, and 16 GB memory has been used in the current study.
4.1 Hyper-Parameter Tuning of APIDL Model.
Hyper-parameter tuning is significant in ML models as it plays an important role in achieving higher model accuracy. Trial and error methods and optimization algorithms have been exploited for choosing the optimal hyper-parameters by various researchers. In the current study, hyper-parameters are tuned using a stepwise search approach.
The learning rate, number of epochs, and optimizer type directly impact the performance of machine learning models. The effect of the learning rate on MAPE of both the training and testing datasets is illustrated in Fig. 5(a). An optimal value could be found by starting with a learning rate of 0.05 and eventually decreasing it to 0.001. It can be seen that the learning rate of 0.05 shows signs of underfitting, i.e., performing well on the training set (MAPE of 2.9%) and not so good on the testing set (MAPE of 4.2%). As the learning rate is decreased to 0.005, the errors drop to a minimum of 2.77% and 2.8% for the training and testing dataset, respectively. Further decrease in learning rate is not desirable due to the possibility of being stuck with a high MAPE and slower training. Therefore, the learning rate of 0.005 is selected based on lower MAPE values on training and testing errors.
The number of epochs’ impact on training and testing loss for a learning rate of 0.005 is depicted in Fig. 5(b). As the number of epochs increases, both the training and testing loss seem to decrease and become flat, which is a good indicator. There is no sign of overfitting or underfitting. The lower value of testing loss than training loss means that the testing set is easier to predict than the training set.
Choosing an appropriate optimizer is highly important to minimize training and testing errors. Adam and Adagrad optimizers are known as powerful optimizers for neural networks [35]. Different learning rates were tested for both optimizers, and the training and testing MAPE for optimal learning rates in both optimizers are shown in Table 4. It is found that the Adam optimizer does perform better than the Adagrad optimizer. Therefore, the Adam optimizer was used throughout the current study.
4.2 Analysis of the APIDL Model’s Performance.
Table 5 compares the performance of the APIDL model in predicting the start, middle, and end point temperatures for both the training and testing dataset. It can be seen that for all of the points, the MAPE is less than 3.16%, and R2 values are more than 0.91. On average, for these points, APIDL achieves a MAPE of 2.8% and R2 value of 0.936. This high accuracy of APIDL with limited data (800 data for training and 200 data for testing) has been possible due to the integration of the thermal model into a neural network structure. The links between features across the layers in APIDL are already created and designed based on a thermal model assisting APIDL during the learning phase.
Performance criteria | Training dataset | Testing dataset | ||||
---|---|---|---|---|---|---|
Start point | Middle point | End point | Start point | Middle point | End point | |
MAPE (%) | 2.91 | 2.78 | 2.62 | 2.58 | 2.81 | 3.16 |
R2 | 0.92 | 0.95 | 0.94 | 0.91 | 0.96 | 0.94 |
Performance criteria | Training dataset | Testing dataset | ||||
---|---|---|---|---|---|---|
Start point | Middle point | End point | Start point | Middle point | End point | |
MAPE (%) | 2.91 | 2.78 | 2.62 | 2.58 | 2.81 | 3.16 |
R2 | 0.92 | 0.95 | 0.94 | 0.91 | 0.96 | 0.94 |
Relative deviations of temperatures in the testing dataset predicted by APIDL with experimental values are shown in Fig. 6 for the start, middle, and end points. It can be seen that only 5.7% of the data lie outside the relative deviations of ±10%, signifying that the PIDL predictions are highly accurate considering limited training data. The maximum deviation is for the start point, with a value of 33%. This may be attributed to the laser turn-on instant in which the model is unable to capture the sudden surge. Additionally, at the very peak temperatures in the beginning of the experiment, a vapor plume is formed. The plume is not part of the thermal equations formulated earlier in Eq. (1), and thus, its impact (in a gaseous state) is not captured by the physical thermal model (which usually considers liquid and solid states).
To further examine the data from a physical viewpoint, in the next step, one of the experiments from the testing dataset (experiment 10 in Table 3) is examined thoroughly as a sample for understanding the process. The variations of experimental and APIDL-predicted temperature at start, middle, and end points versus time and collected data sample are shown in Fig. 7. According to this figure, for all points, the temperature remains roughly constant at the beginning while further increasing as the laser gets closer to that point. After the laser reaches each point, the temperature increases to maximum values of 2326 K, 2615 K, and 2717 K at the start, middle, and end points. The increase in maximum temperature along the track from start to end point may be attributed to the heat conduction inside the specimen. Moreover, the model seems to slightly underestimate the temperatures when the laser reaches the desired start, middle, and end points. As is observed, the points with maximum temperatures and their adjacent points have the highest deviations between experimental and predicted values. As the melting point for Ti64 material is 1877 K [31], it may be concluded that the APIDL model predicts a lower value for material in the liquid phase and has higher errors in these points. Future research could study the possibility of using relevant liquid and molten-state physical relationships to enhance predictions for these points. It is also apparent that the temperature drops after the laser passes away for all points, and the temperatures tend to reach steady-state values.
4.3 Comparative Analysis.
To assess the APIDL model's performance in handling scarce data, the APIDL model is compared with other pure machine learning models, including ANN, ETR, LSTM, and SVR. The hyper-parameters of these models are found by trial and error and shown in Table 6. It is worth mentioning that identical experimental data were split and used for training and testing the APIDL and pure ML models for suitable performance comparison.
ML model | Hyper-parameters |
---|---|
ANN | Number of hidden layers = 5, Number of neurons in a layer = 256, Optimizer = “Adam” |
ETR | Maximum depth of a tree = 20, Minimum number of samples for splitting = 5, Number of trees = 1000 |
LSTM | Number of layers = 5, Number of neurons = 25, Optimizer = “Adam”, Dropout = 0.2 |
SVR | Kernel = “poly”, Kernel coefficient = 3, Penalty parameter = 0.5, Epsilon = 0.02 |
ML model | Hyper-parameters |
---|---|
ANN | Number of hidden layers = 5, Number of neurons in a layer = 256, Optimizer = “Adam” |
ETR | Maximum depth of a tree = 20, Minimum number of samples for splitting = 5, Number of trees = 1000 |
LSTM | Number of layers = 5, Number of neurons = 25, Optimizer = “Adam”, Dropout = 0.2 |
SVR | Kernel = “poly”, Kernel coefficient = 3, Penalty parameter = 0.5, Epsilon = 0.02 |
Figure 8 compares the MAPE and training time of different pure ML models with the APIDL model. Accordingly, it is observed that the APIDL model outperforms all the studied pure ML models with the lowest MAPE of 2.8%. Among the pure ML models, ETR has better performance with a MAPE of 6.43%, which is still less accurate than APIDL for the current number of training points. The superiority of ETR compared to other pure ML models, in this case, may be attributed to the fact that ETR is a tree-based ML algorithm and has the highest computation efficiency among pure model training and prediction in limited data scenarios. Also, it is observed that while the LSTM has a MAPE of 7.1%, the ANN and SVR fail in this case due to the limited number of datasets. Regarding the training time, ANN and LSTM have quite a high training time compared to APIDL, ETR, and SVR.
APIDL being in the similar category of deep learning models like ANN and LSTM takes less time due to the mapping of the iterative thermal model onto layers of the neural network. The mapping of the iterative thermal model limits the number of connections between layers, thus making the network sparse with significantly fewer connections than fully connected deep learning models like ANN. Therefore, the APIDL is more accurate and faster for predicting results under limited data scenarios compared to pure deep learning models. By considering both the MAPE and training time, APIDL is compared in more detail with the top two pure ML models, i.e., the ETR and the LSTM, in Table 7. According to the results, the training time for ETR is better than APIDL and LSTM, which might be attributed to the inherent capability of tree-based models in the learning phase. Moreover, ETR fits better than LSTM and APIDL on the training set. In spite of this, it was shown that the APIDL outperforms both the ETR and the LSTM in terms of both MAPE and R2 in the test dataset. The MAPE and R2 of APIDL are 55.7% lower and 15.6% higher than the ETR and 59.9% lower and 39.7% higher than LSTM, respectively.
Model | Train dataset | Test dataset | ||||
---|---|---|---|---|---|---|
MAPE | R2 | Time (s) | MAPE | R2 | Time (s) | |
APIDL | 2.77 | 0.943 | 87.93 | 2.85 | 0.936 | 0.46 |
LSTM | 6.8 | 0.68 | 147 | 7.1 | 0.67 | 4.74 |
ETR | 0.4 | 0.996 | 3.66 | 6.43 | 0.81 | 0.3 |
Model | Train dataset | Test dataset | ||||
---|---|---|---|---|---|---|
MAPE | R2 | Time (s) | MAPE | R2 | Time (s) | |
APIDL | 2.77 | 0.943 | 87.93 | 2.85 | 0.936 | 0.46 |
LSTM | 6.8 | 0.68 | 147 | 7.1 | 0.67 | 4.74 |
ETR | 0.4 | 0.996 | 3.66 | 6.43 | 0.81 | 0.3 |
While per the results, APIDL promises higher accuracy with a moderate training time; it would be of interest to further study its performance sensitivity compared to other approaches as the training dataset becomes smaller, as well as explore the minimum dataset size requirements for the proposed APIDL approach to achieve a reasonable accuracy of at least 0.80 R2 or, at most, 10% MAPE. To answer this question and evaluate the sensitivity of our model to the training size, APIDL was repetitively trained by reducing the data points (i.e., reducing the number of experiments). However, keeping the test set to be the same, i.e., experiments 9 and 10.
Figure 9 shows the variations of R2, training time, MAPE, and mean prediction time for various training dataset sizes. In Fig. 9, each 100 data point is equivalent to one set of experiments, as shown in Table 3; 400 data points (four experiments) are enough for APIDL to achieve at least 0.80 R2, and 300 data points (three experiments) are enough to get below 10% MAPE. This result means that to achieve the desired level of accuracy, APIDL needs labeled data in the order of hundreds, making it useful for scenarios where obtaining labeled data is expensive. For LSTM, over 800 data points will be required to reach 0.80 R2, whereas 800 data points are enough for ETR to reach 0.80 R2. To achieve a 10% MAPE level, LSTM needs at least 500 (five experiments), and ETR needs at least 400 (four experiments) data points. Therefore, APIDL can achieve identical prediction accuracy of 10% or lower MAPE with 40% and 25% less training data than LSTM and ETR, respectively. This means that the APIDL is especially useful when the cost of labeled data collection by experiments or numerical simulations is high, such as in the field of additive manufacturing. According to Fig. 9, the training time of APIDL is between ETR and LSTM for all the dataset sizes. However, the mean prediction time at the desired accuracy level for APIDL is almost similar to the ETR. LSTM is inferior to APIDL and ETR in terms of data required and mean prediction time. Overall, the APIDL model has the advantage of achieving better prediction accuracy while using fewer data points and taking only about one or two minutes of computation time. This is especially beneficial in LPBF additive manufacturing, where data collection and preprocessing can be expensive. Compared to other models like ETR, ANN, and LSTM, the APIDL model's performance is favorable in this regard.
Table 8 gives a comparison between PIDL models used in the literature for predicting thermal fields in different AM processes and the APIDL proposed in the current study. Although a direct comparison would not be feasible because these models have been performed on different datasets and computation systems, it is noticeable that the APIDL model needs fewer data points than most of the reported PIDL models. It can be seen that most of the previous PIDL studies modified the components of neural networks, e.g., loss function, whereas this study is among the first to modify the actual structure of the neural network by integrating the AM-specific physical models. In the future, a comparative study between both approaches will be studied, i.e., modifying loss function in neural networks and unfolding a physical model on a neural network.
Ref | Year | AM Process | Inputs | Outputs | Training dataset size | Metrics | Training time (s) | Approach |
---|---|---|---|---|---|---|---|---|
[36] | 2021 | Stereolitho-graphy | Time, spatial coordinates | Temperature field | 200,000 | Maximum error: < 3.5 °C | 1800 | Physics-informed Loss function augmentation |
[37] | 2021 | Directed energy deposition | Spatial coordinates and time | Temperature | 80 (1000 collocation points generated data for each iteration of the batch) | MSE: 0.43 Root mean square error (RMSE): 0.6 °C | 2975 | Physics-informed Loss function augmentation |
[21] | 2022 | Directed energy deposition | Laser power, scanning speed, time, and spatial coordinates | Melt-pool temperature and thermal history | 6,000 per layer | Mean relative error: 1.28%, MAE: 1.62 °C, RMSE:4.32 °C, Accuracy: 90% | 622.4 | Physics-informed Loss function augmentation |
[8] | 2021 | LPBF | Material properties, laser power and scanning speed | Temperature field and cooling rate | 788,651 | Relative discrepancy: 37.7%, 8.1% and 7.8% | – | Physics-informed Loss function augmentation |
[29] | 2022 | LPBF | Spatial coordinates, laser power, scan speed, laser spot size, and time | Temperature | 400 | 106.3 | Physic-informed architecture shaping (APIDL) (author’previous work) | |
Current study | 2022 | LPBF | Spatial coordinates, laser power, scan speed, laser spot size and time | Temperature | 800 | MAPE: 2.85% R2: 0.936 | 87.93 | Physic-informed architecture shaping (APIDL) |
Ref | Year | AM Process | Inputs | Outputs | Training dataset size | Metrics | Training time (s) | Approach |
---|---|---|---|---|---|---|---|---|
[36] | 2021 | Stereolitho-graphy | Time, spatial coordinates | Temperature field | 200,000 | Maximum error: < 3.5 °C | 1800 | Physics-informed Loss function augmentation |
[37] | 2021 | Directed energy deposition | Spatial coordinates and time | Temperature | 80 (1000 collocation points generated data for each iteration of the batch) | MSE: 0.43 Root mean square error (RMSE): 0.6 °C | 2975 | Physics-informed Loss function augmentation |
[21] | 2022 | Directed energy deposition | Laser power, scanning speed, time, and spatial coordinates | Melt-pool temperature and thermal history | 6,000 per layer | Mean relative error: 1.28%, MAE: 1.62 °C, RMSE:4.32 °C, Accuracy: 90% | 622.4 | Physics-informed Loss function augmentation |
[8] | 2021 | LPBF | Material properties, laser power and scanning speed | Temperature field and cooling rate | 788,651 | Relative discrepancy: 37.7%, 8.1% and 7.8% | – | Physics-informed Loss function augmentation |
[29] | 2022 | LPBF | Spatial coordinates, laser power, scan speed, laser spot size, and time | Temperature | 400 | 106.3 | Physic-informed architecture shaping (APIDL) (author’previous work) | |
Current study | 2022 | LPBF | Spatial coordinates, laser power, scan speed, laser spot size and time | Temperature | 800 | MAPE: 2.85% R2: 0.936 | 87.93 | Physic-informed architecture shaping (APIDL) |
5 Conclusion
In this work, an architecture-driven physics-informed deep learning methodology named APIDL was established to predict temperature with limited data obtained from the LPBF additive manufacturing process. In this approach, the iterations of the transient thermal model for LPBF are unfolded on the layers of the deep neural network. Experiments were performed to create a dataset of size 1000. APIDL model’s performance and hyper-parameter tuning is studied in detail. The proposed APIDL model achieves a testing MAPE of 2.8% or R2 value of 0.936. Additionally, the APIDL model is compared with the existing machine learning approaches, including ETR, SVR, ANN, and LSTM networks. Despite limited training data, APIDL outperformed all the other models in terms of prediction accuracy with a relatively moderate computation time. The MAPE and R2 of APIDL are 55.7% lower and 15.6% higher than the best pure ML model, ETR, respectively. The impact of dataset size on the performance of the APIDL, ETR, and LSTM networks is studied. The results show that APIDL needs 40% and 25% less training data than LSTM and ETR, respectively, to achieve 10% or lower MAPE. Based on the reported training data size, accuracy, and training time of existing PIDL models in the literature, the proposed novel APIDL approach can become an ideal choice for limited data scenarios and has a lower training time than previously reported PIDL models.
Future research will be to extend the application of the APIDL model to multiple-track and multiple-layer AM using relevant experimental data and appropriate thermal models. Moreover, the combination of the APIDL model with previous physics-informed deep learning approaches (e.g., input feature engineering and loss function augmentation) will be studied. The application of the APIDL technique, i.e., mapping iterations to layers of deep neural networks, in different aspects of AM ranging from design to process control and quality assurance will be explored.
Acknowledgment
This research used resources of the Advanced Photon Source, a U.S. Department of Energy (DOE) Office of Science User Facility operated for the DOE Office of Science by Argonne National Laboratory (ANL) under Contract No. DE-AC02-06CH11357. It is also supported using Laboratory Directed Research and Development funds via Argonne National Laboratory under Contract No. DE-AC02-06CH11357.
Conflict of Interest
There are no conflicts of interest.
Data Availability Statement
The datasets generated and supporting the findings of this article are obtainable from the corresponding author upon reasonable request.