Development of artificial intelligence-based model for prediction
of the compressive strength of self-compacting concrete
Hai-Bang Ly1,*, Binh Thai Pham1, Thuy-Anh Nguyen1, May Huu Nguyen 1,2,*
1
Civil Engineering Department, University of Transport Technology, 54 Trieu Khuc,
Thanh Xuan, Hanoi 100000, Vietnam
2
Civil and Environmental Engineering Program, Graduate School of Advanced Science
and Engineering, Hiroshima University, 1-4-1, Kagamiyama, Higashi-Hiroshima,
Hiroshima 739-8527, Japan
* Corresponding authors
Email addresses: (H.-B. Ly), (B. P. Pham),
(T.-A. Nguyen), and (M. H. Nguyen)
Abstract:
This study investigated the usability of an artificial neural network model (ANN) and the
Grey Wolf Optimizer (GWO) method for predicting the compressive strength of selfcompacting concrete (SCC). The ANN-GWO model was developed using an experimental
database of 115 samples obtained from various sources considering nine key factors of
SCC. The validation of the proposed model was evaluated via six indices including
correlation coefficient, mean squared error, mean absolute error, IA, Slope, and mean
1
Electronic copy available at: />
absolute percentage error. In addition, the importance of each parameter affecting the
compressive strength of SCC was investigated utilizing partial dependence plots. The
findings demonstrated that the proposed ANN-GWO model is a reliable predictor of SCC
compressive strength. Following that, an examination of the parameters impacting the
compressive strength of SCC was provided.
Keywords: Artificial Neural Network (ANN); Grey Wolf Optimizer (GWO) algorithm;
compressive strength; self-compacting concrete;
1. Introduction
In the sphere of construction and building, concrete is the most often used material due to
its ease of production, low cost, and valuable structure characteristics [1,2]. It may be used
in a broad variety of structures such as buildings, bridges, roads, and dams. In line with the
scientific growth path, the need for high-performance concrete is developing on a
continuous basis. As a result, several particular concrete types have been proposed with
notable features in physicochemical properties and fresh state properties [3–5].
The construction industry in Japan has quickly adopted the use of self-compacting concrete
(SCC), a concrete type that can approach and fill the corners of formwork without the
requirement for a compaction phase [6,7]. Since then, various studies have been focused on
developing the applications of this kind of concrete [8,9]. On the one hand, SCC is listed as
a kid of high-performance concrete, flexible deformability, good segregation resistance,
and less blocking surrounding the reinforcement. The exclusion of the compaction step
2
Electronic copy available at: />
brings several advantages of SCC, including economic efficiency (e.g., accelerated casting
speed, saving labor, energy, and cost), enhances the working environment, and proposes a
novel approach to automating the concrete construction [6,10–13].
On the other hand, to achieve its desired flowable behaviors and proper mechanical
properties, SCC requires a complex manipulation of several mixture variables [10,11]. For
instance, the water-to-binder (w/b) ratio of SCC is lower than conventional concrete, which
is usually supported by special additives and superplasticizers to obtain the desired
workability [14–17]. Also, the grading of the aggregates, including aggregate shapes,
texture, mineralogy, and strength, are always carefully considered to ensure workability and
concrete strengths [18,19]. These features lead to a significant challenge to establishing a
universal correlation between the SCC properties and its constituent parameters [8,9,20]. In
other words, they bring out the need for predicting the properties of SCC in both the fresh
and hardened stages. The traditional applications of analytical models to represent the
influence of each of these parameters on the properties of SCC, and then optimizing this
model utilizing regression analysis. However, so far, no explicit equations have been
established due to these methods being less productive for nonlinearly separable data and
complicated [21,22].
In this regard, over the past few decades, various modeling methods utilizing artificial
intelligence (AI) techniques have been adopted, such as artificial neural networks (ANNs),
genetic algorithm (GA), and expert system (ES) for modeling a variety of current problems
in the field of civil engineering [23–25]. Among these, ANN is a more prevalent and
efficient approach since its ability to classify to capture interrelationships among input3
Electronic copy available at: />
output data pairs. Numerous researchers have proposed their own ANN models for
predicting the concrete strength [26–28]. Regarding SCC, several models have also been
presented for predicting the compressive strength [29–31]. Yeh has soon demonstrated the
opportunities of adapting ANN to predict high-performance concrete's compressive
strength [29]. The viability of utilizing ANNs to forecast the characteristics of SCC that
uses fly ash as a cementitious substitute was examined by Douma et al. [30]. In these
models, numerous experimental results were collected from the previous studies and
employed for training and evaluating the proposed model. Siddique et al. presented the
useability of neural network for predicting the compressive strength of SCC based on some
input properties [31]. Their proposed model could be easily extended to different input
parameters of the experimental results, containing bottom ash as a replacement of sand.
Despite this, there has not been a detailed investigation into an improved ANN model for
predicting the compressive strength of SCC. The need for a novel, appropriate artificial
neural network model to forecast the strength of SCC is developing on a continuous basis,
in step with the advancement of scientific knowledge.
Therefore, in the present research, the artificial neural network (ANN) approach coupled
with the Grey Wolf Optimizer (GWO) algorithm for forecasting the compressive strength
of SCC is examined. For this target, a variety of databases from different independent
sources was gathered and employed to train and assess the proposed model. The ANN
model is established on the basis of two groups of input parameters, including concrete
mixture components (i.e., the contents of binder, fine and coarse aggregates,
superplasticizer and water-to-binder ratio), and the fresh properties SCC such as slump
4
Electronic copy available at: />
flow, V-funnel and L-box tests. The output predicted parameter is the compressive strength
of SCC. The influence of the used parameters on the compressive strength of SCC was then
discussed.
2. Materials and methods
2.1. Machine learning methods
2.1.1. Artificial Neural Network (ANN)
Artificial Neural Network (ANN) is being widely used to solve prediction problems by
drawing on biology's understanding of how the nervous system functions [32–35]. ANN
contains many simple processing elements, the so-called neurons. An ANN is made up of
nodes and linked parts that are divided into three layers: the input layer, hidden layer, and
output layer. Because of this training process, the neural network produces a model that can
predict a target parameter from an input value that has been provided [36].
In general, an ANN includes the minimum number of neurons that can simulate the training
progress. A linking between nodes carries a weighted representative of some earlier
learning stage. On the basis of the changes in weights, the input-output correlation could be
established. The system has to be educated to recreate the input-output correlation, which is
called optimal weights [37,38]. In an ANN model, the correlation between the input and
output variables is determined by the collected data points. Because they are very
independent of one another, it is feasible to execute a large number of processes at the same
time.
5
Electronic copy available at: />
In order to take advantage of these benefits, most suggested models determine the number
of hidden layers and the number of nodes by using a rule of thumb or by looking for
random designs that meet specific criteria. Furthermore, several appropriate numbers of
parameters similar to learning speed and momentum are needed for chosen hidden layers
and nodes [29–31,39]. As a final point, all of the research stated that ANN is a reliable
method for estimating the compressive strength of concrete.
2.1.2. Grey Wolf Optimizer (GWO) algorithm
Over two past decades, metaheuristic optimization algorithms have commonly been applied
in most engineering fields. For example, the Grey Wolf Optimizer (GWO) algorithm, one
of the models developed by Mirjalili et al., was invented based on the leadership and
hunting skills of the grey wolf pack's communal life [40]. In order to simulate the order of
management, each wolf pack comprises of four main forms of grey wolves, including alpha
(α), beta (β), delta (δ), and omega (ω). In this structure, grey wolves follow strict rules
which clearly divide their responsibilities. Accordingly, α wolves work as the most
responsible wolves, whereas ω wolves have the least responsibility (Fig. 1). The following
orders in the pack are β and δ wolves, respectively.
Each location of a grey wolf in the GWO algorithm might result in a viable solution to the
optimization problem. From a mathematical standpoint, the optimal option is chosen among
α, β, and δ wolves with the closest proximity to the prey. Every iteration follows the same
method for the second and third-best answers. The locations of all other wolves (i.e., ω
ones) are meant to be determined by the positions of α, β and δ wolves. On the basis of this
6
Electronic copy available at: />
technique, several works [41–43] focused on the reliability of the GWO model for
estimating compressive strength.
Fig. 1. The categorized leadership structure of grey wolves.
2.2. Database construction
To realize the objective of the current study, the dataset containing 115 SCC compressive
strength data is collected from 12 published experimental works [44–55]. The ANN model
is designed with nine inputs, such as the water/powder ratio (W/B), coarse aggregate (C),
fly ash percentage (P), fine aggregate (F), slump flow (D), binder content (B), V-funnel
test, superplasticizer dosage (SP), and L-box test. In detail, the values of the W/B, P, F, D,
B range between 0.26 -0.45, 590 - 935 kg, 0 and 60%, 656 - 1038, 480 - 880 mm, and 370 733 kg, respectively. The V-funnel test value ranges from 1.95 to 19.2, the superplasticizer
dose is between 0.74 and 21.84, and the L-box test value is between 1.95 and 19.2. Besides,
the compressive strength values are in the range of 10.2 to 86.8 MPa. Specifically,
statistical analysis of input and output variables is detailed in Table 1.
7
Electronic copy available at: />
Table 1
Statistical analysis of the inputs and output.
Unit
Task
Min
Average
Max
St. D1
Range
-
Input
0.26
0.37
0.45
16.5859
60
C
kg
Input
590
742.63
935
121.809
345
P
%
Input
0
28.7
60
0.06
0.19
F
kg
Input
656
852.8
1038
89.931
382
D
mm
Input
480
660.5
880
56.108
330
B
kg
Input
370
523.4
733
71.221
363
-
Input
1.95
7.75
19.2
3.844
17.2
kg
Input
0.74
8
21.84
4.669
21.1
-
Input
0.6
0.86
1
0.0935
0.4
MPa
Output
10.2
48.22
86.8
17.555
69.8
Variable
W/B
V-funnel test
SP
L-box
Compressive
strength
1St.D:
standard deviation.
8
Electronic copy available at: />
Herein, the proposed ANN model is trained using 70 percent of the 115 experiments, while
30 percent of the data are utilized to evaluate the model. Thus, there are 81 samples for the
training data set and 34 samples used to determine the projected performance of the ANN
network. All data are scaled within the range of [0,1] to reduce the numerical errors while
conducting simulations, as recommended in Witten et al. [56], using Equation (1):
scaled =
2( - )
- 1
-
(1)
where and are respectively the min and max values of variables, and is the
corresponding variable's value to be scaled.
2.3. Quality assessment criteria
In this study, six statistical indicators were employed to assess the accuracy of the proposed
model, which are the correlation coefficient (R), root mean square error (RMSE), index of
agreement (IA), mean absolute error (MAE), slope, and mean absolute percentage error
(MAPE). To measure the correlation between the actual and predicted values in regression
problems, the R criterion, which is generally in the range [-1; 1], is extensively employed in
the literature [57]. The average degree of inaccuracy between actual and predicted outputs
is measured by the root mean square error (RMSE), mean absolute error (MAE), and mean
absolute percentage error (MAPE) [58]. In terms of quantitative accuracy, the smaller the
values of RMSE, MAE, and MAPE are, as well as the closer the absolute value of the
9
Electronic copy available at: />
correlation coefficient is to one, the more accurate the machine learning model is. These
values are represented by:
2
1 N
Q j ,AV Q j , PV
N j 1
(2)
1 N
Q j , AV Q j , PV
N j 1
(3)
RMSE
MAE
MAPE
1 N Q j , AV Q j , PV
100%
N j 1
Q j , AV
(4)
Q j , AV QAV Q j , PV QPV
N
R
j 1
Q j , AV QAV Q j , PV QPV
N
2 N
j 1
2
(5)
j 1
𝑁
∑ (𝑄
𝐼𝐴 = 1 ‒
2
𝐴𝑉 ‒ 𝑄𝑃𝑉)
𝑗=1
(6)
𝑁
∑ (| 𝑄
𝐴𝑉 ‒ | +
|𝑄𝑃𝑉 ‒ | )2
𝑗=1
where: N is the number of databases; QAV and QAV are the actual values and the average
real values; QPV and QPV are predicted values and average predicted values are calculated
according to the forecasting model.
2.4. Partial Dependence Plot
10
Electronic copy available at: />
The partial dependency plot (PDP) is introduced by Friedman [59] for the purpose of
interpreting complex Machine Learning algorithms. Some algorithms are predictive, but
they do not show whether a variable has a positive or negative effect on the model. Hence,
the partial dependency plot helped depict the functional relationship between the inputs and
the targets. At the same time, PDP can show whether the relationship between the target
and a feature is linear, monotonic or more complex.
Let X = X 1 , X 2 ,..., X n being the inputs for a model with the predictive function Y(X). X
is subdivided into two subsets XM and its complement X N X \ X M .
For the output Y (X) of a machine learning algorithm, the partial dependence of y on a
subset of variables XM is defined as:
YM X M E X N Y X M , X N Y X M , X N PM X N dX N
(7)
In which: PN(XN) is the probability density on the marginal distribution of all variables in
the test, determined as follows:
PN X N P X d X N
(8)
In fact, the PDP is simply calculated by averaging over a training data set
YM X M
1 n
Y X M , X i,N
n i 1
where Xi,N (i = 1, 2, ..., n) are the values of XN appearing in the training sample.
11
Electronic copy available at: />
(9)
To simplify the construction of PDP (3), set XM = X1 as the predictor variable of interest
with unique values. PDP (3) then follows the following steps:
Step 1: For i {1, 2,..k}; Copy the training database and replace the original value of X1
with X1i is the constant. From a modified copy of training data, compute the vectors of the
predicted values.
- Calculate the mean predicted value to obtain
Step 2: From pairs
X
1i
, Y1 ( X 1i )
Y1 X 1i
.
(a, b) with i = 1,2, ... k, draw graphs representing PDP.
3. Results and discussions
3.1. Analysis of Optimal ANN-GWO Parameters
It is discussed in this part how to optimize the structure of the proposed ANN-GWO model,
including how to determine population size in the GWO algorithm and the neurons
associated with the hidden layers. In machine learning algorithms, the structure of the ANN
model plays a crucial role. As previously mentioned, the ANN structure includes three
layers, with the number of hidden layers consisting of one or more layers. In various
research, it has been proved that the ANN structure with one hidden layer is capable of
solving complicated nonlinear problems, including finding a correlation between input and
output variables [60–62]. Therefore, in this study, the structure of ANN-GWO with one
hidden layer is proposed. The next problem is to determine the number of neurons in the
12
Electronic copy available at: />
hidden layer as well as the optimal population size of the GWO optimization algorithm. For
this purpose, the GWO optimization algorithm was run with the population changing from
30 to 300 with a step of 30, the hidden layer's neuron changed from 3 to 30 by 3. To
determine optimal parameters, a grid search technique was utilized. The effects of the
different values of the two parameters on the performance of the proposed were evaluated
according to the 6 statistical criteria mentioned above. Here, a maximum number of
iterations of 1000 is used to define the parameters.
Fig. 2 presents 3D models of the mesh search. As observed, when the number of neurons is
low, the population size changes in the increasing direction, the model efficiency is still
low. That is reflected by the low values of R, IA, while RMSE, MAE, and MAPE are high.
In the case of low population size, the increase in the number of nerve cells does not allow
an increase in performance. The case where the number of neurons increases while
increasing the number of population sizes allows the model's performance to increase. The
results of the mesh search technique show the best performance of the ANN-GWO model
obtained when the number of neurons is 21, and the population size is 240. Then, all the
performance evaluation criteria of the model are satisfied.
13
Electronic copy available at: />
(b) Average of IA
(a) Average of R
Optimal zone
Optimal zone
0.965
0.98
0.96
0.978
0.955
0.976
0.95
0.974
0.972
0.945
0.94
300 270
240 210
180 150
24
18 21
120 90
60 30
Population size
27 30
0.97
300 270
15
9 12
3 6
Nr of neurons
240 210
180 150
24
18 21
120 90
60 30
Population size
(c) Average of Slope
27 30
15
9 12
3 6
Nr of neurons
(d) Average of RMSE
Optimal zone
Optimal zone
0.93
4.5
0.92
5
0.91
0.9
5.5
0.89
0.88
300 270
240 210
180 150
24
18 21
120 90
60 30
Population size
27 30
6
300 270
15
9 12
3 6
Nr of neurons
240 210
180 150
24
18 21
120 90
60 30
Population size
(e) Average of MAE
27 30
15
9 12
3 6
Nr of neurons
(f) Average of MAPE
Optimal zone
Optimal zone
8.5
3.8
4
9
4.2
9.5
4.4
10
4.6
300 270
240 210
180 150
21
15 18
120 90
60 30
Population size
24 27
30
10.5
300 270
9 12
3 6
Nr of neurons
240 210
180 150
21
15 18
120 90
60 30
Population size
24 27
30
9 12
3 6
Nr of neurons
14
Electronic copy available at: />
Fig. 2. Calibration of the optimal number of neurons and GWO's population size based on
(a) R, (b) IA, (c) Slope, (d) RMSE, (e) MAE, and (f) MAPE. The optimal zone is also
highlighted.
3.2. Analysis of Convergence of Monte Carlo simulations
The preceding part used 1000 Monte Carlo simulations to optimize the number of neurons
in the hidden layer and the population size in the GWO optimization technique. In this
section, convergence estimation of all the quality assessment criteria was performed based
on 1000 Monte-Carlo simulations. The line representing the normalized convergence of the
six statistical criteria is shown in Fig. 3. For the R, IA, and Slope indices, only roughly 200
simulations with the test set and 100 with the training set are required to obtain
convergence results (less than 0.1 percent). After 200 iterations, the RMSE index seems to
be the harshest since only 1% of normalized convergence is reached. There is a distinct
difference between MAE and MAPE compared with RMSE, when their convergence is
identical to those of R, IA, and Slope. The obtained results indicated that all 6 criteria
achieve static convergence per 1000 Monte Carlo simulations. That means that such runs
were enough to assess the effectiveness of the proposed model.
15
Electronic copy available at: />
Convergence indicator for R
Convergence indicator for IA
(a)
100.4
100.2
100
Training part
Testing part
99.8
99.6
99.4
99.2
99
0
200
400
600
800
1000
100
99.8
99.6
99.4
Convergence indicator for RMSE
Convergence indicator for Slope
100.5
100
99.5
Training part
Testing part
98.5
98
97.5
0
200
400
600
800
1000
Convergence indicator for MAPE
Convergence indicator for MAE
Training part
Testing part
105
100
95
90
0
200
400
600
800
400
600
800
1000
Training part
Testing part
104
102
100
98
96
0
200
400
600
800
1000
Nr of Monte Carlo runs
(e)
110
200
(d)
Nr of Monte Carlo runs
115
0
Nr of Monte Carlo runs
(c)
99
Training part
Testing part
100.2
Nr of Monte Carlo runs
101
(b)
100.4
1000
Nr of Monte Carlo runs
(f)
108
Training part
Testing part
106
104
102
100
98
96
0
200
400
600
800
Nr of Monte Carlo runs
16
Electronic copy available at: />
1000
Fig. 3. Statistical convergence over 1000 random samplings for (a) R, (b) IA, (c) Slope, (d)
RMSE, (e) MAE and (f) MAPE.
3.3. Analysis of Distribution of Performance Criteria
The statistical assessment of the ANN-GWO model's performance is reported in this
section. Fig. 4 shows the probability distribution over 1000 simulations of the criteria,
namely, R (Fig. 4a), IA (Fig. 4b), slope (Fig. 4c), RMSE (Fig. 4d), MAE (Fig. 4e), and
MAPE (Fig. 4f). The probability distribution function for the training set, test set were
presented by solid, and dashed lines, respectively. In addition, the summary statistical
information including quantiles Q25, Q50, Q75, mean, StD, the max and min values of the
R, IA, slope, RMSE, MAE, and MAPE distributions for the training and testing databases
are highlighted in Table 3.
(a)
50
Training part
Testing part
40
30
20
10
0
0.85
0.9
(b)
120
Probability distribution
Probability distribution
60
0.95
100
80
60
40
20
0
0.92
1
R
Training part
Testing part
0.94
0.96
IA
17
Electronic copy available at: />
0.98
1
(c)
30
Training part
Testing part
25
20
15
10
5
0
0.8
0.85
0.9
(d)
1
Probability distribution
Probability distribution
35
0.95
1
1.05
0.8
0.6
0.4
0.2
0
1.1
Training part
Testing part
3
4
5
Slope
(e)
Training part
Testing part
1
0.8
0.6
0.4
0.2
0
2
3
4
7
8
9
(f)
0.6
Probability distribution
Probability distribution
1.2
6
RMSE
5
6
0.5
0.4
0.3
0.2
0.1
0
7
Training part
Testing part
6
MAE
8
10
12
14
16
MAPE
Fig. 4. Probability distribution over 1000 random samplings for (a) R, (b) IA, (c) Slope, (d)
RMSE, (e) MAE and (f) MAPE.
Table 3
Statistical analysis over 1000 random samplings quality assessment criteria.
Min
Q25
Q50
Q75
Max
Average
StD
18
Electronic copy available at: />
CV
Rtrain
0.932
0.955
0.960
0.964
0.977
0.959
0.006
0.666
Rtest
0.881
0.912
0.919
0.926
0.953
0.918
0.011
1.230
IAtrain
0.964
0.976
0.979
0.981
0.988
0.979
0.003
0.351
IAtest
0.936
0.953
0.957
0.961
0.975
0.957
0.006
0.634
Slopetrain
0.870
0.909
0.917
0.925
0.949
0.917
0.012
1.283
Slopetest
0.861
0.932
0.951
0.969
1.049
0.951
0.028
2.974
RMSEtrain
3.890
4.816
5.087
5.366
6.575
5.099
0.390
7.649
RMSEtest
5.017
6.301
6.581
6.872
8.284
6.594
0.445
6.756
MAEtrain
3.131
3.909
4.152
4.378
5.296
4.150
0.343
8.265
MAEtest
3.986
4.991
5.219
5.468
6.362
5.238
0.354
6.755
MAPEtrain
6.916
9.020
9.518
10.016
12.358
9.519
0.742
7.796
MAPEtest
9.149
11.646
12.228
12.746
14.845
12.214
0.834
6.824
It can be observed from Table 3 that the mean and standard deviation values of R for the
training database were 0.959 and 0.006, and 0.918 and 0.011 for the testing database,
respectively. With IA criterion, the mean and standard deviation values for the training
database were 0.979 and 0.003, while those values were 0.957 and 0.006 for the testing
19
Electronic copy available at: />
database. The slope criterion values were 0.917 and 0.012 with the training database; 0.951
and 0.012 with the testing database. The mean and standard deviation of RMSE for the
training database were 5.099 and 0.390, while for the testing database were 6,594 and
0.445. For MAE, these values were 4.150 and 0.343 respectively for the training database
and 5.238 and 0.354 for the testing database. Finally, for MAPE criterion, the mean and
standard deviation were 9,519 and 0.742 for the training database, while for the testing
database, these values were 12,214 and 0.834, respectively. The obtained results indicated
that the ANN-GWO model could be employed as a good predictor of the compressive
strength of SCC with high accuracy.
3.4. Analysis of ANN Optimization by GWO
The weight and bias values of ANN's neurons were optimized using the GWO algorithm in
this section based on three statistical criteria, namely, R, RMSE, and MAE, over the
process. Fig. 5 presents a cost function that evaluates the convergence of criteria in the
network training process. It can be seen that an increase in the number of repetitions can
decrease the RMSE and MAE value, while the R value tends to increase. The findings of
five hundred iterations have likewise been shown to be trustworthy. The R, RMSE, and
MAE measures are essentially identical from iteration 200 onwards, as can be shown. As a
result, the maximum number of iterations for ANN-GWO was chosen as 500, which
ensures the relative error between two iterations is less than 0.1%
20
Electronic copy available at: />
(a)
Cost function R
1
0.8
Training part
Testing part
0.6
0.4
0.2
0
0
50
100
150
200
250
300
350
400
450
500
350
400
450
500
350
400
450
500
Iteration
(b)
Cost function RMSE
25
20
15
Training part
Testing part
10
5
0
0
50
100
150
200
250
300
Iteration
(c)
Cost function MAE
25
20
Training part
Testing part
15
10
5
0
0
50
100
150
200
250
300
Iteration
Fig. 5. Evaluation of (a) R, (b) RMSE. and (c) MAE during training processes.
21
Electronic copy available at: />
3.5. Analysis of Typical Results
This section presents typical results in SCC compressive strength prediction, which is the
best predictive result of the ANN-GWO model with structure [9-21-1] over 1000 MonterCarlo simulations. The comparison between the actual compressive strength value with the
predicted compressive strength value by the ANN-GWO model is shown by the regression
graphs in Fig. 6. The results show the comparison for the training data (Fig. 6a) and testing
data (Fig. 6b). Linear lines were also plotted (red lines) in each graph to show the
performance of the proposed model. The corresponding correlation coefficient values (R)
were 0.951 and 0.940 for the training and the testing data, respectively, indicating the
excellent predictability of the ANN - GWO model. Thus, there was a strong linear
relationship between the predicted compressive strength and the actual compressive
strength. The detailed performance of the proposed ANN - GWO algorithm was
summarized in Table 4, including R, IA, Slope, RMSE, MAE, and MAPE.
60
(a) Training part
80
R = 0.951
IA = 0.974
Slope = 0.904
RMSE = 5.132 MPa
MAE = 4.112 MPa
MAPE = 9.293 MPa
Predicted data
Predicted data
80
40
20
20
40
60
80
60
(b) Testing part
R = 0.940
IA = 0.969
Slope = 0.960
RMSE = 5.515 MPa
MAE = 4.427 MPa
MAPE = 10.20 MPa
40
20
20
Actual data
40
60
80
Actual data
Fig. 6. Regression graphs for (a) training and (b) testing parts.
22
Electronic copy available at: />
Table 4
Performance indicators of the optimal ANN-GWO model
Criteria
R
IA
Slope
RMSE (MPa)
MAE (MPa)
MAPE (%)
Training data
0.951
0.974
0.904
5.132
4.112
9.293
Testing data
0.940
0.969
0.96
5.515
4.427
10.2
In addition, the good correlation between the predicted compressive strength results and the
actual results was confirmed by the relative error curve shown in Fig. 7. The relative error
values for the 85 samples of the training data are shown in Fig. 7a, while Fig. 7b shows the
34 samples in the testing data. In the training data, the main samples have the errors found
in the range [-7; 8] (MPa), only 5 samples had error outside the above range, and there was
1 sample with the largest error of -13 MPa. In the testing data, the error of the samples is
mainly in the range [-9; 10] (MPa). The errors are mostly concentrated around 0 for both
the training part and testing part. These errors indicate that the predictability of the
proposed ANN - GWO model is good with low error.
23
Electronic copy available at: />
Error = Predicted - Actual
Error = Predicted - Actual
(b) Testing part
15
10
5
0
-5
-10
-15
0
5
10
15
20
25
30
35
Sample Index
(a) Training part
15
10
5
0
-5
-10
-15
0
10
20
30
40
50
60
70
80
Sample Index
Fig. 7. Relative errors for (a) training data and (b) testing data.
In this part, uncertainty analysis was conducted with the aim of quantifying the change in
the output, specifically in this study, the compressive strength of SCC due to the change of
the input parameters. Estimating the uncertainty of prediction is needed to evaluate the
reliability of the results [63]. Quantification is usually done by estimating statistical
quantities of interest such as quantum mean, median, and population. Starting with Q10 and
up to Q90, nine percentile levels of the target compressive strength were specified. The
confidence intervals for estimating the compressive strength of SCC are shown in Fig. 8. It
24
Electronic copy available at: />
is worth noting that the amount of data in each quantum level is also calculated and
displayed, along with the mean curve and confidence intervals of 70%, 95%, and 99%,
respectively. It is observed that the confidence interval of the proposed ANN-GWO model
is the smallest, proving that the accuracy of the predicted model is high. This result is in
Compressive strength (MPa)
good agreement with the performance analysis of the model presented above.
100
80
17
13
10
number of data
11
11
14
10
12
16
average curve
60
40
20
70% confidence interval
0
Q10
Q20
Q30
Q40
95% confidence interval
Q50
Q60
Q70
99% confidence interval
Q80
Q90
Quantile level
Fig. 8. Confidence intervals for estimating the compressive strength of SCC.
3.6. Sensitivity analysis and discussion
The effects of the input parameters on the SCC compressive strength prediction are
discussed in this section. In this paper, a Partial Dependent Plot (PDP) was utilized to
estimate the efficiency of all input variables (i.e., W/B, C, F, D, B, V-funnel, SP, L-box).
The strategy is to remove one input parameter at a time from the input space while keeping
the median value of the remaining input parameters constant. Thus, the approach can
25
Electronic copy available at: />