More Related Content
Similar to Training the neural network using levenberg marquardt’s algorithm to optimize
Similar to Training the neural network using levenberg marquardt’s algorithm to optimize (20)
More from IAEME Publication
More from IAEME Publication (20)
Training the neural network using levenberg marquardt’s algorithm to optimize
- 1. International Journal of Advanced Research in Engineering and Technology (IJARET), ISSN
0976 – 6480(Print), ISSN 0976 – 6499(Online) Volume 4, Issue 3, April (2013), © IAEME
93
TRAINING THE NEURAL NETWORK USING
LEVENBERG-MARQUARDT’S ALGORITHM TO OPTIMIZE THE
EVACUATION TIME IN AN AUTOMOTIVE VACUUM PUMP
Vijayashree1
*, Kolla Bhanu Prakash2
and T.V. Ananthan3
1, 2, 3
Department of Computer Science and Engineering, Dr. MGR Educational and Research Institute
University, Maduravoyal, Chennai 600 095, India
ABSTRACT
Neural networks have been used for engine computations in the recent past. One reason for using
neural networks is to capture the accuracy of experimental data while saving computational time, so
that system simulations can be performed within a reasonable time frame. The main aim of this study is
to optimize and arrive at a design base for a vacuum pump in an automotive engine using
Levenberg-Marquardt’s (LM) Algorithm for Artificial Neural Networking (ANN). Design bases are
created based on the previous products and by bench marking. Effortless application of brake is a
preferred comfort feature in automotive application. To provide an easy and effective feeling, the
braking mechanism needs to be assisted with external energy. This is optimized based on LM algorithm
using the neural network to arrive at the optimum evacuation time..
Index Terms: automotive engine, braking system, evacuation time, Levenberg-Marquardt’s (LM)
Algorithm, neural networks, vacuum pump.
I. INTRODUCTION
Effortless application of brake is a preferred comfort feature in automotive application. To
provide an easy and effective feeling, the braking mechanism needs to be assisted with external energy.
Vane type Vacuum pump exactly serves this purpose, which is used to produce vacuum by evacuating
the air in the vacuum booster. This vacuum is used to actuate the booster for the power brakes in the
diesel-powered and Gasoline Direct Injection automobile. The capacity of the vacuum pump varies
based on the weight and brake booster capacity of the vehicle. Therefore, it is necessary to have a
design base with a proven technique, which will serve as a basis for faster product development.
Neural networks and other machine learning algorithms are increasingly being used for engine
applications [1]. These applications can be categorized as either real time control/diagnostic methods
INTERNATIONAL JOURNAL OF ADVANCED RESEARCH IN
ENGINEERING AND TECHNOLOGY (IJARET)
ISSN 0976 - 6480 (Print)
ISSN 0976 - 6499 (Online)
Volume 4, Issue 3, April 2013, pp. 93-100
© IAEME: www.iaeme.com/ijaret.asp
Journal Impact Factor (2013): 5.8376 (Calculated by GISI)
www.jifactor.com
IJARET
© I A E M E
- 2. International Journal of Advanced Research in Engineering and Technology (IJARET), ISSN
0976 – 6480(Print), ISSN 0976 – 6499(Online) Volume 4, Issue 3, April (2013), © IAEME
94
or predictive tools for design purposes. Some applications have even moved downstream of the engine
[2]. The present work aims to use neutral network technique using LM algorithm to arrive at the
appropriate evacuation time which is a critical parameter. The particular task selected is to minimize
the evacuation time in a vane type vacuum pump. The dataset used are the experimental results
conducted at UCAL Fuel Systems Ltd. Chennai.
II. VACUUM PUMP
Vane type vacuum pump has a unique profile in which an eccentrically mounted rotor rotates the
vane as shown in the Fig.1. The movement of vanes creates pressure difference, which creates vacuum
in brake booster. Air enters the pump through inlet check valve assembly. Oil is circulated inside the
pump to lubricate the rotating parts and to maintain sealing between the high pressure and low pressure
regions [3, 4, 5]. The air and oil mixture are then expelled outside the pump through the reed valve. The
performance of the pump is specified by evacuation time of a specified tank volume [3].
Evacuation time, t = (Vt / Q ) / ln (p1 / p2)
Where Vt is tank volume; p1 is atmospheric pressure and p2 is required pressure.
Vane type vacuum pump is used to produce vacuum by evacuating the air in the vacuum booster.
This vacuum is used to actuate the booster for the power brakes in the diesel-powered and GDI
automobile. The capacity of the vacuum pump varies based on the weight and brake booster capacity of
the vehicle. Therefore, it is necessary to have a design base with a proven technique, which will serve as
a basis for faster product development.
These results obtained from the existing pump were used for training the ANN using LM algorithm
to create the design base for any future design. Figure 1 shows the vacuum pump of capacity 110cc
.
Fig.1 Photograph of vacuum pump of capacity 110cc
III. LEVENBERG-MARQUARDT’S ALGORITHM
The LM algorithm is an iterative technique that locates a local minimum of a multivariate
function that is expressed as the sum of squares of several non-linear, real-valued functions. It has
become a standard technique for nonlinear least-squares problems, widely adopted in various
disciplines for dealing with data-fitting applications. LM can be thought of as a combination of steepest
descent and the Gauss-Newton method. When the current solution is far from a local minimum, the
algorithm behaves like a steepest descent method: slow, but guaranteed to converge. When the current
solution is close to a local minimum, it becomes a Gauss-Newton method and exhibits fast
convergence.
Input:
A vector function f : Rm
→ Rn
with n ≥ m, a measurement vector x ∈∈∈∈ Rn
and an initial parameters
estimate p0 ∈∈∈∈ Rm
.
- 3. International Journal of Advanced Research in Engineering and Technology (IJARET), ISSN
0976 – 6480(Print), ISSN 0976 – 6499(Online) Volume 4, Issue 3, April (2013), © IAEME
95
Output:
A vector p+
∈∈∈∈ Rm
minimizing ||x – f(p) ||2
Algorithm:
k := 0; v := 2; p := p0;
A := JT
J; ∈∈∈∈P := x – f(p); g := JT
∈∈∈∈P;
stop := (||g||∞ ≤ ∈∈∈∈1); µ := τ * maxi=1, …, m (Aii);
while (not stop) and (k < kmax)
k := k + 1;
repeat
Solve (A + µI) δP = g;
if (||δP ||≤ ∈∈∈∈2 ||p||)
stop := true;
else
pnew := p + δP;
ρ := (||∈∈∈∈P ||2
− ||x – f(pnew)||2
) / ( T
Pδ (µδP + g));
if ρ > 0
p = pnew;
A := JT
J; ∈∈∈∈P := x – f(p); g := JT
∈∈∈∈P;
stop := (||g||∞ ≤ ∈∈∈∈1);
µ := µ * max(1/3, 1 – (2ρ – 1)3
); v := 2;
else
µ := µ * v; v := 2 * v;
endif
endif
until (ρ > 0) or (stop)
endwhile
p+
:= p;
The above is Levenberg-Marquardt nonlinear least squares algorithm. ρis the gain ratio, defined
by the ratio of the actual reduction in the error (||∈∈∈∈P ||2
) that corresponds to a step δP and the reduction
predicted for δP by the linear model of Eq. (1). See text and [6,7] for details. When LM is applied to the
problem, the operation enclosed in the rectangular box is carried out by taking into account the sparse
structure of the corresponding Hessian matrix A.
In the following, vectors and arrays appear in boldface and T
is used to denote transposition. Also,
||.|| and ||.||∞ respectively denote the 2 and infinity norms. Let f be an assumed functional relation which
maps a parameter vector p ∈∈∈∈ Rm
to an estimated measurement vector x = f(p), x ∈∈∈∈ Rn
. An initial
parameter estimate p0 and a measured vector x are provided and it is desired to find the vector p+
that
best satisfies the functional relation f locally, i.e. minimizes the squared distance ∈∈∈∈T
∈∈∈∈ with ∈∈∈∈ = x - x
for all p within a sphere having a certain, small radius. The basis of the LM algorithm is a linear
approximation to f in the neighborhood of p. Denoting by J the Jacobian matrix
p
p
∂
∂ )(f
, a Taylor series
expansion for a small ||δP|| leads to the following approximation f (p + δP ) ≈ f (p) + J δP (1)
Like all non-linear optimization methods, LM is iterative. Initiated at the starting point p0, it produces a
series of vectors p1, p2, … that converge towards a local minimize p+
for f. Hence, at each iteration, it is
required to find the step δP that minimizes the quantity ||x −−−− f (p + δP ) || ≈ ||x −−−− f (p) − J δP || = ||∈∈∈∈−−−− J δP||
(2)
- 4. International Journal of Advanced Research in Engineering and Technology (IJARET), ISSN
0976 – 6480(Print), ISSN 0976 – 6499(Online) Volume 4, Issue 3, April (2013), © IAEME
96
The sought δP is thus the solution to a linear least-squares problem: the minimum is attained when J
δP −−−− ∈∈∈∈ is orthogonal to the column space of J. This leads to JT
(J δP −−−− ∈∈∈∈) = 0, which yields the
Gauss-Newton step δP; as the solution of the so-called normal equations: JT
J δP = JT
∈∈∈∈ (3)
Ignoring the second derivative terms, matrix JT
J in Eq.(3) approximates the Hessian of ½∈∈∈∈T
∈∈∈∈
[18]. Note also that JT
∈∈∈∈ is along the steepest descent direction, since the gradient of ½∈∈∈∈T
∈∈∈∈ is −JT
∈∈∈∈.
The LM method actually solves a slight variation of Eq. (3), known as the augmented normal equations:
N δP = JT
∈∈∈∈, with N ≡≡≡≡ JT
J + µI and µ > 0 (4)
Where I is the identity matrix. The strategy of altering the diagonal elements of JT
J is called
damping and µ is referred to as the damping term. If the updated parameter vector p +δP with δP
computed from Eq. (4) leads to a reduction in the error ∈∈∈∈T
∈∈∈∈, the update is accepted and the process
repeats with a decreased damping term. Otherwise, the damping term is increased, the augmented
normal equations are solved again and the process iterates until a value of δP that decreases the error is
found. The process of repeatedly solving Eq. (4) for different values of the damping term until an
acceptable update to the parameter vector is found corresponds to one iteration of the LM algorithm.
In LM, the damping term is adjusted at each iteration to assure a reduction in the error. If the damping
is set to a large value, matrix N in Eq. (4) is nearly diagonal and the LM update step δP is near the
steepest descent direction JT
∈∈∈∈. Moreover, the magnitude of δP is reduced in this case, ensuring that
excessively large Gauss-Newton steps are not taken.
Damping also handles situations where the Jacobian is rank deficient and JT
J is therefore
singular [4]. The damping term can be chosen so that matrix N in Eq. (4) is nonsingular and, therefore,
positive definite, thus ensuring that the δP computed from it is in a descent direction. In this way, LM
can defensively navigate a region of the parameter space in which the model is highly nonlinear. If the
damping is small, the LM step approximates the exact Gauss-Newton step. LM is adaptive because it
controls its own damping: it raises the damping if a step fails to reduce ∈∈∈∈T
∈∈∈∈ otherwise it reduces the
damping. By doing so, LM is capable of alternating between a slow descent approach when being far
from the minimum and a fast, quadratic convergence when being at the minimum’s neighborhood [8].
The LM algorithm terminates when at least one of the following conditions is met:
1. The gradient’s magnitude drops below a threshold ε1.
2. The relative change in the magnitude of δP drops below a threshold ε2.
3. A maximum number of iterations kmax is reached.
The complete LM algorithm is shown in the above pseudocode; more details regarding it can be
found in [6]. The initial damping factor is chosen equal to the product of a parameter τ with the
maximum element of JT
J in the main diagonal. Indicative values for the user-defined parameters are τ =
10−3
, ε1 = ε2 = 10−2
, kmax = 100.
IV. METHODOLOGY OF NEURAL NETWORKS IN VACUUM PUMP PERFORMANCE
OPTIMIZATION
The performance of the vacuum pump is determined by time required to evacuate air from the
reservoir. This function depends on the various parameters like temperature, oil pressure, rotation
speed etc. The vacuum pump development requires the procedure to develop the pump of any capacity
based on the customer requirement.
In this first training stage, the inputs and the desired outputs are given to the NN. The weights are
modified to minimize the error between the NN predictions and expected outputs. Different types of
learning algorithms have been developed, but the most common and robust one is back-propagation.
The goal of the training is to minimize the error, and consequently to optimize the NN solution. Each
iterative step in which the weights are recalculated is called epoch. When the minimum is achieved, the
weights are fixed and the training process ends. Once a neural network has been trained to a satisfactory
- 5. International Journal of Advanced Research in Engineering and Technology (IJARET), ISSN
0976 – 6480(Print), ISSN 0976 – 6499(Online) Volume 4, Issue 3, April (2013), © IAEME
97
level, it may be used as a predictive tool for new data. To do this, only the inputs are given to the NN,
and the NN predicted outputs are calculated using the previous error minimizing weights.
V. RESULTS AND DISCUSSION
The dataset used was obtained from UCAL Fuel Systems Ltd, Chennai. There were 4 sets of
training data, each set corresponding to a different combination of pump and tank capacity, speed,
pressure and evacuation time.
There were 21x6 training data points and 4 input features. The target values were the 21x6
normalized (by the minimum possible evacuation time) values. There were 10 such sets for testing too.
No tuning set was required to be extracted from the training data, since because of the large number of
training data points, the training error as well as tune error decreased asymptotically, beyond a few
hundred epochs, and early stopping did not occur. The MATLAB neural network toolbox was used to
build the baseline neural networks. The Levenberg-Marquardt algorithm [9, 10] was used with the back
propagation algorithm. Twenty five hidden layers with an optimal 10 neurons having sigmoid
activation function, and the output layer having a ten neuron with a linear activation function was the
chosen configuration. The Nguyen-Widow method was used to initialize the weights. Evacuation time
predictions were made using this configuration (baseline case).
The reasons to incorporate a physical model into a neural network are:
1. To make the network more robust. Even if confronted with a set of conditions very different from
those encountered in the training data, the network should output realistic results.
2. To reduce dependence on training data, i.e. to enable the network to form a reasonable hypothesis,
from small datasets.
3. To improve the prediction accuracy.
Table 1 Experimental data for tank capacity 100 cc and pump capacity 3 cc.
Temperature
Speed
400 1000 1500 2300
50 3.47 1.97 1.7 1.61
90 3.53 1.98 1.8 1.7
120 3.92 2.08 1.8 1.75
150 4.77 2.16 1.17 1.72
Table 2 ANN result for tank capacity 100 cc and pump capacity 3 cc (hidden layers: 25)
Temperature
Evacuation time
Speed
400 1000 1500 2300
50 3.47912 1.7302 1.9189 1.60273
90 3.53071 1.98974 1.32223 1.67414
120 3.90548 2.18308 0.84523 1.73175
150 4.90085 1.78111 2.24074 1.67527
The reported error is the mean square error over normalized evacuation time values. It is always the
test error, unless otherwise mentioned. It was noticed from error plots that most of the error occurred
- 6. International Journal of Advanced Research in Engineering and Technology (IJARET), ISSN
0976 – 6480(Print), ISSN 0976 – 6499(Online) Volume 4, Issue 3, April (2013), © IAEME
98
over the -0.2396 region (Fig.2). The other regions had much smaller errors and this error were therefore
chosen for comparison with the three new methods.
Fig.2 Error histogram
The mean square error of the model output to the target output is a typical measure of neural
network performance. However, it was found that there are practical difficulties in establishing
acceptance criteria for the mean square error. Therefore a normalised version of the mean square error
was implemented. This normalised mean square error used the nearer specification limit concept that
was modified to encompass the definition of an acceptable percentage error level. Here, the acceptable
error was equated to the typical level of propagated error that one would expect from the
instrumentation measuring the engine performance. This was consistent with the idea that it is
reasonable not to expect a higher standard of inference using the model than one could expect from
direct measurement of the engine performance.
The performance obtained during the training are
Performance = 0.1601
trainPerformance = 8.4504e-008
valPerformance = 0.4123
testPerformance = 0.2283
During training, the progress is constantly updated in the training window. Of most interest are
the performance, the magnitude of the gradient of performance and the number of validation checks.
The magnitude of the gradient and the number of validation checks are used to terminate the training.
The gradient will become very small as the training reaches a minimum of the performance. If the
magnitude of the gradient is less than 1e-5, the training will stop (Fig.3). This limit can be adjusted by
setting the parameter net.trainParam.min_grad. The number of validation checks represents the number
of successive iterations that the validation performance fails to decrease. If this number reaches 6 (the
default value), the training will stop.
- 7. International Journal of Advanced Research in Engineering and Technology (IJARET), ISSN
0976 – 6480(Print), ISSN 0976 – 6499(Online) Volume 4, Issue 3, April (2013), © IAEME
99
Fig.3 Gradient plot
The performance plot (Fig.4) shows the value of the performance function versus the iteration
number (epochs). It plots training, validation and test performances. The best validation performance is
0.17081 at epoch1.
Fig.4 Performance plot
The training state plot shows the progress of other training variables, such as the gradient magnitude,
the number of validation checks, etc (Fig.5). The error histogram plot shows the distribution of the
network errors. The regression plot shows a regression between network outputs and network targets.
Fig.5 Training regeression plots
The three axes represent the training, validation and testing data. The dashed line in each axis
represents the perfect result – outputs = targets. The solid line represents the best fit linear regression
line between outputs and targets. The R value is an indication of the relationship between the outputs
and targets. If R = 1, this indicates that there is an exact linear relationship between outputs and targets.
- 8. International Journal of Advanced Research in Engineering and Technology (IJARET), ISSN
0976 – 6480(Print), ISSN 0976 – 6499(Online) Volume 4, Issue 3, April (2013), © IAEME
100
If R is close to zero, then there is no linear relationship between outputs and targets. For this example,
the training data indicates a good fit. The validation and test results also show R values that greater than
0.9. The scatter plot is helpful in showing that certain data points have poor fits. Here in this R at
training, validation, test and with all the three are 0.083294, 0.13655, 0.80023 and 0.080557
respectively.
VI. CONCLUSION
From the results obtained from the above Levenberg-Marquardt’s algorithm, it can be concluded
that the above algorithm works quite satisfactorily in optimizing the evacuation time in automotive
engines. The above optimization has been validated and found to be accurate to 5% level. The deviation
of NN optimized values were also found within 5%, when compared with experimental results.
VII. ACKNOWLEDGEMENT
I wish to acknowledge Mr. J. Suresh Kumar, Deputy General Manager of UCAL Fuel Systems
Ltd, Chennai for his help in conducting the experiments and generating the data set to do this project
and validate the same in their prototype.
REFERENCES
[1] Indranil Brahma, Yongsheng He and Christopher J. Rutland, Improvement of Neural Network Accuracy for
Engine Simulations, SAE Paper 2003-01-3227
[2] He, Y. and Rutland, C.J., “Application of Artificial Neural Network for Integration of Advanced Engine
Simulation Methods”, Proceedings of the 2000 Fall Technical Conference of the ASME Internal Combustion
Engine Division, ICE-Vol.35-1, 53-64, Paper No. 2000-ICE-304, 2000
[3] Chambers, A., Fitch, R. K., Halliday, B. S., “Basic Vacuum Technology,” ISBN 0-75-030495-2, 1998.
[4] Nagendiran, S., Sivanantham, R., and Kumar, J.,“Improvement of the Performance of Cam-Operated Vacuum
Pump for Multi Jet Diesel Engine,” SAE Technical Paper 2009-01-1462, 2009, doi:10.4271/2009-01-1462.
[5] Nagendiran S R, Arun Subramanian, J Suresh kumar and Ramalingam Sivanantham Designing of Automotive
Vacuum Pumps - Development of Mathematical Model for Critical Parameters and Optimization using
Artificial Neural Networks, SAE Paper No.2012-01-0779K. Madsen, H. Nielsen, and O. Tingleff. Methods for
Non-Linear Least Squares Problems. Technical University of Denmark, 2004. Lecture notes, available at
http://www.imm.dtu.dk/courses/02611/nllsq.pdf.
[6] Manolis I.A. Lourakis and Antonis A. Argyros, Is Levenberg-Marquardt the Most Efficient Optimization
Algorithm for Implementing Bundle Adjustment? Proceedings of the Tenth IEEE International Conference on
Computer Vision (ICCV’05), IEEE Computer Society
[7] J. Dennis and R. Schnabel. Numerical Methods for Unconstrained Optimization and Nonlinear Equations.
Classics in Applied Mathematics. SIAM Publications, Philadelphia, 1996.
[8] Indranil Brahma, Yongsheng He and Christopher J. Rutland, Improvement of Neural Network Accuracy for
Engine Simulations SAE Paper 2003-01-3227
[9] Hagan, M.T. and Menjaj, M.B., “Training Feedforward Networks with the Marquardt Algorithm”, IEEE
Transactions on Neural Networks, Vol. 5, No. 6, pp.989-993, 1994.
[10] Pallavi.H.Agarwal, Prof.Dr.P.M.George and Prof.Dr.L.M.Manocha, “Comparison Of Neural Network
Models On Material Removal Rate Of C-Sic” International Journal Of Design And Manufacturing Technology
(IJDMT) Volume 3, Issue 1, 2012, pp. 1 – 10, ISSN Print: 0976 – 6995, ISSN Online: 0976 – 7002
[11] Dharmendra Kumar Singh, Dr.Moushmi Kar And Dr.A.S.Zadgaonkar, “Analysis Of Generated Harmonics
Due To Transformer Load On Power System Using Artificial Neural Network” International Journal of
Electrical Engineering & Technology (IJEET) Volume 4, Issue 1, 2013, pp. 81 – 90, ISSN PRINT: 0976-6545,
ISSN ONLINE: 0976-6553.