SlideShare una empresa de Scribd logo
1 de 5
Descargar para leer sin conexión
Dissertation summary                                                                Francesco Azzena



Neural Networks for Predicting Financial Series. A case of
study: S&P Mib Index.


The subject of this thesis is the study of so-called neural networks for forecasting purposes. These
networks consist in a system of mathematical-statistical modelling capable of describing the
relationship between one or more output variables and a set of inputs. This type of modelling is
particular in that no direct relationship is established between inputs and outputs, one or more
hidden layers being set between them that contain calculating units called neurons, which create
fictitious variables.


These models have experienced fluctuating fortunes over the years: the first attempts appeared in
the 40s as simple linear combinations of inputs. Slowly, neural networks were refined, from the
famous Perceptron models to multi-layers models with nonlinear activation functions. The
construction of these models came through many phases of training whose purpose was to make the
network learn the relationship between the variables under consideration. Despite the fact that
proper computational tools had not yet been conceived, neural networks were slowly driven out of
the attention of statisticians, above all because of their great complexity.
A second break through came at the end of the 80s, almost unexpectedly: George Cybenko, an
American professor of mathematics, proved through the study of the properties of the sigmoid
function that if well constructed, neural networks with such activation functions and one hidden
layer could approximate any nonlinear process generating data, and this with an arbitrary small
margin of error. Of course, this conclusion received great prominence: working on non-linear
generating process is always problematic precisely because of the difficulty implied in establishing
a good model to use.


Neural networks can approximate any type of non-linearity in the series —in theory at least—
through the learning system and the non-linearity of the activation functions between the layers.
For this reason, neural networks have made a comeback in many scientific fields, from economics
to medicine, biology, meteorology and many others.


We thought therefore that it would be interesting to investigate in our work the theoretical basis
which underpin this tool, so much in vogue nowadays.



                                                   1/5
Dissertation summary                                                                Francesco Azzena


With this objective in mind, we decided to produce a case study that would improve our knowledge
of the issues related to the construction of a neural network. This required the use of particular
computer tools. Notably, we chose to proceed with the drafting of a program in Matlab language
instead of using the appropriate statistical packages.


As this thesis was realized within the framework of a degree profile in “Banking and Finance", we
decided to focus on the most important index of the Milan stock exchange, the Standard & Poor's
Mib (S&P Mib), and to attempt to make good predictions. The choice may seem bold: according to
the theory of the weak efficiency of the markets, we should not be able to provide a proper model
for such kind of variable, since it should follow a random walk and therefore have completely
random variations, just like a white noise. We nonetheless decided to make this attempt, hoping not
only to learn about neural networks, but also to make interesting predictions or, in the worst-case
scenario, to demonstrate the correctness of the theory under consideration.
More specifically, the problem we tried to solve was how to predict the variations at the time of
closure of the Italian stock exchange by using the information available before the opening time.
The variables taken into account were the latest available closing value of the Tokyo and New York
stock exchange, together with the currency exchange rates of Euro with Yen and Dollar published
by the European Central Bank. If that modelling worked, therefore, a hypothetical operator in the
Milan stock exchange would be able to anticipate the market successfully, speculating through the
assumption of "short" or "long" positions on securities with high correlation with the market.


The structure of the thesis is rather simple: after a brief introduction in the first chapter, we
describe, in the following chapter, the story of the ideas related to neural networks, with a short
digression on the medical phenomenon of the same name.
We then analyze the historical process by which, from the first simple neural networks, more and
more complex networks came to be used. The structure of neural networks has, indeed, evolved
over time: initially, there was the model of the physiologists McCulloch and Pitts (MCP), which
simply was a weighted sum compared with a threshold value, this allowing for a response of binary
type comparable to the activity of the brain, which works through electrical impulses. As this
methodology is suitable only for linearly divisible problems, Rosenblatt proposed in 1958 the
Perceptron model: for the first time, the idea of a hidden level between inputs and outputs
appeared. The composition of the inputs remained a simple weighted sum, but by increasing the
number of neurons and thus of summations, it was made possible to define a precise area in the
Cartesian plane and then solve problems not divisible linearly.


                                                  2/5
Dissertation summary                                                                Francesco Azzena


The Perceptron idea opened the way for increasingly complex solutions. Today, the number of
hidden layers and neurons in each one of them depends only on the choice of structure made by the
user. Also, in every neuron we are now capable of applying to the sum of inputs in the previous
level a function chosen discretionally. This high possibility of customization of the models has
allowed neuron networks to adapt easily to many types of studies, thus finding applications in many
branches of science.


In the next part of the thesis, we explain the characteristics which distinguish the main types of
neural networks: the presence or absence of a teacher (which allows the model to learn) distinguish
between supervised networks (the most common type, the one on which we focussed our attention)
and unsupervised ones, which are less efficient but have the advantage of operating in real time
without any human intervention.


After an overview of these models, we analyze in detail the various components of a common
supervised neural network with particular attention to the activation functions and the weights.
When we create a model of this type, indeed, the first decisions we have to take pertain to the
structure: the high level of customization of the product allows us to create a network which we
believe is best suited to the series under study. Once the number of layers and hidden neurons for
each one of them have been determined, another important choice concerns the activation functions.
As already mentioned, these are the functions that we apply to the weighted sum of inputs in each
neuron of the network. There exists various types of functions of the kind, which we give a general
picture of. The most important to date is the sigmoid, since a fundamental study concerning it has
been produced already.


The neural networks almost fell into disuse when the mathematician George Cybenko
demonstrated, in an article focussing on the properties of the sigmoid, that a neural network with
this type of activation function, if well structured and with the right choice of variables, is an
universal approximator. Neural networks are therefore able to approximate a non-linear function
generating data with a margin of error smaller than a small epsilon arbitrarily chosen.
At the end of the second chapter of the thesis, we explain the methods of training the weights in the
network: when we create the network, the weights of the sum in each neuron are chosen randomly.
Subsequently, using the comparison between estimated values and values of the teacher, the
weights are trained to reach their most correct estimation.



                                                 3/5
Dissertation summary                                                                  Francesco Azzena


Starting from the definition of error, we then examine how it is possible to use its gradient in order
to move towards the minimum point on the curve of error by changing the weights of the model.
Subsequently, we discuss the backpropagation algorithm, a method that enables the transmission of
error to the hidden layers of the model, although it is calculated solely on the final outputs, the only
detectable ones.


We also pay attention to the most commonly used methods to reduce the time for calculating the
weights and to improve performance. More particularly, we focus on methods based on learning
speed, both constant and variable. The last one was subsequently used in the practice as proposed
by Silva and Almeida.


At the end of this chapter, we also discuss the most popular way to avoid overfitting, i.e. the
excessive learning of the model about the sample, becoming almost useless out of sample. This is
one of the greatest risks using a model characterized by the ability to learn. The method we propose
and use is the division of the sample into three different groups: the training set to train the model,
the cross validation set, which is used to avoid the overfitting through the comparison between the
error curves on the training set and on the cross validation set, and the test set, the part of the
sample used to test the forecast ability when the training has ended and the weights are fixed.


In the third chapter of the thesis, we deal with practice. First, we analyze the data chosen to test the
neural networks on financial series for forecasting: our choice fell on the daily percentage changes
of the S&P Mib. We tried to explain its trend by using the data available before the start of trading:
more particularly, we used the morning closure value of the Nikkei 225 index of the Tokyo stock
exchange (thanks to time zone) and the one of the previous day, the Dow Jones Industrial Average
of New York of the two previous days, the currency rates of the two previous days between the
Euro and Yen and Dollar.


The thesis also comports a section devoted to the explanation of the theory of the weak efficiency of
the markets, which would seem to involve the unpredictability of the series under consideration,
since it would result as a random variable.


We then move on to the implementation of our ideas in Matlab language. We tried to create a small
guide for the implementation of a neural network with this program, focusing on those commands
which enact the options described in the theoretical part.


                                                  4/5
Dissertation summary                                                                Francesco Azzena


We created fifty models for each type of neural network with one layer distinguished by the number
of neurons (from one to fifteen, for a total of 750 models estimated). So we selected the best of
every type and compared them to the prediction of the test set: the objective was to find a network
that had a performance better than a white noise, which confirms the theory of the weak efficiency
of the markets.


In the last chapter of the thesis, we analyze the work done, pulling the conclusions about the
practical work in light of the theory discussed previously.


When we chose the S&P Mib for the practical application, we expected to meet many difficulties
and so did it happen: the estimated models unfortunately did not give satisfactory results. Using the
mean square error (MSE) for the assessment of the predicting performances, the best result was
obtained assuming the series as a white noise, putting the expected result values constantly equal to
zero: therefore the theory of the weak efficiency of the financial markets seems fully confirmed.
However, in some cases, the results were interesting, especially concerning the ability to provide
with a good approximation the combination of sign and magnitude of the change of the stock
market index in the day.
Therefore, although they can not be considered a tool which makes certain predictions, such models
could be useful when used by an experienced operator: actually, managed in conjunction with other
information spread in the markets and with the consciousness and the intuition derived from
experience, they could find a profitable use.
To operate on a wide and varied aggregate data as an stock exchange index, it is first necessary to
work properly with the economic theories it is based on. We tried to identify the most appropriate
inputs, but obviously, a financial analyst, with a greater knowledge of the mechanisms of the
market, can make a better choice: the Cybenko theorem could function with a good choice not only
of structure, but also of inputs to the network.




                                                   5/5

Más contenido relacionado

La actualidad más candente

Inference in HMM and Bayesian Models
Inference in HMM and Bayesian ModelsInference in HMM and Bayesian Models
Inference in HMM and Bayesian ModelsMinakshi Atre
 
A SELF-ORGANIZING RECURRENT NEURAL NETWORK
A SELF-ORGANIZING RECURRENT NEURAL NETWORKA SELF-ORGANIZING RECURRENT NEURAL NETWORK
A SELF-ORGANIZING RECURRENT NEURAL NETWORKijaia
 
Continuous Unsupervised Training of Deep Architectures
Continuous Unsupervised Training of Deep ArchitecturesContinuous Unsupervised Training of Deep Architectures
Continuous Unsupervised Training of Deep ArchitecturesVincenzo Lomonaco
 
Deep learning: el renacimiento de las redes neuronales
Deep learning: el renacimiento de las redes neuronalesDeep learning: el renacimiento de las redes neuronales
Deep learning: el renacimiento de las redes neuronalesBig Data Colombia
 
Aplication of artificial neural network in cancer diagnosis
Aplication of artificial neural network in cancer diagnosisAplication of artificial neural network in cancer diagnosis
Aplication of artificial neural network in cancer diagnosisSaeid Afshar
 
Model of Differential Equation for Genetic Algorithm with Neural Network (GAN...
Model of Differential Equation for Genetic Algorithm with Neural Network (GAN...Model of Differential Equation for Genetic Algorithm with Neural Network (GAN...
Model of Differential Equation for Genetic Algorithm with Neural Network (GAN...Sarvesh Kumar
 
One page summary of master thesis "Mathematical Analysis of Neural Networks"
One page summary of master thesis "Mathematical Analysis of Neural Networks"One page summary of master thesis "Mathematical Analysis of Neural Networks"
One page summary of master thesis "Mathematical Analysis of Neural Networks"Alina Leidinger
 
BookyScholia: A Methodology for the Investigation of Expert Systems
BookyScholia: A Methodology for the  Investigation of Expert SystemsBookyScholia: A Methodology for the  Investigation of Expert Systems
BookyScholia: A Methodology for the Investigation of Expert Systemsijcnac
 
Supervised Learning
Supervised LearningSupervised Learning
Supervised Learningbutest
 
Refining Simulated Smooth Surface Annealing and Consistent Hashing - Crimson ...
Refining Simulated Smooth Surface Annealing and Consistent Hashing - Crimson ...Refining Simulated Smooth Surface Annealing and Consistent Hashing - Crimson ...
Refining Simulated Smooth Surface Annealing and Consistent Hashing - Crimson ...CrimsonPublishersRDMS
 
Nature-inspired metaheuristic algorithms for optimization and computional int...
Nature-inspired metaheuristic algorithms for optimization and computional int...Nature-inspired metaheuristic algorithms for optimization and computional int...
Nature-inspired metaheuristic algorithms for optimization and computional int...Xin-She Yang
 
Metaheuristic Optimization: Algorithm Analysis and Open Problems
Metaheuristic Optimization: Algorithm Analysis and Open ProblemsMetaheuristic Optimization: Algorithm Analysis and Open Problems
Metaheuristic Optimization: Algorithm Analysis and Open ProblemsXin-She Yang
 
Soft Computing
Soft ComputingSoft Computing
Soft ComputingMANISH T I
 
X trepan an extended trepan for
X trepan an extended trepan forX trepan an extended trepan for
X trepan an extended trepan forijaia
 
A neuro fuzzy decision support system
A neuro fuzzy decision support systemA neuro fuzzy decision support system
A neuro fuzzy decision support systemR A Akerkar
 
Analyzing individual neurons in pre trained language models
Analyzing individual neurons in pre trained language modelsAnalyzing individual neurons in pre trained language models
Analyzing individual neurons in pre trained language modelsken-ando
 

La actualidad más candente (20)

Inference in HMM and Bayesian Models
Inference in HMM and Bayesian ModelsInference in HMM and Bayesian Models
Inference in HMM and Bayesian Models
 
A SELF-ORGANIZING RECURRENT NEURAL NETWORK
A SELF-ORGANIZING RECURRENT NEURAL NETWORKA SELF-ORGANIZING RECURRENT NEURAL NETWORK
A SELF-ORGANIZING RECURRENT NEURAL NETWORK
 
Continuous Unsupervised Training of Deep Architectures
Continuous Unsupervised Training of Deep ArchitecturesContinuous Unsupervised Training of Deep Architectures
Continuous Unsupervised Training of Deep Architectures
 
Deep learning: el renacimiento de las redes neuronales
Deep learning: el renacimiento de las redes neuronalesDeep learning: el renacimiento de las redes neuronales
Deep learning: el renacimiento de las redes neuronales
 
Aplication of artificial neural network in cancer diagnosis
Aplication of artificial neural network in cancer diagnosisAplication of artificial neural network in cancer diagnosis
Aplication of artificial neural network in cancer diagnosis
 
Model of Differential Equation for Genetic Algorithm with Neural Network (GAN...
Model of Differential Equation for Genetic Algorithm with Neural Network (GAN...Model of Differential Equation for Genetic Algorithm with Neural Network (GAN...
Model of Differential Equation for Genetic Algorithm with Neural Network (GAN...
 
The Tower of Knowledge A Generic System Architecture
The Tower of Knowledge A Generic System ArchitectureThe Tower of Knowledge A Generic System Architecture
The Tower of Knowledge A Generic System Architecture
 
One page summary of master thesis "Mathematical Analysis of Neural Networks"
One page summary of master thesis "Mathematical Analysis of Neural Networks"One page summary of master thesis "Mathematical Analysis of Neural Networks"
One page summary of master thesis "Mathematical Analysis of Neural Networks"
 
Ann
AnnAnn
Ann
 
BookyScholia: A Methodology for the Investigation of Expert Systems
BookyScholia: A Methodology for the  Investigation of Expert SystemsBookyScholia: A Methodology for the  Investigation of Expert Systems
BookyScholia: A Methodology for the Investigation of Expert Systems
 
Supervised Learning
Supervised LearningSupervised Learning
Supervised Learning
 
Refining Simulated Smooth Surface Annealing and Consistent Hashing - Crimson ...
Refining Simulated Smooth Surface Annealing and Consistent Hashing - Crimson ...Refining Simulated Smooth Surface Annealing and Consistent Hashing - Crimson ...
Refining Simulated Smooth Surface Annealing and Consistent Hashing - Crimson ...
 
Nature-inspired metaheuristic algorithms for optimization and computional int...
Nature-inspired metaheuristic algorithms for optimization and computional int...Nature-inspired metaheuristic algorithms for optimization and computional int...
Nature-inspired metaheuristic algorithms for optimization and computional int...
 
Metaheuristic Optimization: Algorithm Analysis and Open Problems
Metaheuristic Optimization: Algorithm Analysis and Open ProblemsMetaheuristic Optimization: Algorithm Analysis and Open Problems
Metaheuristic Optimization: Algorithm Analysis and Open Problems
 
Soft Computing
Soft ComputingSoft Computing
Soft Computing
 
Introduction to Soft Computing
Introduction to Soft ComputingIntroduction to Soft Computing
Introduction to Soft Computing
 
X trepan an extended trepan for
X trepan an extended trepan forX trepan an extended trepan for
X trepan an extended trepan for
 
deona
deonadeona
deona
 
A neuro fuzzy decision support system
A neuro fuzzy decision support systemA neuro fuzzy decision support system
A neuro fuzzy decision support system
 
Analyzing individual neurons in pre trained language models
Analyzing individual neurons in pre trained language modelsAnalyzing individual neurons in pre trained language models
Analyzing individual neurons in pre trained language models
 

Similar a Summary Of Thesis

Fuzzy Logic Final Report
Fuzzy Logic Final ReportFuzzy Logic Final Report
Fuzzy Logic Final ReportShikhar Agarwal
 
Artificial Neural Networks.pdf
Artificial Neural Networks.pdfArtificial Neural Networks.pdf
Artificial Neural Networks.pdfBria Davis
 
Artifical neural networks
Artifical neural networksArtifical neural networks
Artifical neural networksalldesign
 
NEURAL NETWORKS
NEURAL NETWORKSNEURAL NETWORKS
NEURAL NETWORKSESCOM
 
Artificial Neural Network report
Artificial Neural Network reportArtificial Neural Network report
Artificial Neural Network reportAnjali Agrawal
 
Neural Network
Neural NetworkNeural Network
Neural NetworkSayyed Z
 
Neural networks are parallel computing devices.docx.pdf
Neural networks are parallel computing devices.docx.pdfNeural networks are parallel computing devices.docx.pdf
Neural networks are parallel computing devices.docx.pdfneelamsanjeevkumar
 
Convolutional Networks
Convolutional NetworksConvolutional Networks
Convolutional NetworksNicole Savoie
 
Artificial Neural Networks Lect1: Introduction & neural computation
Artificial Neural Networks Lect1: Introduction & neural computationArtificial Neural Networks Lect1: Introduction & neural computation
Artificial Neural Networks Lect1: Introduction & neural computationMohammed Bennamoun
 
Neural Computing
Neural ComputingNeural Computing
Neural ComputingESCOM
 
Artificial Neural Network Abstract
Artificial Neural Network AbstractArtificial Neural Network Abstract
Artificial Neural Network AbstractAnjali Agrawal
 
NEURAL MODEL-APPLYING NETWORK (NEUMAN): A NEW BASIS FOR COMPUTATIONAL COGNITION
NEURAL MODEL-APPLYING NETWORK (NEUMAN): A NEW BASIS FOR COMPUTATIONAL COGNITIONNEURAL MODEL-APPLYING NETWORK (NEUMAN): A NEW BASIS FOR COMPUTATIONAL COGNITION
NEURAL MODEL-APPLYING NETWORK (NEUMAN): A NEW BASIS FOR COMPUTATIONAL COGNITIONaciijournal
 
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdfCCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdfAsst.prof M.Gokilavani
 

Similar a Summary Of Thesis (20)

Neural Networks
Neural NetworksNeural Networks
Neural Networks
 
Fuzzy Logic Final Report
Fuzzy Logic Final ReportFuzzy Logic Final Report
Fuzzy Logic Final Report
 
Artificial Neural Networks.pdf
Artificial Neural Networks.pdfArtificial Neural Networks.pdf
Artificial Neural Networks.pdf
 
VEU_CST499_FinalReport
VEU_CST499_FinalReportVEU_CST499_FinalReport
VEU_CST499_FinalReport
 
Artifical neural networks
Artifical neural networksArtifical neural networks
Artifical neural networks
 
NEURAL NETWORKS
NEURAL NETWORKSNEURAL NETWORKS
NEURAL NETWORKS
 
Artificial Neural Network report
Artificial Neural Network reportArtificial Neural Network report
Artificial Neural Network report
 
Neural Network
Neural NetworkNeural Network
Neural Network
 
Neural networks are parallel computing devices.docx.pdf
Neural networks are parallel computing devices.docx.pdfNeural networks are parallel computing devices.docx.pdf
Neural networks are parallel computing devices.docx.pdf
 
Deep Learning Survey
Deep Learning SurveyDeep Learning Survey
Deep Learning Survey
 
A Study On Deep Learning
A Study On Deep LearningA Study On Deep Learning
A Study On Deep Learning
 
D010242223
D010242223D010242223
D010242223
 
Convolutional Networks
Convolutional NetworksConvolutional Networks
Convolutional Networks
 
Neural network
Neural networkNeural network
Neural network
 
[IJET V2I2P20] Authors: Dr. Sanjeev S Sannakki, Ms.Anjanabhargavi A Kulkarni
[IJET V2I2P20] Authors: Dr. Sanjeev S Sannakki, Ms.Anjanabhargavi A Kulkarni[IJET V2I2P20] Authors: Dr. Sanjeev S Sannakki, Ms.Anjanabhargavi A Kulkarni
[IJET V2I2P20] Authors: Dr. Sanjeev S Sannakki, Ms.Anjanabhargavi A Kulkarni
 
Artificial Neural Networks Lect1: Introduction & neural computation
Artificial Neural Networks Lect1: Introduction & neural computationArtificial Neural Networks Lect1: Introduction & neural computation
Artificial Neural Networks Lect1: Introduction & neural computation
 
Neural Computing
Neural ComputingNeural Computing
Neural Computing
 
Artificial Neural Network Abstract
Artificial Neural Network AbstractArtificial Neural Network Abstract
Artificial Neural Network Abstract
 
NEURAL MODEL-APPLYING NETWORK (NEUMAN): A NEW BASIS FOR COMPUTATIONAL COGNITION
NEURAL MODEL-APPLYING NETWORK (NEUMAN): A NEW BASIS FOR COMPUTATIONAL COGNITIONNEURAL MODEL-APPLYING NETWORK (NEUMAN): A NEW BASIS FOR COMPUTATIONAL COGNITION
NEURAL MODEL-APPLYING NETWORK (NEUMAN): A NEW BASIS FOR COMPUTATIONAL COGNITION
 
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdfCCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
 

Summary Of Thesis

  • 1. Dissertation summary Francesco Azzena Neural Networks for Predicting Financial Series. A case of study: S&P Mib Index. The subject of this thesis is the study of so-called neural networks for forecasting purposes. These networks consist in a system of mathematical-statistical modelling capable of describing the relationship between one or more output variables and a set of inputs. This type of modelling is particular in that no direct relationship is established between inputs and outputs, one or more hidden layers being set between them that contain calculating units called neurons, which create fictitious variables. These models have experienced fluctuating fortunes over the years: the first attempts appeared in the 40s as simple linear combinations of inputs. Slowly, neural networks were refined, from the famous Perceptron models to multi-layers models with nonlinear activation functions. The construction of these models came through many phases of training whose purpose was to make the network learn the relationship between the variables under consideration. Despite the fact that proper computational tools had not yet been conceived, neural networks were slowly driven out of the attention of statisticians, above all because of their great complexity. A second break through came at the end of the 80s, almost unexpectedly: George Cybenko, an American professor of mathematics, proved through the study of the properties of the sigmoid function that if well constructed, neural networks with such activation functions and one hidden layer could approximate any nonlinear process generating data, and this with an arbitrary small margin of error. Of course, this conclusion received great prominence: working on non-linear generating process is always problematic precisely because of the difficulty implied in establishing a good model to use. Neural networks can approximate any type of non-linearity in the series —in theory at least— through the learning system and the non-linearity of the activation functions between the layers. For this reason, neural networks have made a comeback in many scientific fields, from economics to medicine, biology, meteorology and many others. We thought therefore that it would be interesting to investigate in our work the theoretical basis which underpin this tool, so much in vogue nowadays. 1/5
  • 2. Dissertation summary Francesco Azzena With this objective in mind, we decided to produce a case study that would improve our knowledge of the issues related to the construction of a neural network. This required the use of particular computer tools. Notably, we chose to proceed with the drafting of a program in Matlab language instead of using the appropriate statistical packages. As this thesis was realized within the framework of a degree profile in “Banking and Finance", we decided to focus on the most important index of the Milan stock exchange, the Standard & Poor's Mib (S&P Mib), and to attempt to make good predictions. The choice may seem bold: according to the theory of the weak efficiency of the markets, we should not be able to provide a proper model for such kind of variable, since it should follow a random walk and therefore have completely random variations, just like a white noise. We nonetheless decided to make this attempt, hoping not only to learn about neural networks, but also to make interesting predictions or, in the worst-case scenario, to demonstrate the correctness of the theory under consideration. More specifically, the problem we tried to solve was how to predict the variations at the time of closure of the Italian stock exchange by using the information available before the opening time. The variables taken into account were the latest available closing value of the Tokyo and New York stock exchange, together with the currency exchange rates of Euro with Yen and Dollar published by the European Central Bank. If that modelling worked, therefore, a hypothetical operator in the Milan stock exchange would be able to anticipate the market successfully, speculating through the assumption of "short" or "long" positions on securities with high correlation with the market. The structure of the thesis is rather simple: after a brief introduction in the first chapter, we describe, in the following chapter, the story of the ideas related to neural networks, with a short digression on the medical phenomenon of the same name. We then analyze the historical process by which, from the first simple neural networks, more and more complex networks came to be used. The structure of neural networks has, indeed, evolved over time: initially, there was the model of the physiologists McCulloch and Pitts (MCP), which simply was a weighted sum compared with a threshold value, this allowing for a response of binary type comparable to the activity of the brain, which works through electrical impulses. As this methodology is suitable only for linearly divisible problems, Rosenblatt proposed in 1958 the Perceptron model: for the first time, the idea of a hidden level between inputs and outputs appeared. The composition of the inputs remained a simple weighted sum, but by increasing the number of neurons and thus of summations, it was made possible to define a precise area in the Cartesian plane and then solve problems not divisible linearly. 2/5
  • 3. Dissertation summary Francesco Azzena The Perceptron idea opened the way for increasingly complex solutions. Today, the number of hidden layers and neurons in each one of them depends only on the choice of structure made by the user. Also, in every neuron we are now capable of applying to the sum of inputs in the previous level a function chosen discretionally. This high possibility of customization of the models has allowed neuron networks to adapt easily to many types of studies, thus finding applications in many branches of science. In the next part of the thesis, we explain the characteristics which distinguish the main types of neural networks: the presence or absence of a teacher (which allows the model to learn) distinguish between supervised networks (the most common type, the one on which we focussed our attention) and unsupervised ones, which are less efficient but have the advantage of operating in real time without any human intervention. After an overview of these models, we analyze in detail the various components of a common supervised neural network with particular attention to the activation functions and the weights. When we create a model of this type, indeed, the first decisions we have to take pertain to the structure: the high level of customization of the product allows us to create a network which we believe is best suited to the series under study. Once the number of layers and hidden neurons for each one of them have been determined, another important choice concerns the activation functions. As already mentioned, these are the functions that we apply to the weighted sum of inputs in each neuron of the network. There exists various types of functions of the kind, which we give a general picture of. The most important to date is the sigmoid, since a fundamental study concerning it has been produced already. The neural networks almost fell into disuse when the mathematician George Cybenko demonstrated, in an article focussing on the properties of the sigmoid, that a neural network with this type of activation function, if well structured and with the right choice of variables, is an universal approximator. Neural networks are therefore able to approximate a non-linear function generating data with a margin of error smaller than a small epsilon arbitrarily chosen. At the end of the second chapter of the thesis, we explain the methods of training the weights in the network: when we create the network, the weights of the sum in each neuron are chosen randomly. Subsequently, using the comparison between estimated values and values of the teacher, the weights are trained to reach their most correct estimation. 3/5
  • 4. Dissertation summary Francesco Azzena Starting from the definition of error, we then examine how it is possible to use its gradient in order to move towards the minimum point on the curve of error by changing the weights of the model. Subsequently, we discuss the backpropagation algorithm, a method that enables the transmission of error to the hidden layers of the model, although it is calculated solely on the final outputs, the only detectable ones. We also pay attention to the most commonly used methods to reduce the time for calculating the weights and to improve performance. More particularly, we focus on methods based on learning speed, both constant and variable. The last one was subsequently used in the practice as proposed by Silva and Almeida. At the end of this chapter, we also discuss the most popular way to avoid overfitting, i.e. the excessive learning of the model about the sample, becoming almost useless out of sample. This is one of the greatest risks using a model characterized by the ability to learn. The method we propose and use is the division of the sample into three different groups: the training set to train the model, the cross validation set, which is used to avoid the overfitting through the comparison between the error curves on the training set and on the cross validation set, and the test set, the part of the sample used to test the forecast ability when the training has ended and the weights are fixed. In the third chapter of the thesis, we deal with practice. First, we analyze the data chosen to test the neural networks on financial series for forecasting: our choice fell on the daily percentage changes of the S&P Mib. We tried to explain its trend by using the data available before the start of trading: more particularly, we used the morning closure value of the Nikkei 225 index of the Tokyo stock exchange (thanks to time zone) and the one of the previous day, the Dow Jones Industrial Average of New York of the two previous days, the currency rates of the two previous days between the Euro and Yen and Dollar. The thesis also comports a section devoted to the explanation of the theory of the weak efficiency of the markets, which would seem to involve the unpredictability of the series under consideration, since it would result as a random variable. We then move on to the implementation of our ideas in Matlab language. We tried to create a small guide for the implementation of a neural network with this program, focusing on those commands which enact the options described in the theoretical part. 4/5
  • 5. Dissertation summary Francesco Azzena We created fifty models for each type of neural network with one layer distinguished by the number of neurons (from one to fifteen, for a total of 750 models estimated). So we selected the best of every type and compared them to the prediction of the test set: the objective was to find a network that had a performance better than a white noise, which confirms the theory of the weak efficiency of the markets. In the last chapter of the thesis, we analyze the work done, pulling the conclusions about the practical work in light of the theory discussed previously. When we chose the S&P Mib for the practical application, we expected to meet many difficulties and so did it happen: the estimated models unfortunately did not give satisfactory results. Using the mean square error (MSE) for the assessment of the predicting performances, the best result was obtained assuming the series as a white noise, putting the expected result values constantly equal to zero: therefore the theory of the weak efficiency of the financial markets seems fully confirmed. However, in some cases, the results were interesting, especially concerning the ability to provide with a good approximation the combination of sign and magnitude of the change of the stock market index in the day. Therefore, although they can not be considered a tool which makes certain predictions, such models could be useful when used by an experienced operator: actually, managed in conjunction with other information spread in the markets and with the consciousness and the intuition derived from experience, they could find a profitable use. To operate on a wide and varied aggregate data as an stock exchange index, it is first necessary to work properly with the economic theories it is based on. We tried to identify the most appropriate inputs, but obviously, a financial analyst, with a greater knowledge of the mechanisms of the market, can make a better choice: the Cybenko theorem could function with a good choice not only of structure, but also of inputs to the network. 5/5