Utilizamos tu perfil de LinkedIn y tus datos de actividad para personalizar los anuncios y mostrarte publicidad más relevante. Puedes cambiar tus preferencias de publicidad en cualquier momento.
Próxima SlideShare
Cargando en…5
×

# Machine Learning, Stock Market and Chaos

17.103 visualizaciones

Dr. Roitman discusses the use of Artificial Intelligence to solve complex and insoluble problems. Artificial intelligence approach is in the root of I Know First predictive algorithm.

• Full Name
Comment goes here.

Are you sure you want to Yes No
• Inicia sesión para ver los comentarios

### Machine Learning, Stock Market and Chaos

1. 1. Machine Learning of Chaotic Systems Solving Complex and Insoluble Problems via Artificial Intelligence By Lipa Roitman PhD November 1st, 2015
2. 2. Contents • Chaos VS Randomness • Chaotic Processes • Modeling Chaos- Statistics Approach • Modeling Chaos- Artificial Intelligence and Machine Learning Approach • Steps in Machine Learning • Financial Markets as Chaotic Processes
3. 3. Chaos and Randomness • Random noise No known cause, no regularity, no rationality, no repeatability, no pattern Impossible to predict
4. 4. Chaos VS Randomness • Randomness Examples Previous coin flips do not predict the next one. Brownian motion - random walk Gaussian and non-Gaussian Random (white) noise with frequency-independent power spectrum Other modes of random processes.
5. 5. • Stationary process: statistical properties: mean value, variance, moments, and probability distribution do not change over time. Stationary ergodic process: the process has constant statistical properties with time, AND its global statistical properties can be reliably derived from a long enough sample of the process. Chaos VS Randomness
6. 6. • Real life chaotic processes are neither stationary nor ergodic! Their statistics have to be constantly monitored since they drift with time. A nonparametric analysis is needed when the probability distribution of the system is not normal. Chaos VS Randomness
7. 7. • Astronomy: Three-Body Problem • Sunspots • Geology: Earthquakes • Oceanology: El Niño (Pacific ocean temperature) , Tides • Meteorology: Weather Chaos in Natural Processes
8. 8. • Fluid flow: luminary vs turbulent • Candle flame • Quantum chaos • Biology: Population growth • Physiology: Arrhythmia, Epilepsy, Diabetis • DNA code • Epidemiology: diseases Chaos in Natural Processes
9. 9. • Social: fashion trends • Wars • Music and speech • Stock markets, etc. Chaos in Natural Processes
10. 10. Chaotic Processes Chaotic Processes Three competing paradigms: Stability Instability Sudden and Dramatic Change
11. 11. Chaotic Systems Properties What is the pattern? • Stability: Persistent trends. • Memory: What happens next depends on prior history. • Predictable: One can predict while the pattern continues.
12. 12. Chaotic Systems Properties • Instability - “tired trend” - accumulation of small random imbalances, or of slow systematic imbalances that precede large change. • “Sand pile avalanche model” • Predictability is lower
13. 13. • Change: paradigm changes suddenly, seemingly without warning. – often with reversal of trend • Fat-Tail: The change could be much stronger from what is expected in the normal Gaussian distribution. • Black Swan Events Chaotic Systems Properties
14. 14. Chaotic Systems Properties • Cycles of varying lengths. • Periods of quiet followed by big jumps • Chaotic patterns are predictable, but only in terms of probabilities.
15. 15. • Measuring Chaos - Statistically Modeling Chaos
16. 16. • Mathematical modeling of chaotic systems is difficult: Tiny changes in parameters can sometimes lead to extreme changes in the outcome. There is no certainty, only probability. Modeling Chaos
17. 17. • The ubiquity of gradual trends and the rarity of the extreme events resemble the spectral density of a stochastic process, having the form • In this “1/f noise model” the magnitude of the signal (event) is inversely proportional to its frequency. Modeling Chaos S(f)=1/f^α
18. 18. Although 1/f noise is widely present in natural and social time series, the source of such noise is not well and understood. 1/f noise is an intermediate between the white noise with no correlation in time and random walk (Brownian motion) noise with no correlation between increments. In most real chaotic processes the random (white) frequency-independent noise overlaps the 1/f noise. Modeling Chaos
19. 19. In a random autoregressive process the autocorrelation functions decay exponentially In chaotic process, they leave a small persistent residue: “long memory”. Modeling Chaos
20. 20. If one looks at a chaotic process at different degrees of magnification, one finds they are similar. This self –similarity brings us to a subject of fractals Self similarity = Power laws scale invariance fractals (Mandelbrot) Hurst exponent Scale Invariance
21. 21. • Chaos Fractals Connection Modeling Chaos
22. 22. • Rescaling Range • Given a relation • Scaling the argument x by a constant factor c causes only a proportionate scaling of the function itself Modeling Chaos
23. 23. • In other words:  Scaling by a constant c simply multiplies the original power-law relation by the constant c^{-k}. Thus “Self-Similarity” Modeling Chaos
24. 24. • “Power Law Signature”: Logarithms of both f(x) and x, have linear relationship: straight-line on the log-log plot. • Rescaled range - The slope of this line gives the Hurst exponent, H. Modeling Chaos
25. 25. • Hurst exponent can distinguish fractal from random time series, or find the long memory cycles Hurst Exponent H
26. 26. • H =1/2 Random walk - Brownian motion -Normal Distribution • H < 1/2 mean reverting • negative feedback: • high noise • high fractal dimension Hurst exponent H
27. 27. • 1>H>1/2 Chaotic trending process: Positive feedback Less noise Smaller fractional dimension Fractional Brownian motion, or 1/f noise Hurst exponent H
28. 28. Maximal Lyapunov Exponent Maximal Lyapunov exponent (MLE) is a measure of sensitivity to initial conditions, i.e. unpredictability. Positive MLE: chaos The inverse of Lyapunov exponent: predictability: 1/MLE Large MLE: shorter half-life of signal, faster loss of predictive “power”.
29. 29. • Maximal Lyapunov exponent (MLE) is a measure of sensitivity to initial conditions, a property of chaos • Hurst exponent H is a measure of persistency Maximal Lyapunov Exponent
30. 30. Fractal time series are good approximations of chaotic processes. They are complex systems that have similar properties. Modeling Chaos with Fractals
31. 31. Modeling Chaos with Fractals Fat-tailed probability distribution Memory Effect: Slowly decaying autocorrelation function Power spectrum of 1/f type Modeled with fractal dimension and the Hurst parameter Global or local self-similarity.
32. 32. Fractal dimension D and Hurst exponent H each characterize the local irregularity (D) and global persistence (H). Thus D and H are the fractal analogues of variance and mean, which are not constant in the chaotic time series. Fractal Dimension and Hurst Exponent
33. 33. Fractal Dimension and Hurst Exponent • For self-affine processes, the local properties are reflected in the global ones • For a self-affine surface in n-dimensional space • D+H=n+1 D: fractal dimension H: Hurst exponent
34. 34. Chaos and Fractals Connection Fractals have self-similar patterns at different scales. Fractal dimension Multi fractal system - continuous spectrum of exponents - singularity spectrum.
35. 35. Random shocks to the process, such as news events. The shocks can have both temporary and lasting effect Combination of interdependent autoregressive processes, each with its own statistical properties. Two Reasons For 1/F Noise
36. 36. Modeling Chaos:  Artificial Intelligence and Machine Learning Approach Modeling Chaos - AI Approach
37. 37. Artificial Intelligence • Machine Learning Purpose: Generalization • Find the laws within the data • Predicting change • Number crunching allows finding hidden laws, not obvious to human eye
38. 38. Artificial Intelligence Types Rules Based AI Man creates the rules: Expert Systems The rule-based approach is time consuming and not very accurate
39. 39. Supervised learning from examples The examples must be representative of the entire data set. Artificial Intelligence Types
40. 40. Un-supervised learning Classification: clustering Artificial Intelligence Types
41. 41. Deep learning Deep learning models high-level abstractions in data by using multiple processing layers with complex structures. Artificial Intelligence Types
42. 42. Deep learning can automatically select the features For a simple machine learning, a human has to tell the algorithm which combination of features to consider Deep learning finds the relationships on its own No human involvement Artificial Intelligence Types
43. 43. “Ultra Deep Learning” Machine has learned so much, it can not only derive the rules, but detect when the rules change: detect the change in paradigms. Combines the supervised, un-supervised types and rule based machine learning into a more intelligent system. Artificial Intelligence Types
44. 44. Steps in Machine Learning Provide Framework Mathematical and Programming Tools Data preparation Parameters estimation Give examples to learn from: the input (and in some methods the output)
45. 45. Steps in Machine Learning • Creating a Model (or Models). • Fitness Function: What to optimize? • Example: Make more good predictions than bad ones.
46. 46. Data Preparation Data preparation Convert the generally non-stationary data into more-or-less stationary Remove the cycles, trends to reduce the uniqueness of each data point
47. 47. Parameters Estimation • Parametric OR Nonparametric? • Parametric model: fixed number of parameters • Nonparametric: no assumptions about the probability distributions of the variables. • In non-parametric model the number of parameters increases with the amount of training data.
48. 48. Creating a Model “All Models are Wrong, Some Models are Useful” – George E. P. Box
49. 49. Multivariate time series Multivariate time series modeling is required when the outcome of one process depends on other processes. Examples are systems of interdependent global and local processes, asset prices, exchange rates, interest rates, and other variables.
50. 50. Multivariate time series To create a model one could use the available knowledge about interrelationship of the processes, and combine it with unknowns in one or more of the linear or non-linear models. The “fitness” or “error” function is then created, which compares the model with the data.
51. 51. Machine Learning The fitness function is improved through machine learning by varying the parameters in the model. The goal is to maximize the fitness of the model to the data presented for learning (minimize the error). Different models are screened Part of the data is saved from the learning cycle to be used for testing. The successful model should be able to perform adequately on the test data.
52. 52. Dimensionality Reduction • Dimensionality reduction • Speeds up algorithm execution • Improves performance • The less variables the better is generality
53. 53. • Principal Component Analysis is one of the methods of dimensionality reduction. • Orthogonally transforms the original data set into a new set of “principal components” Dimensionality Reduction Methods
54. 54. • Methods: • Low Variance Filter. • High Correlation Filter. • Pruning the network. • Adding and replacing inputs. • Other methods. Dimensionality Reduction Methods
55. 55. Clustering • The many examples in the data can be compressed into clusters according to the similarity through fitting to one or more criteria. • Each data member that belongs to a cluster is associated with a number from 0 to 1 that shows the degree of belonging. • Each data member can also belong to multiple clusters with each specific degree of belonging. • Clustering can be a goal in itself, or a part of a general model, that includes the behavior of clusters as a whole.
56. 56. Time Constraint • A <insert favorite programming language> programmer knows the value of everything, but the cost of nothing. -- Alan J. Perlis
57. 57. Time Constraint • Some problems are insoluble or too complex to be completely solved in reasonable time. • Compromises are necessary, e.g. speed vs precision vs generality • Time complexity (big O notation) of an algorithm quantifies the amount of time taken by an algorithm to run as a function of the length of the string representing the input.
58. 58. Time Complexity (Big O Notation)
59. 59. Choice of Algorithm • Which Algorithm?  Depends on the task  Depends on time available  Depends on the precision required
60. 60. Local and Global Minimum accp1.org/pharmacometrics/theory.htm Uphill SearchingDownhill Gradient Searching
61. 61. Local Search Algorithms • Local search methods: • steepest descent or • best-first criterion, • stochastic search. • simulated annealing, • genetic selection • others
62. 62. A random move altering the state Assess the fitness of the new state Compare the fitness to the previous state Decide whether to accept the new solution or reject it. Repeat until you have converged on an acceptable answer Simulated Annealing
63. 63. Global Search Algorithms • Stochastic optimization • Uphill searching • Basin hopping
64. 64. accp1.org/pharmacometrics/theory.htm Local and Global Minimum
65. 65. Basin Hopping The algorithm is iterative with each cycle composed of the following features Random perturbation of the coordinates Local minimization Accept or reject the new coordinates based on the minimized function value
66. 66. Genetic Algorithms • Many solutions are in the pool, some good, some not so. • Each solution is analogous to a chromosome in genetics
67. 67. Genetic Algorithms • Ways to improve gene pool: • Combination: • Combine two or more solutions in hope of producing a better solution. • Mutation: • -Modify a solution in random places in hope of producing a better solution. • Crossover: • Import a solution from a similar problem • Selection: • Survival of the fittest
68. 68. 68 Bain-Template Gene Pool Reprod uceMutate Select Genetic Algorithm
69. 69. I Know First Predictive Algorithm • Most financial time series exhibit classical chaotic behavior. The chaos theory, the classification and predictive capabilities of the machine learning has been applied to forecasting of such time series. • This artificial intelligence approach is in the root of I Know First predictive algorithm.
70. 70. I Know First Predictive Algorithm  The following slides are the method and the results of applying the algorithm to learn the database of historical time series data.
71. 71. The I Know First Algorithm The results are constantly improving as the algorithm learns from its successes and failures Tracks and predicts the flow of money from one market or investment channel to another The system is a predictive model based on Artificial Intelligence, Machine Learning, and incorporates elements of Artificial Neural Networks and Genetic Algorithms Artificial Intelligence (AI) Artificial Neural Networks I Know First predicts 2000 Market’s Eeveryday
72. 72. Synopsis of the Algorithm The results are constantly improving as the algorithm learns from its successes and failures
73. 73. Two indicators: Signal – Predicted movement of the asset Predictability Indicator – Historical correlation between the prediction and the actual market movement Daily Market Heat map
74. 74. XOMA returned 61.45% in 1 month from this forecast
75. 75. Forecast vs. Actual
76. 76. I Know First Sample Portfolio
77. 77. I Know First beats the S&P500 by 96.4% View Full Portfolio I Know First Live Portfolio 2015 Performance The Performance
78. 78. I Know First beats the S&P500 by 20.8% The Performance
79. 79. The Performance
80. 80. The Performance
81. 81. Main Features of the Algorithm Identifies The Best Market Opportunities Daily 6 Time Frames Tracks Over 3,000 Markets Self-Learning Adaptable Always Learning New Patterns Scalable A Decision Support System (DSS) Predictability Indicator Strong Historical Performance – 60.66% gain in 2013 The algorithm becomes more and more accurate with every prediction as it constantly tests multiple models in different market circumstances
82. 82. More Applications Of I Know First Algorithm • Time Series Forecasting of Multidimensional Chaotic Systems. • What if? It is a Scenario-based Forecasting