The correlation coefficient between random walk and time has a characteristic shape of histogram or density function. Some findings has been revealed and it is desirable to be investigated more.
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
Correlations about random_walks
1. Correlations About Random Walks
A (single) corr. coef. > .8 or .9 unnecessary means
high degree of relationship in time series
2014-06-06 Toshiyuki Shimono
The correlation coefficient
between two independt
random walks has a
characteristic shape of
histogram or density
function as shown above.
A random walk is defined as its each increment at each
time step is +1 or -1 with each probability 50%.
2. Random walks are various.
And their correlation coefficients have a broad distribution
even if they are just chosen `randomly’.
3. Remarks for time series analysis
• Very often, random walks have very high
correlation coefficients ρ such as > 0.8 or 0.9.
• Thus, unless you know well about
how numerals moves (such as economic indexes),
do not take ρ as some relation degree.
• High correlations in multiple may mean significant deviation
from just a random chance, but some rigid methodology to
detect that significance is desirable.
4. § Numerical Calculations
To investigate the distributions of correlation coefficients
(1) between random walks and time, and also
(2) between two independent random walks,
Monte Carlo approach was employed and trillions of
random variables are yielded from Mersenne Twister.
5. The density functions for
(1) corr. coef. of random walk and time
(2) corr. coef. of two independent random walks
• Mersenne Twister produced trillions of random variables to
calculate billions of random walks here.
• Each of vertical segments inside indicates 1.25% percentiles.
0.3099..
1.018..1.018.. 0.6054..
6. Main stats.
• The two distributions seems to have the modes
at ±0.910.. and 0, respectively.
• The two distributions have the variances
of 0.4341..=0.659..2 and 0.2405.. =0.490..2, respectively.
• When only their positive parts are considered,
the two distributions have the medians
of 0.6613.. and 0.4136.., respectively.
• For 0 < ε << 1, the density functions seems to be
approximated as exp(negative const. ÷ ε2).
7. Probabilities, percentiles
For the correlation coefficients btw. r.w. and time :
The probability being over 0.6 is 0.2823.. ≈ sin(0.6)/2
The probability being over 0.8 is 0.1582.. ≈ 1/401/2
The probability being over 0.81 is 0.1501..
The probability being over 0.8669.. ( ≈ π -1/8 ) is 0.1
For the correlation coefficients btw. two r.w. :
The prob. being over 0.4529 or below -0.4529 is 0.4529..
The 99.5 percentile is 0.9051..
The 99.95 percentile is 0.9481..
The 99.995 percentile is 0.9670..
The 99.9995 percentile is 0.9771..
The density function at 0 is 0.6055.. ≈ π -1/6 /2
8. Conclusions and summary
• The correlation coefficients above have wide distributions and
their shapes are characteristic, which may have grave concerns
about time series analysis.
• The convergence behavior depends also on the length L of
random walks, and L=512, 1024, 2048 are used to analyze the
convergence, which is not well presented in this document.
• Quadrillions (1015) of random numbers are desirable to
determine more digits. Then we can easily compare
discovered numerals and some meaningful mathematical
numbers.
• Mathematical analysis is desirable to determine
the density functions of the distributions
— as a matter of course!
— and for establishing some new hypothesis tests.
9. Appendix.
• Raw data from calculations are presented here for
the further analysis in the future.