Modeling and Analysis of Bitcoin Volatility Based on ARMA-EGARCH Model (2)

in #fmz2 years ago

2. Smoothness of the time series

If it is a non-stationary series, it needs to be adjusted approximately to a stationary series. The common way is to do difference processing. Theoretically, after many times of difference, the non-stationary series can be approximated to a stationary series. If the covariance of the sample series is stable, the expectation, variance and covariance of its observations will not change with time, indicating that the sample series is more convenient for inference in statistical analysis.

The unit root test, namely ADF test, is used here. ADF test uses t test to observe significance. In principle, if the series does not show obvious trend, only constant items are retained. If the series has trend, the regression equation should include both constant items and time trend items. In addition, AIC and BIC criteria can be used for evaluation based on information criteria. If formula is required, it is as follows:

7.png

In [8]:

stable_test = kline_all['log_return']
adftest = sm.tsa.stattools.adfuller(np.array(stable_test), autolag='AIC')
adftest2 = sm.tsa.stattools.adfuller(np.array(stable_test), autolag='BIC')
output=pd.DataFrame(index=['ADF Statistic Test Value', "ADF P-value", "Lags", "Number of Observations",
                           "Critical Value(1%)","Critical Value(5%)","Critical Value(10%)"],
                    columns=['AIC','BIC'])
output['AIC']['ADF Statistic Test Value'] = adftest[0]
output['AIC']['ADF P-value'] = adftest[1]
output['AIC']['Lags'] = adftest[2]
output['AIC']['Number of Observations'] = adftest[3]
output['AIC']['Critical Value(1%)'] = adftest[4]['1%']
output['AIC']['Critical Value(5%)'] = adftest[4]['5%']
output['AIC']['Critical Value(10%)'] = adftest[4]['10%']
output['BIC']['ADF Statistic Test Value'] = adftest2[0]
output['BIC']['ADF P-value'] = adftest2[1]
output['BIC']['Lags'] = adftest2[2]
output['BIC']['Number of Observations'] = adftest2[3]
output['BIC']['Critical Value(1%)'] = adftest2[4]['1%']
output['BIC']['Critical Value(5%)'] = adftest2[4]['5%']
output['BIC']['Critical Value(10%)'] = adftest2[4]['10%']
output

Out[8]:

8.png

The original assumption is that there is no unit root in the series, that is, the alternative assumption is that the series is stationary. Test P value is far less than 0.05% confidence level cut-off value, reject the original assumption, so the log rate of return is a stationary series, can be modeled by using statistical time series model.

3. Model identification and order determination

In order to establish the mean value equation, it is necessary to do an autocorrelation test on the sequence to ensure that the error term does not have autocorrelation. First, try to plot autocorrelation ACF and partial correlation PACF as follows:

In [19]:

tsplot(kline_all['log_return'], kline_all['log_return'], title='Log Return', lags=100)

Out[19]:

9.png

It can be seen that the effect of truncation is perfect. At that moment, this picture gave me an inspiration. Is the market really invalid? In order to verify, we will do autocorrelation analysis on the return series and determine the lag order of the model.

The commonly used correlation coefficient is to measure the correlation between it and itself, that is, the correlation between r(t) and r (t-l) at a certain time in the past:

10.png

Then let's do a quantitative test. The original assumption is that all autocorrelation coefficients are 0, that is, there is no autocorrelation in the series. The test statistics formula is written as follows:

11.png

Ten autocorrelation coefficients were taken for analysis, as follows:

In [9]:

acf,q,p = sm.tsa.acf(kline_all['log_return'], nlags=15,unbiased=True,qstat = True, fft=False)  # Test 10 autocorrelation coefficients
output = pd.DataFrame(np.c_[range(1,16), acf[1:], q, p], columns=['lag', 'ACF', 'Q', 'P-value'])
output = output.set_index('lag')
output

Out[9]:

12.png

According to the test statistic Q and P-value, we can see that the autocorrelation function ACF gradually becomes 0 after order 0. The P-values of Q test statistics are small enough to reject the original assumption, so there is autocorrelation in the series.

To be continued...