A times series is a series of data points which are listed (indexed) in time order. Simply a time series is a sequence taken at successive equal interval points in time. Therefore, it is a sequence of discrete-time data. The correlation for time series observations with observations with previous time steps (lags) can be calculated. As the correlation of the time series observations is calculated with values of the same series at previous times, this is called a serial correlation, or an AUTOCORRELATION. The pandas is an open source, BSD-licensed library providing high-performance, easy-to-use data structures and data analysis tools for the Python programming language. Following modules are to be imported to carry out time series analysis:
import pandas import numpy from pandas import Series, DataFrame, Panel
pandas.Series.autocorr()
can be used to calculate autocorrelation on Series with lag-N (default=1).
However, to calculate autocorrelation on a dataframe, following function can be utilized: –
def df_autocorr(df, lag=1, axis=0): “””Compute full-sample column-wise autocorrelation for a DataFrame.””” return df.apply(lambda col: col.autocorr(lag), axis=axis) d1 = DataFrame(np.random.randn(100, 6))df_autocorr(d1)
Questions? Please feel free to write to bd@agilytics.in
Comentários