Step by Step approach of Time Series Analysis.

#datascience #machinelearning #timeseries

Nilesh Oct 13 2020 · 4 min read
Share this

What is Time Series Analysis ?

Prediction of data by analyzing time based data is Time Series Analysis. It helps us analyze past behaviors and patterns to identify insights that enforce quick decision making.

Examples :

Predict share prices in stock market based on previous trends.

Predict the demand of houses in real estate.

What is the difference between time Series Analysis and Time Series Forecasting ?

Time series analysis comprises methods for analyzing time series data in order to extract meaningful statistics and other characteristics of the data. 

Time series forecasting is the use of a model to predict future values based on previously observed values.(Source : Wikipedia)

Step wise approach towards handling of Time Series Data.

In order to perform efficient data analysis, we need to be able to understand the type of data we are dealing with and interpret it accordingly. This will help us identify patterns in the data which will further lead us towards accurate predictions.

We will talk about the below concepts in this blog.

  • Identifying Time Series Data
  • Understanding Stationary Time Series
  • Patterns in Time Series Data
  • Decomposition of Data
  • Smoothing the Time Series Data
  • Estimation of Smoothing Coefficient
  • Identifying Time Series Data

    Data which is being observed at regular time intervals is a Time Series Data.

    Time series data is represented by:

    yt= y(t-1)+ εt

    where 'yt' is the observed value at time 't'

    'εt' is the error at time 't'

    Time Series Data can be represented as shown.

    Understanding Stationary Time Series

    Time series data has to be a stationary series to perform time series analysis. Let us understand what a stationary series is.

    A stationary series exhibits the following characteristics:

    1.) mean is constant

    2.) variance is constant

    3.) co-variance is constant

    Performing time series analysis on a non-stationary series will result in inaccurate predictions.

    Series with Constant/Variable mean
    Series with Constant/Variable Variance
    Series with Constant/Variable covariance

    Patterns in Time Series Data

    Analyzing the past data can be done by keeping in mind the following factors: 

    Trend: show a long term increase or decrease in data

    Seasonal: when the data is affected by seasonal factors

    Cyclic: data shows fluctuations across any period of time

    Irregular/Random: If the above 3 components are missing in a data

    Common patterns in Time Series
    Time Series Data and its components

    Decomposition of Data

    Extracting the above mentioned four factors trends, seasonal, cyclic and random is called decomposition of data.

    This will help us identify patterns in the data and will lead us to accurate prediction. Below mentioned models are used for decomposition.

    Types of Decomposition Models

    1.) Additive Model

    Additive model is used when seasonal variations are relatively constant over time

    The time series data may be decomposed as follows:

    Time Series Data = (Seasonal) + (Trends) + (Random)

    As the seasonality factor makes it difficult to identify whether data is depicting upward or downward trend therefore we remove the seasonality factor by deseasonalization which gives us seasonally adjusted values.

     Seasonally adjusted values = Time Series Data - (Seasonal) = (Trends) + (Random)

    2.) Multiplicative Model

    Multiplicative model is used when seasonal variations increase or decrease over time.

    In other words it is useful when the seasonal variations changes over the period of time.

    Time Series Data = (Seasonal) * (Trends) * (Random)

    Time Series Data/(Seasonal) = (Trends) * (Random)

    Data with seasonality
    Data after seasonality removal

    Smoothing the Time Series Data

    When seasonally adjusted values have too many variations,  we may not be able to get a clear trend to forecast accurately, hence we need to remove these variations in the data to obtain a clear trend pattern for forecasting through Smoothing.

    Commonly used smoothing techniques are:

    1.) Simple moving average :

    It is used to determine the average of the observed values at each time period. Each observation is assigned some weight in order to obtain the weighted average.

      MA =   ( α(0)y(t) + α(1)y(t -1) + … + α(n-1) y(t -n) ) / n

    'α' is the weight assigned to the data

      'yt' is the last day data(latest data)

      'n' is the time period for which we want to calculate the moving Average.

     In simple moving average each data is assigned equal weights, i.e. α(0)  = α(1) = ... = α(n-1) = 1 or any constant.

      SMA =   ( y(t) + y(t -1) + … + y(t -n) ) / n

    *Moving Average in detail used for forecasting will be covered in the next blog.

    2.) Exponential smoothing

    Whereas in the simple moving average the past observations are weighted equally, exponential functions are used to assign exponentially decreasing weights over time.

    Exponential smoothing equation:

      St= αy(t)+ (1 - α) S(t-1)

      St= αy(t)+ α(1-α) y(t-1)+ α(1-α)2 y(t-2)+ …+(1-α)(t-1) y(1) (Expanded)

      St is a exponentially smoothed time series at 't', where 't' > 0.

      'y(t)' denotes the latest observation in the series at period 't' and 'y1'is the first observation.

      'α' is a smoothing constant (alpha).

    Estimation of Smoothing Coefficient (alpha)

    We compare the forecasted values and the observed values for different values of alpha. The value of alpha that gives us the minimum forecasted error, is chosen. Estimation of alpha is done on the basis of minimum MSE (Mean Square Error) as given below.

    Alpha corresponding to minimum MSE is chosen.

    To find an optimal value of alpha we can use SciPy library to get optimal results.

    Smoothing techniques can be utilized for short term predictions. In order to perform long term predictions, like forecasting values for 1 or 2 years, we need to use advanced methods which we will cover in upcoming blogs. 

  • Seasonal Indexing (I)
  • Autoregressive (AR)
  • Moving Average (Detailed) (MA)
  • Autoregressive moving average (ARMA)
  • Autoregressive integrated moving average (ARIMA)
  • * * *

    Learn-Share-grow 😊

    For any feedbacks/discussions do comment down below or reach out to me LinkedIn Profile.

    Comments
    Read next