Time Series Forecasting: how to chose your algorithm?

Moving Average vs. Exponential Smoothing? What about ARIMA? Is it better to have one algorithm or many at your disposal so that you can switch if need be?

The answer to these questions will vary based on the specificities of the signals you are trying to forecast. There is no universal answer, however we will try to give you some best practices in order to make this call.

One size does not fit all

That may sound obvious, but if you are trying to improve your Forecast Accuracy you need to understand each signal’s specificities and forecast them accordingly. If you have a Seasonal signal, you will need a Seasonal algorithm, if you have some clear trends, you will need a algorithm with trends… etc

Therefore good forecasters do not have just one method at their disposal but many and will chose the best for each signal based on its specificities. Now the next question is how to select this algorithm?

“Natural choice” vs. “Black Box”

Most Forecasting software perform an automatic selection of the forecasting algorithm using the “Best in Class” approach. As they well understand that one size does not fit all, they use their computing power to calculate forecasts on each signal with many different algorithms in order to select the one that works best on each signal. The “best” algorithm is usually selected through “Backtesting”, i.e. forecasting the last few data points using the previous ones and selecting the algorithm that minimizes the forecast error on this few last data points.

That might sound like a great idea, however it can be a very risky strategy because different forecasting algorithms are likely to give very different forecasts. Switching algorithms back and forth is likely to generate a high variability in your forecast from release to release, affecting your credibility as a forecaster due to this lack of consistency, if not generating some serious business problems down the line. If you are Producing according to your forecast for example, you might be familiar with the bullwhip effect, and know that large variations in your forecast will likely generate Capacity and Inventory issues.

Another drawback of this approach is that it tends to treat forecasting as a “Black Box” which makes it difficult to explain which forecasting algorithm has been used. A Forecaster’s job is less about producing the forecast itself than about convincing peers and Management that this is the one they should base their decision upon. A forecast non validated by your Management or peers is of little value and when being challenged about how you built it, the last thing you want is to have is sentences such as:  “the computer said…” or “I have no idea how it was calculated, the machine did it for me” as your only answers.

Finally, do not get fooled by the nice “Best In Class” labelling, and take some time to understand what is really behind it. The so-called “best” algorithm is generally the one that minimizes the lag-1 error, which means that you will be picking the algorithm which based on N data points has best forecasted the N+1 data point. The thing is that it’s not so hard to forecast accurately the next data point. Many algorithms including “naïve” forecasts (i.e. a simple forecast equal to the last signal value) can be quite accurate for this. The risk for an “unnatural” methodology to be selected (such as a non Seasonal algorithm whereas the signal is clearly Seasonal… etc) is also high. Such a winner of the “best in class” competition might have been the best to minimize your past lag-1 error but is this the algorithm you want to use to forecast 12, 50 or 100 data points ahead and convince your management to use? Probably not.

We highly believe that it is best to drop a few % points of Forecast Accuracy for gaining a higher Forecast consistency and stability, as well as clarity about which forecast methodology has been used. For this reason we recommend a more “natural” approach to selecting the forecasting algorithm: try to first analyse the signal. Does it exhibits clear trend? pick a method with trend? does it exhibit clear seasonality? pick a method with seasonality? Do you want the trend to be linear or damped? pick the right type of trend. Try not to leave these obvious choices to a computer’s “Black Box”, which could generate variability or “strange” forecasts that will later be difficult to explain.

That does not mean that automatic optimization is necessarily bad, it is actually quite useful to figure out the best value of the parameters (for example if you are using an Exponential Smoothing algorithm), or picking the best algorithm among a few “natural” ones, however we recommend you understand carefully how the automatic optimization works and that you narrow the “competition” to “natural” algorithms and ranges of parameters that are likely to return a forecast that makes sense to you and to your team.

Seasonality on the side please!

Seasonality is one of the most difficult component of your signal to forecast. Get it wrong and your forecast accuracy will seriously be damaged.

There are essentially two types of forecasting algorithms: some that deal with Seasonality automatically (Holt-Winters for example), and some who don’t incorporate Seasonality (Moving Average, Exponential Smoothing, ARIMA…). You can still use these methodologies for seasonal signals, it just means that you will have to deal with Seasonality on the side, by “de-seasonalizing” your signal first (i.e. removing the seasonality inside it), and “re-seasonalizing” your signal after you’ve forecasted.

For a Signal X(t), calculate the Seasonal coefficient S(t)

Consider the de-seasonalized signal Y(t) = X(t) / S(t), and forecast Y(t+T) using the forecasting methodology of your choice (Exponential Smoothing, Moving Average, ARIMA.. you name it)

Then re-seasonalize using the same Seasonality coefficient S(t+N)

X(t+T) = S(t+T) * Y(t+T)

We recommend this later approach. Similarly to “guiding” the computer by pre-selecting some algorithms and range of parameters, “guiding” the algorithm on the seasonality side is the one of the best way to avoid large pitfalls and ensuring good forecast accuracy. When selecting the seasonality coefficients to be used on a particular Item you have essentially two options:

  • Item Level seasonality: In that case you will be calculating the seasonality based on the historical signal for this particular item. This approach is to be used if a for items with a specific and distinctive seasonality
  • Aggregated seasonality: In that case we will be using the seasonality of a bigger aggregate to which the item belongs. For example if I am a retailer trying to forecast the sales of a new red sweater with little historical data, I could use the seasonality of all the others red sweaters and assign it to this seasonality to the new red sweater. This approach is to be used for items with no distinctive seasonality pattern, or with not enough history to define a specific pattern from it.

Again, nothing wrong with automatic seasonality algorithms such as Holt-Winters. Feel free to use them if they have demonstrated a good and consistent forecast accuracy. However, if you are suffering from low forecast accuracy, a good first step towards improving it would be to perform a seasonality calculation on the side

Is aggregating data good?

There is a general belief that “Forecasts at Aggregated Level are more accurate than Forecasts performed at a Lower level”.  For this reason some forecasting softwares will propose you algorithm that forecast at an aggregated level, aggregating items with similar properties (same line, same color… you define it).

That might result in a quite accurate forecast on this aggregate indeed, but then what? How do you get back to your Forecast at item level? The answer is generally simple: you will have to split it. For example if you forecast the sales of some Shoes of different sizes, it could make sense to aggregate the sales of all the sizes together to get a smoother signal, to forecast this and then to break your forecast down at size level using some kind of split rule. Despite a good forecast on the aggregate, mess up with the split rule and your forecast accuracy at item level will be damaged.

From our experience, if you select well your forecasting algorithm, it does not make a difference to your Item level forecast accuracy, to forecast at aggregate level and split, or to forecast directly at Item level. Actually, too much aggregation can make you miss some important patterns that you would not have missed by forecasting at item level.

Let’s imagine that you forecasts Shoes and aggregate sales from a model which comes in Black and in White. The two models have very different seasonality patterns: winter for Black, summer for White. Now let’s assume the Black model sales much more than the White (10 times more on average), the risk here is that your aggregate signal will be very close to the Black color signal. You will forecast it, and then when you split you are likely to apply a Black / Winter seasonality pattern to your White shoes, something you would have likely not done if forecasting colors separately.

That does not make the aggregating / splitting approach irrelevant. This approach can be quite useful if you have to forecast many items and do not have the resources to look at all of them individually. But be careful to applying this method to aggregate that “make sense” (for example the sales of all the sizes of the Black shoe sounds fine), and avoid mixing signals with different seasonality or where the split rule to go back to the Item level is not obvious.

Keep it simple

The French have a saying: “Le Mieux est l’ennemi du Bien”, which literally translates into The Better is the enemy of the Good, and could be interpreted as Passed a certain point, adding complexity will do more Damage than Good.

The complexity level you need to introduce in your algorithm should be just enough to capture the main specificities of the signal you are trying to forecast. However you do not want to introduce so much complexity than you lose the big picture or a clear understanding of how your algorithm works.

My teacher once said “What is easy to conceive, should be easy to explain“. Never forget that a Forecaster’s job is less to build the forecast itself than to be able to explain how it was built, in simple terms, to non Forecasting specialists, in order to convince them that it is the correct path towards decision making.  Therefore “technical models” which are hard for non-specialists to comprehend such as ARIMA are not recommended unless you can demonstrate their added value.

Bottom Line: Analyse well your signal and introduce the complexity you need to take its main specificities into consideration, but keep your forecasting methodology as simple and as easy to explain to non specialists as you possibly can.

AnalystMaster

 

Time Series Forecasting: Don’t forget Seasonality!

Why is Seasonality important

Seasonality is in everything we do, we even considering unconsciously, for example when we leave for work early to beat morning traffic, or when we book our Summer holidays early to avoid peak prices.

If you are running a Winter attire business, you might only sell a few pieces during summer but your sales might boom when the weather gets colder, therefore requiring additional resources such as Inventory, Staff Availability… etc.

Seasonality is therefore a very important component of Planning and especially in Forecasting.

How to calculate Seasonality

Seasonality is usually calculated using the Time Series Decomposition Method.

This method assumes the Signal can be broken down in 3  components:

  • The Trend: is your signal flat? increasing or decreasing?
  • The Seasonality: does your signal show peaks and drops at specific Time periods (for example peak of Sales for Christmas in December)
  • The Noise: this is the part of the signal that can not be explained. If the Signal is well decomposed, the Noise component should be a process of mean = 0
  • Sometimes a Cycle component is also added. We will assume there is no Cycle going forward

 Multiplicative Time Series Decomposition

Signal(t) = Trend(t) * Seasonality(t) * Noise (t)

Additive Time Series Decomposition

Signal(t) = Trend(t) *+Seasonality(t) + Noise (t)

At AnalystMaster we generally prefer to consider Times Series as Multiplicative (and that is what we will use going forward). In that case, Seasonality for each time period could be seen as a weight and the Sum of Seasonality for all components is equal to 1 or 100%.

 Calculate Seasonality in Excel

You can also use the attached Excel Model to calculate your Monthly Seasonality in Excel.

This model works with 24 months of Historical data

  1. It first evaluates the Trend using a Centred Moving Average (only possible from time bucket 6 to 18), and extrapolate this trend linearly for time buckets 1 to 24.
  2. Then the trend is removed (we divide the original signal by it as we consider the Time Series as Multiplicative), in order to leave only the Seasonality and Noise as the only components of the Time Serie.
  3. Finally, Seasonality coefficients from both time periods are averaged

Calculate Seasonality with R

R is a great tool to calculate the Seasonality of a Time Series. You can use the following piece of code to read a monthly time serie from a data.csv file and return the Seasonality coefficients.

R will return you the Seasonal, Trend and Random components from the Multiplicative Time Series decomposition.

library(gdata)
> library(forecast)
> library(tseries)
> mydata = read.csv(“data.csv”)
> signal<-ts(signal ,start = c(2014,1), end = c(2017,6), frequency = 12)
> seascoef<-decompose(signal, type=”multiplicative”)

season_r

 The next levels

Including  Seasonality in your planning will dramatically improve your planning accuracy. However you might find out that only considering the Seasonality at Monthly level is not good enough and that you need to also include seasonality at a more detailed level to maximize your planning accuracy.

Seasonality exists also at a more detailed level: weekly, daily, hourly… for example

  • Weekly seasonality: if you are a Retailer: although December is a peak month, not all the December weeks are equal. The seasonality is much stronger on the last week before Christmas and failing to anticipate this can result in shortages of capacity
  • Daily seasonality: if you own a shop or a restaurant, Seasonality is usually stronger on some days of the week: Saturday for example
  • Hourly seasonality: peak times also vary hour by hour. If you are running a call centre you need to plan your capacity accordingly. Or if you are going shopping at Harrods’s, you might want to go when the store is less crowded according to the chart below which is available on Google

 

harrods

Sometimes Seasonality can even be more complex as it does not necessarily follow a regular Month-Week-Day-Hour pattern.

A well-known example by Retailers is Chinese New Year, which follows the Chinese Moon calendar and will therefore fall on a different week and month every year. Easter or Ramadan are also moving holidays and seasonality can be hard to evaluate.

chinese

How far should I go?

Seasonality is important and should be included in your forecasting activities, however you need to keep it at a level which is both relevant for your activity and simple enough to implement. For example if you plan Production at a Monthly level, keep your signal at a Monthly level and evaluate Seasonality at this level too. If you are planning a Warehouse Capacity at Weekly level, then get your Signal and Seasonality at Weekly level, and if you are planning how to staff a Call Centre or a Shop on an Hourly basis, then plan and measure Seasonality hour by hour. But do not introduce unnecessary complexity in getting a signal at hour level if you only expect a monthly plan

AnalystMaster