France’s World Cup win: the failure of statistics?

Let’s be honest, we did not see it coming. When asked who would be lifting the famous World Cup trophy in Moscow on July 15th 2018: Germany, Brazil and Spain were the easy bets, with Belgium and France being the top outsiders. But none of us could see Croatia (who struggled in qualifications) playing the Final, and very few would have bet on Germany or Spain going home early, or on Russia making it to the quarter-finals. Difficult as well to imagine France lifting the trophy after their difficult first round. And yet, here we are… so the question is: how did we not see it coming? If we were looking, many signs were actually pointing in that direction.

There are more than one way to play (and win)

First, the so-called “malediction of the defending champion” who sent home Germany after the first round, is actually a trend: Spain 2014, Italy 2010, France 2002… you name it! Not so surprising after all, as coaches who won a World Cup tend to feel obliged to owe their winning players a second round. These players can feel overconfident whereas their opponents get extra motivation at the idea of facing the defending champion.

Second, like all good things, Football (the most popular sport ever invented and it’s easy to see why) has the ability to reinvent itself constantly and we are just witnessing a new era. Whereas the late 2000s was dominated by the famous ball possession style of FC Barcelona, Spain (2008-2012), and to some extend Germany (2014), the trend is now to a more defensive style with fast and clinical counter attacks. Where Barcelona and Spain loved to drive their opponents crazy moving the ball from side to side and enjoying a 70% possession, the new top teams now wait back, often leaving possession to the other team, and project themselves vertically at high speed towards the goal. Zidane’s Real Madrid won the last 3 Champions Leagues like that (2016-2018), Portugal won the 2016 Euro Championship the same way with only 40% possession, and so did Belgium to send Neymar and his teammates home early. If France won the World Cup, it is first and foremost because they mastered this combination of good defence and fast and lethal counter attacks.

Like in Chess, there are many ways to win in Football and France just showed us that there is an alternative to Spain’s possession style. Just like Spain needed Iniesta, Xavi or Fabregas to be performant with the possession strategy, France needed to be confident in its defence and its ability to score with very few opportunity. But in fairness with Mbappe’s speed, Pogba and Griezmann technique and players like Kante, Varane or Lloris in the back, they had the team for it.

The importance of Management

What was probably underrated by statisticians and bookies, is also the importance of Management for succeeding in a tournament that lasted about 60 days. Past performance is not as important as the ability of a team to live together and to stay united for about 2 months. France learned this lesson the hard way in 2010, with its players going on strike in South Africa, after some egos went in the way of the team’s common objective. Likewise Neymar’s obsession to save Brazil against Belgium made him misjudge some situations, playing too selfishly when passing the ball was the best option, and simulating fouls with destroyed its reputation and the team’s. On the other side Didier Deschamps, the French coach, built a young team (25 years on average, the 2nd youngest to win a World Cup), without strong egos (Benzema and Rabiot were left home), and clearly designed to live well together through the tournament. He protected the players who were criticized by the media (Pogba, Grizemann, Lloris…) and they gave it back to him by delivering stunning performances in this tournament. Finally Deschamps also adapted to the situation, quickly abandoning his offensive 4-3-3 with Grizemann-Mbappe-Dembele, designed for ball posession and which gave such great results in the test matches, to come back to his more defenseive 4-2-3-1 with Giroud and Matuidi to solidify the team, after the difficult win against Australia in the first round.

Deschamps, who won the World Cup as France captain in 1998, is clearly a master strategist and a great leader, and all the credit to him to have spotted much better than us what were the key success factors to win this World Cup and to have built a strong team and strategy around them. In the end, maybe France’s win was predictable after all.

AnalystMaster

 

 

Champions League SemiFinals predictions (2nd leg)

Hi football fans,

This is it, tonight’s game will decide which teams will go to Kiev for the Champions League Final on May 26th.

Real Madrid and Liverpool are both in good position to pass this round, thanks to some away goals (Real Madrid beat Bayern Munich 1-2 in Munich), or to a comfortable lead (Liverpool beat Roma 5-2 at home). But we know better than to think the game is over, especially after Juventus and Roma’s 3 goals “remontadas” after losing the first leg in the Quarter Finals. Munich and Rome have not much to lose and are expected to be offensive tomorrow.

Both games are expected to be close, despite the first leg’s results. Tomorrow we will bet on a win for Real Madrid and from AS Roma (on this last one Bing disagrees and favours a Liverpool win instead).

Enjoy the games!

AnalystMaster

Champions League Semi-Final AnalystMaster Bing
Home Away H D A H D A
Real Madrid Bayern Munich 44% 23% 33% 39% 27% 34%
Roma Liverpool 42% 23% 36% 28% 30% 42%

Champions League SemiFinal Predictions

Hi Football fans,

Here are our predictions for the First leg of the Champions League Semi-Finals.

As Bing, we foresee Bayern Munich taking full advantage of their Home Advantage to win that first game with a 51% / 52% chance.

On the other game between the two challengers who defeated favourites Man City and Barcelona in the Quarter Finals, we also give Liverpool the better odds (45%), but we are not as optimistic as Bing (58%), and give Roma a 55% chance of upsetting the Home team with a draw or a defeat.

Champions League Semi-Final AnalystMaster Bing
Home Away H D A H D A
Liverpool Roma 45% 25% 31% 58% 24% 18%
Bayern Munich Real Madrid 52% 21% 27% 51% 27% 22%

Enjoy the games !

AnalystMaster

Time Series Forecasting: how to chose your algorithm?

Moving Average vs. Exponential Smoothing? What about ARIMA? Is it better to have one algorithm or many at your disposal so that you can switch if need be?

The answer to these questions will vary based on the specificities of the signals you are trying to forecast. There is no universal answer, however we will try to give you some best practices in order to make this call.

One size does not fit all

That may sound obvious, but if you are trying to improve your Forecast Accuracy you need to understand each signal’s specificities and forecast them accordingly. If you have a Seasonal signal, you will need a Seasonal algorithm, if you have some clear trends, you will need a algorithm with trends… etc

Therefore good forecasters do not have just one method at their disposal but many and will chose the best for each signal based on its specificities. Now the next question is how to select this algorithm?

“Natural choice” vs. “Black Box”

Most Forecasting software perform an automatic selection of the forecasting algorithm using the “Best in Class” approach. As they well understand that one size does not fit all, they use their computing power to calculate forecasts on each signal with many different algorithms in order to select the one that works best on each signal. The “best” algorithm is usually selected through “Backtesting”, i.e. forecasting the last few data points using the previous ones and selecting the algorithm that minimizes the forecast error on this few last data points.

That might sound like a great idea, however it can be a very risky strategy because different forecasting algorithms are likely to give very different forecasts. Switching algorithms back and forth is likely to generate a high variability in your forecast from release to release, affecting your credibility as a forecaster due to this lack of consistency, if not generating some serious business problems down the line. If you are Producing according to your forecast for example, you might be familiar with the bullwhip effect, and know that large variations in your forecast will likely generate Capacity and Inventory issues.

Another drawback of this approach is that it tends to treat forecasting as a “Black Box” which makes it difficult to explain which forecasting algorithm has been used. A Forecaster’s job is less about producing the forecast itself than about convincing peers and Management that this is the one they should base their decision upon. A forecast non validated by your Management or peers is of little value and when being challenged about how you built it, the last thing you want is to have is sentences such as:  “the computer said…” or “I have no idea how it was calculated, the machine did it for me” as your only answers.

Finally, do not get fooled by the nice “Best In Class” labelling, and take some time to understand what is really behind it. The so-called “best” algorithm is generally the one that minimizes the lag-1 error, which means that you will be picking the algorithm which based on N data points has best forecasted the N+1 data point. The thing is that it’s not so hard to forecast accurately the next data point. Many algorithms including “naïve” forecasts (i.e. a simple forecast equal to the last signal value) can be quite accurate for this. The risk for an “unnatural” methodology to be selected (such as a non Seasonal algorithm whereas the signal is clearly Seasonal… etc) is also high. Such a winner of the “best in class” competition might have been the best to minimize your past lag-1 error but is this the algorithm you want to use to forecast 12, 50 or 100 data points ahead and convince your management to use? Probably not.

We highly believe that it is best to drop a few % points of Forecast Accuracy for gaining a higher Forecast consistency and stability, as well as clarity about which forecast methodology has been used. For this reason we recommend a more “natural” approach to selecting the forecasting algorithm: try to first analyse the signal. Does it exhibits clear trend? pick a method with trend? does it exhibit clear seasonality? pick a method with seasonality? Do you want the trend to be linear or damped? pick the right type of trend. Try not to leave these obvious choices to a computer’s “Black Box”, which could generate variability or “strange” forecasts that will later be difficult to explain.

That does not mean that automatic optimization is necessarily bad, it is actually quite useful to figure out the best value of the parameters (for example if you are using an Exponential Smoothing algorithm), or picking the best algorithm among a few “natural” ones, however we recommend you understand carefully how the automatic optimization works and that you narrow the “competition” to “natural” algorithms and ranges of parameters that are likely to return a forecast that makes sense to you and to your team.

Seasonality on the side please!

Seasonality is one of the most difficult component of your signal to forecast. Get it wrong and your forecast accuracy will seriously be damaged.

There are essentially two types of forecasting algorithms: some that deal with Seasonality automatically (Holt-Winters for example), and some who don’t incorporate Seasonality (Moving Average, Exponential Smoothing, ARIMA…). You can still use these methodologies for seasonal signals, it just means that you will have to deal with Seasonality on the side, by “de-seasonalizing” your signal first (i.e. removing the seasonality inside it), and “re-seasonalizing” your signal after you’ve forecasted.

For a Signal X(t), calculate the Seasonal coefficient S(t)

Consider the de-seasonalized signal Y(t) = X(t) / S(t), and forecast Y(t+T) using the forecasting methodology of your choice (Exponential Smoothing, Moving Average, ARIMA.. you name it)

Then re-seasonalize using the same Seasonality coefficient S(t+N)

X(t+T) = S(t+T) * Y(t+T)

We recommend this later approach. Similarly to “guiding” the computer by pre-selecting some algorithms and range of parameters, “guiding” the algorithm on the seasonality side is the one of the best way to avoid large pitfalls and ensuring good forecast accuracy. When selecting the seasonality coefficients to be used on a particular Item you have essentially two options:

  • Item Level seasonality: In that case you will be calculating the seasonality based on the historical signal for this particular item. This approach is to be used if a for items with a specific and distinctive seasonality
  • Aggregated seasonality: In that case we will be using the seasonality of a bigger aggregate to which the item belongs. For example if I am a retailer trying to forecast the sales of a new red sweater with little historical data, I could use the seasonality of all the others red sweaters and assign it to this seasonality to the new red sweater. This approach is to be used for items with no distinctive seasonality pattern, or with not enough history to define a specific pattern from it.

Again, nothing wrong with automatic seasonality algorithms such as Holt-Winters. Feel free to use them if they have demonstrated a good and consistent forecast accuracy. However, if you are suffering from low forecast accuracy, a good first step towards improving it would be to perform a seasonality calculation on the side

Is aggregating data good?

There is a general belief that “Forecasts at Aggregated Level are more accurate than Forecasts performed at a Lower level”.  For this reason some forecasting softwares will propose you algorithm that forecast at an aggregated level, aggregating items with similar properties (same line, same color… you define it).

That might result in a quite accurate forecast on this aggregate indeed, but then what? How do you get back to your Forecast at item level? The answer is generally simple: you will have to split it. For example if you forecast the sales of some Shoes of different sizes, it could make sense to aggregate the sales of all the sizes together to get a smoother signal, to forecast this and then to break your forecast down at size level using some kind of split rule. Despite a good forecast on the aggregate, mess up with the split rule and your forecast accuracy at item level will be damaged.

From our experience, if you select well your forecasting algorithm, it does not make a difference to your Item level forecast accuracy, to forecast at aggregate level and split, or to forecast directly at Item level. Actually, too much aggregation can make you miss some important patterns that you would not have missed by forecasting at item level.

Let’s imagine that you forecasts Shoes and aggregate sales from a model which comes in Black and in White. The two models have very different seasonality patterns: winter for Black, summer for White. Now let’s assume the Black model sales much more than the White (10 times more on average), the risk here is that your aggregate signal will be very close to the Black color signal. You will forecast it, and then when you split you are likely to apply a Black / Winter seasonality pattern to your White shoes, something you would have likely not done if forecasting colors separately.

That does not make the aggregating / splitting approach irrelevant. This approach can be quite useful if you have to forecast many items and do not have the resources to look at all of them individually. But be careful to applying this method to aggregate that “make sense” (for example the sales of all the sizes of the Black shoe sounds fine), and avoid mixing signals with different seasonality or where the split rule to go back to the Item level is not obvious.

Keep it simple

The French have a saying: “Le Mieux est l’ennemi du Bien”, which literally translates into The Better is the enemy of the Good, and could be interpreted as Passed a certain point, adding complexity will do more Damage than Good.

The complexity level you need to introduce in your algorithm should be just enough to capture the main specificities of the signal you are trying to forecast. However you do not want to introduce so much complexity than you lose the big picture or a clear understanding of how your algorithm works.

My teacher once said “What is easy to conceive, should be easy to explain“. Never forget that a Forecaster’s job is less to build the forecast itself than to be able to explain how it was built, in simple terms, to non Forecasting specialists, in order to convince them that it is the correct path towards decision making.  Therefore “technical models” which are hard for non-specialists to comprehend such as ARIMA are not recommended unless you can demonstrate their added value.

Bottom Line: Analyse well your signal and introduce the complexity you need to take its main specificities into consideration, but keep your forecasting methodology as simple and as easy to explain to non specialists as you possibly can.

AnalystMaster

 

Weekly Football Predictions

Man United vs. City, Real vs. Atletico Madrid… who will win this weekend big games? Find our latest predictions below!

AnalystMaster

Premier League Week 33

Premier League Week 33 AnalystMaster Bing
Home Away H D A H D A
Everton Liverpool 38% 24% 39% 8% 16% 76%
Watford Burnley 37% 26% 36% 21% 46% 33%
Leicester Newcastle 48% 25% 27% 48% 24% 28%
Brighton Huddersfield 51% 25% 24% 56% 28% 16%
Stoke Tottenham 24% 24% 53% 17% 16% 67%
Bournemouth Crystal Palace 45% 25% 30% 81% 15% 4%
West Brom Swansea 44% 27% 29% 21% 36% 43%
Man City Man United 56% 22% 22% 33% 36% 31%
Arsenal Southampton 61% 21% 18% 89% 5% 6%
Chelsea West Ham 60% 21% 19% 89% 5% 6%

La Liga Week 31

La Liga Week 31 AnalystMaster Bing
Home Away H D A H D A
La Coruna Malaga 45% 27% 27% 45% 23% 32%
Alaves Getafe 37% 30% 33% 21% 46% 33%
Celta Sevilla 47% 24% 28% 58% 28% 14%
Betis Eibar 46% 24% 30% 89% 5% 6%
Barcelona Leganes 75% 17% 9% 89% 5% 6%
Levante Las Palmas 48% 25% 28% 50% 42% 8%
Real Madrid Atl Madrid 46% 23% 31% 41% 29% 30%
Sociedad Girona 42% 24% 34% 56% 28% 16%
Valencia Espanyol 56% 23% 21% 89% 5% 6%
Villarreal Bilbao 53% 24% 23% 89% 5% 6%

Serie A Week 31

Serie A Week 31 AnalystMaster Bing
Home Away H D A H D A
Benevento Juventus 15% 20% 65% 17% 16% 67%
Roma Fiorentina 48% 24% 28% 81% 15% 4%
Spal Atalanta 28% 25% 47% 8% 16% 76%
Sampdoria Genoa 51% 27% 22% 64% 20% 16%
Torino Inter 37% 26% 37% 8% 16% 76%
Crotone Bologna 39% 27% 33% 28% 32% 40%
Napoli Chievo 66% 20% 14% 89% 5% 6%
Verona Cagliari 32% 25% 42% 12% 25% 63%
Udinese Lazio 33% 23% 44% 8% 16% 76%
Milan Sassuolo 53% 24% 23% 64% 20% 16%

Ligue 1 Week 32

Ligue 1 Week 32 AnalystMaster Bing
Home Away H D A H D A
St Etienne Paris SG 24% 23% 53% 8% 16% 76%
Monaco Nantes 63% 20% 17% 89% 5% 6%
Toulouse Dijon 56% 23% 21% 45% 23% 32%
Guingamp Troyes 51% 26% 24% 37% 33% 30%
Amiens Caen 47% 29% 24% 45% 23% 32%
Angers Strasbourg 52% 24% 25% 58% 28% 14%
Bordeaux Lille 47% 27% 26% 81% 15% 4%
Nice Rennes 43% 26% 31% 55% 31% 14%
Metz Lyon 22% 22% 56% 17% 16% 67%
Marseille Montpellier 55% 24% 21% 39% 28% 33%

Bundesliga Week 29

Bundesliga Week 29 AnalystMaster Bing
Home Away H D A H D A
Hannover  Werder Bremen 43% 29% 28% 50% 42% 8%
Freiburg Wolfsburg 40% 30% 30% 48% 24% 28%
Koln Mainz 46% 25% 29% 22% 28% 50%
Mgladbach Hertha 42% 27% 31% 45% 23% 32%
Augsburg Bayern Munich 25% 25% 50% 6% 19% 75%
Hamburger Schalke 32% 28% 40% 21% 36% 43%
Dortmund Stuttgart 59% 22% 19% 55% 31% 14%
Frankfurt Hoffenheim 44% 24% 31% 33% 38% 29%
RB Leipzig Leverkysen 46% 24% 31% 45% 23% 32%

Champions League Quarter Final Predictions (1st leg)

The end of the season is coming up and so are the big games! You will find below our predictions as well as Bing’s for the first leg games of the Champions League Quarter Final.

Our predictions are pretty similar, only Barcelona seeming to have a strong chance of winning against Roma and the other games are projected to be very close.

Despite statistics give Liverpool a slightly stronger chance of winning, Guardiola’s team’s chances are almost equivalent so even a draw would not be an unlikely outcome.

As Bing, we are also betting on a Juventus win, however Real Madrid is coming strong in this end of season and has always performed well in Champions League (won back to back titles and beat PSG twice in the previous round). Their chances here are probably under estimated by our algorithms.

Enjoy the games and bet responsibly!

AnalystMaster

  AnalystMaster Bing
Home Away H D A H D A
Sevilla Bayern Munich 32% 25% 43% 28% 31% 41%
Juventus Real Madrid 47% 23% 30% 43% 31% 26%
Liverpool Man City 40% 24% 36% 41% 32% 27%
Barcelona Roma 53% 23% 24% 72% 17% 11%

Decision Making: why you need to know about the Bayes Theorem

In this section of the blog we will take you into the process of decision making, and we will start with introducing you the Bayes Theorem and why it is so important in the decision making process.

But let’s start with the beginning: how do you make a decision?

Decision Making and Probabilities

If you decide to go out without an umbrella, it could be because you just forgot it, but a more likely reason is that you think it will not rain. The weather is quite impossible to forecast with a 100% certainty (especially in my hometown in Ireland), but somehow you have evaluated that the probability of rain was low (let’s say <10%) and that carrying an umbrella with you was not worth the trouble compared to the benefit of having it in the unlikely event of rain.

We are used to make such decisions unconsciously and this is the basis of Risk Management and Decision Making.

Let’s take another example: you are a Retailer and need to decide how many pieces of Item X you need to carry in your store to avoid running out of stock and losing sales. If you decide to carry 5 pcs of Item X (for which you receive replenishment every day) in Stock, it is (or at least it should be) because you have evaluated that you almost never sell more than 5 pcs a day, and that if you do this happens so rarely (let’s say less than 1% of the time) that you are willing to accept the risk of running out of Stock 1% of the time vs. the cost of carrying additional Inventory to prevent against all the possible odds.

This probabilistic decision making process requires a deep understanding about the events of this world (such as rain or making more than 5 sales per day), now how do we evaluate them?

Rethinking Reality: nothing is certain!

Evaluate or Estimate the probabilities about the events of the world are carefully chosen words. This probabilistic approach invites us to rethink Reality and what we hold for certain.

A prediction like The Sun will rise tomorrow sounds so obvious that most of us would hold it as a universal truth. The probabilistic decision maker would instead say The Sun will rise tomorrow with a 99% chance. Then, every day, as the Sun rises, the probabilistic decision maker refines his estimate which eventually becomes The Sun will rise tomorrow with 99.9% chance, then 99.99% chance, then 99.99999% chance. However the probabilistic decision maker will never give into the certainty of holding The Sun will rise tomorrow statement as an absolute truth. For him, nothing is certain in this world and 100% probability does not exist! (as a matter of fact we now know that in about 5 billion years from now the Sun will begin to die, so eventually one day The Sun will NOT rise tomorrow!)

Therefore we will never know for sure the real probabilities necessary for evaluating risks and making decisions like carrying an umbrella, selling more than 5 pcs per day, or seeing the Sun rise tomorrow. However what we can do is Estimate them through Observations and Tests.

It is very important to make the distinction between Tests and Absolute Reality as they are not the same thing and Tests incorporate a risk of error:

  • Tests and Reality are not the same thing: for example being tested positive for Cancer and having Cancer are not the same thing
  • Tests are flawed: Tests can be wrong. For example you can be tested positive for Cancer and not have Cancer at all (this is called a false positive) or being tested negative for Cancer and have it (this is called a false negative)

The Bayes Theorem and its applications

Instead of holding Universal Truths, we are now invited to think the world (even the most certain things like the Sun rising every day) in terms of probabilities, and to evaluate these probabilities through Objective Tests and Observations, and continuously refine these estimates as new evidence comes up.

In a probabilistic world, this translates into the Bayes Theorem:

Bayes Theorem:

P(A¦X) = P(X¦A) * P(A)  /  P(X)

i.e. probability of A happening knowing X happened = probability of X happened knowing A as true (true positive) * probability of A happening / probability of X happening

or its equivalent form

P(A¦X) = P(X¦A) * P(A) / ( P(X¦A) * P(A) + P(X¦not A) * P(not A) )

Now let’s see how the Bayes Theorem works on a practical example. Let’s try to evaluate P(A¦X) the probability of having Cancer (A), following the result of a positive test X

Prior probability of having Cancer before the test P(A) = 1%

We know that P(not A) = 1 – P(A) = 99%

New Event occurs: tested positive for Cancer

  • P(X¦A) is the true-positive probability of having Cancer knowing that you have been tested positive = 80%
  • P(X¦notA) is the false-positive probability of being tested positive if you do not have cancer have cancer:  = 10%

Posterior probability

P(A¦X) = P(X¦A) * P(A) / ( P(X¦A) * P(A) + P(X¦not A) * P(not A) ) = 7.5%

The Bayes Theorem invites us to start with an initial estimate of 1% chance of having cancer, which will increase to 7.5% after having being tested positive, incorporating the risks of true and false positive.

A second positive test would increase the probability of having cancer further

Prior probability of having Cancer before the test P(A) = 7.5%

We know that P(not A) = 1 – P(A) = 92.5%

New Event occurs: tested positive for Cancer

  • P(X¦A) is the true-positive probability of having Cancer knowing that you have been tested positive = 80%
  • P(X¦notA) is the false-positive probability of being tested positive if you do not have cancer have cancer:  = 10%

Posterior probability

P(A¦X) = P(X¦A) * P(A) / ( P(X¦A) * P(A) + P(X¦not A) * P(not A) ) = 41%

After this second positive test we know have 41% chance of having cancer.

Bottom Line

The Bayes theorem is all about acknowledging that we do not know for sure about the events in the world, that we need to think about them probabilistically and that we need to refine our estimates of these probabilities as new data becomes available

Old Forecast + New & Objective data = New Forecast

This sounds obvious but it is the core of Forecasting, Risk Management and Decision Making

AnalystMaster