As already done for the ZIP Poisson model, I also added some smaller leagues data to my Vanilla Poisson model: Championship, Seria B, La Liga 2, Eredivise, Liga Portugal. All these additional leagues are already available through my data service. So it’s time to take a look how profitable these new leagues are using the Vanilly Poisson model.
As for each other betting simulation I used a 1 unit flat-stack betting strategy. Each bet, which indicates value, is selected. The bets are placed against fair Bet365 odds. So the betting is free from any margin. All bets placed since the start of the season 2018/19 result in loss of -584,7 units.
This corresponds to an average yield of -3.82% or a loss of 0.04 units per bet. The ZIP Poisson model provided an average yield of -5.22% for the same leagues and the same period of time. But let’s look into more details.
As for all of my models, it’s again the case, that you should not place a bet, when the model indicates value for the draw. Determine the real probability of a draw is just too hard and the average yield for draw bets is -73.75%.
Looking at the average profit per division without draw bets shows a bit of surprising picture. All the new added leagues are more or less at the end of the list. The Championship is the only league to split the top list of already good performing big leagues in Europe with a yield of 2.63%. Then there’s the block of new smaller leagues with an overall similar bad performance. Serie A provides again the biggest loss. But that’s expected, as all my models have problems predicting this league.
I would have expected a better performance of the Vanilla Poisson model for the new leagues as again xG data is used. But the running profit comparison between the goals-based ZIP Poisson model and the xG-based Vanilla Poisson model indicates a similar performance. The ZIP Poisson even indicates a bit smaller loss of -17.4 units in comparison to -30.1 units for the Vanilla Poisson model.
This is somewhat at odds with my blog on the predictive power of xG data and why is it superior to goals data. Looking at the model performance on league level, it’s possible to spot the differences. Eredivise and Serie B stand out. For the ZIP Poisson model the Eredivise provides a really good looking profit. But this profit gets eaten up by the loss of Serie B. For the Vanilla Poisson model the variation between the leagues is way smaller. The xG data provides a better profit for 4 of 6 leagues.
Here we might also take in consideration, how xG data is produced and which source is providing the xG data for the new leagues. xG data is no statistic, which you collect during a match by counting specific events. It’s always a model, which produces the data. And each website or data provider uses its own xG model, which of course also provides a different xG data. The xG data, which I am using for the Big5 leagues, is scrapped from understat.com. The xG data for the minor leagues is scrapped from the website fivethirtyeight. So we might here also phase a problem of data quality. The xG data from fivethirtyeight might have not the quality and therefor the also the predictiv power in comparison to understat. But this can just be determined by using the same model and same leagues for both sources.
Nevertheless we can take a look at the overall profit of the Vanilla Poisson model respecting the insight of the analyse. Excluding Draw bets, as well as the 4 non-profit-leagues provides an overall profit of 537 units since season 2018/19. If you would just used the model for a shorter time since 2021/2022, you would have faced a loss. But that might always be the case, when just looking at a smaller timeframe of betting.
If you have further questions, feel free to leave a comment or contact me @Mo_Nbg.