Some days ago I read an interesting article about how bookies arrange their margin to the possible outcomes. All bookies keep this of course secret as this offers them a specific range, where they can shorten or lengthen the odds depended on the amounts of placed bets. But I need this information, because I simulate my prediction models for back and lay markets, with just the odds of the back markets. This post will explain, how you should calculate the bookie margin, and how you should not do it. I handled this topic a little bit naively during the development of my Poisson model, which causes some problems.
The next 10 matchday are played in the German Bundesliga. Bayern Munich are (again) already champions and the relegation of the last never-relegated member HSV comes closer. It is time to take a look at the performance of my Poisson model since start of 2018.
The first and the second part of this series explained some basic methods to optimise the regression models for the GS & PPG match rating. You have now a set of 3 different regression models (linear, polynomial and polynomial without outliers) for each predictive variable. These models now have to not only compete against each other, but also of course against the Bookie odds and the Poisson prediction model.
The first part of this series took a look at the GS match rating model. The post described, how you are able to identify a non-linear relationship between the predictor variables and the outcome variable. The same methods will now be applied to the PPG match rating model, so that we are able to compare the two different polynomial regression models. On top, I want to show, how you are able to figure out, whether outliers in your data have an influence on your regression model.
The next 10 match days are played in the German Bundesliga and so it is time for the next summary of the current betting history of my Poisson prediction model. During my summary about the first 10 match days I faced a really big loss. This trend continued also for the next match days, but some interesting observations can be made.
In the last post I described, how the features for the GS & PPG match rating models are calculated. Based on these features I will now describe, how you build and optimise a linear regression model with R. The first part will describe the optimisation of the linear regression model for the GS match rating model in detail. The second part will cover the PPG match rating model. The third and final part will compare the prediction performance of the different models.
After 10 games played in the German Bundesliga, it is a good time to draw a small summary about the current stats of the betting history. If you follow my blog, you should know, I publish every pick at Pyckio. Until now I only publish the picks of the Poisson model. I am still investigating new prediction models.
In the last post I described the predictive models, which will be explained in this series. Following the development process for predictive models, the next steps should handle the raw data supply for the predictive models. Fortunately football-data.co.uk already offers all data, which is needed for these models. So this post will explain, how you implement the features for the GS and PPG match rating models based on the existing Raw Data Vault model.
This post will be the start of a new series, where I explain, how to implement another predictive model at the TripleA DWH architecture. When starting developing predictive models with R, I was a little bit overstrained by the different plots provided by R, which can be used to analyse and optimize your predictive model. That’s why I wanted to learn and understand the whole optimizing process in R on base of a simple predictive model. Football-data.co.uk provides an explanation for a small rating system, which uses a linear regression to predict the probability for a home-win, draw or away win. I have chosen this linear regression model, as linear regression is a frequent used and easy to understand predictive method. With a linear regression you can investigate the relationship of the variable, which should be predicted, and one or more features.
In the last post the prototype of the Poisson prediction model has proven, that the optimised model is suitable to beat the bookie – at least for the German Bundesliga. The next step in the predictive model development process consists of implementing the model for forecasting the current fixtures. Regarding this model this part is very easy, as you need not to implement a trained model, just the prediction logic.