Inflated ML Poisson model to predict football matches

My last blog post “Poisson vs Reality” did change something in my head. I realized, that I not yet checked single parts of my model enough, whether they differ from reality and whether I could reduce this difference and improve the model performance. That’s why I started creating a new model approach for the new season and focus on the improvement of single steps during the model process. After the training of multiple models, I will test against the fair profit, which kind of adaptions improve a Poisson distribution model the most.

Continue reading “Inflated ML Poisson model to predict football matches”

Using xG & advanced stats to predict football matches

With the BeatTheBookieDataService in place it’s also time to provide some new models. This post will take a look at possible models using the team statistics provided for each match by understat.com. Therefor I will compare 3 of the most used machine learning algorithms. Beside this, it’s also time to test again some basics for predictiv modeling for football: “To differ between home/away performance or not to differ”? For my Poisson models I always differed between home and away performance. But is this also needed, when using ML algorithms?

Continue reading “Using xG & advanced stats to predict football matches”

Why every data scientist should learn SQL

It’s been quite a long time since my last post for my blog. But that has been because of a specific reason: I participated at the 2nd DFB Hackathon, which consumed a huge amount of my freetime, which I normally spent creating some content for my blog. The Hackathon was again a great experience as all this deep data science stuff is still a challenge for me. But there’s again on big question on my side: Why are data scientist often just using Python (or R) and don’t know, how and when to use SQL.

Continue reading “Why every data scientist should learn SQL”