Why every data scientist should learn SQL

It’s been quite a long time since my last post for my blog. But that has been because of a specific reason: I participated at the 2nd DFB Hackathon, which consumed a huge amount of my freetime, which I normally spent creating some content for my blog. The Hackathon was again a great experience as all this deep data science stuff is still a challenge for me. But there’s again on big question on my side: Why are data scientist often just using Python (or R) and don’t know, how and when to use SQL.

Migrating Exasol Community Edition

In one of my older posts I described the data architecture, I am using for all my examples. As the database I use the Exasol Community Edition. From time to time it is necessary to update your software to the current version because of new features. This post will describe, how to migrate a Exasol community edition to anther one. These steps can also be used, to migrate nearly every database to an Exasol.

xG data journey – the raise of M. Gladbach

After getting all this expected goals data, it’s of course most obvious to take a look at the insights such data can produce and in which way xG can be interpreted. I have decided to take a look at the current development of Borussia Moenchengladbach in the Bundesliga . Even if RB Leipzig took over now the first place, the development of Gladbach in comparison to the last seasons is impressive. And now I just want to know: Does xG data reveals the secret of Marco Rose?

From Business Analytics to Sports Analytics

Before I started analyzing data for sports betting I have worked as a Business Intelligence (BI) consultant in different industries. During this time I learned how Business Analytics helps you to improve your business performance by analyzing data. This also helped me to understand, what’s needed to improve the performance of a sports team or the betting performance of a punter with the help of data.

xG data journey – scrapping dynamic webpages

In the first part of this data journey, I took a look  at the general definition of expected goals (xG) and the usage of this metric. In the next step in the process of testing the predictive power of xG, I need to get some data. This part will focus on getting the team expected goals statistics. In one of the following parts, I will also take a look on getting the player expected goals statistics as this of course offers even deeper insights.

xG data journey – What are ExpectedGoals?

After I realized my available data is definitely not enough to beat the bookie, I decided to start a new data journey and take a look at some more advanced statistics. And what could be better suited as Expected Goals (xG). This statistic is used more and more to explain this specific luck / bad luck factor, you feel, when watching a football match. In the first part of this journey I will explain, what are xG and what they tell you about a football match.

Retrospective for Bundesliga season 2018/19

Before the new season will start I should take a look at the last season. Everybody following my pick history already knows: the last season again was very disappointing! But I again have to point out, that I of course did not expect to find the “holy grail” after just two seasons of model testing. So how bad do the numbers really look, and what are the most important “lesson learned” are….

Overcome your confirmation bias (guest blog)

When you follow my twitter account, you may have noticed, since several month I started also writing blogs and articles for other platforms. Even so these are most of the time not about sports betting, I thought it would be a good idea, to share them also via my blog and also share some thoughts about the topics as the main message is often the same: Get the most out of your data!

A data journey – market values (part 2)

In the last post I described, how I collected the market value data as the first step of my journey. The second step is – in my opinion – one of the most important ones. Get to know your data! Of course many predictive methods can be used as a black box. But that's something I would not suggest. At least you should understand how your values are distributed. And it's even better, when you build some kind of domain knowledge. To know your data offers you the possibility to shorten the training process of you predictive models. And visualizations always help to better understand your data.