Part one defined the basic architecture of the Team Strength MLP (multi layer perceptron). The training process and its monitoring via Tensorboard was explained in part two. Now it is time to take a look at the prediction of football matches. Primarily this consists of following steps:
- Load the prediction data set
- Re-build neural network architecture and load pre-trained weights
- Execute prediction
The Bundesliga season 2017/18 will be the test case for this example. The season 2008 – 2016 were used to train the mode.
Load prediction data set
In comparison to the training data set, the prediction data set contains just the different team strength values. Of course, we want to predict the outcome. Beside this, the procedure is exactly the same as for the training data set: connect to the database, load the data and close the connection again.
#read prediction data set df_data = v_Con.export_to_pandas(""" select his.football_match_his_lid, str.home_attacking_strength, str.home_defence_strength, str.away_attacking_strength, str.away_defence_strength from betting_dv.football_match_his_l his join betting_dv.football_match_his_l_s_statistic stat on his.football_match_his_lid = stat.football_match_his_lid join betting_dv.football_match_his_l_s_attack_defence_strength_30 str on his.football_match_his_lid = str.football_match_his_lid join betting_dv.football_division_h division on his.football_division_hid = division.football_division_hid join betting_dv.football_season_h season on his.football_season_hid = season.football_season_hid join betting_dv.football_team_h home on his.football_team_home_hid = home.football_team_hid join betting_dv.football_team_h away on his.football_team_away_hid = away.football_team_hid where division.division = 'D1' and season.season in ('2017_2018') """)
Additionally the data should contain the hash key for the football matches, so that you are easily able to link the prediction result to the existing data model. Everything is stored in a Pandas data frame. This simplifies the work.
Reload network weights
In comparison to machine learning algorithms Tensorflow does not save the complete model. Just the weights of the neurons are stored. So at first you have recreate the network architecture. It is important to use the parameter restore = True. Otherwise the initial weights will not be overwritten.
#rebuild layers v_net = tflearn.input_data(shape=[None, 4], name="InputLayer") v_net = tflearn.fully_connected(v_net, 10, activation="relu", bias=False, weights_init="truncated_normal", bias_init="zeros", trainable=True, restore=True, reuse=False, scope=None, name="FullyConnectedLayer_relu") v_net = tflearn.fully_connected(v_net, 3, activation="softmax", name="OutputLayer") v_net = tflearn.regression(v_net, optimizer=tflearn.Adam(learning_rate=0.01, epsilon=0.01), metric=tflearn.metrics.accuracy(), loss="categorical_crossentropy" ) #define model model = tflearn.DNN(v_net)
The load() function of the model loads the weights from the passed stored model file.
#load saved weights model.load('C:/temp/Tensorflow/v01/relu_adam_001.tf_model')
The data frame columns, which contain the variables, have to be passed to the predict() function. As the result you get an array, which should be added to the existing data frame, so that you are able to write back the complete data to the database.
#prediction prediction = model.predict(df_data[df_data.columns[1:5]]) #add predictions to data frame df_data['PROB_HOME_WIN'] = prediction[:,0] df_data['PROB_DRAW'] = prediction[:,1] df_data['PROB_AWAY_WIN'] = prediction[:,2]
As already described in another post, the import_from_pandas() function writes a complete Pandas data frame to a database table. But you have to ensure, that the table exists and that the structure of the data frame and the table is the same.
#write predictions to db table v_Con.import_from_pandas(df_data,'TEAM_STRENGTH_MLP')
All sources are again available at GitHub:
Following SQL statement can be used to calculate the accuracy for the predicted season and compare it to the Poisson model and the bookmaker Bet365:
Even though it is just one season, the model shows an improved performance. The brier score improved by 0.07 although the MLP uses just the same variables as the Poisson model. This equals roughly an improvement of 33% in relation to the gap to the bookie.
For the next post the model must now prove its performance during a multiple season backtest. This will also show, whether the model is profitable against the bookie.
If you have further questions, feel free to leave a comment or contact me @Mo_Nbg.