0% found this document useful (0 votes)
48 views

Football - Match - Result - Prediction - Using - Neural - Networks - and - Deep - Learning Yeah

Uploaded by

maryy23 23
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
48 views

Football - Match - Result - Prediction - Using - Neural - Networks - and - Deep - Learning Yeah

Uploaded by

maryy23 23
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/344981991

Football Match Result Prediction Using Neural Networks and Deep Learning

Conference Paper · June 2020


DOI: 10.1109/ICRITO48877.2020.9197811

CITATIONS READS

7 2,110

3 authors, including:

Sarika Jain
National Institute of Technology, Kurukshetra
146 PUBLICATIONS 1,178 CITATIONS

SEE PROFILE

All content following this page was uploaded by Sarika Jain on 22 August 2021.

The user has requested enhancement of the downloaded file.


2020 8th International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions) (ICRITO)
Amity University, Noida, India. June 4-5, 2020

Football Match Result Prediction Using Neural


Networks and Deep Learning
Ekansh Tiwari Prasanjit Sardar Sarika Jain
Department of Computer Applications Department of Computer Applications Department of Computer Applications
National Institute of Technology National Institute of Technology National Institute of Technology
Kurukshetra, India Kurukshetra, India Kurukshetra, India
[email protected] [email protected] [email protected]

Abstract—In the present world, the prediction of the results the team itself. The aim of this paper is to predict the result of
of football matches is being done by both football experts and a football match. For the prediction process, we make use of
machines. Football as a game produces a huge amount of Recurrent Neural Networks(RNNs) and Long Short Term
statistical data about the players of the team, the matches Memory(LSTM). We aim at increasing the accuracy of the
played between the teams, the environment in which the match prediction by taking into consideration all the events that take
is being played. This statistical data can be exploited using place during a football match and their effects on the
various machine learning techniques to predict various outcome of the match.
information related to a particular football match namely the
result of a particular game, injury of a player, performance of II. PROPOSED METHOD
a player in a particular match, spotting new talents in the
game etc. We in this project will attempt to design a prediction Previous works on predicting the results of football
system powered by machine Recurrent Neural Networks and matches with machine learning techniques have mainly
LSTMs. focused on the data available about the team. The focus has
only been limited to a small number of leagues as well. The
Keywords—Neural Networks, LSTMs, Recurrent Neural approach of this project is to deduce better features from the
Networks. results of the previous matches that the team has played and
taken into consideration the current form of the team predict
I. INTRODUCTION the accurate result to the game.
Football being one of the most popular sports around the A match is played in a particular environment and the
globe has a huge amount of fan following the day to day events happening on the field have a huge effect on the
events in the game. The people associated with the football outcome of the match. So, a Long Short Term Memory
teams as well as the followers of the teams often come system is exploited for this paper.
across instances where it is hard to guess how their team will
perform in a particular game, this is where game result A. Dataset
prediction systems come into play. Soccer score prediction The dataset for the project has been taken from
can be a helpful means to assess the readiness of a team “http://football-data.co.uk/data.php”. The dataset contains the
before going into a game. The prediction can also help the data of the seasons from 2010-11 to 2017-18 of the English
management of different football associations in getting their Premier League. The advantage associated with this dataset
teams ready for the upcoming matches so as to get the best is that the number of matches played by each team and the
result possible out of the fixture. Accurate football match total number of matches in the tournament is fixed. This was
outcome prediction is valuable for small and big sports beneficial in removing unnecessary information from the
telecommunication companies as it can help them increase system. In each dataset, there are around 380 records each
their revenue and make the game a lot more interesting for which are in chronological order with around 60 attributes.
the viewers. From the given dataset manual feature generation was done
There have been several attempts at trying to predict the to get new attributes. For example: the values from the
outcome of a football match but none of them has been able ’FTAG’(Full Time Away Goals) and ’FTHG’(Full Time
to match the accuracy of the human prediction. Humans are Home Goals), two new attributes ’HTGS’(Home Team Goals
still by far superior in predicting the outcome of the match as Scored) and ’ATGS’(Away Team Goals Scored) were
they take into account the technicality as well as the calculated by summing over the attributes through all the
emotional factors that affect the outcome of the game, this rows. All the contenders in every match had their form
helps them in predicting well but it has its drawbacks as calculated by finding the winning streaks of the teams and
well. Humans let their emotions overpower them in the recording if any team had won 5 games in a row or 3 games
prediction which more or less leads to a wrong prediction of in a row. Finally, one hot encoding was done for
the outcome. Various factors affect the outcome of a football classification purpose.
match such as the number of goals scored in previous B. Methodology
matches by a team, the winning streak that the team is
coming to play with in the present game, the environment in There are a few parameters in every neural network that
which the team is playing the current fixture, the number of won’t change throughout the working of the model even
cards that the players involved in the game are playing with when the calculation of the accuracy of the model is done.
and many more other factors associated with the current Such parameters are known as hyper-parameters. Once the
form of the players in the team as well as the current form of model has made use of a certain set of hyper-parameters

978-1-7281-7016-9/20/$31.00 ©2020 IEEE 229

Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY KURUKSHETRA. Downloaded on August 22,2021 at 16:00:59 UTC from IEEE Xplore. Restrictions apply.
these parameters can be played around with to get a better n_classes = 2
understanding and accuracy of the model.
batch_size = 1
The equation for the value of a state of an RNN cell is
given by chunk_size=27
n_chunks=1
ht = f(ht‫͑͢͝׺‬Ήt)
rnn_size=512
where ht is the new value of the state,ht‫ ͑͢׺‬ΚΤ͑ ΥΙΖ͑
ΡΣΖΧΚΠΦΤ͑ΧΒΝΦΖ͑ΠΗ͑ΥΙΖ͑ΤΥΒΥΖ͑ΒΟΕ͑͑Ήt is the input. Step 4: Create the RNN model with LSTMcell.
RNNs can not only remember the sequence in one instance, Step 5: Feed the system with data and Lstmcell
but also the sequence of the various instances coming in mechanism.
because the nodes storing the weights and the activation
functions are the same. With respect to football match Step 6: Train the model using the function created in step
outcome prediction, the dataset is a relational database as 4.
mentioned in section 4.1. Also, the order of the various The values of hyper-parameters can be played around
columns is unimportant as it doesn’t affect the outcome of with to check which combination of the parameters is best
the match. for the prediction system.
LSTM cells are a modification of RNNs. LSTM cells
implement RNNs and with that, they perform an additional III. RESULTS
calculation with previous output from the RNN cell and the The experiment has been performed on the above dataset
new input to generate the new output. LSTM cells are using a basicLSTM cell from the tensorflow library. The
mainly composed of three gates : model was run on a various set of hyper-parameters to find
out the best model. The hyper-parameters include -
x Forget gate: this gate is known as the sigmoid
function. It is responsible for forgetting information 1) n classes - total number of output classes.
that is not required anymore by the system when 2) batch size - number of rows fed in one iteration to the
moving to the next sequence. model.
x Input gate: this gate works as an input provider to 3) hm epochs - number of epochs the model runs for.
the network. It uses tanh function as well as the
sigmoid function. 4) chunk size - number of attributes in each chunk.
x Output gate: this gate is responsible for giving the 5) n chunks - number of chunks in one instance of data.
output that is calculated by the processing that is 6) rnn size - number of LSTM cells in the hidden layer of
done at the previous two gates the recurrent neural network.
.The given model is implemented with the help of Table 1 shows the various train and test accuracies with
tensorflow library in python. All the data in this library is chunk size = 3 and n chunks = 9, for batch sizes from 1 to
processed in the form of an array. A code written in 124. The model with chunk size as 27 was considered for
Tensorflow works in two steps. Initially, a computational further analysis. Table 2 shows the accuracies of the model
graph is formed, where all the connections among various with varying size of the hidden layer. Here, for increasing rnn
layers of the neural network and the calculations of the size, though the training accuracy is increasing, there is no
weights are done but none of these variables has values. sufficient increase in the test accuracy. This suggests that the
They are initialized with Tensorflow objects without any model is over-fitting the data. Hence, for the present dataset,
numerical values. The second step is the execution, where a the best model values are as follows -
session function is run and all the variables that have to be
calculated are passed as parameters. This generates 1) n classes - 2
numerical values that are assigned to the variables.
2) batch size - 1
C. Implementation
3) hm epochs - 10
The code for the implementation of LSTM and RNN for
the prediction can be seen below. 4) chunk size - 27
Pseudocode : 5) n chunks - 1
Step 1: import the necessary libraries 6) rnn size - 64
import tensorflow as tf TABLE I. RNN WITH CHUNK SIZE 27 AND VARYING BATCH SIZE

from tensorflow.contrib import rnn


Batch size 1 30 60 124
import pandas as pd ↑

Accuracy 0.9811828 0.7688172 0.7413978 0.7354839


Step 2: import the one hot dataset using pandas train 6 ↓

Step 3: Define the hyper-parameters Accuracy 0.8075 0.69875 0.69375 0.69


test ↓
hm_epochs=10

230

Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY KURUKSHETRA. Downloaded on August 22,2021 at 16:00:59 UTC from IEEE Xplore. Restrictions apply.
TABLE II. :RNN WITH CHUNK SIZE 3 AND N CHUNKS AS 9 AND [3] Igiri, C. P. (2015). Support Vector Machine—Based Prediction
VARYING BATCH SIZE System for a Football Match Result. IOSR Journal of Computer
Engineering (IOSR-JCE), 17(3), 21-26.
Batch size 1 30 60 124 [4] Rotshtein, A. P., Posner, M., & Rakityanskaya, A. B. (2005). Football
↑ predictions based on a fuzzy model with genetic and neural tuning.
Cybernetics and Systems Analysis, 41(4), 619-630.
Accuracy 0.922043 0.6973118 0.6655914 0.6704301 [5] Pettersson, D., & Nyquist, R. (2017). Football Match Prediction Using
train ↓ Deep Learning (Doctoral dissertation, Master’s thesis, Chalmers
University of Technology).
[6] Cui, T., Li, J., Woodward, J. R., & Parkes, A. J. (2013, April). An
Accuracy 0.79875 0.6525 0.64625 0.6525
ensemble based genetic programming system to predict English
test ↓ football premier league games. In 2013 IEEE Conference on Evolving
and Adaptive Intelligent Systems (EAIS) (pp. 138-143). IEEE.
TABLE III. RNN WITH DIFFERENT SIZES OF HIDDEN LAYER [7] Kent, M., & Kent, D. M. (2006). The Oxford dictionary of sports
science and medicine (Vol. 56). New York: Oxford university press.
[8] Schumaker, R. P., Solieman, O. K., & Chen, H. (2010). Sports data
RNN size 32 64 256 512 mining (Vol. 26). Springer Science & Business Media.
[9] Wu, X., Kumar, V., Quinlan, J. R., Ghosh, J., Yang, Q., Motoda, H.,
Accuracy 0.9731183 0.9704301 0.9919355 0.9892473 ... & Zhou, Z. H. (2008). Top 10 algorithms in data mining.
train 4 Knowledge and information systems, 14(1), 1-37.
[10] John R. Koza and Riccardo Poli. A Genetic Programming Tutorial.
Accuracy 0.8125 0.805 0.805 0.8025 Freely available at http://www.genetic-programming.com, 2003.
test [11] Haym Hirsh, Wolfgang Banzhaf, John R. Koza, Conor Ryan, Lee
Spector, and Christian Jacob. Genetic programming. IEEE Intelligent
Systems, 15(3):74–84, May-June 2000.
IV. CONCLUSION
[12] Riccardo Poli, William B. Langdon, and Nicholas Freitag McPhee. A
The popularity and international effect that football has Field Guide to Genetic Programming. Lulu Enterprises, UK Ltd,
makes it an interesting problem to solve. Moreover, the 2008.
number of factors that affect the outcome of a match is [13] Wolfgang Banzhaf, Frank D. Francone, Robert E. Keller, and Peter
enormous. From the results, we can say that RNNs with Nordin. Genetic programming: an introduction: on the automatic
evolution of computer programs and its applications. Morgan
LSTM show a visible and obvious advantage over the Kaufmann Publishers Inc., San Francisco, CA, USA, 1998.
original ANN and the traditional machine learning. Hence,
[14] Kantardzic, M., “Data mining – Concepts, models, methods, and
other than the mainstream uses of LSTMs, this particular algorithms”, Wiley-IEEE Press, 2003.
path of using it for prediction of the outcome for sporting [15] Garfield, J. B., & Burrill, G. (Eds.). (1997). Research on the role of
events has also shown promising results. This model can still technology in teaching and learning statistics. International Statistical
be improved. One way to do that is, by using a better set of Institute.
attributes. They can include statistics of each individual [16] Agatonovic-Kustrin, S., & Beresford, R. (2000). Basic concepts of
player. This can also help in predicting the form of a artificial neural network (ANN) modeling and its application in
particular player from season to season. Larger datasets will pharmaceutical research. Journal of pharmaceutical and biomedical
analysis, 22(5), 717-727.
also help in better training the neural network.
[17] Koza, J. R. (2003). INTRODUCTION TO GENETIC
An LSTM network performs wells on classification and PROGRAMMING TUTORIAL GECCO-2003 CHICAGO(Doctoral
has the potential to perform well on predictions of the dissertation, Stanford University Stanford, California).
outcome of football games. In their research Hucaljuk et. al. [18] Kuncheva, L. I., Whitaker, C. J., Shipp, C. A., & Duin, R. P. (2003).
Limits on the majority vote accuracy in classifier fusion. Pattern
showed that the Artificial Neural Network predicted the Analysis & Applications, 6(1), 22-31.
winners of a football match with the best accuracy of 68.8%. [19] L. Lam and S. Y. Suen. Application of majority voting to pattern
In contrast to this, the accuracy achieved by the prediction recognition: an analysis of its behavior and performance. IEEE
system that worked with the LSTM form of RNNs showed Transactions on Systems, Man, and Cybernetics, Part A, 27(5):553–
huge improvement with a test accuracy of 80.75%. 568, September 1997.
[20] Louisa Lam and Ching Y. Suen. Optimal combinations of pattern
This is no surprise since more information should lead to classifiers. Pattern Recognition Letters, 16(9):945–954, September
better predictions. However, the increased prediction 1995.
accuracy over minutes played in a match indicates that the [21] https://in.nba.com/?gr=www
network is able to learn about football. Future work can be [22] https://www.premierleague.com/
performed on this subject; for example, different data could [23] https://www.uefa.com/
be used, different inputs and architectures could be tested [24] Alpaydin, E. (2009). Introduction to machine learning. MIT press.
and other things than winner could be predicted, such as the [25] Tarantola, A. (2005). Inverse problem theory and methods for model
number of goals or cards. parameter estimation (Vol. 89). siam.
[26] Landwehr, N., Hall, M., & Frank, E. (2005). Logistic model trees.
REFERENCES Machine learning, 59(1-2), 161-205.
[1] Zdravevski, E., & Kulakov, A. (2009, September). System for [27] Teodorović, D. (1999). Fuzzy logic systems for transportation
Prediction of the Winner in a Sports Game. In International engineering: the state of the art. Transportation Research Part A:
Conference on ICT Innovations (pp. 55-63). Springer, Berlin, Policy and Practice, 33(5), 337-364.
Heidelberg. [28] Allen, N. J., & Barres, B. A. (2009). Neuroscience: glia—more than
[2] Lam, M. W. (2018). One-Match-Ahead Forecasting in Two-Team just brain glue. Nature, 457(7230), 675.
Sports with Stacked Bayesian Regressions. Journal of Artificial
Intelligence and Soft Computing Research, 8(3), 159-171.

231

Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY KURUKSHETRA. Downloaded on August 22,2021 at 16:00:59 UTC from IEEE Xplore. Restrictions apply.
View publication stats

You might also like