Artificial Intelligence and Machine Learning in SP
Artificial Intelligence and Machine Learning in SP
Institute for Health and Sport, Victoria University, Melbourne, VIC, Australia
In the last two decades, artificial intelligence (AI) has transformed the way in which we
consume and analyse sports. The role of AI in improving decision-making and forecasting
in sports, amongst many other advantages, is rapidly expanding and gaining more
attention in both the academic sector and the industry. Nonetheless, for many sports
audiences, professionals and policy makers, who are not particularly au courant or
experts in AI, the connexion between artificial intelligence and sports remains fuzzy.
Likewise, for many, the motivations for adopting a machine learning (ML) paradigm in
sports analytics are still either faint or unclear. In this perspective paper, we present
a high-level, non-technical, overview of the machine learning paradigm that motivates
its potential for enhancing sports (performance and business) analytics. We provide a
summary of some relevant research literature on the areas in which artificial intelligence
Edited by:
Leigh Robinson,
and machine learning have been applied to the sports industry and in sport research.
Cardiff Metropolitan University, Finally, we present some hypothetical scenarios of how AI and ML could shape the future
United Kingdom of sports.
Reviewed by:
Daniel Mason, Keywords: artificial intelligence, machine learning, sports business, sports analytics, sport research, future
University of Alberta, Canada of sports
Jaret Karnuta,
University of Pennsylvania Health
System, United States INTRODUCTION
*Correspondence:
Nader Chmait It was in Moneyball (Lewis, 2004), the famous success storey of the Major League Baseball team
[email protected] “Oakland Athletics,” that using in-game play statistics came under focus as a means to assemble
an exceptional team. Despite Oakland Athletics’ relatively small budget, the adoption of a rigorous
Specialty section: data-driven approach to assemble a new team led to the playoffs in the year 2002. An economic
This article was submitted to evaluation of the Moneyball hypothesis (Hakes and Sauer, 2006) describes how, at the time, a
Sports Management, Marketing and baseball hitters’ salary was not truly explained by the contribution of a player’s batting skills to
Business, winning games. Oakland Athletics gained a big advantage over their competitors by identifying
a section of the journal
and exploiting this information gap. It’s been almost two decades since Moneyball principles, or
Frontiers in Sports and Active Living
SABRmetrics (Lewis, 2004) was introduced to baseball. SABR stands for Society for American
Received: 18 March 2021 Baseball Research and SABRmetricians are those scientists who gather the in-game data and
Accepted: 15 November 2021
analyse it to answer questions that will lead to improving team performance. Since the success
Published: 08 December 2021
of the Oakland Athletics, most MLB teams started employing SABRmetricians. The ongoing and
Citation:
exponential increase of computer processing power has further accelerated the ability to analyse
Chmait N and Westerbeek H (2021)
Artificial Intelligence and Machine
“big data,” and indeed, computers increasingly are taking charge of the deeper analysis of data sets,
Learning in Sport Research: An through means of artificial intelligence (AI). Likewise, the surge in high-quality data collection and
Introduction for Non-data Scientists. data aggregation (accomplished by organisations like Baseball Savant/StatCast, ESPN and others)
Front. Sports Act. Living 3:682287. are key ingredients to the spike in the accuracy and breadth of analytics that was observed in the
doi: 10.3389/fspor.2021.682287 MLB in recent years.
Frontiers in Sports and Active Living | www.frontiersin.org 1 December 2021 | Volume 3 | Article 682287
Chmait and Westerbeek Artificial Intelligence and Machine Learning in Sport Research
The adoption of AI and statistical modelling in sports putting, football kicking, . . . ) to show that, at the time of writing,
has become therefore more prominent in recent years as expert systems were marginally used in sports biomechanics
new technologies and research applications are impacting despite being popular for “gait analysis” whereas Artificial
professional sports at various levels of sophistication. The wide Neural Networks were used for applications such as performance
applicability of machine learning algorithms, combined with patterns in training and movement patterns of sports performers.
increasing computing processing power as well as access to An Artificial Neural Network (ANN) is a system that mimics
more and new sources of data in recent years, has made sports the functionality of a human brain. ANNs are used to solve
organisations hungry for new applications and strategies. The computational problems or estimate functions from a given
overriding aim is still to make them more competitive on and data input, by imitating the way neurons are fired or activated
off the field–in athletic and business performance. The benefits in the human brain. Several (layers of) artificial neurons,
of leveraging the power of AI can, in that regard, take different known as perceptrons, are connected to perform computations
forms from optimising business or technical decision making to which return an output as a function of the provided input
enhancing athlete/team performance but also increasing demand (Anderson, 1995).
for attendance at sporting events, as well as promoting alternative Bartlett (2006) predicted that multi-layer ANNs will play a
entertainment formats of the sport. big role in sports technique analysis in the future. Indeed, as
We next list some areas where AI and machine learning (ML) we discuss later, multi-layer ANNs, now commonly referred
have left their footprints in the world of sports (Beal et al., 2019) to as Deep Learning, have become one of the most popular
and provide some examples of applications in each (some of the techniques in sports related analytics. Last but not least Bartlett
listed applications could overlap with one or more of the areas). (2006) described the applications of Evolutionary Computation
and hybrid systems in the optimization of sports techniques and
• Game activity/analytics: match outcome modelling, player/ball
skill learning. Further discussion around the applications of AI in
Tracking, match event (e.g., shot) classification, umpire
sports biomechanics can be found in Ratiu et al. (2010). McCabe
assistance, sports betting.
and Trevathan (2008) discussed the use of artificial intelligence
• Talent identification and acquisition: player recruitment, player
for prediction of sporting outcomes, showing how the behaviour
performance measurement, biomechanics.
of teams can be modelled in different sporting contests using
• Training and coaching: assessment of team formation efficacy,
multi-layer ANNs.
tactical planning, player injury modelling.
Between 2006 and 2010, machine learning algorithms,
• Fan and business focused: measurement of a player’s economic
particularly ANNs were becoming more popular amongst
value, modelling demand for event attendance, ticket pricing
computer scientists. This was aided by the impressive
optimisation (variable and dynamic), wearable and sensor
improvements in computer hardware, but also due to a shift in
design, highlight packaging, virtual and augmented reality sport
mindset in the AI community. Large volumes of data were made
applications, etc.
public amongst researchers and scientists (e.g., ImageNet a visual
The field of AI (particularly ML) offers new methodologies database delivered by Stanford University), and new open-source
that have proven to be beneficial for tackling the above machine learning competitions were organised (such as Netflix
challenges. In this perspective paper we aim to provide sports Prize and Kaggle). It is these types of events that have shaped the
business professionals and non-technical sports audiences, adoption of AI and machine learning in many different fields of
coaches, business leaders, policy makers and stakeholders with study from medicine to econometrics and sports, by facilitating
an overview of the range of AI approaches used to analyse sport access to training data and offering free open-source tools and
performance and business centric problems. We also discuss frameworks for leveraging the power of AI. Note that, in addition
perspectives on how AI could shape the future of sports in the to ANN, other machine learning techniques are utilised in such
next few years. competitions, and sometimes these can be used in combination
with one another. For instance, some of the techniques that
went into the winning of the Netflix prize include singular value
RESEARCH ON AI AND ML IN SPORTS decomposition combined with restricted Boltzmann machines
and gradient boosted decision trees.
In this section, we will not be reviewing examples of how AI has Other examples discussing ANNs in sports include
been applied to sports for a specific application, but rather, we will Novatchkov and Baca (2013) who discuss how ANNs can
look at the intersection of AI and sports at a more abstract level, be used for understanding the quality of execution, assisting
discussing some research that either surveyed or summarised the athletes and coaches, and training optimisation. However, the
application of AI and ML in sports. applications of AI to sports analytics go beyond the use of
One of the earliest works discussing the potential applications ANNs. For example, Fister et al. (2015) discussed how nature-
of artificial intelligence in sports performance, and its positive inspired AI algorithms can be used to investigate unsolved
impact on improving decision-making is by Lapham and Bartlett research problems regarding safe and effective training plans.
(1995). The paper discusses how expert systems (i.e., a knowledge- Their approach (Fister et al., 2015) relies on the notion of
based database used for reasoning) can be used for sports artificial collective intelligence (Chmait et al., 2016; Chmait,
biomechanics purposes. Bartlett (2006) reviewed developments 2017) and the adaptability of algorithms to adapt to a changing
in the use of AI in sports biomechanics (e.g., throwing, shot environment. The authors show how such algorithms can be
Frontiers in Sports and Active Living | www.frontiersin.org 2 December 2021 | Volume 3 | Article 682287
Chmait and Westerbeek Artificial Intelligence and Machine Learning in Sport Research
used to develop an artificial trainer to recommend athletes with player effect on German television audience demand for live
an informed training strategy after taking into consideration broadcast tennis matches (Konjer et al., 2017)
various factors related to the athlete’s physique and readiness. • And similarly, in Cricket (Paton and Cooke, 2005), Hockey
Other types of scientific methods that include Bayesian (Coates and Humphreys, 2012), and in the Australian Football
approaches have been applied to determining player abilities League (Lenten, 2012).
(Whitaker et al., 2021) but also predicting match outcomes
AI algorithms are being used in Formula 1 (F1) to improve
(Yang and Swartz, 2004). Bayesian analysis and learning is an
the racing tactics of competing teams by analysing data from
approach for building (statistical and inference) models by
hundreds of sensors in the F1 car. Recent work by Piccinotti
updating the probability for a hypothesis as more evidence
(2021) shows how artificial intelligence can provide F1 with
or information becomes available by using Bayes’ theorem
automated ways for identifying tyre replacement strategies by
(Ghosh et al., 2007).
modelling pit-stop timing and frequency as sequential decision-
There are numerous research papers in which AI and ML
making problems.
is applied to sport, and it is not our aim to comprehensively
Researchers from Tennis Australia and Victoria University
discuss these works here1 . However, we refer to a recent survey
devised a racket recommendation technique based on real
that elaborates on this topic. Beal et al. (2019) surveyed the
HawkEye (computer vision system) data. An algorithm was used
applications of AI in team sports. The authors summarised
to recommend a selection of rackets based on movement, hitting
existing academic work, in a range of sports, tackling issues such
pattern and style of the player with the aim to improve the player’s
as match outcome modelling, in-game tactical decision making,
performance (Krause, 2019).
player performance in fantasy sport games, and managing
Accurate and fair judging of sophisticated skills in sports
professional players’ sport injuries. Work by Nadikattu (2020)
like gymnastics is a difficult task. Recently, a judging system
presents, at an abstract level, discussions on how AI can
was developed by Fujitsu Ltd. The system scores a routine
be implemented in (American) sports from enhancing player
based on the angles of a gymnast’s joints. It uses AI to
performance, to assisting coaches to come up with the right
analyse 3D laser sensors that capture the gymnasts’ movements
formations and tactics, to developing automated video highlights
(Atiković et al., 2020).
of sports matches and supporting referees using computer
Finally, it is important to note the exceptionally successful
vision applications.
adoption of AI in board games like Chess, Checkers, Shogi
We emphasise that the application of AI in sports is
and the Chinese game of GO, as well as virtual games (like
not limited to topics of sports performance, athlete talent
Dota2 and StarCraft). In the last couple of decades, AI has
identification or the technical analysis of the game. The (off the
delivered a staggering rise in performance in such areas to the
field) business side of sports organisations is rapidly shifting
point that machines (almost) constantly defeat human world
towards a data driven culture led by developing profiles of their
champions. We refer to some notable solutions like Schaeffer
fans and their consumer preferences. As fans call for superior
et al. (2007) Checkers artificial algorithm, DeepBlue defeating
content and entertainment, sport organisations must react by
Kasparov in Chess (Campbell et al., 2002), AlphaGo Zero
delivering a customised experience to their patrons. This is often
defeating Lee Sedol in Go (Silver et al., 2017) (noting that
achieved by the use of statistical modelling as well as other
AlphaZero is also unbeatable in chess) and Vinyals et al. (2019)
machine learning solutions, for example, to understand the value
AlphaStar in StarcraftII as well as superhuman AI for multiplayer
of players from an economic perspective. As shown in Chmait
poker (Brown and Sandholm, 2019). Commonly, in these types
et al. (2020a), investigating the relationship between the talent
of games or sports, AI algorithms rely on a Reinforcement
and success of athletes (to determine the existence of what
Learning approach (which we will describe later) as well as
is referred to as superstardom phenomenon or star power) is
using techniques like the Monte-Carlo Search Trees to explore
becoming an important angle to explore value created in sport.
the game and devise robust strategies to solve and play these
To provide an idea of the extent of such work, we note some
games. Some of the recent testbeds used to evaluate AI agents and
sports in which the relationship between famous players/teams
algorithms are discussed in Hernández-Orallo et al. (2017). For a
and their effect on audience attendance or sport consumption has
broader investigation of AI in board and virtual/computer games
been studied:
refer to Risi and Preuss (2020).
• In soccer (Brandes et al., 2008; Jewell, 2017), The rise of applying AI and ML is unstoppable and to that
• In Major League Baseball (Ormiston, 2014; Lewis and Yoon, end, one might be wondering how AI an ML tools work and why
2016) are they different from traditional summary analytics. We touch
• In the National Basketball Association (Berri et al., 2004; Jane, upon these considerations in the next section.
2016)
• In tennis: superstar player effect in demand for tennis
tournament attendance (Chmait et al., 2020a), the presence THE MACHINE LEARNING PARADIGM
of a stardom effect in social media (Chmait et al., 2020b),
To understand why ML is used in a wide range of applications,
we need to take a look into the difference between recent AI
1 Forconferences and published articles on AI and sports analytics see Swartz approaches to learning and traditional analytics approaches. At
(2020). a higher conceptual level, one can describe old or traditional
Frontiers in Sports and Active Living | www.frontiersin.org 3 December 2021 | Volume 3 | Article 682287
Chmait and Westerbeek Artificial Intelligence and Machine Learning in Sport Research
approaches to sports analytics, as starting off with some set professionals but also applied researchers who work in sport to
of rules that constitute the problem definition, some data that better understand the way that data scientists think so to facilitate
is to be processed using a program/application which will talking to them about their approach and methodology, without
then deliver answers to the given problem. In contrast, in a requiring to dive deep into the details of the underlying analytics.
machine learning/predictive analytics paradigm, the way this
process works is fundamentally different. For instance, in some Supervised Learning: Predicting Player
approaches of the ML paradigm, one typically starts by feeding Injury
the program with answers and corresponding data to a specific Many sports injuries (e.g., muscle strain) can be effectively treated
problem, with an algorithm narrowing down the rules of the or prevented if one is able to detect them early or predict the
problem. These rules are later used for making predictions and likelihood of sustaining them. There could be many different
they are evaluated or validated by testing their accuracy over new (combinations of) reasons/actions leading to injuries like muscle
(unseen) data. strain. For example, in the Australian Football League, some of
To that end, machine learning is an area of AI that is hypotheses put forward leading to muscle strain include: muscle
concerned with algorithms that learn from data by performing weakness and lack of flexibility, fatigue, inadequate warm-up,
some form of inductive learning. In simple terms, ML prediction and poor lumbar posture (Brockett et al., 2004). Detecting the
could be described as a function2 from a set of inputs patterns that can lead to such injuries is extremely important
i1 , i2 , . . . , in , to forecast an unknown value y, as follows both for the safety of the players, and for the success and
f (w1 ∗i1 , w2 ∗i2 , . . . , wn ∗in ) = y, where wt is the weight of input t. competitiveness of the team.
Different types or approaches of ML are used for different In a supervised learning scenario, data about the players would
types of problems. Some of the most popular are supervised be collected from previous seasons including details such as the
learning, unsupervised learning, and reinforcement learning: number of overall matches and consecutive matches they played,
total time played in each match, categorised by age, number of
• In supervised learning, we begin by observing and recording
metres run, whether or not they warmed up before the match, how
both inputs (the i’s) and outputs (the y’s) of a system, for
many times they were tackled by other players, and so on, but more
a given period of time. This data (collection of correct
importantly, whether or not the players ended up injured and
examples of inputs and their corresponding outputs) is then
missed their next match.
analysed to derive the rules that underly the dynamics of the
The last point is very important as it is the principal difference
observed system, i.e., the rules that map a given input to its
between supervised learning and other approaches: the outcome
correct output.
(whether or not the player was injured) is known in the
• Unlike the above, in unsupervised learning, the correct
historical data that was collected from previous seasons. This
examples or outputs from a given system are not available.
historical data is then fed (with the outcome) to a machine
The task of the algorithm is to discover (previously unnoticed)
learning algorithm with the objective of learning the patterns
patterns in the input data.
(combination of factors) which led to an injury (and usually
• In reinforcement learning, an algorithm (usually referred to as
assigning a probability of the likelihood of an injury given these
an agent) is designed to take a series of actions that maximise
patterns). Once these patterns are learnt, the algorithm or model
its cumulative payoff or rewards over time. The agent then
is then tested on new (unseen data) to see if it performs well and
builds a policy (a map of action selection rules) that return a
indeed predicts/explains injury at a high level of accuracy (e.g.,
probability of taking a given action under different conditions
70% of the time). If the accuracy of the model is not as required,
of the problem.
the model is tuned (or trained with slightly different parameters)
For a thorough introduction to the fundamentals of machine until it reaches the desired or acceptable accuracy. Note here that
learning and the popular ML algorithms see Bonaccorso (2017). we did not single out a specific algorithm or technique to achieve
The majority of AI applications in sports are based on one or the above. Indeed, this approach can be applied using many
more of the above approaches to ML. In fact, in most predictive different ML algorithms such as Neural Networks, Decision Trees
modelling applications, the nature of the output y that needs to and regression models.
be predicted or analysed could influence the architecture of the
learning algorithm. Unsupervised Learning: Fan Segmentation
Explaining the details of how different ML techniques work We will use a sport business example to introduce the
is outside the scope of this paper. However, to provide an unsupervised learning approach. Most sports organisations keep
insight into how such algorithms function in layman’s terms and track of historical data about their patrons who attended their
the differences between them, we briefly present (hypothetical) sporting events, recording characteristics such as their gender,
supervised, unsupervised and reinforcement learning problems postcode, age, nationality, education, income, marital status, etc.
in the context of sports. These examples will assist the A natural question of interest here is to understand if the different
segments of customers/patrons will purchase different categories
2 Note
(e.g., price, duration, class etc.) of tickets.
that such function is also found in regression techniques where the
weights/coefficients are unknown. In ML, it is usually the case where both the
Some AI algorithms are designed to help split the available
function and its weights are unknown and are determined using various search data, so that each data point (historical ticket sale) sits in a
techniques and algorithms. group/class that is similar to the other data points (other sales)
Frontiers in Sports and Active Living | www.frontiersin.org 4 December 2021 | Volume 3 | Article 682287
Chmait and Westerbeek Artificial Intelligence and Machine Learning in Sport Research
in that same class given the recorded features. The algorithm unaware of the expected rewards from executing a given action
will then use some sort of a similarity or distance metric to when at a given state, it takes a random action and updates its
classify the patrons according to the category of tickets that they table following that action. After many (thousands of) iterations
might purchase. over the problem space, the agent’s table holds (a weighted sum
This is different from how supervised learning algorithms, like of) the expected values of the rewards of all future actions starting
those discussed in the previous section, work. As we described from the initial state.
before, in supervised learning we instruct the algorithm with the Reinforcement learning has been applied to improve the
outcome in advance while training it (i.e., we classify/label each selection of team formations in fantasy sports (Matthews et al.,
observation based on the outcome: injury or no injury, cheap 2012). Likewise, the use of reinforcement learning is prominent
or expensive seats, . . . ). In the unsupervised learning approach, in online AI bots and simulators like chess, checkers, Go, poker,
there is no such labelling or classification of existing historical StarCraft, etc.
data. It is the mission of the unsupervised learning algorithm to Finally, it is important to also note the existence of genetic
discover (previously unnoticed) patterns in the input data and or evolutionary algorithms, sometimes referred to as nature/bio-
group it into (two or more) classes. inspired algorithms. While such algorithms are not typically
Imagine the following use case where an Australian Football considered to be ML algorithms (but rather search techniques
League club aims to identify a highly profitable customer segment and heuristics), they are very popular in solving similar types of
within its entire set of stadium attendees, with the aim to enhance problems tackled by ML algorithms. In short, the idea behind
its marketing operations. Mathematical models can be used to such algorithms is to run (parallel) search, selection and mutation
discover (segments of) similar customers based on variations techniques, by going over possible candidate solutions of a
in some customer attributes within and across each segment. A problem. The solutions are gradually optimised until reaching a
popular unsupervised learning algorithm to achieve such goal local (sub-optimal) or global maximum (optimal solution). To
is the K-means clustering algorithm which finds the class labels provide a high-level understanding of evolutionary algorithms,
from the data. This is done by iteratively assigning the data points consider the following sequence of steps:
(e.g., customers) from the input into a group/class based on the
• We start by creating (a population of) initial candidate or
characteristics of this input. The essence is that the groups or
random strategies/solutions to the problem at hand.
classes to which the data points are assigned to are not defined
• We assess these candidate solutions (using a fitness function)
prior to exploring the input data (although the number of groups
and assign scores to each according to how well they solve the
or segments can be pre-defined) but are rather dynamically
problem at hand.
formed as the K-means algorithm iterates over the data points.
• We then pick a selection of these candidate solutions that
In the context of customer segmentation, when presenting the
performed best at stage two above. We then combine
mathematical model (K-means algorithm) with customer data,
(crossbreed) these together to generate (breed) new solutions
there is no requirement to label a portion (or any of) of this data
(e.g., take some attributes from one candidate solution and
into groups in advance in order to train the model as usually done
others from another candidate solution in order to come up
in supervised models.
with a new solution).
• We then apply random changes (mutations) to the resulting
Reinforcement Learning: Simulations and solutions from the previous step.
• We repeat the solution combination/crossbreeding process
Fantasy Sports
until a satisfactory solution is reached.
As mentioned before, in reinforcement learning, an algorithm
(such as Q-learning and SARSA algorithms) learns how to Evolutionary algorithms can be used as alternative means for
complete a series of tasks (i.e., solve a problem) by interacting training machine learning algorithms such as reinforcement
with an (artificial) environment that was designed to simulate learning algorithms and deep neural networks.
the real environment/problem at hand. Unlike the case with
supervised learning, the algorithm is not explicitly instructed
about the right/accurate action in different states/conditions of THE FUTURE OF AI IN SPORT
the environment (or steps of problem it is trying to solve).
But rather it incrementally learns such a protocol through There is no doubt that AI will continue to transform sports,
reward maximisation. and the ways in which we play, watch and analyse sports will
In simple terms, reinforcement learning approaches represent be innovative and unexpected. In fact, machine learning has
problems using what are referred to as: an agent (a software drastically changed the way we think about match strategies,
algorithm), and a table of states and actions. When the agent player performance analytics but also how we track, identify and
executes an action, it transitions from one state to another and learn about sport consumers. A Pandora’s box of ethical issues
it receives a reward or a penalty (a positive or negative numerical is emerging and will increasingly need to be considered when
score respectively) as a result. The reward/penalty associated with machines invade the traditionally human centred and naturally
the action-state combination is then stored in the agent’s table talented athlete base of sport. It is unlikely that AI will completely
for future reference and refinement. The agent’s goal is to take replace coaches and human experts, but there is no doubt that
the action that maximises its reward. When the agent is still leveraging the power of AI will provide coaches and players with
Frontiers in Sports and Active Living | www.frontiersin.org 5 December 2021 | Volume 3 | Article 682287
Chmait and Westerbeek Artificial Intelligence and Machine Learning in Sport Research
a big advantage and lead over those who only rely on human possible disputes in the change rooms and on the field of play.
expertise. It will also provide sport business managers with Being in control of which data can or cannot, and will or will not,
deeper, real time insights into the behaviours, needs and wants of be used is at stake.
sport consumers and in turn AI will become a main producer of From an economic perspective, relying on artificial algorithms
sport content that is personalised and custom made for individual could increase the revenue of sports organisations and event
consumers. But human direction and intervention seems to be, at organisers when enabled to apply efficient variable and dynamic
least in the near future, still essential working towards elite sport pricing strategies and build comprehensive and deep knowledge
performance and strategic decision making in sport business. consumer platforms. Different types of ML algorithms can
The sporting performance on the field is often produced as be adopted to deliver more effective customer marketing via
an entertainment spectacle, where the sporting context is the personalisation and to increase sales funnel conversion rates.
platform for generating the business of sport. Replacing referees Finally, for a window on the future of data privacy, it might be
with automated AI is clearly possible and increasingly adopted in useful to return to baseball where the addiction to big data started
various sports, because it is more accurate and efficient, but is it its spread across the high-performance sport industry. Hattery
what the fans want? (2017, p. 282) explains that in baseball “using advanced data
What might the future of sport with increasingly integrated collection systems . . . the MLB teams compete to create the most
AI look like? Currently, most of the research in AI and sports is precise injury prediction models possible in order to protect and
specialised. That is to provide performance or business solutions optimise the use of their player-assets. While this technology has
and solve specific on and off field problems. For instance, the potential to offer tremendous value to both team and player,
scientists have successfully devised solutions to tackle problems it comes with a potential conflict of interest. Players’ goals are
like player performance measurement, and quantifying the effect not always congruent with those of the organisation: the player
of a player/team on demand for gate attendance. Nevertheless, strives to protect his own career while the team is attempting to
our research has not identified studies (yet) that provide a 360- capitalise on the value of an asset. For this reason, the player has
degree analysis on, for example, the absolute value of an athlete by an interest in accessing data that analyses his potential injury risk.
taking into account all the dimensions of his or her performance This highlights a greater problem in big data: what rights will
on how much business can be developed, for example in regard individuals possess regarding their own data points?”
to ticket sales or endorsement deals. This privacy issue can be further extended to the sport
One of the main challenges to achieve such a comprehensive business space Dezfouli et al. (2020) have shown how AI
analysis is mainly due to the fact that data about players and can be designed to manipulate human behaviour. Algorithms
teams, and commercial data such as ticket sales and attendance learned from humans’ responses who were participating in
numbers, are kept proprietary and are not made public to avoid controlled experiments. The algorithms identified and targeted
providing other parties with competitive information. Moreover, vulnerabilities in human decision-making. The AI succeeded in
privacy is an important consideration as well. Regulations about steering participants towards executing particular actions. So,
data privacy and leakage of personal identification details must be will AI one day be shaping the spending behaviour of sports
put in place to govern the use and sharing of sports (performance fans by exploiting their fan infused emotional vulnerabilities and
and consumption) data. Data ownership, protection, security, monitoring their (for example) gambling inclinations? Will AI
privacy and access will all drive the need for comprehensive and sacrifice the health of some athletes in favour of the bigger team
tight legislation and regulation that will strongly influence the winning the premiership? Or is this already happening? Time
speed and comprehensiveness of the adoption of AI in sport. will tell.
To that end, it is worth considering privacy and confidentiality
implications independently when studying the leagues’ journey of DATA AVAILABILITY STATEMENT
AI adoption compared to that of individual teams and ultimately
the individual players. Eventually, the successful adoption of AI The original contributions presented in the study are included
in a sports league will likely depend on the teams in that league in the article, further inquiries can be directed to the
and their players to be willing to share proprietary data or insights corresponding author.
with other teams in the league. Performance data of players in
particular is becoming a hot topic of disputation. It may well be AUTHOR CONTRIBUTIONS
AI that will determine the bargaining power of players and their
agents in regard to the value of their contracts. As an extension NC and HW had major contribution to the writing of
of this it will then also be AI providing the information that this manuscript. NC contributed to the writing of the parts
will determine if players are achieving the performance objectives around artificial intelligence and machine learning and provided
set by coaches and as agreed to in contracts. In other words, examples of these. HW shaped the scope of the manuscript
confidentiality and ownership of league, team or player level data and wrote and edited many of its sections particularly the
will become an increasing bone of legal contention and this will introduction and the discussion. Both authors contributed to the
be reflected in the complexity of contractual agreements and article and approved the submitted version.
Frontiers in Sports and Active Living | www.frontiersin.org 6 December 2021 | Volume 3 | Article 682287
Chmait and Westerbeek Artificial Intelligence and Machine Learning in Sport Research
REFERENCES Jane, W.-J. (2016). The effect of star quality on attendance demand: the
case of the National Basketball Association. J. Sports Econom. 17, 396–417.
Anderson, J. A. (1995). An introduction to Neural Networks. Cambridge, MA: doi: 10.1177/1527002514530405
MIT Press. Jewell, R. T. (2017). The effect of marquee players on sports demand:
Atiković, A., Kamenjašević, E., Nožinović, M. A., Užičanin, E., Tabaković, M., and the case of US Major League Soccer. J. Sports Econom. 18, 239–252.
Curić, M. (2020). Differences between all-around results in women’s artistic doi: 10.1177/1527002514567922
gymnastics and ways of minimizing them. Balt. J. Health Phys. Act. 12, 80–91. Konjer, M., Meier, H. E., and Wedeking, K. (2017). Consumer demand for
doi: 10.29359/BJHPA.12.3.08 telecasts of tennis matches in Germany. J. Sports Econom. 18, 351–375.
Bartlett, R. (2006). Artificial intelligence in sports biomechanics: new dawn or false doi: 10.1177/1527002515577882
hope? J. Sports Sci. Med. 5, 474–479. Krause, L. (2019). Exploring the influence of practice design on the development
Beal, R., Norman, T. J., and Ramchurn, S. D. (2019). Artificial intelligence of tennis players (Doctoral dissertation). Victoria University, Footscray,
for team sports: a survey. Knowl. Eng. Rev. 34. doi: 10.1017/S02698889190 VIC, Australia.
00225 Lapham, A. C., and Bartlett, R. M. (1995). The use of artificial intelligence in the
Berri, D. J., Schmidt, M. B., and Brook, S. L. (2004). Stars at the gate: the analysis of sports performance: a review of applications in human gait analysis
impact of star power on NBA gate revenues. J. Sports Econom. 5, 33–50. and future directions for sports biomechanics. J. Sports Sci. 13, 229–237.
doi: 10.1177/1527002503254051 doi: 10.1080/02640419508732232
Bonaccorso, G. (2017). Machine Learning Algorithms. Birmingham: Packt Lenten, L. J. (2012). Comparing attendances and memberships in the Australian
Publishing Ltd. Football League: the case of hawthorn. Econ Labour Relat. Rev. 23, 23–38.
Brandes, L., Franck, E., and Nuesch, S. (2008). Local heroes and superstars: an doi: 10.1177/103530461202300203
empirical analysis of star attraction in German soccer. J. Sports Econom. 9, Lewis, M. (2004). Moneyball: The Art of Winning an Unfair Game. New York, NY:
266–286. doi: 10.1177/1527002507302026 WW Norton and Company.
Brockett, C. L., Morgan, D. L., and Proske, U. W. E. (2004). Predicting Lewis, M., and Yoon, Y. (2016). An empirical examination of the development and
hamstring strain injury in elite athletes. Med. Sci. Sports Exerc. 36, 379–387. impact of star power in Major League Baseball. J. Sports Econom. 19, 155–187.
doi: 10.1249/01.MSS.0000117165.75832.05 doi: 10.1177/1527002515626220
Brown, N., and Sandholm, T. (2019). Superhuman AI for Matthews, T., Ramchurn, S., and Chalkiadakis, G. (2012). Competing with humans
multiplayer poker. Science 365, 885–890. doi: 10.1126/science.aa at fantasy football: Team formation in large partially-observable domains.
y2400 in Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 26
Campbell, M., Hoane Jr, A. J., and Hsu, F. H. (2002). Deep (Vancouver, BC), 1394–1400.
blue. Artif. Intell. 134, 57–83. doi: 10.1016/S0004-3702(01)0 McCabe, A., and Trevathan, J. (2008). Artificial intelligence in sports prediction.
0129-1 in Fifth International Conference on Information Technology: New
Chmait, N. (2017). Understanding and measuring collective intelligence across Generations (IEEE: Las Vegas, NV), 1194–1197. doi: 10.1109/ITNG.20
different cognitive systems: an information-theoretic approach. in IJCAI 08.203
(Melbourne), 5171–5172. Nadikattu, R. R. (2020). Implementation of new ways of artificial
Chmait, N., Dowe, D. L., Li, Y. F., Green, D. G., and Insa-Cabrera, J. intelligence in sports. J. Xidian Univ. 14, 5983–5997. doi: 10.2139/ssrn.36
(2016). Factors of collective intelligence: how smart are agent collectives? 20017
in Proceedings of the Twenty-second European Conference on Artificial Novatchkov, H., and Baca, A. (2013). Artificial intelligence in sports on the example
Intelligence (Prague), 542–550. of weight training. J. Sports Sci. Med. 12, 27–37.
Chmait, N., Robertson, S., Westerbeek, H., Eime, R., Sellitto, C., and Reid, Ormiston, R. (2014). Attendance effects of star pitchers in major league
M. (2020a). Tennis superstars: the relationship between star status and baseball. J. Sports Econom. 15, 338–364. doi: 10.1177/15270025124
demand for tickets. Sport Manag. Rev. 23, 330–347. doi: 10.1016/j.smr.2019. 61155
03.006 Paton, D., and Cooke, A. (2005). Attendance at county cricket: an economic
Chmait, N., Westerbeek, H., Eime, R., Robertson, S., Sellitto, C., and Reid, M. analysis. J. Sports Econom. 6, 24–45. doi: 10.1177/1527002503261487
(2020b). Tennis influencers: the player effect on social media engagement Piccinotti, D. (2021). Open Loop Planning for Formula 1 Race Strategy
and demand for tournament attendance. Telemat Inform. 50:101381. Identification. Menlo Park, CA: Association for the Advancement of
doi: 10.1016/j.tele.2020.101381 Artificial Intelligence.
Coates, D., and Humphreys, B. R. (2012). Game attendance and outcome Ratiu, O. G., Badau, D., Carstea, C. G., Badau, A., and Paraschiv, F. (2010). Artificial
uncertainty in the National Hockey League. J. Sports Econom. 13, 364–377. intelligence (AI) in sports, in Proceedings of the 9th WSEAS International
doi: 10.1177/1527002512450260 Conference on Artificial Intelligence, Knowledge Engineering, and Data Bases
Dezfouli, A., Nock, R., and Dayan, P. (2020). Adversarial vulnerabilities of (Cambridge, UK), 93–97.
human decision-making. Proc. Nat. Acad. Sci. U.S.A. 117, 29221–29228. Risi, S., and Preuss, M. (2020). From chess and Atari to StarCraft and beyond:
doi: 10.1073/pnas.2016921117 how game AI is driving the world of AI. KI-Künstliche Intell. 34, 7–17.
Fister Jr, I., Ljubič, K., Suganthan, P. N., Perc, M., and Fister, I. (2015). doi: 10.1007/s13218-020-00647-w
Computational intelligence in sports: challenges and opportunities Schaeffer, J., Burch, N., Björnsson, Y., Kishimoto, A., Müller, M.,
within a new research domain. Appl. Math. Comput. 262, 178–186. Lake, R., et al. (2007). Checkers is solved. science 317, 1518–1522.
doi: 10.1016/j.amc.2015.04.004 doi: 10.1126/science.1144079
Ghosh, J. K., Delampady, M., and Samanta, T. (2007). An Introduction Silver, D., Schrittwieser, J., Simonyan, K., Antonoglou, I., Huang, A., Guez, A.,
to Bayesian Analysis: Theory and Methods. Berlin: Springer Science and et al. (2017). Mastering the game of go without human knowledge. Nature 550,
Business Media. 354–359. doi: 10.1038/nature24270
Hakes, J. K., and Sauer, R. D. (2006). An economic evaluation of the Swartz, T. B. (2020). Where should I publish my sports
Moneyball hypothesis. J. Econ. Perspect. 20, 173–186. doi: 10.1257/jep.20. paper? Am. Stat. 74, 103–108. doi: 10.1080/00031305.2018.14
3.173 59842
Hattery, M. (2017). Major League Baseball players, big data, and the right to know: Vinyals, O., Babuschkin, I., Czarnecki, W. M., Mathieu, M., Dudzik, A.,
the duty of Major League Baseball teams to disclose health modelling analysis Chung, J., et al. (2019). Grandmaster level in StarCraft II using multi-
to their players. Marquette Sports Law Rev. 28, 257–283. https://scholarship.law. agent reinforcement learning. Nature 575, 350–354. doi: 10.1038/s41586-019-1
marquette.edu/sportslaw/vol28/iss1/9/ 724-z
Hernández-Orallo, J., Baroni, M., Bieger, J., Chmait, N., Dowe, D. L., Hofmann, K., Whitaker, G. A., Silva, R., Edwards, D., and Kosmidis, I. (2021). A Bayesian
et al. (2017). A new AI evaluation cosmos: ready to play the game? AI Mag. 38, approach for determining player abilities in football. J. R. Stat. Soc. Series C 70,
66–69. doi: 10.1609/aimag.v38i3.2748 174-201. doi: 10.1111/rssc.12454
Frontiers in Sports and Active Living | www.frontiersin.org 7 December 2021 | Volume 3 | Article 682287
Chmait and Westerbeek Artificial Intelligence and Machine Learning in Sport Research
Yang, T. Y., and Swartz, T. (2004). A two-stage Bayesian model for the publisher, the editors and the reviewers. Any product that may be evaluated in
predicting winners in major league baseball. J. Data Sci. 2, 61–73. this article, or claim that may be made by its manufacturer, is not guaranteed or
doi: 10.6339/JDS.2004.02(1).142 endorsed by the publisher.
Conflict of Interest: The authors declare that the research was conducted in the Copyright © 2021 Chmait and Westerbeek. This is an open-access article distributed
absence of any commercial or financial relationships that could be construed as a under the terms of the Creative Commons Attribution License (CC BY). The use,
potential conflict of interest. distribution or reproduction in other forums is permitted, provided the original
author(s) and the copyright owner(s) are credited and that the original publication
Publisher’s Note: All claims expressed in this article are solely those of the authors in this journal is cited, in accordance with accepted academic practice. No use,
and do not necessarily represent those of their affiliated organizations, or those of distribution or reproduction is permitted which does not comply with these terms.
Frontiers in Sports and Active Living | www.frontiersin.org 8 December 2021 | Volume 3 | Article 682287