Mini Project Documentation
Mini Project Documentation
Ch.0 ABSTRACT 09
Ch.1 INTRODUCTION 10-12
1.1 Problem statement 10
1.2 Objectives 10
1.3 Motivations 11
1.4 Existing systems 11
1.5 Proposed system 11
1.6 Scope 12
Ch.2 LITERATURE SURVEY 13
Ch.3 SYSTEM REQUIREMENT SPECIFICATION 14
3.1 Hardware Requirements
3.2 Software Requirements
Ch.4 ARCHITECTURE OF PROPOSED SYSTEM 15
Ch.5 IMPLEMENTATION 16-29
5.1 Algorithm 16
5.2 Required Modules/Libraries/Framework 17
5.3 Installation 20
5.4 Datasets 28
Ch.6 APPLICATIONS OVERVIEW AND RESULTS 30-42
6.1 Home 30
6.2 Exploratory Data Analysis 31
6.3 Data Preprocessing 34
6.4 Trends 36
6.5 Prediction 41
Ch.7 CONCLUSIONS AND FUTURE SCOPE 43
Ch.8 REFERENCES 44
APPENDIX 45-51
8
Ch.0 Abstract
Olympics is one of the leading sporting events and this project revolves around
performing careful data analytics operations on the data collected from it. For
this objective, two datasets that contain information about the various events
and the participated athletes has been analyzed. This project finds its base in
Descriptive and Predictive forms of Analytics.
9
Ch.1 Introduction
The modern Olympic Games or Olympics are leading international sports events
featuring summer and winter sports competitions in which thousands
of athletes from around the world participate in a variety of competitions. The
‘modern Olympics’ comprises all the Games from Athens 1986 to Rio 2016. The
Olympic Games are considered the world’s foremost sports competition with more
than 200 nations participating.
1.2 Objectives
The objective of our analysis is to answer these questions, but, however not limited
to these only:
1. How weight of an athlete is dependent on his/her height?
2. The total number of medals won by Male and female athletes.
3. Determining the participation trend in the Summer and Winter Seasons
4. Which Countries have the most medals?
5. Name the athletes with most medals.
6. Determine the countries winning the most gold medals in a specific year.
7. Predict the weight of an athlete, given the height?
8. Predict the sport a person is apt for, depending on his/her BMI values.
9. Analyze women participation over the years.
10
1.3 Motivation
The motivation for this project lies in our curiosity in the field of Data
Analytics. Also, the previous documented analysis on the olympics
data have continuously motivated us to perform better.
11
1.6 Scope
• This project is an interactive application that would help to know about the
Olympics more.
• It's especially useful to the Olympics managing authorities.
• It can lay foundations to build more Data Analytics applications that would
easy to present the results and user-friendly.
12
Ch.2 Literature Survey
1)
Ø Title : Performance Analysis in Olympic Games using Exploratory Data
Analysis Techniques
2)
Ø Title : 120 years of Olympic Games — How to analyze and visualize the
history with R
3)
Ø Title : Analyzing Evolution of the Olympics by Exploratory Data Analysis
using R
Ø Author(s): Rahul Pradhan, Karthik Agrawal, Anubhav Bag
Ø Year: March 2021
Ø Observations: Neat presentation, proper vision, excellent findings and
documentation.
Ø Limitations: Lack of Interface, unappealing Visualization.
13
Ch.3 System Requirement specification
3.1 Hardware requirements are:
Ø RAM: 1GB
Ø Streamlit framework
14
Ch.4 Architecture of Proposed system
15
Ch.5 Implementation
5.1 Algorithm
Linear regression
o Linear Regression is a machine learning algorithm based on supervised
learning. It performs a regression task.
o Regression models a target prediction value based on independent variables.
o It is mostly used for finding out the relationship between variables and
forecasting.
o Different regression models differ based on – the kind of relationship between
dependent and independent variables they are considering, and the number
of independent variables getting used.
o Linear regression performs the task to predict a dependent variable value (y)
based on a given independent variable (x).
o So, this regression technique finds out a linear relationship between x (input)
and y(output). Hence, the name is Linear Regression.
o In the figure above, X (input) is the work experience and Y (output) is the
salary of a person.
o The regression line is the best fit line for our model.
Hypothesis function for Linear Regression :
16
When training the model – it fits the best line to predict the value of y for a given
value of x. The model gets the best regression fit line by finding the best θ1 and
θ2 values.
θ1: intercept
θ2: coefficient of x
Once we find the best θ1 and θ2 values, we get the best fit line. So when we are
finally using our model for prediction, it will predict the value of y for the input
value of x.
Cost function(J) of Linear Regression is the Root Mean Squared Error (RMSE)
between predicted y value (pred) and true y value (y).
Gradient Descent:
To update θ1 and θ2 values in order to reduce Cost function (minimizing RMSE
value) and achieving the best fit line the model uses Gradient Descent. The idea is
to start with random θ1 and θ2 values and then iteratively updating the values,
reaching minimum cost.
17
Numpy:
• Arrays of Numpy offer modern mathematical implementations on huge
amount of data.
• Numpy makes the execution of these projects much easier and hassle-free.
• While you change the shape of any N-dimensional arrays, Numpy will create
new arrays for that and delete the old ones.
• This python package provides useful tools for integration. You can easily
integrate Numpy with programming languages such as C, C++, and Fortran
code.
Pandas:
• Pandas provide us with many Series and Data Frames. It allows you to easily
organize, explore, represent, and manipulate data.
• Pandas has some special features that allow you to handle missing data or
value with a proper measure.
• This package offers you such a clean code that even people with no or basic
knowledge of programming can easily work with it.
• It provides a collection of built-in tools that allows you to both read and write
data in different web services, data-structure, and databases as well.
• Pandas can support JSON, Excel, CSV, HDF5, and many other formats. In fact,
you can merge different databases at a time with Pandas.
Streamlit:
18
• It is a Python-based library specifically designed for machine learning
engineers.
• Data scientists or machine learning engineers are not web developers and
they're not interested in spending weeks learning to use these frameworks to
build web apps.
• Instead, they want a tool that is easier to learn and to use, as long as it can
display data and collect needed parameters for modeling.
Matplotlib:
• Matplotlib makes easy things easy and hard things possible, like
Plotly:
• Python Plotly Library is an open-source library that can be used for data
visualization and understanding data simply and easily.
• Plotly supports various types of plots like line charts, scatter plots, histograms,
cox plots, etc.
• Plotly has hover tool capabilities that allow us to detect any outliers or
anomalies in a large number of data points.
• It is visually attractive that can be accepted by a wide range of audiences.
• It allows us for the endless customization of our graphs that makes our plot
more meaningful and understandable for others.
Seaborn:
19
• Seaborn is an amazing visualization library for statistical graphics plotting in
Python.
• It provides beautiful default styles and color palettes to make statistical plots
more attractive.
• It is built on the top of matplotlib library and also closely integrated to the
data structures from pandas.
• Seaborn aims to make visualization the central part of exploring and
understanding data.
• It provides dataset-oriented APIs, so that we can switch between different
visual representations for same variables for better understanding of dataset.
Scikit-Learn:
• This module is a simple and efficient tool for predictive data analysis
• Model selection
5.3 Installation
1. Anaconda Navigator:
The installation of Anaconda Navigator is:
20
§ Anaconda is an open-source software that contains Jupyter, spyder, etc that
are used for large data processing, data analytics, heavy scientific computing.
§ Anaconda works for R and python programming language. Spyder(sub-
application of Anaconda) is used for python.
§ Opencv for python will work in spyder. Package versions are managed by the
package management system called conda.
§ To begin working with Anaconda, one must get it installed first.
§ Follow the below instructions to Download and install Anaconda on your
system:
Download and install Anaconda:
Head over to anaconda.com and install the latest version of Anaconda.
Make sure to download the “Python 3.7 Version” for the appropriate
architecture.
21
Ø Select Installation Type: Select Just Me if you want the software
to be used by a single User
22
Ø Getting through the Installation Process:
23
• Working with Anaconda:
Once the installation process is done, Anaconda can be used to
perform multiple operations. To begin using Anaconda, search for
Anaconda Navigator from the Start Menu in Windows
• Go to Environments
24
• Click on create at the bottom, and name the environment as shown
25
• Can click "Open Terminal" to install any modules or packages
3. Download matplotlib:
4. Download seaborn:
• To download Seaborn click on 'Open Terminal' as shown above, and
enter the code: pip install seaborn
• The library will be successfully installed.
26
5. Download sklearn:
• To download sklearn kit click on 'Open Terminal' as shown above,and
enter the code: pip install sklearn
6. Download Streamlit:
• To download streamlit click on 'Open Terminal' as shown above, and
enter the code: pip install streamlit
• The framework will be successfully installed.
27
• Similarly plotly can be downloaded using pip install plotly
5.4 Datasets
• Datasets are imported in csv file format.
• Two datsets are imported.
• One being athlete_events csv file, the other athlete_BMI file
• Athlete Events dataset.
• It has been imported as a csv file from the source
https://www.kaggle.com/datasets/heesoo37/120-years-of-olympic-
history-athletes-and-results
• It contains a total of 277116 row tuples, mapped over 15 attributes or
columns.
• Each row corresponds to an individual athlete competing in an individual
Olympic event (athlete-events). The columns are:
ID - Unique number for each athlete
Name - Athlete's name
Sex - M or F
Age - Integer
Height - In centimeters
Weight - In kilograms
Team - Team name
NOC - National Olympic Committee 3-letter code
Games - Year and season
28
Year – Integer
Season - Summer or Winter
City - Host city
Sport - Sport
Event - Event
Medal - Gold, Silver, Bronze, or NA
29
Ch.6 Applications Overview and Results
6.1 Home:
• Home of the data application contains basic introduction on the Olympics
application, along with it's Logo of five connected circles.
• Also the Home contains a navigation bar, that would be helpful in linking
all the five pages.
• Moreover, it's helpful in identifying the current page, and also to navigate
to other pages.
30
2. Load Athlete BMI dataset -
• Body mass index (BMI) is a value derived from the mass (weight)
and height of a person.
• The BMI is defined as the body mass divided by the square of
the body height, and is expressed in units of kg/m2, resulting from
mass in kilograms and height in metres.
• BMI = Weight in kgs/ (height in m)^2
• This BMI values of athletes, and corresponding sport are used to
predict the apt sport for a test case.
• The concerned datset has been given in the Appendix.
• This button on click gives a flowchart that guides users to the
prediction page.
• Note that this dataset doesn't require preprocessing.
• The result is shown below:
31
• The next two pages, Data Preprocessing and Trends are also a part of
Exploratory Data Analysis.
• For simple reasoning, let's assme this page would just explore the
dataset(Athlete Events) and provide statistical information if asked by the
user.
• This page contains five Exploration checkboxes, which user can click to see
the results.
• Every checkbox has a query embedded into it, which makes it more
appealing to the user.
• Here is the screenshot of considered five checkboxes:
• The first checkbox, Show Dataset, when clicked presents the user the
entire dataset with the help of scroll bar.
• The second checkbox, First 5 values of the dataset, returns the head of the
dataset. For this purpose df.head() has been used.
32
• The third checkbox, Get the total number of Rows and Columns, uses the
shape function to return total number of data tuples and attributes
• The fifth and final checkbox, Null values in Columns, gives the total NaN
values in the dataset in each column.
33
• Again, note that these operations are only considered on Athlete Events
dataset, and BMI dataset has only been used for prediction
.
• The code for this has been given in the Appendix.
• From these observations, it's clear that data clearing should be performed.
• Data cleaning routines attempt to fill in missing values, smooth out noise
while identifying outliers, and correct inconsistencies in the data.
• Here, it's applications are limited to filling of the NaN values, and removal
of an attribute only.
• The various methods for handling the problem of missing values in data
tuples include:
A. Ignoring the tuple: This is usually done when the class label is missing
(assuming the mining task involves classification or description). This
method is not very effective unless the tuple contains several attributes
with missing values.
C. Using a global constant to fill in the missing value: Replace all missing
attribute values by the same constant, such as a label like “Unknown,”
or −∞. If missing values are replaced by, say, “Unknown,” then the
mining program may mistakenly think that they form an interesting
concept, since they all have a value in common — that of “Unknown.”
Hence, although this method is simple, it is not recommended.
D. Using the attribute mean for quantitative (numeric) values or attribute
mode for categorical (nominal) values, can be a feasible approach.
34
• In this project, we considered replacing NaN with mean values for Age,
Height and Medal
• While, NaN values of Medal are replaced by 0 which indicates No medal.
• Also, the Medal Column has been converted to numeric form, with 1
representing Gold, 2 with Silver and 3 representing Bronze, using replace
function.
• In this page a total of 7 queries have been answered. The first three
checkboxes remove NaN values in Age, Height and Weight columns and
output the dataset with new values.
• The fourth checkbox performs the operation discussed above for Medal
Column.
35
• One can check the updated NaN/Null values by checking the fifth checkbox
Updated Null Values
• The polished dataset can be viewed with the aid of last button "Final
Dataset".
6.4 Trends:
• Trends determine the relationships between attributes and are helpful in
answering questions presented in the Objectives session.
• The main purpose of this project is to find answers to those questions.
1) Consider the query "Analyze the relationship between the Height and
Weights of an athlete", for this purpose a simple scatter plot is used to
check the cirrelation between those attributes.
36
Ø Scatter plot's code is given in the Appendix, also the user is presented
with an option of selecting Histogram to find the relationship between
the attributes. Consider these outputs
Ø Result: it can be inferred that approximately 200k men and 75k women
have participated in the Olympics so far.
37
Ø Result : Summer Olympics see more participation than Winter.
38
5) Consider "The total number of medals won by Male and female
athletes", which is found using a histogram from plotly express.
Ø Result : Men have won 9625 Gold, 9524 silver and 9381 Bronze, while
women have won 3747 gold, 3771 silver and 3735 bronze.
39
7) Consider the query " Countries with most medals", the answer can be
found out using get_dummies () function from the pandas. (The code to
which is given in appendix)
Ø Given below is the list of ten countries with most medals .
Ø This checkbook is also uses to find whether a country is in the zero
medal list.
Ø A zero medal list is basically the list of countries with zero Olympic
medals.
Ø It asks the user to enter a country name and to check the result. If the
country is in the zero medal list it outputs "sorry to break it to u, your
country is in the zero medal list".
Ø If the country is not in the zero medal list then it outputs "your country
has won atleast one medal so chill".
Ø If the country is not in the data set then it outputs "country not listed
in the data set so be optimistic about your country winning a medal".
8) Consider the query, "Countries winning the most gold medals in a specific
year."
Ø Here, an Olympic year is asked from the user, and then the year is fed
to the function, that outputs bar plot from the highest to lowest gold
winning countries.
Ø This bar plot is derived from barplot() of seaborn.
40
Ø Result: Countries winning the most gold medals in a specific year is
determined. For example in 1896, Germany won most Gold medals.
Ø Observations checkbox is provided to note down any results acquired.
Ø The code is given in the Appendix.
6.5 Prediction:
• In this part, simple predictions are done using the Linear Regression algorithm
.
• The first prediction deals in finding og Weight value of an Athlete given
his/her height.
• While the other prediction deals with finding an apt sport for a person based
on his/her BMI values.
• The first prediction is done by constructing model based on Height and
Weights of athletes from Athlete Events dataset.
• The second is done by constructing a model that would predict, based on BMI
& corresponding sport from Athlete BMI dataset.
• The accuracy of H vs W model in 62 percent, while BMI is 89.87 percent.
41
• Show athlete BMI dataset button gives the corresponding dataset.
42
Ch.7 Conclusions and Future Scope
7.1 Conclusions:
43
Ch.8 References
Ø http://www.researchgate.net/publication/330847008_Performance_analysis_i
n_olympic_games_using_exploratory_data_analysis_techniques
Ø https://www.researchgate.net/publication/265033380_Data_mining_of_sport
s_performance_data
Ø https://www.researchgate.net/publication/23756788_Economics_and_Olympi
cs_An_Efficiency_Analysis
Ø https://ieeexplore.ieee.org/abstract/document/9725496
Ø https://docs.streamlit.io/
44
Appendix
import streamlit as st
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import plotly.express as px
import seaborn as sns
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split
45
import time
if nav == 'Home':
st.image('https://cdn.pixabay.com/photo/2013/02/15/10/58/blue-
81847__340.jpg',width=800)
st.write("The main theme of this app is to perform data analytics on
Olympic datasets(namely Athlete Events and Athlete BMI datasets). The
Athlete Events dataset contains historic data ranging from the Athens
Olympics 1896 to Rio 2016, while the other dataset contains data of athletes
with their BMI values and their concerned sport.")
if st.button('Load Main/Athlete Events dataset'):
st.write('Loaded successfully! Can perform analysis on it by following this
flowchart')
st.graphviz_chart("""
digraph{
Home -> ExploratoryDataAnalysis
ExploratoryDataAnalysis -> DataPreprocessing
DataPreprocessing -> Trends
Trends -> Prediction
DataPreprocessing -> Prediction
}
""")
if st.button('Load Athlete BMI dataset'):
st.write('Loaded successfully! Can perform predictions on it')
st.graphviz_chart("""
digraph{
Home -> Prediction
}
""")
data = pd.read_csv(r"C:\Users\chand\Downloads\athlete_events.csv
(1)\athlete_events.csv")
p = pd.DataFrame(data)
if nav == 'Exploratory Data Analysis':
st.header('Exploratory Data Analysis')
if st.checkbox("Show Dataset"):
46
st.dataframe(p)
st.write("Note that the Winter and Summer Games were held in the same
year up until 1992. After that, they staggered them such that Winter Games
occur on a four year cycle starting with 1994, then Summer in 1996, then
Winter in 1998, and so on. A common mistake people make when analyzing
this data is to assume that the Summer and Winter Games have always been
staggered.")
if st.checkbox("Show first 5 values of the Dataset"):
st.dataframe(p.head())
if st.checkbox("Get the total number of Rows and Columns"):
st.write(p.shape)
if st.checkbox("Show the Statistical Information Of the
Columns/Attributes"):
st.write(p.describe())
if st.checkbox("Null Values in columns"):
d = pd.DataFrame(p.isnull().sum()).transpose()
st.write(d)
if nav == 'Data Preprocessing':
st.header('Data Preprocessing')
if st.checkbox("Remove Null Values in Age Column"):
p['Age'] = p['Age'].fillna(p.Age.mean())
st.dataframe(p)
if st.checkbox("Remove Null Values in Height Column"):
p['Height'] = p['Height'].fillna(p.Height.mean())
st.dataframe(p)
if st.checkbox("Remove Null Values in Weight Column"):
p['Weight'] = p['Weight'].fillna(p.Weight.mean())
st.dataframe(p)
if st.checkbox('Covert Medals to Numeric datatype and Remove Null
Values'):
p['Medal']= p.Medal.replace({'Gold':1,'Silver':2,'Bronze':3})
p['Medal']= p['Medal'].fillna(0)
st.write(p)
if st.checkbox(" Updated Null Values"):
d = pd.DataFrame(p.isnull().sum()).transpose()
st.write(d)
if st.checkbox("Remove redundant column"):
p = p.drop(['Games'], axis = 1)
st.write(p)
if st.button("Final Dataset"):
st.write(p)
if nav == 'Trends':
47
st.header('Trends')
if st.checkbox("Analyze the relationship between the Height and Weights of
an athlete"):
graph = st.selectbox("What kind of Plot do you want?",['Scatter
Plot','Histogram'])
if graph =='Histogram':
fig = px.histogram(p,x=p.Height,color=p.Weight)
st.write(fig)
if graph =='Scatter Plot':
plt.scatter(p['Height'],p['Weight'])
plt.xlabel('Height')
plt.ylabel('Weight')
plt.title('Height Vs Weight')
st.set_option('deprecation.showPyplotGlobalUse', False)
st.pyplot()
if st.checkbox('Approximate Number of Males And Females Participated in
the Olympics'):
p['Sex'].value_counts().plot.bar(p['Sex'])
st.set_option('deprecation.showPyplotGlobalUse', False)
plt.grid()
st.pyplot()
if st.checkbox("Determine the Participation trend in the Summer and
Winter Seasons"):
fig = px.histogram(p,x=p.Season,color = p.Sex, barmode="group")
st.write(fig)
if st.checkbox('Women Participation over the years'):
y = p[p['Sex']=='F']['Sex']
fig = px.histogram(y,x = p.Year)
st.write(fig)
if st.checkbox('Number of Medals Won by M and F'):
p['Medal']= p.Medal.replace({'Gold':1,'Silver':2,'Bronze':3})
p['Medal']= p['Medal'].fillna(0)
fig = px.histogram(p,x = p.Sex, color= p.Medal)
st.write(fig)
if st.checkbox("Atheletes with Most Medals"):
p['Medal']= p.Medal.replace({'Gold':1,'Silver':2,'Bronze':3})
p['Medal']= p['Medal'].fillna(0)
df = p[['Medal']]
df = pd.get_dummies(df.Medal)
df = df.drop([0],axis = 1)
df['Name'] = p['Name']
df['Total'] = df[1]+df[2]+df[3]
48
f = df.groupby(df['Name'])['Total'].sum().sort_values(ascending =
False).head(10)
x = pd.DataFrame(f)
st.write(x)
fig = plt.figure()
ax = fig.add_axes([0,0,1,1])
ax.axis('equal')
Team = list(f.index.values)
Count_of_Medal = f
ax.pie(Count_of_Medal, labels = Team,autopct='%1.2f%%')
plt.show()
st.pyplot()
if st.checkbox("Countries with Most Medals"):
p['Medal']= p.Medal.replace({'Gold':1,'Silver':2,'Bronze':3})
p['Medal']= p['Medal'].fillna(0)
df = p[['Medal']]
df = pd.get_dummies(df.Medal)
df = df.drop([0],axis = 1)
df['Team'] = p['Team']
df['Total'] = df[1]+df[2]+df[3]
f = df.groupby(df['Team'])['Total'].sum().sort_values(ascending = False)
k=f.head(10)
k = list(k.index.values)
k = pd.DataFrame(k)
st.write(k)
49
if max_year not in p.Year:
st.write("Enter a valid year")
else:
if st.button('Show'):
if max_year not in p.Year:
st.write("Enter a valid year")
else:
team_list = p[(p.Year == max_year) & (p.Medal=='Gold')].Team
if(len(team_list)!=0):
sns.barplot(x=team_list.value_counts().head(),
y=team_list.value_counts().head().index)
st.set_option('deprecation.showPyplotGlobalUse', False)
st.pyplot()
else:
st.write('Enter valid year')
if st.checkbox('Observations'):
a = st.text_area("Observations")
st.write(a)
if nav == 'Prediction':
st.header('Prediction')
st.subheader('Predict the Weight of an athlete with his/her Height')
model = LinearRegression()
p['Height'] = p['Height'].fillna(p.Height.mean())
p['Weight'] = p['Weight'].fillna(p.Weight.mean())
x= p['Height']
x= np.array(x).reshape(-1,1)
y= p['Weight']
y=np.array(y).reshape(-1,1)
x_train,x_test,y_train,y_test = train_test_split(x,y, test_size= 0.2)
model.fit(x_train,y_train)
t = st.number_input('Enter the Height')
t = np.array(t).reshape(-1,1)
d = model.predict(t)
if st.button('Predict Weight'):
st.write(d)
st.subheader('Predict the suitable Sport with the BMI values')
data = pd.read_csv(r"C:\Users\chand\Prediction.csv")
k = pd.DataFrame(data)
if st.checkbox("Show Athlete BMI Dataset"):
st.dataframe(k)
st.write("Note:")
q = {"Value":['1','2','3','4'],
"Corresponding Sport": ['Marathon','Basketball','Rugby','Shot Put']}
50
m=pd.DataFrame(q)
st.write(m)
model1 = LinearRegression()
j= k['BMI']
j= np.array(j).reshape(-1,1)
l= k['Sport']
l=np.array(l).reshape(-1,1)
j_train,j_test,l_train,l_test = train_test_split(j,l, test_size= 0.3)
model1.fit(j_train,l_train)
t = st.number_input('Enter BMI')
t = np.array(t).reshape(-1,1)
d = model1.predict(t)
d = d*10
d = np.round(d)
if st.button('Results'):
my_bar = st.progress(0)
for percent_complete in range(100):
time.sleep(0.001)
my_bar.progress(percent_complete + 1)
if d in range(0,12):
st.write('Definitely Marathon')
elif d in range(12,19):
st.write('Marathon, But also suitable for Basketball')
elif d in range(19,23):
st.write('Definitely Basketball')
elif d in range(23,27):
st.write('Basketball, But also suitable for Rugby')
elif d in range(27,33):
st.write('Definitely Rugby')
elif d in range(33,38):
st.write('Rugby, can also try shot put')
else:
st.write('Opt for Shot Put')
51