0% found this document useful (0 votes)
7 views

Project File

The document outlines two projects completed by Nishika Yadav: 'Meteor Shower Prediction' and 'Using Basketball Statistics to Optimize Gameplay.' The first project focuses on predicting meteor showers by analyzing historical data and considering geographical and atmospheric factors, while the second project explores the application of data science and machine learning in basketball to enhance gameplay strategies. Both projects include sections on objectives, theories, libraries used, datasets, and acknowledgments.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views

Project File

The document outlines two projects completed by Nishika Yadav: 'Meteor Shower Prediction' and 'Using Basketball Statistics to Optimize Gameplay.' The first project focuses on predicting meteor showers by analyzing historical data and considering geographical and atmospheric factors, while the second project explores the application of data science and machine learning in basketball to enhance gameplay strategies. Both projects include sections on objectives, theories, libraries used, datasets, and acknowledgments.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 36

Contents

Project I- Meteor Shower Prediction

S.no. Content Page no.

1. Certificate 3

2. Acknowledgement 4

3. System Requirement 5

4. Objective 6

5. Theory 7

6. Libraries used & their purpose 8

7. Data files used & their purpose 9

8. Source code & output 9-14

9. Project 2 15-34

10. Bibliography 35
Certificate
Project 1- Meteor Shower Prediction

This is to certify that Nishika Yadav, student of Blue Bells Model Sr. Sec. School
has successfully completed the project titled- "Meteor Shower Prediction" during
the academic year 2024-2025 under the guidance of Ms Sanjana Arora (PGT
Computer Science) as a part of the Data Science curriculum.
The findings and conclusions drawn in this project are based on the data collected
and analysed by the student and this project may be considered part of the
practical exam conducted by CBSE.

___________________ _____________________
Subject Teacher’s Signature External Examiner’s Signature
Acknowledgement
I am deeply grateful to my project guide, Ms. Sanjana Arora, for guiding me
immensely throughout the Meteor Shower Prediction project. She always showed
keen interest in my project, and her constructive advice and constant motivation
have been responsible for its successful completion.
My sincere thanks go to our principal, Ms. Alka Singh, for her coordination and
for extending every possible support to complete my project.
I must thank my classmates for their timely help and support in completing this
project.
System Requirements
Hardware Requirements:
1. Processor: x86_64 Processor with at least 2 CPU cores above 1GHz
2. RAM: Minimum 8 GB
3. Hard disk space: Approximately 10 GB

Software Requirements:
1. Operating system: Windows 8 and above
2. Python 3.11.14
3. Visual Studio Code with Jupyter Notebook
Objective
The objective of this project is to explore and predict meteor showers, a
captivating celestial phenomenon. Meteors, often called shooting stars, are visible
when tiny space debris enters Earth's atmosphere, burning up and producing light
streaks. Meteor showers, a more spectacular event, occur when Earth passes
through debris trails left by comets orbiting the sun, leading to numerous meteors
appearing from a single point in the sky.

This project focuses on predicting when and where these meteor showers can be
observed from any given city. To do this, we'll analyse historical data about comet
paths and previous meteor showers. Understanding the orbits of these comets
and Earth’s position in its orbit during meteor showers is crucial. It helps
determine the timing and visibility of these events.

Additionally, we'll consider geographical factors, such as a viewer’s location, and


atmospheric conditions like sky clarity and light pollution, which greatly impact
the visibility of meteor showers.

The aim is to use past patterns of meteor showers—like their frequency, intensity,
and duration—to forecast future occurrences. This will not only deepen our
understanding of celestial events but also provide a practical guide for astronomy
enthusiasts, especially students, to experience these natural wonders. Through
this project, we hope to foster a greater appreciation for astronomy and the night
sky among peers and the community.
Theory
As we start this project, it's essential first to explore the fascinating world of comets and meteors,
the primary subjects of our study. Understanding these celestial bodies is crucial for tracking
the meteor showers they create. Moreover, knowing about the moon's phases is equally
important, as it helps determine the best times to observe these meteor showers.
In our solar system, numerous comets orbit the sun, each with unique characteristics and
trajectories. For this project, we will concentrate on several well-known comets of particular
interest to astronomers over the years.
● Comet Thatcher: This comet was discovered in 1861 and has an orbital period of about
415.5 years. It is the source of the Lyrids meteor shower, visible in April. The debris from
Comet Thatcher entering Earth's atmosphere results in this annual spectacle.
● Comet Halley: A famous comet first recorded in 1531, it orbits the sun every 76 years.
Comet Halley is responsible for two meteor showers: the Eta Aquarids in May and the
Orionids in October. Both showers offer breathtaking views as Earth passes through
Halley's trail of cosmic debris.
● Comet Swift-Tuttle: Discovered in 1862, this comet takes approximately 133 years to
complete its orbit around the sun. From its debris, the Perseids meteor shower emerges,
peaking in August. This shower is known for its brightness and is associated with the
constellation Perseus.
● Comet Tempel-Tuttle: This comet was independently discovered in 1865 and 1866. It has a
relatively shorter orbital period of 33 years. The debris from Comet Tempel-Tuttle gives
rise to the Leonids meteor shower, typically observed in November. On occasion, the
Leonids can intensify into a meteor storm, offering a rare and spectacular sight.

As the moon orbits Earth, it reflects varying amounts of sunlight, leading to changes in its visible
size and shape, known as lunar phases. These phases occur due to the moon's position relative
to Earth and the sun, altering the portion of the moon that is illuminated.

The different phases of the Moon are as follows:


• New Moon
• Waxing crescent
• First-quarter
• Waxing gibbous
• Full Moon
• Waning gibbous
• Third-quarter
• Waning crescent
Libraries Used
1. NumPy (Numerical Python):
NumPy is a core library for numerical computing in Python, known for its large, multidimensional
array and matrix data structures. It offers efficient storage and operations on large datasets,
dramatically enhancing performance over traditional Python lists. NumPy's capabilities include
advanced mathematical functions, linear algebra, and random number generation, making it a
fundamental package for scientific computing. Its speed and versatility stem from its C language
implementation, and it serves as a critical foundation for many other Python data science
libraries.

2. Pandas:
Pandas is a powerful Python library for data manipulation and analysis, offering structured data
storage via its DataFrame and Series objects. Renowned for its ease in handling and
transforming data, Pandas provides essential functions for reading, writing, and processing data
in various formats (like CSV, Excel, and SQL databases). It simplifies tasks like data cleaning,
statistical analysis, and visualization. Widely used in data science and financial analysis, Pandas
integrates seamlessly with other libraries, making it indispensable for exploratory data analysis,
data aggregation, time-series analysis, and cross-sectional data handling.
Datasets Used
For our project, the required data is hosted on GitHub and can be accessed via
the link below. The dataset is organised into four key files, each offering specific
information essential for analysing and predicting meteor showers.

1. moonphases.csv: This file plays a critical role as it provides detailed


information on the moon phases for every day of the year 2020. Understanding
these phases is crucial since they significantly affect the visibility of meteor
showers.
2. meteorshowers.csv: This file is a comprehensive source of data for the five
meteor showers discussed in our project. It includes important details such as the
preferred viewing month for each shower, the range of months when they are
visible, and the hemisphere where they can be best observed. This information is
vital for predicting the occurrence and visibility of these meteor showers.
3. constellations.csv: This file contains information about the four constellations
that are radiant points for the five meteor showers. It includes data like the
latitudes where these constellations are visible and the months when viewing them
is optimal. This is important for understanding where and when to look in the sky
during meteor showers.
4. cities.csv: This dataset provides a list of country capitals along with their
corresponding latitudes. This geographic information is essential for correlating
the visibility of meteor showers with specific locations around the world.

linktr.ee/nishikayadav
Source Code & Output 1
Contents
Project 2- Using Basketball statistics to optimize gameplay

S.no. Content Page no.

1. Certificate 16

2. Acknowledgement 17

3. System Requirement 18

4. Objective 19

5. Theory 20

6. Libraries used & their purpose 21

7. Data files used & their purpose 22

8. Source code & output 23-34

9. Bibliography 35
Certificate
Project II- Using Basketball statistics to optimize gameplay

This is to certify that Nishika Yadav, student of Blue Bells Model Sr. Sec. School
has successfully completed the project titled- "Using basketball statistics to
optimize gameplay" during the academic year 2024-2025 under the guidance of
Ms Sanjana Arora (PGT Computer Science) as a part of the Data Science
curriculum.
The findings and conclusions drawn in this project are based on the data collected
and analysed by the student and this project may be considered part of the
practical exam conducted by CBSE.

___________________ _____________________
Subject Teacher’s Signature ExternalExaminer’s Signature
Acknowledgement
I would like to express a deep gratitude to my project guide, Ms. Sanjana Arora,
for guiding me immensely throughout the project- “Using basketball statistics to
optimize gameplay”. She always displayed a keen interest in my project. Her
constructive advice and constant motivation have been responsible for the
successful completion of this project.
My sincere thanks go to our Principal Ms. Alka Singh for her co-ordination in
extending every possible support for completing my project.
I must thank my classmates for their timely help and support in completing this
project.
System Requirements
Hardware Requirements:
1. Processor: x86_64 Processor with at least 2 CPU cores above 1GHz
2. RAM: Minimum 8 GiB
3. Hard disk space: Approximately 20 GiB

Software Requirements:
1. Operating system: Windows 8 and above
2. Python 3.11.14
3. Visual Studio Code with Jupyter Notebook
Objective
In the captivating hybrid of animation and live-action "Space Jam: A New Legacy," basketball
icon LeBron James teams up with the classic cartoon character Bugs Bunny in a grand
adventure. Directed by Malcolm D. Lee, and realised by a forward-thinking creative team
including Ryan Coogler and Maverick Carter, this film bridges the gap between traditional sports
and the imaginative realm of animation.

Taking a cue from this innovative film, our project delves into the intricate relationship between
sports, particularly basketball, and the evolving fields of data science and machine learning.
This exploration is particularly relevant in today’s sports where player statistics (stats) are not
just numbers, but integral components that shape the game’s dynamics. For sports enthusiasts,
these stats are often the backbone of fantasy leagues and passionate discussions.

This project goes a step further, offering insights into how these stats are employed by coaches
and team strategists in real-world scenarios. Basketball, like many other sports, is increasingly
reliant on data for decision-making. From player selection and training to game strategy and
performance analysis, every aspect is data-driven.

Our focus is to unravel how machine learning and data science techniques can transform raw
player stats into valuable insights. This involves understanding how player efficiency is
measured, predicting game outcomes, and even tailoring training programs. The aim is to
demonstrate the practical applications of these stats in enhancing team performance and
Strategy.
Theory
This project delves into basketball statistics, encompassing both real and animated players. It
focuses on understanding how different metrics contribute to a player's PER (Player efficiency
rating), a measure of their per-minute effectiveness on the court. The primary aim is to apply
machine learning techniques to formulate a refined and accurate dataset of these players. This
curated dataset will be instrumental in facilitating rapid, strategic decision-making during
games, thereby enhancing a team's probability of securing Victories.

By analysing and interpreting various statistics, the project seeks to establish a clear correlation
between these stats and the PER. This will enable a deeper insight into what makes a player
efficient and how their performance impacts the overall dynamics of the game. The machine
learning aspect involves cleaning and organizing data, ensuring it is reliable and useful for real-
time analysis.

This project is not just about data collection, but about creating a dynamic tool for coaches and
analysts. By leveraging this dataset, they can make informed decisions about player rotations,
strategies, and overall game plans. The ultimate goal is to provide a competitive edge to teams
by optimizing their lineup and tactics based on the most current and comprehensive player
performance data available. This innovative approach combines sports analytics with cutting-
edge technology to revolutionize how basketball games are strategized and played.
Libraries Used
Pandas:
Pandas is a powerful Python library used for data manipulation and analysis. It provides easy-to-use data
structures and data analysis tools, making it ideal for tasks like cleaning, transforming, and analysing
structured data. Pandas excel in handling tabular data with heterogeneously-typed columns, as found in
many real-world datasets. Its DataFrame object facilitates importing, cleaning, and analysing large
datasets, making it a staple in data science and statistical modelling workflows.

NumPy:
NumPy is a fundamental package for scientific computing in Python. It offers comprehensive
mathematical functions, random number generators, linear algebra routines, Fourier transforms, and
more. At its core, NumPy provides an efficient N-dimensional array object, which is used as a container
of generic data. These arrays allow for efficient operations on large amounts of data, making NumPy
essential for performing mathematical and logical operations on arrays and matrices, a cornerstone in
various scientific computing tasks.

Matplotlib:
Matplotlib is a widely used Python library for creating static, interactive, and animated visualizations. It
provides an object-oriented API for embedding plots into applications, and a scripting layer for quick and
easy generation of plots. Matplotlib is highly customizable and can create a wide variety of plots and
charts, such as line graphs, scatter plots, bar charts, histograms, and more. Its versatility and ease of use
make it a go-to library for data
visualization in Python.

Scikit-learn:
Scikit-learn is a popular machine-learning library for Python. It features various algorithms for
classification, regression, clustering, and dimensionality reduction, along with tools for model selection
and evaluation. Its consistency and simple interface make it accessible for beginners yet flexible enough
for advanced users. Scikit-learn integrates well with other Python libraries and is widely used for practical,
real-world machine-learning applications due to their robustness and ease of use.

• Seaborn:
Seaborn is a Python data visualization library based on Matplotlib. It provides a high-level interface for
drawing attractive and informative statistical graphics. Seaborn simplifies the creation of complex
visualizations, like heat maps, time series, and violin plots, and is particularly suited for visualizing
complex datasets. Its integration with Pandas DataFrames makes it an efficient tool for data exploration.
Seaborn’s beautiful default styles and colour palettes enhance the aesthetics and readability of data
visualizations.
Datasets Used
The dataset in our project comprises four files, each serving a distinct purpose in the
analysis of basketball players' performance:

1. tune_squad.csv:
This file acts as a directory of basketball players, linking player IDs to their names. It’s used for
identifying players in the dataset, ensuring that the data associated with each ID corresponds
correctly to the right player.

2. player_data.csv:
It houses detailed statistics for each player. The data includes a unique player ID, total points
scored, number of possessions, and team pace. Additionally, it covers advanced metrics like
games played (GP), average minutes per game (MPG), true shooting percentage (TS%), assist
ratio (AST), turnover ratio (TO), usage rate (USG), offensive (ORR) and defensive (DRR) rebound
rates, total rebound rate (REBR), and the player efficiency rating (PER). These statistics provide
an in-depth view of each player's performance and contribution on the court.

3. game_stats.csv:
Focused on individual game performances, this file contains data on minutes played, player
names, and key performance metrics such as TS%, AST, TO, USG, ORR, DRR, REBR, and PER. This
file is crucial for analysing players' performance in specific games, offering insights into how
various factors like playing time and in-game efficiency contribute to overall outcomes.

4. player_data_final.csv:
This is a refined compilation of player statistics. It amalgamates basic information like player
ID and name with a comprehensive set of performance metrics (points, possessions, team pace,
GP, MPG, TS%, AST, TO, USG, ORR, DRR, REBR, PER). This file represents a consolidated view of
player data, ideal for final analysis and decision-making processes regarding player
performance and team strategies.

linktr.ee/nishikayadav
Source Code & Output 2
Bibliography
The following resources were referred for research during this project:
● https://learn.microsoft.com/en-us/shows/learn-with-dr-g/predicting-meteor-showers-
using-python-and-visual-studio-code
● https://learn.microsoft.com/en-us/shows/learn-with-dr-g/predict-basketball-per-with-
machine-learning-and-visual-studio-code

You might also like