0% found this document useful (0 votes)

155 views42 pages

Machine Learning Internship Report 2023

The document provides details about the Machine Learning internship completed by Prakhar Sharma at Prodigy Infotech from August 5, 2023 to September 5, 2023. It includes sections on the introduction to Prodigy Infotech, the results and observations from two projects on house price prediction and customer segmentation, and conclusions about the skills and experience gained from the internship.

Uploaded by

alenwalker402

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

155 views42 pages

Machine Learning Internship Report 2023

Uploaded by

alenwalker402

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

INDUSTRIAL TRAINING

REPORT
Submitted in partial fulfilment of the
Requirements for the award of the degree

Bachelor of Technology

Artificial Intelligence & Data Science

By:

PRAKHAR SHARMA (50113211921/AIDS/21)

Department of Artificial Intelligence & Data Science

Guru Tegh Bahadur Institute of Technology

Guru Gobind Singh Indraprastha University

Dwarka, New Delhi
Year 2021-2025
MACHINE LEARNING INTERN

Duration

5th August 2023 – 5st September 2023

By:

Prakhar Sharma (50113211921/AIDS/2021)

Prodigy InfoTech
DECLARATION

I declare that the content presented in this Industrial Training Report, which contributes to the
fulfillment of the requirements for the Bachelor of Technology degree in Artificial
Intelligence & Data Science at Guru Tegh Bahadur Institute of Technology, affiliated to Guru
Gobind Singh Indraprastha University Delhi, represents an authentic account of my
independent work during the internship at Prodigy Infotech. This endeavor took place from
August 5, 2023, to September 5, 2023.

Date: 2 December 2023

Prakhar Sharma (50113211921/AIDS/2021)

ii
CERTIFICATE

PRAKHAR SHARMA

iii
ACKNOWLEDGEMENT

I would like to express our great gratitude towards Ms. Nandini who has given us
support and suggestions. Without their help we could not have presented this work upto
the present standard. We also take this opportunity to give thanks to all others who gave
us support for the project or in other aspects of our study at Guru Tegh Bahadur
Institute of Technology.

Date: December 2023

Prakhar Sharma (50113211921/AIDS/2021)

prakharsharma479@[Link]

iv
ABSTRACT

At Prodigy Infotech, I served as a Machine Learning Intern. The internship revolved around
two impactful projects: "House Prices - Advanced Regression Techniques" and "Mall
Customer Segmentation Data Market Basket Analysis." These projects not only enriched my
understanding of machine learning concepts but also provided valuable insights into their
practical applications in the industry.

In the first project, "House Prices - Advanced Regression Techniques," the objective was to
predict sales prices by employing advanced regression techniques. The project involved a
multifaceted approach, including feature engineering, Random Forests (RFs), and gradient
boosting. The application of these techniques allowed for a nuanced understanding of the
factors influencing house prices and provided a hands-on experience in optimizing predictive
models.

The second project, "Mall Customer Segmentation Data Market Basket Analysis," delved
into the realm of customer behavior analysis using market basket analysis techniques. The
goal was to extract meaningful insights into customer segmentation based on their purchasing
patterns. By employing advanced analytics, the project aimed to enhance marketing strategies
and optimize business operations for improved customer satisfaction.

Throughout the internship, I was exposed to the challenges and intricacies of real-world
machine learning applications. The hands-on experience not only strengthened my technical
skills but also honed my ability to collaborate effectively within a professional team
environment. The projects required a combination of domain knowledge, data preprocessing,
and model optimization, highlighting the interdisciplinary nature of machine learning in
solving practical business problems.

This internship at Prodigy Infotech provided a holistic understanding of machine learning's

application in addressing real-world challenges. The projects not only added significant value
to the organization but also equipped me with invaluable skills and experiences that will
undoubtedly shape my future endeavors in the field of data science and machine learning.

v
LIST OF FIGURES AND TABLES

Fig No Figure Name Page

1 PRODIGY INFOTECH LOGO 1

vi
CONTENTS

Chapter Page No.

Title Page i
Declaration and Certificate ii
Acknowledgement iii
Abstract iv
Tables and figures v

1. Introduction 1
1.1 About Prodigy Infotech 1
1.2 Services
2
2. Results and observations 4
3. Conclusions 6
4. References 9
5. Appendix 11

vii
INTRODUCTION

1.1 About Prodigy InfoTech

The company is dedicated to providing students with valuable work experience,

offering them the opportunity to gain practical skills and knowledge that can
significantly benefit their future careers. The internships are meticulously designed to
simulate real-world work experiences, providing students with the chance to engage in
projects and assignments that are directly relevant to their chosen fields of study.

The mission of the organization is to create innovative and accessible learning solutions
that empower individuals of all ages and backgrounds to realize their full potential.
Whether individuals are students seeking academic improvement, professionals aiming
to upskill, or organizations looking to enhance employee training, the company
provides the necessary tools and resources for success.

Prospective participants are invited to embark on an innovative and dynamic learning

experience that will aid them in achieving their goals and unlocking their full potential.
The company envisions a transformative journey where, together, they can
revolutionize the way people learn, ultimately contributing to the creation of a better
future for all.

1
1.2 Services

Exploring Opportunities: A Comprehensive Overview of Internship Offerings at

Prodigy Infotech
Prodigy Infotech takes pride in offering a diverse range of internship opportunities across
various fields, providing students with hands-on experience and insights into their chosen
industries.

These internships are meticulously crafted to simulate real-world scenarios, allowing interns
to actively contribute to projects, collaborate with experienced teams, and gain practical
knowledge that goes beyond the classroom setting.

Machine Learning Internship:

Overview:
The Machine Learning Internship at Our Company is a comprehensive program that covers
various facets of machine learning. Interns are immersed in a structured curriculum that
includes Data Analysis, Supervised Learning, Unsupervised Learning, and Deep Learning.
This hands-on experience equips interns with the skills needed to navigate the intricacies of
machine learning and apply them to real-world challenges.

For:
Enthusiastic individuals seeking to delve into the realm of machine learning are encouraged
to apply. This internship promises a dynamic learning environment where participants can
actively engage with cutting-edge technologies and contribute to innovative projects.

Web Development Internship:

Overview:
The Web Development Internship at Our Company caters to those aspiring to become
proficient in web development. The program covers HTML5 & CSS3, Javascript, Responsive
Website Design, and Web Applications. Interns gain practical experience in developing
websites and applications, honing their skills in both front-end and back-end development.

2
For:
Individuals passionate about creating visually appealing and functional web solutions are
invited to apply. This internship offers a popular and sought-after learning experience in the
ever-evolving field of web development.

Data Science Internship:

Overview:
The Data Science Internship at Our Company provides a comprehensive exploration of the
field. Interns engage in Exploratory Data Analysis (EDA), Data Pre-processing, Data
Visualization, and gain exposure to Business

Intelligence (BI) Tools. This internship equips participants with the skills required for
meaningful data analysis and interpretation.

For:
Aspiring data scientists looking to enhance their analytical capabilities are encouraged to
apply. This internship offers a unique opportunity to work with real-world datasets and
develop a profound understanding of the data science workflow.

Android Development Internship:

Overview:
The Android Development Internship at Our Company is designed for those interested in
mobile app development. Interns delve into Kotlin programming, creating Simple Apps,
Advanced Apps, and Cloud Apps. This program provides practical experience in building
scalable and innovative Android applications.

For:
Individuals passionate about mobile app development are invited to apply. This internship
promises exposure to the latest trends in Android development and an opportunity to
contribute to the creation of diverse and functional mobile applications.

Internships provide a platform for aspiring professionals to bridge the gap between
theoretical knowledge and practical application.

3
RESULT

Project 1: House Prices - Advanced Regression Techniques

Predictive Model Development:

Implemented advanced regression techniques, including Linear Regression, Random Forests

(RFs), and Gradient Boosting.
Utilized feature engineering to enhance the predictive power of the model.
Experimented with various algorithms to identify the most effective approach for predicting
house prices.

Model Evaluation and Optimization:

Conducted thorough model evaluations using metrics such as Mean Squared Error (MSE)
and R-squared.
Employed cross-validation techniques to ensure the robustness of the models.
Optimized hyper parameters to enhance the overall performance of the predictive models.
Insights and Interpretations:

Extracted valuable insights into the factors influencing house prices.

Identified key features that significantly impact the accuracy of predictions.
Provided actionable recommendations based on the model's findings for potential
improvements in the real estate domain.

Results:
After thorough experimentation and tuning, the best performing model was the Linear
regression with an RMSE of 50045.870 on the test data. Detailed results and insights can be
found in the notebooks. [Link]

4
Project 2: Mall Customer Segmentation Data

Market Basket Analysis:

Conducted thorough exploratory data analysis (EDA) to understand patterns in customer

behavior.
Implemented market basket analysis techniques to identify associations between products and
customer segments.

Uncovered meaningful relationships and purchasing patterns that informed strategic decision-
making.

Customer Segmentation:

Utilized clustering algorithms to segment customers based on their buying behavior.

Developed comprehensive customer profiles, allowing for targeted marketing strategies.
Provided insights into the preferences and characteristics of different customer segments.

Business Impact and Recommendations:

Discussed the practical implications of the analysis for marketing and sales strategies.
Offered actionable recommendations for optimizing product placements, promotions, and
customer engagement.

Demonstrated how market basket analysis can be a valuable tool for enhancing the overall sh
opping experience and increasing revenue.

Results:
After thorough experimentation and tuning, the best performing model was the KNN with an
Accuracy of 0.925 on the test data. Detailed results and insights can be found in the
notebooks. [Link]

5
SUMMARY & CONCLUSIONS

Mall Customer Segmentation Data Analysis

Mall Customer Segmentation

This project focuses on understanding and segmenting customers based on their shopping
behavior using a dataset from Kaggle.

Customer segmentation is a crucial strategy in marketing and business. This project aims to
analyze customer data from a mall and group customers into distinct segments based on their
purchasing patterns and demographics. By understanding these segments, businesses can
tailor their marketing strategies to specific customer groups.

Dataset
The dataset used in this project is the Mall Customer Segmentation Data from Kaggle. It
contains information about customers, including their age, gender, annual income, and
spending score.

1. Explore the distribution of variables such as age, annual income, and spending score.

[Link] relationships between variables using scatter plots, histograms, and correlation
analysis.

[Link] potential outliers and understand their impact on the analysis.

Customer Segmentation
Utilize unsupervised learning techniques like K-means clustering to segment customers.
Determine the optimal number of clusters using techniques such as the elbow method.
Visualize customer segments using scatter plots or other relevant visualizations.

6
Insights
The project aims to uncover insights such as:

1. Characteristics of different customer segments (e.g., high-income, low-spending

customers).

2. How age and gender correlate with spending behavior.

3. Strategies that can be employed to target different customer segments effectively.

Housing Prices Prediction using Random forest

Housing Prices Prediction

The project is based on a dataset from Kaggle, specifically the Housing Prices Competition
dataset.

Predicting housing prices is a fundamental problem in the field of machine learning and data
science. This project aims to explore various regression techniques to predict housing prices
using features provided in the dataset. By experimenting with different algorithms and
techniques, we aim to find the most accurate model for this specific problem.

Dataset
The dataset used in this project is the Housing Prices Competition dataset from Kaggle. It
contains a comprehensive set of features related to residential properties. The dataset includes
both training and testing data, with corresponding target values.

Explore the Jupyter notebooks in the notebooks/ directory to see the step-by-step process of
data preprocessing, feature engineering, model training, and evaluation.

7
Techniques
The following techniques are implemented in this project:

1. Random Forest

2. KNNimputer

3. Heatmap (correlation graph)

8
REFERENCES

House Prices - Advanced Regression Techniques:

 Kaggle. (n.d.). House Prices: Advanced Regression Techniques. Retrieved from:

[Link]

 Brownlee, J. (2016). Feature Engineering for Machine Learning: A Comprehensive

Overview. Retrieved from: [Link]
engineering-how-to-engineer-features-and-how-to-get-good-at-it/

 Chen, T., & Guestrin, C. (2016). XGBoost: A Scalable Tree Boosting System. In
Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge
Discovery and Data Mining.

 Friedman, J. H. (2001). Greedy Function Approximation: A Gradient Boosting Machine.

The Annals of Statistics, 29(5), 1189-1232.

 Liaw, A., & Wiener, M. (2002). Classification and Regression by randomForest. R News,
2(3), 18-22.

 Pedregosa, F., et al. (2011). Scikit-learn: Machine Learning in Python. Journal of

Machine Learning Research, 12, 2825-2830.

 James, G., Witten, D., Hastie, T., & Tibshirani, R. (2013). An Introduction to Statistical
Learning. Springer.

Mall Customer Segmentation Data - Market Basket Analysis:

 Kaggle. (n.d.). Mall Customer Segmentation Data. Retrieved from:

[Link]

 Agrawal, R., Imielinski, T., & Swami, A. (1993). Mining Association Rules Between
Sets of Items in Large Databases. In Proceedings of the 1993 ACM SIGMOD
International Conference on Management of Data.

9
 Tan, P. N., Steinbach, M., & Kumar, V. (2005). Introduction to Data Mining. Pearson.

 Ransbotham, S., & Kiron, D. (2017). Analytics as a Source of Business Innovation. MIT
Sloan Management Review, 58(1), 1-14.

 Berry, M. J. A., & Linoff, G. (2004). Data Mining Techniques: For Marketing, Sales, and
Customer Relationship Management. John Wiley & Sons.

 Fournier-Viger, P., et al. (2016). A Survey of Sequential Pattern Mining. Data Science
and Pattern Recognition, 3(2), 54-77.

 Jain, P., et al. (2017). A Comprehensive Review on Apriori Algorithm and its
Improvements. International Journal of Computer Applications, 173(8), 43-47.

 Turban, E., et al. (2015). Data Mining for Business Intelligence: Concepts, Techniques,
and Applications in Microsoft Office Excel with XLMiner. Wiley.

General Machine Learning and Data Science References:

 Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep Learning. MIT Press.

 Bishop, C. M. (2006). Pattern Recognition and Machine Learning. Springer.

 VanderPlas, J. (2016). Python Data Science Handbook. O'Reilly Media.

 McKinney, W. (2018). Python for Data Analysis. O'Reilly Media.

 Müller, A. C., & Guido, S. (2016). Introduction to Machine Learning with Python: A
Guide for Data Scientists. O'Reilly Media.

10
APPENDIX A
(Screenshots Results)
House Prices - Advanced Regression Techniques:

11
-

12
13
14
15
-

16
17
-

18
19
Mall Customer Segmentation Data - Market Basket Analysis:

20
21
22
23
-

24
-

25
26
APPENDIX B
(Source Code)

House Prices - Advanced Regression Techniques:

#importing all the necessary libraries

import numpy as np
import pandas as pd
import [Link] as plt
import seaborn as sns

# %% [markdown]
# **Import the training data**
df = pd.read_csv(r"/kaggle/input/house-prices-advanced-regression-techniques/[Link]")
[Link]()

#lets check the information and columns about this data

[Link]()

df2=pd.read_csv(r"/kaggle/input/house-prices-advanced-regression-
techniques/sample_submission.csv")
df2

# as we can observe :
#72 PoolQC 7 non-null
# 73 Fence 281 non-null
#74 MiscFeature 54 non-null

#these features has almost 90 percent of null values so, we can remove them.
#* I have put comments cuz they are already been deleted and cant run again.*

#df = [Link]('PoolQC',axis=1)
#df = [Link]('Fence',axis=1)
#df = [Link]('MiscFeature',axis=1)
#df = [Link]('Alley',axis=1)

[Link]()

# %% [markdown]
# ## Feature Selection

27
#we will see correlation between output columns and all other coluns to see their significance.

df[[Link][1:]].corr()['SalePrice'][:]

print("The importantt features are :\n" )

dfc=df[['Id','OverallQual','GrLivArea','GarageCars','GarageArea','TotalBsmtSF','1stFlrSF','Full
Bath','TotRmsAbvGrd','YearBuilt','LotArea','SalePrice']]

dfc

[Link](figsize = (16,5))
[Link]([Link](),annot=True)

[Link](df['SalePrice'],bins=100)
print("Right Skewed Data: More houses with price between 1 million and 3 million ")

# %% [markdown]
# ## Outliers in Data
#using box plot
[Link](figsize=(16,5))
[Link](x='OverallQual',y='SalePrice',data=dfc)

# %% [markdown]
# ## Imputation using sklearn (handling missing values)

from [Link] import KNNImputer

imp = KNNImputer()
imp.fit_transform(dfc)
print([Link]().sum())
print("\n\n\nNo missing values")

# %% [markdown]
# # Random Forest Regressor

##dividing the dataset into independent and dependent var

X=[Link][:,:-1]
y=[Link][:,-1] #TARGET

X
y
#splitting the dataset
from sklearn.model_selection import train_test_split
from [Link] import RandomForestRegressor

28
rf_reg=RandomForestRegressor()

X_train,X_test,y_train,y_test = train_test_split(X,y,test_size=0.2,random_state=0)

#feature scaling
from [Link] import StandardScaler

sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = [Link](X_test)
rf_reg.fit(X_train, y_train)

# predicting the test set results

y_pred = rf_reg.predict(X_test)
# Plotting Scatter graph to show the prediction

[Link](y_test, y_pred, cmap = [Link])

[Link]("Price: in $1000's")
[Link]("Predicted value")
[Link]("True value vs predicted value : Linear Regression")
[Link]()

import math
from [Link] import mean_squared_error, mean_absolute_error
print("New RMSE: ", [Link](mean_squared_error(y_pred, y_test)))

y_pred.size

# %% [markdown]
# # TEST Data

#cleaning test data

df1 = pd.read_csv(r"/kaggle/input/house-prices-advanced-regression-techniques/[Link]")

df1
[Link]()
df1 =
df1[['Id','OverallQual','GrLivArea','GarageCars','GarageArea','TotalBsmtSF','1stFlrSF','FullBat
h','TotRmsAbvGrd'
,'YearBuilt','LotArea']]

df1

29
from [Link] import KNNImputer

imp = KNNImputer()
imp.fit_transform(df1)

df1[df1['GarageCars'].isnull()]
df1[['GarageCars','GarageArea','TotalBsmtSF']] =
df1[['GarageCars','GarageArea','TotalBsmtSF']].fillna(0)

[Link]().sum()
y_pred1 = rf_reg.predict(df1)

# SUBMISSIONS
submission = [Link]({
"Id": range(1461, 2920),
"SalePrice": y_pred1
})

submission.to_csv("/kaggle/working/[Link]", index = False)

30
Mall Customer Segmentation Data - Market Basket Analysis:

import numpy as np # linear algebra

import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)

import os
for dirname, _, filenames in [Link]('/kaggle/input'):
for filename in filenames:
print([Link](dirname, filename))

d=pd.read_csv(r'/kaggle/input/customer-segmentation-tutorial-in-python/Mall_Customers.csv')
df
[Link]()
print("\n\n\n NO missing values")

df['Gender'] = pd.get_dummies(df['Gender'],drop_first=True)

#one hot encodig successfull as we can see in the dtype

[Link]()

import [Link] as plt

import seaborn as sns

#lets see the plots between different columns of the dataset

[Link](df)

[Link]([Link](),annot=True)

31
#outliers 0 = Female , 1= Male
[Link](x='Gender',y='Age',data=df);

[Link](data=df,x='Age',y='Spending Score (1-100)',hue='Gender')

[Link]("Blue is female and orange is Male")
[Link]()

Gen =[Link]('Gender')

print("\t\t\t0 is female and 1 is male")

[Link]()
#max and min
print([Link]())
print('\n\n')
print([Link]())

# %% [markdown]
# # KNN Algorithm

X= [Link][:, [3,4]].values

from [Link] import KMeans

wcss = []

for i in range(1,11):
km = KMeans(n_clusters=i)
km.fit_predict(X)
[Link](km.inertia_)

32
km = KMeans(n_clusters=5)
y_means = km.fit_predict(X)
[Link](X[y_means == 0,0],X[y_means == 0,1],color='blue')
[Link](X[y_means == 1,0],X[y_means == 1,1],color='red')
[Link](X[y_means == 2,0],X[y_means == 2,1],color='green')
[Link](X[y_means == 3,0],X[y_means == 3,1],color='yellow')
[Link](X[y_means == 4,0],X[y_means == 4,1],color='black')
[Link]('Clusters of customers')
[Link]('Annual Income (k$)')
[Link]('Spending Score (1-100)')
[Link]()

model=KMeans(n_clusters=5)
[Link](df)
pre=[Link](df)

df["Target"]=y_means

df=df
[Link]()

X=[Link][:,1:5]
y=[Link][:,-1]

[Link]()
[Link]()

#splitting the dataset

from sklearn.model_selection import train_test_split

X_train,X_test,y_train,y_test = train_test_split(X,y,test_size=0.2,random_state=0)

33
#Standardize the varriables
from [Link] import StandardScaler
Sc = StandardScaler()

sc=StandardScaler()
X_train=sc.fit_transform(X_train)
X_test=[Link](X_test)

from [Link] import KNeighborsClassifier

error_rate=[]

for i in range(1,40):
knn = KNeighborsClassifier(n_neighbors=i)
[Link](X_train,y_train)
pred_i = [Link](X_test)
error_rate.append([Link](pred_i!=y_test))

[Link](figsize=(10,5))
[Link](range(1,40),error_rate,color='blue',linestyle='dashed',marker='o',markersize=12)
[Link]("Error rate vs k value")
[Link]("k")
[Link]("Error_rate")
[Link]()

knn =KNeighborsClassifier(n_neighbors=5)
[Link](X_train,y_train)
pred_5=[Link](X_test)

from [Link] import accuracy_score

accuracy = accuracy_score(y_test, pred_5)
accuracy

AI & ML Internship Report 2024
No ratings yet
AI & ML Internship Report 2024
26 pages
Infosys Pragathi Report
No ratings yet
Infosys Pragathi Report
68 pages
Summer Training Report
No ratings yet
Summer Training Report
32 pages
Internship IN Data Analysis Using Machine Learning: Gopal Tiwari
No ratings yet
Internship IN Data Analysis Using Machine Learning: Gopal Tiwari
44 pages
Internship Report on DT Thon Projects
No ratings yet
Internship Report on DT Thon Projects
64 pages
Internship Report on Data Science & AI/ML
No ratings yet
Internship Report on Data Science & AI/ML
15 pages
Internship Report Trilo
No ratings yet
Internship Report Trilo
36 pages
Bhargav
No ratings yet
Bhargav
27 pages
Guru Intership Report 1
No ratings yet
Guru Intership Report 1
40 pages
Aditya Singh
No ratings yet
Aditya Singh
15 pages
INTERN REPORT (Chaitanya)
No ratings yet
INTERN REPORT (Chaitanya)
14 pages
IBM Internship Report
No ratings yet
IBM Internship Report
49 pages
Formatted AI Internship Report
No ratings yet
Formatted AI Internship Report
10 pages
Screenshot 2024-12-14 at 1.26.20 PM
No ratings yet
Screenshot 2024-12-14 at 1.26.20 PM
15 pages
Abhi Inter 01
No ratings yet
Abhi Inter 01
68 pages
Sample Documentation
No ratings yet
Sample Documentation
70 pages
Final NOBLE COLLEGE Intership Report
No ratings yet
Final NOBLE COLLEGE Intership Report
41 pages
AI-ML Internship Report Summary
No ratings yet
AI-ML Internship Report Summary
25 pages
AI & ML Internship Report 2023
No ratings yet
AI & ML Internship Report 2023
14 pages
Internship Report by Sachin Gadadaki King
No ratings yet
Internship Report by Sachin Gadadaki King
28 pages
Godavari Engg College 24-25 Internship Report
No ratings yet
Godavari Engg College 24-25 Internship Report
19 pages
Kasi Puneeth Ram Report
No ratings yet
Kasi Puneeth Ram Report
60 pages
Adiya SGH Intern
No ratings yet
Adiya SGH Intern
14 pages
Internship Report PDF
No ratings yet
Internship Report PDF
25 pages
Web Development Internship Report
No ratings yet
Web Development Internship Report
20 pages
Rishisathrughnadata
No ratings yet
Rishisathrughnadata
15 pages
Machine Learning Internship Report
No ratings yet
Machine Learning Internship Report
10 pages
IOT Training Report MSME
No ratings yet
IOT Training Report MSME
36 pages
Internship Report: Zensoft Services Pvt. Ltd.
No ratings yet
Internship Report: Zensoft Services Pvt. Ltd.
31 pages
Vaja Prince
No ratings yet
Vaja Prince
57 pages
Data Science Internship Insights
No ratings yet
Data Science Internship Insights
24 pages
VISION
No ratings yet
VISION
31 pages
Aparna INTERN REPORT 12
No ratings yet
Aparna INTERN REPORT 12
46 pages
Python Programming Internship Report
No ratings yet
Python Programming Internship Report
42 pages
Handwritten Digit Recognition Project Report
No ratings yet
Handwritten Digit Recognition Project Report
19 pages
GOOGLE AIML Report
No ratings yet
GOOGLE AIML Report
43 pages
Internship Report: Chat Bot
No ratings yet
Internship Report: Chat Bot
16 pages
Sub Bu Intership
No ratings yet
Sub Bu Intership
65 pages
Mahesh
No ratings yet
Mahesh
16 pages
Dhanush S Jadhav New ISR
No ratings yet
Dhanush S Jadhav New ISR
66 pages
Wine Quality Prediction Internship Report
No ratings yet
Wine Quality Prediction Internship Report
17 pages
Ilovepdf Merged
No ratings yet
Ilovepdf Merged
26 pages
AWS AI ML Virtual Internship Full Report
No ratings yet
AWS AI ML Virtual Internship Full Report
33 pages
Machine Leearning
No ratings yet
Machine Leearning
39 pages
Last Last Final Internship Report Siddhi
No ratings yet
Last Last Final Internship Report Siddhi
14 pages
BTech 3rd Year AI - DS Summer Internship - PIET With Pickl - Ai by TransOrg Analytics
No ratings yet
BTech 3rd Year AI - DS Summer Internship - PIET With Pickl - Ai by TransOrg Analytics
4 pages
Machine Learning Internship Report
No ratings yet
Machine Learning Internship Report
28 pages
Ashish Sinha
No ratings yet
Ashish Sinha
41 pages
Aakarshits INTERNSHIPREPORT
No ratings yet
Aakarshits INTERNSHIPREPORT
32 pages
Ourppt
No ratings yet
Ourppt
11 pages
Web Development Internship Report 2024
No ratings yet
Web Development Internship Report 2024
26 pages
Document From .
No ratings yet
Document From .
41 pages
Industrial Exposuretraining Report: Submitted by
No ratings yet
Industrial Exposuretraining Report: Submitted by
35 pages
Internship Report Format
No ratings yet
Internship Report Format
30 pages
Google AI-ML Virtual Internship Report
No ratings yet
Google AI-ML Virtual Internship Report
27 pages
Ilovepdf Merged
No ratings yet
Ilovepdf Merged
34 pages
Report Draft 4-1-12
No ratings yet
Report Draft 4-1-12
12 pages
Search Results for "gsgsgsgsg"
No ratings yet
Search Results for "gsgsgsgsg"
11 pages
Muni Raju: Profiles and Expertise
No ratings yet
Muni Raju: Profiles and Expertise
9 pages
Muhfaad: Indian Rapper Overview
No ratings yet
Muhfaad: Indian Rapper Overview
17 pages
The Significance of Zero in Life
No ratings yet
The Significance of Zero in Life
17 pages
Dasdoi Village Info - Lucknow, UP
No ratings yet
Dasdoi Village Info - Lucknow, UP
4 pages
Definition of "Lazy as Fuck"
No ratings yet
Definition of "Lazy as Fuck"
12 pages
Flights: Showing Results For
No ratings yet
Flights: Showing Results For
7 pages
How to Destroy the World Ideas
No ratings yet
How to Destroy the World Ideas
9 pages
People Also Ask: How Much Time Are You Wasting Every Year?
No ratings yet
People Also Ask: How Much Time Are You Wasting Every Year?
17 pages
M.A. Hindi Semester IV Assignments 2019-20
No ratings yet
M.A. Hindi Semester IV Assignments 2019-20
6 pages
IoT Practical File for AI & Data Science
No ratings yet
IoT Practical File for AI & Data Science
1 page
Self-Driving Cars: Technology Overview
No ratings yet
Self-Driving Cars: Technology Overview
16 pages
Operating Systems Practical File AI-DS
No ratings yet
Operating Systems Practical File AI-DS
33 pages
Macroeconomics II Course Overview
No ratings yet
Macroeconomics II Course Overview
2 pages
Cost Accounting Basics and Objectives
No ratings yet
Cost Accounting Basics and Objectives
3 pages
Networking Impact on Trade Show Performance
No ratings yet
Networking Impact on Trade Show Performance
9 pages
Student Report On Summer Internship-I
No ratings yet
Student Report On Summer Internship-I
7 pages
SEBI Tribunal Appeal Decision
No ratings yet
SEBI Tribunal Appeal Decision
18 pages
The Flow in Leadership
No ratings yet
The Flow in Leadership
4 pages
Dietitian Exam Mcqs
No ratings yet
Dietitian Exam Mcqs
5 pages
Marketing Project Herbal Shampoo Padmasri
No ratings yet
Marketing Project Herbal Shampoo Padmasri
14 pages
Linkedin Brochure
No ratings yet
Linkedin Brochure
17 pages
01 Social Media Marketing
0% (1)
01 Social Media Marketing
19 pages
Public Procurement Writing Skills Course
No ratings yet
Public Procurement Writing Skills Course
3 pages
FinTech Report Estonia 2019 Insights
No ratings yet
FinTech Report Estonia 2019 Insights
45 pages
HUF: Formation and Tax Benefits Guide
No ratings yet
HUF: Formation and Tax Benefits Guide
23 pages
Nestlé: Ranking Position Total Score
No ratings yet
Nestlé: Ranking Position Total Score
11 pages
Employment Status and Termination Analysis
No ratings yet
Employment Status and Termination Analysis
4 pages
Home Depot Transit Packaging Protocol
No ratings yet
Home Depot Transit Packaging Protocol
11 pages
Understanding Price Elasticity of Demand
No ratings yet
Understanding Price Elasticity of Demand
18 pages
Managerial Functions at Naturals Bhopal
No ratings yet
Managerial Functions at Naturals Bhopal
28 pages
Comprehensive List of Chennai SMS Providers
No ratings yet
Comprehensive List of Chennai SMS Providers
54 pages
Global Legal Services Directory
No ratings yet
Global Legal Services Directory
30 pages
Management cHAT GPT
No ratings yet
Management cHAT GPT
12 pages
Project Officer On Migration and Human Rights
No ratings yet
Project Officer On Migration and Human Rights
2 pages
P245801coll10 414827
No ratings yet
P245801coll10 414827
353 pages
Marketing Quiz for MGT301 Students
50% (2)
Marketing Quiz for MGT301 Students
91 pages
Rural Marketing Strategies in India
No ratings yet
Rural Marketing Strategies in India
30 pages
Financial Weekly Insights: May 2023
No ratings yet
Financial Weekly Insights: May 2023
78 pages
Usdc
No ratings yet
Usdc
6 pages
Online vs In-Store Grocery Waste Impact
No ratings yet
Online vs In-Store Grocery Waste Impact
12 pages
Discover Your Entrepreneurial Traits
75% (24)
Discover Your Entrepreneurial Traits
3 pages
Jillelaguda Sbi Ifsc Code - Google Search
No ratings yet
Jillelaguda Sbi Ifsc Code - Google Search
1 page

Machine Learning Internship Report 2023

Uploaded by

Machine Learning Internship Report 2023

Uploaded by

INDUSTRIAL TRAINING

Artificial Intelligence & Data Science

PRAKHAR SHARMA (50113211921/AIDS/21)

Department of Artificial Intelligence & Data Science

Guru Gobind Singh Indraprastha University

5th August 2023 – 5st September 2023

Prakhar Sharma (50113211921/AIDS/2021)

Date: 2 December 2023

Prakhar Sharma (50113211921/AIDS/2021)

Date: December 2023

Prakhar Sharma (50113211921/AIDS/2021)

This internship at Prodigy Infotech provided a holistic understanding of machine learning's

Fig No Figure Name Page

1 PRODIGY INFOTECH LOGO 1

Chapter Page No.

1.1 About Prodigy InfoTech

The company is dedicated to providing students with valuable work experience,

Prospective participants are invited to embark on an innovative and dynamic learning

Exploring Opportunities: A Comprehensive Overview of Internship Offerings at

Machine Learning Internship:

Web Development Internship:

Data Science Internship:

Android Development Internship:

Project 1: House Prices - Advanced Regression Techniques

Predictive Model Development:

Implemented advanced regression techniques, including Linear Regression, Random Forests

Model Evaluation and Optimization:

Extracted valuable insights into the factors influencing house prices.

Market Basket Analysis:

Conducted thorough exploratory data analysis (EDA) to understand patterns in customer

Utilized clustering algorithms to segment customers based on their buying behavior.

Business Impact and Recommendations:

Mall Customer Segmentation Data Analysis

Mall Customer Segmentation

[Link] potential outliers and understand their impact on the analysis.

1. Characteristics of different customer segments (e.g., high-income, low-spending

2. How age and gender correlate with spending behavior.

3. Strategies that can be employed to target different customer segments effectively.

Housing Prices Prediction using Random forest

Housing Prices Prediction

3. Heatmap (correlation graph)

House Prices - Advanced Regression Techniques:

 Kaggle. (n.d.). House Prices: Advanced Regression Techniques. Retrieved from:

 Brownlee, J. (2016). Feature Engineering for Machine Learning: A Comprehensive

 Friedman, J. H. (2001). Greedy Function Approximation: A Gradient Boosting Machine.

 Pedregosa, F., et al. (2011). Scikit-learn: Machine Learning in Python. Journal of

Mall Customer Segmentation Data - Market Basket Analysis:

 Kaggle. (n.d.). Mall Customer Segmentation Data. Retrieved from:

General Machine Learning and Data Science References:

 Bishop, C. M. (2006). Pattern Recognition and Machine Learning. Springer.

 VanderPlas, J. (2016). Python Data Science Handbook. O'Reilly Media.

 McKinney, W. (2018). Python for Data Analysis. O'Reilly Media.

House Prices - Advanced Regression Techniques:

#importing all the necessary libraries

#lets check the information and columns about this data

print("The importantt features are :\n" )

from [Link] import KNNImputer

##dividing the dataset into independent and dependent var

# predicting the test set results

[Link](y_test, y_pred, cmap = [Link])

#cleaning test data

submission.to_csv("/kaggle/working/[Link]", index = False)

import numpy as np # linear algebra

#one hot encodig successfull as we can see in the dtype

import [Link] as plt

#lets see the plots between different columns of the dataset

[Link](data=df,x='Age',y='Spending Score (1-100)',hue='Gender')

print("\t\t\t0 is female and 1 is male")

from [Link] import KMeans

#splitting the dataset

from [Link] import KNeighborsClassifier

from [Link] import accuracy_score

You might also like