Report 3

This report analyzes an airline passenger satisfaction survey dataset using unsupervised machine learning techniques. The data was preprocessed by removing null values, outliers, and encoded categorical features. Feature engineering included normalization and removing highly correlated features. KMeans and DBSCAN clustering were applied after reducing dimensions using PCA. KMeans was run with 3 clusters and DBSCAN with a minimum sample of 5 and epsilon of 0.5. Clusters were evaluated and compared using silhouette scores. Visualizations of the clusters are provided. Limitations of the clustering algorithms are discussed and potential improvements suggested. The analysis concludes by summarizing insights gained.

Uploaded by

i221435

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

35 views3 pages

Report 3

Uploaded by

i221435

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 3

MEMOONA WAZIR

22i 1435
Machine Learning assignment #3
Report

a.Introduction: Briefly introduce the problem statement and the

dataset.
Context
This dataset contains an airline passenger satisfaction survey.

Content
Gender: Gender of the passengers (Female, Male)
Customer Type: The customer type (Loyal customer, disloyal customer)
Age: The actual age of the passengers
Type of Travel: Purpose of the flight of the passengers (Personal Travel, Business Travel)
Class: Travel class in the plane of the passengers (Business, Eco, Eco Plus)
Flight distance: The flight distance of this journey
Inflight wifi service: Satisfaction level of the inflight wifi service (0:Not Applicable;1-5)
Departure/Arrival time convenient: Satisfaction level of Departure/Arrival time convenient
Ease of Online booking: Satisfaction level of online booking
Gate location: Satisfaction level of Gate location
Food and drink: Satisfaction level of Food and drink
Online boarding: Satisfaction level of online boarding
Seat comfort: Satisfaction level of Seat comfort
Inflight entertainment: Satisfaction level of inflight entertainment
On-board service: Satisfaction level of On-board service
Leg room service: Satisfaction level of Leg room service
Baggage handling: Satisfaction level of baggage handling
Check-in service: Satisfaction level of Check-in service
Inflight service: Satisfaction level of inflight service
Cleanliness: Satisfaction level of Cleanliness
Departure Delay in Minutes: Minutes delayed when departure
Arrival Delay in Minutes: Minutes delayed when Arrival
TARGET:Satisfaction: Airline satisfaction level(Satisfaction, neutral or dissatisfaction)
b.Data Preprocessing: Describe the data preprocessing steps
performed in the analysis.
 I checked and removed null values through mean
 Checked and removed the outliers from the data through IQR
 Changed the grouped data into 0 and 1 through one hot encoding
 Removed name and id column through drop()

c. Feature Engineering: Describe the feature engineering tasks

performed in the analysis.
 Normalized the data using min max
 Checked highly correlated and removed highly correlated features

d. Clustering: Describe the KMeans and DBSCAN algorithms used in

the
analysis and the performance metrics used to evaluate them. Also,
describe the fine-tuning of the clustering algorithms and the
comparison of their performance.

Feature extraction is performed using PCA to reduce the dimensionality of the dataset to 3
principal components using pca = PCA(n_components=3) and pca_data =
pca.fit_transform(data).
KMeans clustering is performed with 3 clusters using kmeans = KMeans(n_clusters=3,
random_state=42) and kmeans_labels = kmeans.fit_predict(pca_data). The silhouette score
metric is calculated using kmeans_silhouette = silhouette_score(pca_data, kmeans_labels).
DBSCAN clustering is performed with a minimum of 5 samples per cluster and an epsilon value
of 0.5 using dbscan = DBSCAN(eps=0.5, min_samples=5) and dbscan_labels =
dbscan.fit_predict(pca_data). The silhouette score metric is calculated using dbscan_silhouette
= silhouette_score(pca_data, dbscan_labels).

e. Results: Describe the results of the analysis and interpret the clusters
obtained.
f. Visualization: Include the visualization plots of the clusters obtained.

g. Limitations and Future Work: Identify the limitations and drawbacks

the clustering algorithms and suggest possible improvements.

h. Conclusion: Provide a summary of the analysis and the insights
obtained from it.

Network Security v1.0 - Module 1
No ratings yet
Network Security v1.0 - Module 1
28 pages
Project Submission Machine Learning - Ankit Bhagat - 8th Jan
100% (9)
Project Submission Machine Learning - Ankit Bhagat - 8th Jan
36 pages
Predicting Airline Passengers Satisfaction
100% (7)
Predicting Airline Passengers Satisfaction
70 pages
ToneRoom UM 1.4.0 VTX
No ratings yet
ToneRoom UM 1.4.0 VTX
16 pages
Project On Data Mining: Prepared by Ashish Pavan Kumar K PGP-DSBA at Great Learning
No ratings yet
Project On Data Mining: Prepared by Ashish Pavan Kumar K PGP-DSBA at Great Learning
50 pages
Data Mininig Project
67% (3)
Data Mininig Project
28 pages
CISA Exam-Testing Concept-OSI Architecture (Domain-5)
From Everand
CISA Exam-Testing Concept-OSI Architecture (Domain-5)
Hemang Doshi
No ratings yet
BH GF
No ratings yet
BH GF
16 pages
Naan Mudhalvan Phase 2
No ratings yet
Naan Mudhalvan Phase 2
13 pages
Car Price Prediction
No ratings yet
Car Price Prediction
42 pages
A1991370857 65680 10 2025 Csm355ca1
No ratings yet
A1991370857 65680 10 2025 Csm355ca1
6 pages
Assessment Brief: Learning Outcomes To Be Assessed
No ratings yet
Assessment Brief: Learning Outcomes To Be Assessed
7 pages
Phase 2
No ratings yet
Phase 2
14 pages
Flight Price Prediction Document
No ratings yet
Flight Price Prediction Document
12 pages
Data Strategy Seminar Paper Round1
No ratings yet
Data Strategy Seminar Paper Round1
3 pages
Report of Assignment 3 ML
No ratings yet
Report of Assignment 3 ML
6 pages
DWM Lab 11 (Open Ended Lab)
No ratings yet
DWM Lab 11 (Open Ended Lab)
3 pages
Capstone Project - Airline Passenger Satisfaction
No ratings yet
Capstone Project - Airline Passenger Satisfaction
18 pages
Random Forest Model
No ratings yet
Random Forest Model
16 pages
Airplane Passanger Satisfication Prediction
No ratings yet
Airplane Passanger Satisfication Prediction
86 pages
Kavin
No ratings yet
Kavin
13 pages
CST 383 Final
No ratings yet
CST 383 Final
23 pages
AI Exercises
No ratings yet
AI Exercises
2 pages
Data Mining Report
No ratings yet
Data Mining Report
72 pages
All Answers Coursera
No ratings yet
All Answers Coursera
2 pages
Classification Analysis Report PDF
No ratings yet
Classification Analysis Report PDF
9 pages
Machine Learning Team Coursework
No ratings yet
Machine Learning Team Coursework
7 pages
Airline Passenger Satisfact
No ratings yet
Airline Passenger Satisfact
6 pages
'Yatham Padma' 8 May 2022
No ratings yet
'Yatham Padma' 8 May 2022
82 pages
How A Perfect Machine Model Should Be Done
No ratings yet
How A Perfect Machine Model Should Be Done
5 pages
EST - Problem Statement-3
No ratings yet
EST - Problem Statement-3
3 pages
Flight Price Prediction
No ratings yet
Flight Price Prediction
34 pages
Cart-Rf-Ann: Prepared by Muralidharan N
67% (3)
Cart-Rf-Ann: Prepared by Muralidharan N
33 pages
Predicting & Optimizing Airlines Customer Satisfaction Using Clas
No ratings yet
Predicting & Optimizing Airlines Customer Satisfaction Using Clas
84 pages
Flight Price Prediction Project Report in PDF
No ratings yet
Flight Price Prediction Project Report in PDF
34 pages
Project Report-Micro Credit Loan
No ratings yet
Project Report-Micro Credit Loan
8 pages
Project - Machine Learning-Business Report: By: K Ravi Kumar PGP-Data Science and Business Analytics (PGPDSBA.O.MAR23.A)
No ratings yet
Project - Machine Learning-Business Report: By: K Ravi Kumar PGP-Data Science and Business Analytics (PGPDSBA.O.MAR23.A)
38 pages
BPP Business School - Applied Modelling and Visualisation
No ratings yet
BPP Business School - Applied Modelling and Visualisation
19 pages
Project
No ratings yet
Project
4 pages
Machine Learning Assignment-02
No ratings yet
Machine Learning Assignment-02
2 pages
S Minor
No ratings yet
S Minor
27 pages
Cpa Final 1
No ratings yet
Cpa Final 1
28 pages
Bai Tap
No ratings yet
Bai Tap
34 pages
Cpa Final
No ratings yet
Cpa Final
28 pages
Proposal CIS 412
No ratings yet
Proposal CIS 412
1 page
Data Mining Project - 27.06.2021
No ratings yet
Data Mining Project - 27.06.2021
6 pages
Flight Price Prediction Report
No ratings yet
Flight Price Prediction Report
18 pages
Daa 01
No ratings yet
Daa 01
11 pages
ML5 Decision Tree Airline Safety
No ratings yet
ML5 Decision Tree Airline Safety
3 pages
UNITIV BtechIot
No ratings yet
UNITIV BtechIot
43 pages
Assignment Instructions For The Data Analytics Report
No ratings yet
Assignment Instructions For The Data Analytics Report
5 pages
Sentiment Analysis of Reviews Using Machine Learning
100% (1)
Sentiment Analysis of Reviews Using Machine Learning
33 pages
Assignment 3-Individual (1) Ict505
No ratings yet
Assignment 3-Individual (1) Ict505
3 pages
Phase3 3
No ratings yet
Phase3 3
8 pages
STAR Method For ML Projects
No ratings yet
STAR Method For ML Projects
10 pages
Predictive Modelling Project
No ratings yet
Predictive Modelling Project
29 pages
Project - Data Mining: Bank - Marketing - Part1 - Data - CSV
No ratings yet
Project - Data Mining: Bank - Marketing - Part1 - Data - CSV
4 pages
Sample Phase 2 Document
No ratings yet
Sample Phase 2 Document
7 pages
Assignment - Machine Learning
No ratings yet
Assignment - Machine Learning
3 pages
Ads Phase 5
No ratings yet
Ads Phase 5
23 pages
Data Mining Project Report
100% (1)
Data Mining Project Report
98 pages
Rutik Kothwala Final Practical Data Science
No ratings yet
Rutik Kothwala Final Practical Data Science
27 pages
Number Systems
No ratings yet
Number Systems
9 pages
Introduction To NUMPY
No ratings yet
Introduction To NUMPY
15 pages
2GCS213014A0050 - RVT Communication Through Modbus, USB or TCPIP Protocol
No ratings yet
2GCS213014A0050 - RVT Communication Through Modbus, USB or TCPIP Protocol
73 pages
C30.C60.C80 SE Presentation - v1.1
100% (1)
C30.C60.C80 SE Presentation - v1.1
27 pages
TB 96aiot 1126ce Hardware User Manual
No ratings yet
TB 96aiot 1126ce Hardware User Manual
10 pages
5 Knowledge Representation
No ratings yet
5 Knowledge Representation
24 pages
Faculty of Computers & Informatics, Zagazig University
No ratings yet
Faculty of Computers & Informatics, Zagazig University
10 pages
IAG Bridge Scenario - GRC AC 12.0 Integration With... - SAP Community
No ratings yet
IAG Bridge Scenario - GRC AC 12.0 Integration With... - SAP Community
16 pages
Salesforce - Data Cloud Exam Flashcards
No ratings yet
Salesforce - Data Cloud Exam Flashcards
29 pages
Gliponeo Lopez Recommendation
No ratings yet
Gliponeo Lopez Recommendation
5 pages
FAC2601 Assessment 1 (S2) 2024
No ratings yet
FAC2601 Assessment 1 (S2) 2024
10 pages
Class 10 Information Technology Sample Paper Set 5
50% (2)
Class 10 Information Technology Sample Paper Set 5
9 pages
System Programming Unit-3
No ratings yet
System Programming Unit-3
20 pages
31 - Ecse105l - Computational Thinking and Programming
No ratings yet
31 - Ecse105l - Computational Thinking and Programming
2 pages
Name:Sumit Jain Phone No: +919131363827 Email ID
No ratings yet
Name:Sumit Jain Phone No: +919131363827 Email ID
2 pages
Mist Aiops
No ratings yet
Mist Aiops
261 pages
Red Hat Enterprise Linux 4: 4.8 Release Notes
No ratings yet
Red Hat Enterprise Linux 4: 4.8 Release Notes
26 pages
6225 Configuring Vlans and Trunking
No ratings yet
6225 Configuring Vlans and Trunking
13 pages
Module 4 Measurements
No ratings yet
Module 4 Measurements
8 pages
Project Report (Database Creation and Maintenance)
No ratings yet
Project Report (Database Creation and Maintenance)
21 pages
5 1 23 Dig-Deeper NEWS GOGGLES Nami Sumida
No ratings yet
5 1 23 Dig-Deeper NEWS GOGGLES Nami Sumida
12 pages
Eiot Mtech Syllabus
No ratings yet
Eiot Mtech Syllabus
30 pages
QCA-based Hamming Code Circuit For Nano Communication Network
No ratings yet
QCA-based Hamming Code Circuit For Nano Communication Network
8 pages
Marantz nr1508 - Manual
No ratings yet
Marantz nr1508 - Manual
241 pages
DRHA 2023 Book of Abstracts
No ratings yet
DRHA 2023 Book of Abstracts
51 pages
Ell 201 Project
No ratings yet
Ell 201 Project
9 pages
WEEK 1 (Math)
No ratings yet
WEEK 1 (Math)
37 pages
The Business Research Process
100% (1)
The Business Research Process
41 pages

Report 3

Uploaded by

Report 3

Uploaded by

MEMOONA WAZIR

a.Introduction: Briefly introduce the problem statement and the

c. Feature Engineering: Describe the feature engineering tasks

d. Clustering: Describe the KMeans and DBSCAN algorithms used in

g. Limitations and Future Work: Identify the limitations and drawbacks

the clustering algorithms and suggest possible improvements.

You might also like