Final 1
Final 1
Submitted To:
Prof. Brejesh Lall,
Department of Electrical Engineering,
Indian Institute Of Technlogy, Delhi
Submitted By:
Aryan Mishra,
Roll No.: 22117023,
Semester: V,
Department of Electrical Engineering,
National Institute of Technology, Raipur
CERTIFICATE
This is to certify that the minor project report entitled "Impact of Feature
Engineering on the performance of Machine Learning Model",
submitted by Aryan Mishra is the bonafied work completed under
my supervision and guidance during his research internship at
Indian Institute of Technology, Delhi.
...........
Prof. Brejesh Lall,
Department of Electrical Engineering,
Indian Institute of Technology, Delhi
2
ACKNOWLEDGEMENTS
The portion of success is brewed by the efforts put in by many individuals. It is constant
support provided by people who give you the initiative, who inspire you at each step of your
endeavor that eventually helps you in your goal. I would like to express my sincere gratitude
to Prof. Brejesh Lall for providing me with the opportunity to undertake this internship
project. I would also like to extend my heartfelt thanks to Amit Oberoi Sir for his continuous
guidance and mentorship throughout the internship. Her knowledge, expertise, and
patience have been instrumental in shaping my research and enhancing my understanding
of the subject matter. I am grateful for his constant support, insightful discussions, and
valuable suggestions that have significantly enriched my learning experience.
Aryan Mishra
ROLL NO: 22117023
5th Sem
B.Tech[ElectricalEngineering]
National Institute of Technology Raipur
i
ABSTRACT
This internship report focuses on the analysis and preprocessing of datasets, along with the
study of machine learning algorithms and neural networks for pattern recognition. The
report begins with an exploration of the importance and limitations of datasets, followed by
an examination of different types of data and techniques for converting categorical and
continuous data. Additionally, correlation, covariance, and outlier detection methods are
discussed, along with strategies for treating outliers. Feature scaling and the application of
Principal Component Analysis (PCA) are also explored.
Overall, this internship report offers valuable insights into the pre-processing of
datasets, the application of machine learning algorithms, and the fundamentals of neural
networks. The practical project demonstrates the impact of feature engineering on model
performance, while the exploration of neural networks expands the understanding of
pattern recognition. The findings from this report contribute to the broader field of
data analysis and machine learning, providing a foundation for further research and
application.
ii
TABLE OF CONTENTS
Page
ACKNOWLEDGEMENTS. . . . . . . . . . . . . . . . . . . . . . . . . . . . . i
ABSTRACT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii
LIST OF TABLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv
LIST OF FIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v
CHAPTER 1: INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
2.6.1 Correlation: . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.6.2 Covariance: . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
i
2.7 Detecting the Outliers in the Dataset . . . . . . . . . . . . . . . . . . . 11
2.11 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.3.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
ii