Data Analysis and Visualization Summer Training Report
Data Analysis and Visualization Summer Training Report
On
Submitted By
SUBMITTED TO:
I hereby declare that the Industrial Training Report on data Analysis and Visualization
during the period from 1st sept to 26th sept or the award of degree of B. Tech. CSE,
Trinity Institute of Innovations in Professional Studies, Greater Noida, (U P), under the
(Signature of student)
Himanshu Sharma
00227902721
Date: 26 /10/2024
Certified that the above statement made by the student is correct to the best of our knowledge and
belief.
Examined by:
(Signature)
Dr Shailendra Kumar
(Signature)
Head of Department
ACKNOWLEDGEMENT
First and foremost, I wish to express my sincere thanks and gratitude to my esteemed Mentor
“Accenture North America” who has contributed so much for successful completion of my
Next I would like to tender my sincere thanks to “Gunjan Arya” (Head of CSE Department)
(Signature of student)
Himanshu Sharma
00227902721
LIST OF CONTENTS
Certificate by Company/Industry i
Declaration by student ii
Acknowledgement iii
Table of Contents iv
List of Tables v
List of Figures vi
Abbreviations and Nomenclature (If any) vii
1. Chapters 1-10
1.1 Introduction 1
1.2 Theory 2
LIST OF FIGURES
DATA UNDERSTANDING - Understanding data and data analysis involves grasping various
concepts, including the types of data, methods of collection, and how to interpret findings. Data
can be quantitative (numerical) or qualitative (categorical), and it can be gathered through
surveys, experiments, or observational studies. Structuring this data effectively, often in tables
or databases, is crucial for analysis.
DATA CLEANING - Data cleaning is a crucial step in data analysis that ensures the accuracy
and reliability of the dataset. It involves identifying and correcting errors or inconsistencies,
starting with the detection of missing values, which can be addressed by removing records or
imputing values.
1. Conceptual Models: High-level representations that outline the overall structure and
relationships without delving into technical details.
2.Logical Models: More detailed than conceptual models, these specify the data elements,
attributes, and relationships while remaining independent of a specific database management
system.
3.Physical Models: These represent how data is stored in the database, including specific
tables, columns, data types, and indexing strategies.
Types of Visualizations
1.Charts: Bar charts, line charts, pie charts, and scatter plots are commonly used to represent
relationships and trends.
2.Maps: Geographic data is effectively displayed using heat maps and choropleth maps show
patterns across different locations.
3.Dashboards: Combining multiple visualizations into a single interface allows for
comprehensive data analysis briefly.
INTRODUCTION
Data Modelling
1.Entities and Attributes
Entities: These are the primary objects or concepts that hold data (e.g., customers,
orders).
Attributes: These are the characteristics or properties of entities (e.g., a customer’s name,
email, and purchase history).
2. Relationships
Data modelling defines how entities relate to each other, such as one-to-one, one-to-
many, or many-to-many relationships. Understanding these relationships is crucial for
accurately representing the data structure.
3. Normalization
This process involves organizing data to minimize redundancy and improve data
integrity. It helps ensure that dependencies are properly maintained, and that data is
stored efficiently.
4. Data Modelling Tools
Various tools are available for data modelling, including ERD (Entity-Relationship
Diagram) software like Lucid chart, MySQL Workbench, and modelling languages like
UML (Unified Modelling Language).
5. Importance of Data Modelling
Effective data modelling leads to better database design, enhances data consistency,
supports improved data management, and provides a clear framework for understanding
and analysing data relationships.
What Have I Created
In this project we worked on data sets, and we did data analysis, data visualization, modelling of
data and cleaning it. Let’s take a look we I have done in this project.
Process 1.
PROCESS 2.
In this we did Requirement Gathering.
PROCESS 3.
In this we have different data sets and we will perform data cleaning.
Data sets Before Data Cleaning and Modelling
Internship Experience
The internship experience was both challenging and rewarding. One major challenge I faced
was dealing with incomplete data that hindered initial analysis. To overcome this, I developed a
systematic approach to identify missing values, use statistical methods for imputation, and
document my findings for transparency.
This experience has set a strong foundation for my professional journey, and I am enthusiastic
about continuing to develop my skills in the field of data analytics.
REFERENCE
For online/Google Search
Website - https://www.accenture.com/gb-en/careers/local/virtual-experience-program
For Help - https://www.kaggle.com/datasets/ahmedsamir11111/project-data-analysis-
using-excel
GitHub - https://github.com/Mabrar92/Data-Analysis-Projects-Portfolio
Book reference - https://files.eric.ed.gov/fulltext/ED536788.pdf