data visualization and EDA
data visualization and EDA
• Bar charts
• Histograms
• Heat maps
• Scatter plots
• Infographics
• Maps
1) Histogram
A histogram is a value distribution plot of numerical columns.
It basically creates bins in various ranges in values and plots it
where we can visualize how values are distributed. We can
have a look where more values lie like in positive, negative,
or at the center(mean). Let’s have a look at the Age column
2) Pie Chart
The pie chart is also the same as the countplot, only gives you
additional information about the percentage presence of each
category in data means which category is getting how much
weightage in data. This classic chart type is effective when
you want to illustrate the proportion of each category in the
dataset. However, remember not to use these types of charts
for large datasets, as too many slices can create confusion.
The chart is suitable when you have limited categories, ideally
less than six or seven
3) Bar Plot
Bar plot is a simple plot which we can use to plot categorical
variable on the x-axis and numerical variable on y-axis and
explore the relationship between both variables.
4) Scatter plots
Scatter plots are types of visualization that show a collection
of data points ‘scattered’ around the graph. The data points
can be evenly or unevenly distributed. Scatter plots are ideal
for exploring relationships and patterns between two
continuous variables. They can help you identify trends,
correlations, or potential clusters in the data.
5) Line charts
A line chart connects distinct data points through straight
lines. Its best use case is to illuminate trends, patterns, and
variable changes. This type of chart helps measure how
different groups relate to each other. This type of chart is also
effective for demonstrating progression, making them suitable
for scenarios like project timelines, production cycles, or
population growth.
6) Heatmap charts
Heatmap charts are a type of map data visualization that
uses a system of color coding to represent value. Each cell
in the matrix is assigned a color based on the value it
holds. This type of chart is commonly used to establish
relationships between two variables across a grid. In the
example above, the intensity of the colors in the map
clearly demonstrates the variables, making it easy to
identify patterns and trends.
Data Visualization Tools
1. Tableau
Tableau is one of the most popular data visualization tools on
the market for two main reasons: It’s relatively easy to use
and incredibly powerful. The software can integrate with
hundreds of sources to import data and output dozens of
visualization types—from charts to maps and more. Owned
by Salesforce, Tableau boasts millions of users and
community members, and it’s widely used at the enterprise
level.
Tableau offers several products, including desktop, server, and
web-hosted versions of its analytics platform, along with
customer relationship management (CRM) software.
A free option, called Tableau Public, is also available. It’s
important to note, however, that any visualizations created on
the free version are available for anyone to see. This makes it
a good option to learn the software's basics, but it’s not ideal
for any proprietary or sensitive data.
Tableau is a data visualization tool that can be used by data
analysts, scientists, statisticians, etc. to visualize the data and
get a clear opinion based on the data analysis. Tableau is very
famous as it can take in data and produce the required data
visualization output in a very short time. And it can do this
while providing the highest level of security with a guarantee
to handle security issues as soon as they arise or are found by
users. The public version of Tableau is free to use for anyone
looking for a powerful way to create data visualizations that
can be used in a variety of settings.
Tableau also allows its users to prepare, clean, and format
their data and then create data visualizations to obtain
actionable insights that can be shared with other users.
Tableau is available for individual data analysts or at scale for
business teams and organizations.
2. Microsoft Excel and Microsoft Power BI
In the strictest sense, Microsoft Excel is a spreadsheet
software, not a data visualization tool. Even so, it has useful
data visualization capabilities. Given that Microsoft products
are widely used at the enterprise level, you may already have
access to it.
According to Microsoft’s documentation, you can use Excel
to design at least 20 types of charts using data in spreadsheets.
These include common options, such as bar charts, pie charts,
and scatter plots, to more advanced ones like radar charts,
histograms, and treemaps.
There are limitations to what you can create in Excel. If your
organization is looking for a more powerful data visualization
tool but wants to stay within the Microsoft ecosystem, Power
BI is an excellent alternative. Built specifically as a data
analytics and visualization tool, Power BI can import data
from various sources and output visualizations in a range of
formats.
Microsoft Power BI is a Data Visualization platform focused
on creating a data-driven business intelligence culture in all
companies today. To fulfill this, it offers self-service analytics
tools that can be used to analyze, aggregate, and share data in
a meaningful fashion.
Microsoft Power BI offers hundreds of data visualizations to
its customers along with built-in Artificial Intelligence
capabilities and Excel integration facilities.
3. Zoho Analytics
Zoho Analytics is a Business Intelligence and Data
Analytics software that can help you create wonderful-looking
data visualizations based on your data in a few minutes. You
can obtain data from multiple sources and mesh it together to
create multidimensional data visualizations that allow you to
view your business data across departments.
Zoho Analytics is a data visualization tool specifically
designed for professionals looking to visualize business
intelligence. As such, it’s most commonly used to visualize
information related to sales, marketing, profit, revenues, costs,
and pipelines with user-friendly dashboards. More than
500,000 businesses and two million users currently leverage
the software.
Zoho Analytics has several paid options, depending on your
needs. There’s also a free version that allows you to build a
limited number of reports, which can be helpful if you’re
testing the waters to determine which tool is best for your
business.
4. Domo
Domo is a business intelligence model that contains multiple
data visualization tools that provide a consolidated platform
where you can perform data analysis and then create
interactive data visualizations that allow other people to easily
understand your data conclusions. You can combine cards,
text, and images in the Domo dashboard so that you can guide
other people through the data while telling a data story as they
go.
In case of any doubts, you can use their pre-built dashboards
to obtain quick insights from the data.
5. Infogram
Infogram is a fully-featured drag-and-drop visualization tool
that allows even non-designers to create effective
visualizations of data for marketing reports,
infographics, social media posts, maps, dashboards, and more.
Infogram is popular option that can be used to generate charts,
reports, and maps.
What sets Infogram apart from the other tools on this list is
that you can use it to create infographics (where its name
comes from), making it especially popular among creative
professionals. Additionally, the tool includes a drag-and-drop
editor, which can be helpful for beginners.
Visualizations can be saved as image files and GIFs to be
embedded in reports and documents, or in HTML to be used
online. Like most of the other tools on this list, Infogram has
tiered pricing, ranging from a free to enterprise-level version.
Finished visualizations can be exported into a number of
formats: .PNG, .JPG, .GIF, .PDF, and .HTML. Interactive
visualizations are also possible, perfect for embedding into
websites or apps. Infogram also offers a WordPress plugin
that makes embedding visualizations even easier for
WordPress users.
6. Google Charts
For professionals interested in creating interactive data
visualizations destined to live on the internet, Google
Charts is a popular free option.
The tool can pull data from various sources—including
Salesforce, SQL databases, and Google Sheets—and uses
HTML5/SVG technology to generate charts, which makes
them incredibly accessible. It offers 18 types of charts,
including bar charts, pie charts, histograms, geo charts, and
area charts.
Members of the Google community occasionally generate
new charts and share them with other users, which are
arranged in a gallery on Google's website. These charts tend
to be more advanced but may not be HTML5-compliant.
7. R. Studio
In R, we can create visually appealing data visualizations by
writing few lines of code. For this purpose, we use the diverse
functionalities of R. Data visualization is an efficient
technique for gaining insight about data through a visual
medium. With the help of visualization techniques, a human
can easily obtain information about hidden patterns in data
that might be neglected.
By using the data visualization technique, we can work with
large datasets to efficiently obtain key insights about it.
R provides a series of packages for data visualization like
ggplot2, plotly, tidyquant.
Types of EDA
Here are key types of EDA techniques:
• Univariate Analysis: Univariate analysis is the simplest