Data Analysis
Data Analysis
Detectives and data analysts have a lot in common. Both depend on facts and clues to make decisions.
Both collect and look at the evidence. Both talk to people who know part of the story. And both might
even follow some footprints to see where they lead. Whether you’re a detective or a data analyst, your
job is all about following steps to collect and understand facts.
Analysts use data-driven decision-making and follow a step-by-step process. You have learned that there
are six steps to this process:
But there are other factors that influence the decision-making process. You may have read mysteries
where the detective used their gut instinct, and followed a hunch that helped them solve the case. Gut
instinct is an intuitive understanding of something with little or no explanation. This isn’t always
something conscious; we often pick up on signals without even realizing. You just have a “feeling” it’s
right.
Consider an example of a real estate developer bidding to redevelop a part of a city's central district.
They were well-known for preservation of historical buildings. Banking on their reputation, the agency's
planners followed gut instinct and included the preservation of several buildings to gain support and win
approval for the project. However, private donations fell short and a partnership failed to materialize
and save the day. The buildings eventually had to be torn down after much delay and an expensive
dispute with the city.
The more you understand the data related to a project, the easier it will be to figure out what is
required. These efforts will also help you identify errors and gaps in your data so you can communicate
your findings more effectively. Sometimes past experience helps you make a connection that no one
else would notice. For example, a detective might be able to crack open a case because they remember
an old case just like the one they’re solving today. It's not just gut instinct.
Blending data with business knowledge, plus maybe a touch of gut instinct, will be a common part of
your process as a junior data analyst. The key is figuring out the exact mix for each particular project. A
lot of times, it will depend on the goals of your analysis. That is why analysts often ask, “How do I define
success for this project?”
In addition, try asking yourself these questions about a project to help find the perfect balance:
For instance, if you are working on a rush project, you might need to rely on your own knowledge and
experience more than usual. There just isn’t enough time to thoroughly analyze all of the available data.
But if you get a project that involves plenty of time and resources, then the best strategy is to be more
data-driven. It’s up to you, the data analyst, to make the best possible choice. You will probably blend
data and knowledge a million different ways over the course of your data analytics career. And the more
you practice, the better you will get at finding that perfect blend.
Origins of the data analysis process
When you decided to join this program, you proved that you are a curious person. So
let’s tap into your curiosity and talk about the origins of data analysis. We don’t fully
know when or why the first person decided to record data about people and things. But
we do know it was useful because the idea is still around today!
We also know that data analysis is rooted in statistics, which has a pretty long history
itself. Archaeologists mark the start of statistics in ancient Egypt with the building of the
pyramids. The ancient Egyptians were masters of organizing data. They documented
their calculations and theories on papyri (paper-like materials), which are now viewed as
the earliest examples of spreadsheets and checklists. Today’s data analysts owe a lot
to those brilliant scribes, who helped create a more technical and efficient process.
It is time to enter the data analysis life cycle—the process of going from data to decision.
Data goes through several phases as it gets created, consumed, tested, processed, and
reused. With a life cycle model, all key team members can drive success by planning
work both up front and at the end of the data analysis process. While the data analysis
life cycle is well known among experts, there isn't a single defined structure of those
phases. There might not be one single architecture that’s uniformly followed by every
data analysis expert, but there are some shared fundamentals in every data analysis
process. This reading provides an overview of several, starting with the process that
forms the foundation of the Google Data Analytics Certificate.
The process presented as part of the Google Data Analytics Certificate is one that will
be valuable to you as you keep moving forward in your career:
1. Discovery
2. Pre-processing data
3. Model planning
4. Model building
5. Communicate results
6. Operationalize
EMC Corporation is now Dell EMC. This model, created by David Dietrich, reflects the
cyclical nature of real-world projects. The phases aren’t static milestones; each step
connects and leads to the next, and eventually repeats. Key questions help analysts
test whether they have accomplished enough to move forward and ensure that teams
have spent enough time on each of the phases and don’t start modeling before the data
is ready. It is a little different from the data analysis life cycle this program is based on,
but it has some core ideas in common: the first phase is interested in discovering and
asking questions; data has to be prepared before it can be analyzed and used; and then
findings should be shared and acted on.
For more information, refer to The Genesis of EMC's Data Analytics Lifecycle.
1. Ask
2. Prepare
3. Explore
4. Model
5. Implement
6. Act
7. Evaluate
The SAS model emphasizes the cyclical nature of their model by visualizing it as an
infinity symbol. Their life cycle has seven steps, many of which we have seen in the
other models, like Ask, Prepare, Model, and Act. But this life cycle is also a little
different; it includes a step after the act phase designed to help analysts evaluate their
solutions and potentially return to the ask phase again.
For more information, refer to Managing the Analytics Life Cycle for Decisions at Scale.
For more information, refer to Understanding the data analytics project life cycle.
For more information, refer to Big Data Adoption and Planning Considerations.
Key takeaway
From our journey to the pyramids and data in ancient Egypt to now, the way we analyze
data has evolved (and continues to do so). The data analysis process is like real life
architecture, there are different ways to do things but the same core ideas still appear in
each model of the process. Whether you use the structure of this Google Data Analytics
Certificate or one of the many other iterations you have learned about, we are here to
help guide you as you continue on your data journey.