Google Certificate (Notes)
Google Certificate (Notes)
Data analysis: Is the collection, transformation, and organization of data to draw conclusions,
make predictions, and drive informed decision-making.
Data analyst: someone who collects, transforms, and organizes data in order to help make
informed decisions.
COURSE OVERVIEW
The six steps of the data analysis process that you have been learning in this program are: Ask,
prepare, process, analyze, share, and act. These six steps apply to any data analysis.
The analysts asked questions to define both the issue to be solved and what would equal a
successful result.
Next, they prepared by building a timeline and collecting data with employee surveys,
which should be inclusive.
They processed the data by cleaning it to make sure it was complete, correct, relevant,
and free of errors and outliers.
They analyzed the clean employee survey data. Then the analysts shared their findings
and recommendations with team leaders. Afterward, leadership acted on the results and
focused on improving key areas.
Dimensions of data analytics
A data analyst is an explorer, a detective and an artist all rolled into one.
An ecosystem is the group of elements that interact with one another, they can be large or tiny.
Data ecosystems are the various elements that interact with one another in order to produce,
manage, store, organize, analyze, and share data. An example of a data ecosystem is a retail store
data base, which is fill with customer names, addresses, previous purchases, and customer
reviews. A data analyst will predict what the customer will buy in the future and make sure the
store has the products in stock when they need it.
Data can be found in the cloud. The cloud is a place to keep data online, rather than a computer
hard drive.
Data science in defined as creating new ways of modeling and understanding the unknown by
using raw data.
The first step of data driven decision making is define the business needs (a problem that needs to
be solved). Ones the problem is define, a data analyst finds data analyses and uses it to uncover
trends patters and relationships.
Data + business knowledge = mystery solved
Blending data with business knowledge, plus maybe a touch of gut instinct, will be a common part
of your process as a junior data analyst. The key is figuring out the exact mix for each particular
project. A lot of times, it will depend on the goals of your analysis. That is why analysts often ask,
“How do I define success for this project?”
In addition, try asking yourself these questions about a project to help find the perfect balance:
data analysis is rooted in statistics, which has a pretty long history itself. Archaeologists mark the
start of statistics in ancient Egypt with the building of the pyramids. The ancient Egyptians were
masters of organizing data. They documented their calculations and theories on papyri (paper-like
materials), which are now viewed as the earliest examples of spreadsheets and checklists. Today’s
data analysts owe a lot to those brilliant scribes, who helped create a more technical and efficient
process.
The process presented as part of the Google Data Analytics Certificate is one that will be valuable
to you as you keep moving forward in your career:
Analytical Skills
Qualities and characteristics associated with solving problems using facts.
Being Strategic: strategizing helps data analyst see what they want to achieve with the data and
how they can get there.
Problem-orientation: keeping the problem top of mind throughout the entire project.
Correlation: being able to identify a correlation between two or more pieces of data.
Big-picture and detail-oriented thinking: big picture thinking is so important because it helps you
zoom out and see possibilities and opportunities, that leads to exiting new ideas or innovations.
On the flip side, detail oriented thinking is figuring out all the aspects that will help you to execute
the plan.
What is the root cause of the problem? (Ask “why?” Five times to reveal the root
cause)The Five Whys process is used to reveal a root cause of a problem through the
answer to the fifth question.
Where are the gaps in our process? For this, many people will use Gap analysis, it is a
method for examining and evaluating how a process works currently in order to get where
you want to be in the future.
What did we not consider before?
(Given that knowledge, which of your skills or ways of thinking do you think will make you a
successful data analyst? )
I think curiosity and having a technical mindset are two skills that are indispensable to solve many
problems regarding data analysis. Asking the right questions, it’s very important when solving a
problem, it’s not enough to ask simple questions that can only uncover the surface of what could
be something more complex and difficult to solve. After we dig in the root of the problem and
understand the context well, we need to break down the information into smaller pieces so we
can understand really what is going on and find some correlation factors in the data that can give
us many important answers to our problem. After we study and break the problem into smaller
pieces we can zoom out to the big picture and get a clear view of where we are standing on.
Planning: During planning a business decides what kind of data it needs, how will be manage
trough its life cycle, who will be responsible for it?, and the optimum outcomes.
Capture: this is where data is collected from a variety of different sources and brought it to the
organization. There is so much data every day that there are many ways to collect data, one
common method is getting data from outside resources.
Manage: how do we care for our data how and where to store it, the tools used to keep it safe and
secure, and the actions taken to make sure it maintains properly.
Analyze: this is where data analyst really shines, in this phase, the data is used to solve problems,
make great decisions, and support business goals.
Archiving: means storing data in a place where it's still available but may not be used again. It
makes way more sense to archive it than to keep it around.
Destroy: This is important for protecting a company's private information, as well as private data
about its customers.
Plan: What plans and decisions do you need to make? What data do you need to answer
your question?
Capture: Where does your data come from? How will you get it?
Manage: How will you store your data? What should it be used for? How do you keep this
data secure and protected?
Analyze: How will the company analyze the data? What tools should they use?
Archive: What should they do with their data when it gets old? How do they know when
it's time?
Destroy: Should they ever dispose of any data? If so, when and how?
Key takeaway: Understanding the importance of the data life cycle will set you up for success as a
data analyst. Individual stages in the data life cycle will vary from company to company or by
industry or sector.
(You’ve been learning about the six phases of the data analysis process: ask, prepare, process,
analyze, share, and act. Based on what you’ve discovered, do you think data analysts find any
one step more important than others? If so, which one? And why do you feel that way?)
Every step its important in the data analysis process, if we miss any of these phases, the data
analysis would not work. But if I needed to tell you which is more important than other to me, I
would say it would be ask, then analyze and then act. First, we need to ask the right questions to
guide ourselves to the right path to achieve our goal and solve the problem. If we don’t ask the
right questions, we could be doing all the work for nothing. So first, we need to have a clear idea
of what is the problem we are trying to solve and then work on it. Then I would say the next most
important step would be analyze the data that we have stored. In this phase we are going to
identify patterns and draw the important conclusions that we later are going to share with our
team and the people interested in the project. This is where we get our conclusion that are going
to lead to our data driven decisions.
The act phase would be the other phase more important in my list. Not because I put it last it
means is the less important of the 3, but I think that with out all the work that we made, it would
not be possible to act on the problem because we would not know how to act on it. At the same
time its one of the most important phases because if we don’t do anything with the conclusion
that we have made, all the work would be in vain.
What is the relationship between the data life cycle and the data analysis process? How are the
two processes similar? How are they different?
The two processes complement each other. In the first phase of the Data Analysis Process, we
must ask ourselves the right questions to see what the root of the problem is. Ones we finish with
this phase we can begin the first phase of the Data Life Cycle, in which we are going to ask
ourselves what’s the data that we need to answer our questions. Then we must go into the
prepare phase of the data analysis process, which complements with the capture phase of the
data life cycle. In the prepare phase we need to understand how data is generated and collected
so we can dig into the capture phase of the data life cycle to understand these more clearly. This
complementation from the data analysis process with the data life cycle goes on for the process
and analyze phases with the manage and analyze phase.
We then move to the share phase of the data analysis process. This is where it starts to
differentiate from the data life cycle. Here the Data analysis process takes the conclusions and the
information that we collected and share it with the people that are interested in the project.
Unlike the fifth phase of the data analysis process, in the fifth phase of the data life cycle we
archive that information that we already have shared and store it in a safe place. Then we have
the last phase of the data analysis process where we act on the conclusions that we have made
through out the process. In the last phase of the data life cycle, we destroy the information that’s
no longer works for us. These two cycles have their similarities and their differences, but they
complement each other at the same time.
What is the relationship between the Ask phase of the data analysis process and the Plan phase of
the data life cycle? How are they similar? How are they different?
As I mention earlier in the ask phase, we ask effective questions to define the problem. In the plan
phase we define which is the right data to collect to solve our problem. Each phase has its own
purpose, but they complement each other. If we don’t do right the ask phase, then we are going
to do wrong the plan phase. As I mention earlier the two cycles have their similarities and their
differences, but they complement each other at the same time. One cannot exist without the
other.
Columns and rows and cells, oh my!
Attribute: A characteristic or quality of data used to label a column in a table.
Formula: A set of instructions that performs a specific action using the data in a spreadsheet.
Observation: All the Attributes for something contained in a row of a data table.
A Business task Is the question or problem data analysis answers for a business.
Example:
Let's say we have a company that's kind of notorious for being a boys club. There isn't much
representation of other genders. This company wants to see which employees are doing well, so
they start gathering data on employee performance and their own company culture. The data
shows that men are the only people succeeding at this company. Their conclusion? That they
should hire more men. After all, they're doing really well here, right? But that's not a fair
conclusion for a couple of reasons. First, it doesn't even consider all of the available data on
company culture, so it paints an incomplete picture. Second, it doesn't think about the other
surrounding factors that impact the data, or in other words, the conclusion doesn't consider
the difficulties that people of different gender identities have trying to navigate a toxic work
environment. If the company only looks at this conclusion, they won't acknowledge and address
how harmful their culture is and they won't understand why certain people are set up to fail
within it. That's why it's important to keep fairness in mind when analyzing data. The conclusion
that only men are succeeding at this company is true, but it ignores other systematic factors that
are contributing to this problem. But don't worry, there's a way to make a fair conclusion here. An
ethical data analyst can look at the data gathered and conclude that the company culture
is preventing some employees from succeeding, and the company needs to address those
problems to boost performance. See how this conclusion paints a much more complete and fair
picture. It recognizes the fact that some people aren't doing as well in this company and factors
in why that could be instead of discriminating against a huge number of applicants in the future.