0% found this document useful (0 votes)
23 views

Data Analystic

Uploaded by

boytuminh1998
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views

Data Analystic

Uploaded by

boytuminh1998
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 35

DATA ANALYTIC

Chương II.

The life cycle ò data is plan, capture, manage, analyze, archive and destroy.

- Plan: decides what kind of data it needs, how it will be managed throughout its
life, who will be responsible for it, anf the optimal outcomes.

- Capture : collect or bring in data from a variety of different sources.

Database is a collection of data stored in a computer system.

- Manage: how and where it’s stored, the tools used to keep it safe and secure,
and the action taken to make sure that it’s maintained properly.

- Analyze: The data is used to solve problems, make great decisions, and support
business goals.

- Archiving means storing data in a place where it’s still available, but may not
be used again.

- Destroy: remove data from storage and delete any shared copies of the data.

B2:

The phases of data analysis and this program

Ask: we define the problem to be solved and we make sure that we fully
understand stakeholder expectations.

Takes the time to fully understand stakeholder expectations

Decides which questions to answer in order to solve the problem

Stakeholders: People who have invested time and resources into a project
and are interested in outcome

Look at the current state and identify how its different from the ideel
state.
Determine who the stakeholder are.

Prepare: This is where data analysts collect and store data they’ll use for the
upcoming analysis process.

Process: here data analysts find and eliminate any errors and inaccuracies that
can get in the way results.

 Cleaning data
 Transforming data into a more useful format
 Combining two or more datasets to make information more complete
 Removing outliters

Analyze : Analyzing the data you’ve collected involves using tools to transform
and organize that information so that you can draw useful conclusions, make
predictions, and drive informed decision-making.

Share:

Explore data analyst tools

A set of instructions that performs a specific calculation using the data in a


spreadsheet

Funtion: A preset command that automatically performs a specific process or


task using the data in a spreadsheet.

Query language : a computer programing language that allows you retrieve and
manipulate data from a database

Database: a collection of data stored in a computer system.

Data visualization: The graphical representation of information.

Key data analyst tools

Spreadsheets: data analysts rely on spreadsheets to collect and orgranize data.

Collect, store, organize, and sort information


Identify patterns and piece the data together in a way that works for each
specific data project.

Create excellent data visualizations, like graphs and charts.

Database and query languages:

Query language: Allow analysts to isolate specific information from a


database

Make it easier for you to learn and understand the requests made to
databases

Allow analysts to select, creat add, or download data from a database for
analysis.

Visualization tools:

Data analysts use a number of visualization tools, like graphs, maps,


tables, charts and more.

- turn complex numbers into a story that people can understand

- help stakeholders come up with conclusions that lead to informed


decisions and effective business strategies.

- have multiple features.

Chooses the right tool for the job

SQL guide: getting started

A query is a request for data or information from a database.

Syntax is the predetermind structure of a languge that includes all required


words, sysbols, and punctuation, as well as their proper placement.

The syntax to every SQL query is the same:

Use SELECT to choose the columns you want to return.


Use FROM to choose the tables where the columns you want are located.

Use WHERE to filter for certain information.

BECOME A DATA VIZ WHIZ

Data visualization is the graphical representation of information.

Plan a data visualization

Step 1: Exploxe the data of patterns

Step 2: Plan your visuals

Refine the data and present the results of your data analysis.

Step 3: Create your visuals

The power of data in business

Issue : a topic or subject to investigate

Question: Designed to discover information

Problem: An obstacle or complication that need to be worked out

Business task: The question or problem data analysis answers for a business

Analyze weather data from the last decade to identify predictable patterns.

Data -driven decision-making: Using facts to guide business strategy

Understand data and fairness


Making sure that their analyses are fair.

Fairness means ensuring that your analysis doesn’t create or reinforce bias(cungr
coos thanhf kieens).

Consider fairness

Best practice Explanation Example


Consider all Part of your job as a data analyst is
of the avaible to determine what data is going to
data be useful for your analysis. Often
there will be data that isn’t relevant
to what you’re focusing on or
doesn’t seem to align with your
expectations. But you can’t ignore
it, it's critical to consider all of the
available data so that your analysis
reflects the truth and no just your
own expectations.
Identify As you’ll learn throughout these
surrounding course context is key for you and
factors your stakeholders to understand the
final conclusions of any
analysis.Similar to considering
factor that could influence the
insight you’re gaining.
Include self- Self-reporting is a data collection
reported data technique where participants
provide information about
themselves.
Use Oversampling is the process of
oversampling increasing the sample size of
effectively nondominant groups in a
population.
Thinking
about fairness
from
beginning to
end.

Data analysts in different industries

Data analyst roles and job descriptions

The data analyst role is one of many job titles that contain the word “ analyst”

Bussiness analyst- analyzes data to help businesses improve processes


productsm or service.

Data analytics conslutant- analyzes the systems and models for using data.

Data scientist- uses expert skills in technology and social science to find trends
through data analysis.
A

Analytical skills: Qualities and characteristics associated with using facts to


solve problems

Analytical thinking: The process of identifying and defining a problem, then


solving it by using

data in an organized, step-by-step manner

Attribute: A characteristic or quality of data used to label a column in a table

Business task: The question or problem data analysis resolves for a business

Context: The condition in which something exists or happens

Data: A collection of facts

Data analysis: The collection, transformation, and organization of data in order


to draw

conclusions, make predictions, and drive informed decision-making

Data analyst: Someone who collects, transforms, and organizes data in order to
draw

conclusions, make predictions, and drive informed decision-making

Data analytics: The science of data

Data design: How information is organized

Data-driven decision-making: Using facts to guide business strategy


Data ecosystem: The various elements that interact with one another in order to
produce,

manage, store, organize, analyze, and share data

Data science: A field of study that uses raw data to create new ways of modeling
and

understanding the unknown

Data strategy: The management of the people, processes, and tools used in data
analysis

Data visualization: The graphical representation of data

Database: A collection of data stored in a computer system

Dataset: A collection of data that can be manipulated or analyzed as one unit

Fairness: A quality of data analysis that does not create or reinforce bias

Formula: A set of instructions used to perform a calculation using the data in a


spreadsheet

Function: A preset command that automatically performs a specified process or


task using the

data in a spreadsheet

Gap analysis: A method for examining and evaluating the current state of a
process in order to

identify opportunities for improvement in the future


H

Oversampling: The process of increasing the sample size of nondominant groups


in a

population. This can help you better represent them and address imbalanced
datasets

Observation: The attributes that describe a piece of data contained in a row of a


table

Query: A request for data or information from a database

Query language: A computer programming language used to communicate with


a database
R

Root cause: The reason why a problem occurs

Self-reporting: A data collection technique where participants provide


information about

themselves

Stakeholders: People who invest time and resources into a project and are
interested in its

outcome

Structured Query Language: A computer programming language used to


communicate with

a database

Spreadsheet: A digital worksheet

SQL: (Refer to Structured Query Language)

Technical mindset: The ability to break things down into smaller steps or pieces
and work with

them in an orderly and logical way

Visualization: (Refer to data visualization)

Data engineer- prepares and intergrates data from different sources for analytical
use
Data specialist-organizes or converts data for use in databases or software
systems

Operations analyst – analyzes data to assess the performance of business


operations and workflows.

Job specialization by industry

Marketing analyst-analyzes market conditions to assess the potential sales of


products and services

HR/payroll analyst-analyzes payroll data for inefficiencies and errors

Financial analyst-analyze financial status by collecting monitoring, and


reviewing data

Rick analyst- analyzes financial documents, economic conditions, and client


data to help companies determine the level of risk involved in making a
particular business decision.

Healthcare analyst- analyzes medical da ta to improve the business aspect of


hospitals and medical facilities.
Think abut a time where you’ve used data to solve a problem, whether it’s in
your profesional or personal projects.

Part II:

Module 1:

Unit 1: Introduction to problem-solving and effective questioning

Structured thinking: the process of recognizing the current problem or situation,


organizing available information, revealing gaps and opportunities, and
identifying the options.

In the ask step, we define the problem we're solving and make sure that we fully
understand stakeholder expectations.

Unit 2: Refresher: Your google data analytics Certificate roadmap

Course1: Foundations

Will learn:

 Real-life roles and responsibilities of a junor data analyst


 How businesses transform data into actionable insight
 Spreadsheet basics
 Database and query basics
 Data visualization basics

Skill build:

 Using data in everyday life


 Thinking analytically
 Applying tools from the data analytics toolkit
 Showing trends and patterns with data visualizations
 Ensuring your data analysis is fair

Course 2: ASK

Will learn:
 How data analysts solve problem with data
 The use analytics for making data-driven decisions
 Spreadsheet formulas and functions
 Dashboard basics, including an introduction to Tableau
 Data reporting basics

Will build:

Asking smart and effective questions

Structuring how you think

Summarizing data

Putting things into context

Managing team and stakeholder expectations

Problem-solving and conflict-resolution

Course 3: Prepare

will learn:

 How data is generated


 Features of different data types, fields, and values
 Database structures
 The function of metadata in data analytics
 Structured Query Language (SQL) functions

will build:

 Ensuring ethical data analysis practices


 Addressing issues of bias and credibility
 Accessing databases and importing data
 Writing simple queries
 Organizing and protecting data
 Connecting with the data community (optional)

Course 4: Process
will learn:

 Data integrity and the importance of clean data


 The tools and processes used by data analysts to clean data
 Data-cleaning verification and reports
 Statistics, hypothesis testing, and margin of error
 Resume building and interpretation of job postings (optional)
will build:

 Connecting business objectives to data analysis


 Identifying clean and dirty data
 Cleaning small datasets using spreadsheet tools
 Cleaning large datasets by writing SQL queries
 Documenting data-cleaning processes

Course 5: Analyze
will learn:

 Steps data analysts take to organize data


 How to combine data from multiple sources
 Spreadsheet calculations and pivot tables
 SQL calculations
 Temporary tables
 Data validation
will build:

 Sorting data in spreadsheets and by writing SQL queries


 Filtering data in spreadsheets and by writing SQL queries
 Converting data
 Formatting data
 Substantiating data analysis processes
 Seeking feedback and support from others during data analysis

Course 6: Share
will learn:

 Design thinking
 How data analysts use visualizations to communicate about data
 The benefits of Tableau for presenting data analysis findings
 Data-driven storytelling
 Dashboards and dashboard filters
 Strategies for creating an effective data presentation
will build:
 Creating visualizations and dashboards in Tableau
 Addressing accessibility issues when communicating about data
 Understanding the purpose of different business communication tools
 Telling a data-driven story
 Presenting to others about data
 Answering questions about data

Course 7:
will learn:

 Programming languages and environments


 R packages
 R functions, variables, data types, pipes, and vectors
 R data frames
 Bias and credibility in R
 R visualization tools
 R Markdown for documentation, creating structure, and emphasis
will build:

 Coding in R
 Writing functions in R
 Accessing data in R
 Cleaning data in R
 Generating data visualizations in R
 Reporting on data analysis to stakeholders

Course 8: Capstone
will learn:

 How a data analytics portfolio distinguishes you from other candidates


 Practical, real-world problem-solving
 Strategies for extracting insights from data
 Clear presentation of data findings
 Motivation and ability to take initiative

Unit 3: Data in action

Problem: Determine what advertising method is best for reaching Anywhere


target audience

Unit 4: From issue to action: The six data analysis phases

Step 1: Ask
It’s impossible to solve a problem if you don’t know what it is. There are some
things to consider:
 Define the problem you’re trying to solve
 Make sure you fully understand the stakeholder’s expectations
 Focus on the actual problem and avoid any distractions(phien nhieu)
 Collaborate(hop tac) with stakeholders and keep an open line of communication
 Take a step back and see the whole situation in context
Questions to ask yourself in this step:

What are my stakeholders saying their problems are

Now that I’ve identified the issues, how can I help the stakeholders resolve their
questions?

Step 2: Prepare
You will decide what data you need to collect in order to answer your questions and how to
organize it so that it is useful. You might use your business task to decide:

 What metrics to measure


 Locate data in your database
 Create security measures to protect that data
Questions to ask yourself in this step:
1. What do I need to figure out how to solve this problem?
2. What research do I need to do?

Step 3: Process
Clean data is the best data and you will need to clean up your data to get rid of any possible
errors, inaccuracies(thieu chinh xac), or inconsistencies(mau thuan). This might mean:

 Using spreadsheet functions to find incorrectly entered data


 Using SQL functions to check for extra spaces
 Removing repeated entries
 Checking as much as possible for bias in the data
Questions to ask yourself in this step:
1. What data errors or inaccuracies might get in my way of getting the best possible
answer to the problem I am trying to solve?
2. How can I clean my data so the information I have is more consistent?

Step 4: Analyze
You will want to think analytically about your data. At this stage, you might sort and format
your data to make it easier to:

 Perform calculations
 Combine(ket hop) data from multiple sources
 Create tables with your results
Questions to ask yourself in this step:
1. What story is my data telling me?
2. How will my data help me solve this problem?
3. Who needs my company’s product or service? What type of person is most likely to
use it?

Step 5: Share
Everyone shares their results differently so be sure to summarize your results with clear and
enticing visuals of your analysis using data via tools like graphs or dashboards. This is your
chance to show the stakeholders you have solved their problem and how you got there.
Sharing will certainly help your team:

 Make better decisions


 Make more informed decisions
 Lead to stronger outcomes
 Successfully communicate your findings
Questions to ask yourself in this step:
1. How can I make what I present to the stakeholders engaging(hap dan) and easy to
understand?
2. What would help me understand this if I were the listener?

Step 6: Act
Now it’s time to act on your data. You will take everything you have learned from your data
analysis and put it to use. This could mean providing your stakeholders with
recommendations based on your findings so they can make data-driven decisions.

Questions to ask yourself in this step:


1. How can I use the feedback I received during the share phase (step 5) to actually meet
the stakeholder’s needs and expectations?
These six steps can help you to break the data analysis process into smaller,
manageable parts, which is called structured thinking. This process
involves four basic activities:

1. Recognizing the current problem or situation


2. Organizing available information
3. Revealing(tiet lo) gaps and opportunities
4. Identifying your options

Unit 5: The data process works

Unit 6: Common problem types

There are six common type

Making predictions(du doan): using data to make an informed decision about


how things may be in the future
Categorizing things: This means assigning information to different groups or
clusters based on common features.
Spotting something unusual: data analysts identify data that is different from the
norm.

Identifying themes: Identifying themes takes categorization as a step further by


grouping information into broader concepts

Discovering connections: data analysts to find similar challenges faced by


different entities, and then combine data and insights to address them

Finding pattern(mau) : Data analysts use data to find patterns by using historical
data to understand what happened in the past and is therefore likely to happen
again.

Unit 7: Six common problem types

Making predictions: A company that wants to know the best advertising


method to bring in new customer is an example of a problem requiring analysts
to make predictions. Analysts with data on location, type of media, and number
of new customers acquired as a result of past ads can’t guarantee(dam bao)
future results, but they can help predict(du doan) the best placement of
advertising to reach(tiep can) the target audience.

Categorizing things: An example of a problem requiring(yeu cau) analysts to


categorize things is a company’s goal to improve customer satisfaction(hai
long). Analysts might classify(phan loai) customer service call based on
certain( keywords or scores.

Unit 8: continue exploring business applications

Unit 9: From hypothesis to outcome

Unit 10: Smart questions

Specific: Is the question specific? Does it address the problem? Does it have
context? Will it uncover a lot of the information you need?
Measurable: Will the question give you answers that you can measure?

Action-oriented: Will the answers provide information that helps you devise
some type of plan?

Relevant: Is the question about the particular problem you are trying to solve?

Time-bound : Are the answers relevant to the specific time being studied?

Glossary terms from module 1


Terms and definitions for Course 2, Module 1
Action-oriented question: A question whose answers lead to change

Cloud: A place to keep data online, rather than a computer hard drive

Data analysis process: The six phases of ask, prepare, process, analyze, share, and
act whose purpose is to gain insights that drive informed decision-making

Data life cycle: The sequence of stages that data experiences, which include plan,
capture, manage, analyze, archive, and destroy

Leading question: A question that steers people toward a certain response

Measurable question: A question whose answers can be quantified and assessed

Problem types: The various problems that data analysts encounter, including
categorizing things, discovering connections, finding patterns, identifying themes, making
predictions, and spotting something unusual

Relevant question: A question that has significance to the problem to be solved

SMART methodology: A tool for determining a question’s effectiveness based on


whether it is specific, measurable, action-oriented, relevant, and time-bound

Specific question: A question that is simple, significant, and focused on a single topic
or a few closely related ideas

Structured thinking: The process of recognizing the current problem or situation,


organizing available information, revealing(phát hiện) gaps and opportunities, and identifying
options
Time-bound question: A question that specifies a timeframe to be studied

Unfair question: A question that makes assumptions or is difficult to answer honestly

Unit 11: Data and decisions

Unit 12: How data empowers decisions

Data is a collection of facts

Data analysis can help us make more informed decisions

+ Data- driven decision

+ Data-inspired decision making( lay cam hung tu du lieu)

Data-inspired decision-making: explores different data sources to find out what


they have in common.( Việc ra quyết định lấy cảm hứng từ dữ liệu khám phá
các nguồn dữ liệu khác nhau để tìm ra điểm chung của chúng)

Algorithm: A process or set of rules to be followed for a specific task

Unit 13: Data trials and triumphs( những thử thách và thành công của dữ liệu)

Data does not make decision but it does improve them

Data- driven decisions( quyet dinh du tren du lieu): means using facts to
guide(huong dan) business strategy(chien luoc kinh doanh). This
approach(phuong phap) is limited(hanj che) by the quantity(so luong) and
quality(chat luong) of readily-available data.

Data-inspired decisions: the same considerations as data-driven decisions, they


create space for people using data to consider a broader reange of ideas: drawing
on comparisions(so sanh) to related(lien quan) concepts(khai niem), giving
weight to feelings and experiences, and considering other qualities that may be
more difficult to measure(do luong).

A data analysis triumph: When data is used strategically, businesses can


transform and grow their revenue(doanh thu).
Data analysis failures(loi phan tich du lieu):

Unit 14: Qualitive and quantitative data

Quantitative data: specific(cu the) and objective(khach quan) measures of


numerical facts. charts and graphs.

This is include : the what how many, how often.

Qualitative data: Subjective(chu quan) or explanatory(giai thich) measures of


qualities and characteristics. Is great for helping us answer why questions.

This is include

Unit 15: Qualitative and quantitative data in business

Qualitative data tools: Focus groups; social media text analysis(phan tich van
ban xa hoi), in-person interview(phong van truc tiep).

Quantitative data tools: Structured interview(phong van co cau truc);


surveys(khao sat), polls(tham do y kien).

Unit 16:the big reveal: Sharing your findings

Report is Static(tinh) collection of data given to stakeholders periodically(cung


cap).

Pros : high-level historical data

Esay to design

Pre-cleaned and stored data

Cons: continual maintenance

Less visually appealing

Static.

Dashboard: Monitors live, incoming data.

Pros: dynamic, automatic and interactive


More stakeholder access

Low maintenance

Cons : Labor-intensive design

Can be confusing

Potentially uncleaned data.

Pivot table : A data summarization tool that is used in data processing. Pivot
tables are used to summarize, sort, reorganize, group, count, total or average
data stored in a database.

Unit 17: Data versus( so) metrics

Metric: single, quantifiable type of data that can be used for measurement.

ROI, or Return on Investment is essentially a formula designed using metrics


that let a business know how well an investment is doing. The ROI is made up
of two metrics, the net profit over a period of time and the cost of investment.

Metric can be used to help calculate customer retention rates, or a company’s


ability to keep its customers over time.

This metric goal is a measurable goal set by a company and evaluated using
metrics.

Unit 18: Tool for visualizing data

Unit 19: Design compelling dashboards

Dashboards are powerful visual tools that help you tell your data story. A dashboard is a tool that
monitors live, incoming data.

Created a dashboard:

1. Identify the stakeholders who need to see the data


and how they will use it
2. Design the dashboard (what should be displayed)
Use these tips to help make your dashboard design clear and easy to follow:

 Use a clear header to label the information.


 Add short text descriptions to each visualization.
 Show the most important information at the top.
3. Create mockups if desired
A mockup is a simple draft of a visualization used for planning a dashboard and evaluating its
progress.

4. Select the visualizations


5. Create filters as needed

The three most common categories are:

Strategic: focuses on long term goals and strategies at the highest level of metrics

They typically contain information that is useful for enterprise-wide decision-making

Operational: short-term performance tracking and intermediate goals

Because these dashboards contain information on a time scale of days, weeks, or months, they
can provide performance insight almost in real-time.

Analytical: consists of the datasets and the mathematics used in these sets

Small data:

Specific

Short time-period

Mathematical thinking is a powerful skill you can use to help you solve
problems and see new solutions.

Big Data:

Large and less specific

Long time period

Big decisions
Những thách thức và lợi ích
Dưới đây là một số thách thức bạn có thể gặp phải khi làm việc với dữ liệu lớn:

 Rất nhiều tổ chức phải đối mặt với tình trạng quá tải dữ liệu và có quá nhiều thông tin
không quan trọng hoặc không liên quan.
 Dữ liệu quan trọng có thể bị ẩn sâu bên dưới cùng với tất cả các dữ liệu không quan
trọng, khiến việc tìm kiếm và sử dụng trở nên khó khăn hơn. Điều này có thể dẫn đến
khung thời gian ra quyết định chậm hơn và kém hiệu quả hơn.
 Dữ liệu bạn cần không phải lúc nào cũng dễ dàng truy cập được.
 Các công cụ và giải pháp công nghệ hiện tại vẫn đang gặp khó khăn trong việc cung
cấp dữ liệu có thể đo lường và báo cáo được. Điều này có thể dẫn đến sai lệch thuật
toán không công bằng.
 Có những lỗ hổng trong nhiều giải pháp kinh doanh dữ liệu lớn.
Bây giờ là phần tin tức tốt! Dưới đây là một số lợi ích đi kèm với dữ liệu lớn:

 Khi một lượng lớn dữ liệu có thể được lưu trữ và phân tích, nó có thể giúp các công ty
xác định các cách kinh doanh hiệu quả hơn và tiết kiệm rất nhiều thời gian và tiền bạc.
 Dữ liệu lớn giúp các tổ chức phát hiện xu hướng mua hàng của khách hàng và mức độ
hài lòng, từ đó có thể giúp họ tạo ra các sản phẩm và giải pháp mới khiến khách hàng
hài lòng.
 Bằng cách phân tích dữ liệu lớn, các doanh nghiệp hiểu rõ hơn về điều kiện thị trường
hiện tại, điều này có thể giúp họ dẫn đầu trong cạnh tranh.
 Như trong ví dụ về truyền thông xã hội trước đây của chúng tôi, dữ liệu lớn giúp các
công ty theo dõi sự hiện diện trực tuyến của họ—đặc biệt là phản hồi, cả tốt lẫn xấu,
từ khách hàng. Điều này cung cấp cho họ thông tin họ cần để cải thiện và bảo vệ
thương hiệu của mình.
Next: Get to work with spreadsheets

Spreadsheet tasks:

Organize your data:

+ Pivot table: sort and filter

Caculator your data:

Spreadsheets and the data life cycle

Plan for the users who will work within a spreadsheet by developing
organizational(to chuc) standards(tieeu chuan).

Capture(thu thap) data by the source by connecting spreadsheets to other data


sources, such as an online survey application or a database.

Manage different kinds of data with a spreadsheet.This can involve storing,


organizing, filtering and updating information.

Analyze data in a spreadsheet to help make better decisions.

Archive any spreadsheet that you don’t use often, but might need to reference
later with built-in tools.

Destroy your spreadsheet when you are certain that you will never need it again.

Introduction to google sheets

Step 1: There are many excellent spreadsheet applications avaiable to data


analysts.(google sheet)

Step 2: Create a new preadsheet

- To start, go to google.com

- Click the google apps icon

- then click the sheets icon.


- In the start a new spreadsheet section, click Blank to create a new blank
spreadsheet.

Step 3: Edit and format your spreadsheet

Formulas for success

Cell reference(tham chieu o) : A cell or a range of cells in a worksheet that can


be used in a formula.

Structured thinking : the process of recognizing the current problem or situation,


organizing available information, revealing gaps and opportunities, and
identifying the options
problem domain: the specific area of analysis that encompasses every activity
affecting or affected by the problem.

Scope of work: an agreed-upon outline of the work you’re going to perform on a


project.

Context in data analystics is the condition and circumstances that surround and
guve meaning to the data.

Glossary terms from module 3

Terms and definitions for Course 2, Module 3

AVERAGE: A spreadsheet function that returns an average of the values from a


selected range

Borders: Lines that can be added around two or more cells on a spreadsheet

Cell reference: A cell or a range of cells in a worksheet typically used in


formulas and functions
COUNT: A spreadsheet function that counts the number of cells in a range that
meet a specific criteria

Equation: A calculation that involves addition, subtraction, multiplication, or


division (also called a math expression)

Fill handle: A box in the lower-right-hand corner of a selected spreadsheet cell


that can be dragged through neighboring cells in order to continue an instruction

Filtering: The process of showing only the data that meets a specified criteria
while hiding the rest

Header: The first row in a spreadsheet that labels the type of data in each
column

Math expression: A calculation that involves addition, subtraction,


multiplication, or division (also called an equation)

Math function: A function that is used as part of a mathematical formula

MAX: A spreadsheet function that returns the largest numeric value from a
range of cells
MIN: A spreadsheet function that returns the smallest numeric value from a
range of cells

Open data: Data that is available to the public

Operator: A symbol that names the operation or calculation to be performed

Order of operations: Using parentheses to group together spreadsheet values in


order to clarify the order in which operations should be performed

Problem domain: The area of analysis that encompasses every activity affecting
or affected by a problem

Range: A collection of two or more cells in a spreadsheet

Report: A static collection of data periodically given to stakeholders

Return on investment (ROI): A formula that uses the metrics of investment and
profit to evaluate the success of an investment

Revenue: The total amount of income generated by the sale of goods or services

Scope of work (SOW): An agreed-upon outline of the tasks to be performed


during a project
Sorting: The process of arranging data into a meaningful order to make it easier
to understand, analyze, and visualize

SUM: A spreadsheet function that adds the values of a selected range of cells

Balance needs and expectations across your team

Terms and definitions from Course 2

Action-oriented question: A question whose answers lead to change

Algorithm: A process or set of rules followed for a specific task

AVERAGE: A spreadsheet function that returns an average of the values from a


selected range

Big data: Large, complex datasets typically involving long periods of time,
which enable data analysts to address far-reaching business problems

Borders: Lines that can be added around two or more cells on a spreadsheet

Cell reference: A cell or a range of cells in a worksheet typically used in


formulas and functions

Cloud: A place to keep data online, rather than a computer hard drive

COUNT: A spreadsheet function that counts the number of cells in a range that
meet a specific criteria

D
Dashboard: A tool that monitors live, incoming data

Data analysis process: The six phases of ask, prepare, process, analyze, share,
and act whose purpose is to gain insights that drive informed decision-making

Data-inspired decision-making: The process of exploring different data sources


to find out what they have in common

Data life cycle: The sequence of stages that data experiences, which include
plan, capture, manage, analyze, archive, and destroy

Equation: A calculation that involves addition, subtraction, multiplication, or


division (also called a math expression)

Fill handle: A box in the lower-right-hand corner of a selected spreadsheet cell


that can be dragged through neighboring cells in order to continue an instruction

Filtering: The process of showing only the data that meets a specified criteria
while hiding the rest

Header: The first row in a spreadsheet that labels the type of data in each
column

Leading question: A question that steers people toward a certain response


M

Math expression: A calculation that involves addition, subtraction,


multiplication, or division (also called an equation)

Math function: A function that is used as part of a mathematical formula

MAX: A spreadsheet function that returns the largest numeric value from a
range of cells

Measurable question: A question whose answers can be quantified and assessed

Metric: A single, quantifiable type of data that is used for measurement

Metric goal: A measurable goal set by a company and evaluated using metrics

MIN: A spreadsheet function that returns the smallest numeric value from a
range of cells

Open data: Data that is available to the public

Operator: A symbol that names the operation or calculation to be performed

Order of operations: Using parentheses to group together spreadsheet values in


order to clarify the order in which operations should be performed

Pivot chart: A chart created from the fields in a pivot table

Pivot table: A data summarization tool used to sort, reorganize, group, count,
total, or average data

Problem domain: The area of analysis that encompasses every activity affecting
or affected by a problem
Problem types: The various problems that data analysts encounter, including
categorizing things, discovering connections, finding patterns, identifying
themes, making predictions, and spotting something unusual

Qualitative data: A subjective and explanatory measure of a quality or


characteristic

Quantitative data: A specific and objective measure, such as a number, quantity,


or range

Range: A collection of two or more cells in a spreadsheet

Reframing: Restating a problem or challenge, then redirecting it toward a


potential resolution

Relevant question: A question that has significance to the problem to be solved

Report: A static collection of data periodically given to stakeholders

Return on investment (ROI): A formula that uses the metrics of investment and
profit to evaluate the success of an investment

Revenue: The total amount of income generated by the sale of goods or services

Scope of work (SOW): An agreed-upon outline of the tasks to be performed


during a project

Small data: Small, specific data points typically involving a short period of time,
which are useful for making day-to-day decisions

SMART methodology: A tool for determining a question’s effectiveness based


on whether it is specific, measurable, action-oriented, relevant, and time-bound
Sorting: The process of arranging data into a meaningful order to make it easier
to understand, analyze, and visualize

Specific question: A question that is simple, significant, and focused on a single


topic or a few closely related ideas

Structured thinking: The process of recognizing the current problem or situation,


organizing available information, revealing gaps and opportunities, and
identifying options

SUM: A spreadsheet function that adds the values of a selected range of cells

Time-bound question: A question that specifies a timeframe to be studied

Turnover rate: The rate at which employees voluntarily leave a company

Unfair question: A question that makes assumptions or is difficult to answer


honestly
DATA EXPLORATION

Question:

Understanding the different types of data and data structures

What type of data is right for the question you’re answering

How data is generated

Different formats types and structures of data

Analyze data for bias and credibility

What “clean data ” means

Database

Extract your own data using spreadsheets and sql

How the data is collected

Interviews

Observations

Forms

Questionnaires

Survey

Cookies

Right data is need collection


Data collection considerations( nhung can nhac khi thu thap du lieu)

How the data will be conllected

Choose data sources

First -party data: Data collected by an individual or group using their own
resources

Second-party data:

Data collected by a group direcly from its audience and then sold

Third -party data: data collected from outside sources who did not collect
it directly.

Decide what data to use

How much data to collect

Select the right data type

Determine the time frame

Population is all possible data values in a certain dataset

Sample: a part of a population that is representative of the population

Analyst data and data structure

Discrete data

Data that í counted and has a limited number of values

Continous data

Data that is measured and can have alomost any numeric value

Nominal data
A type of qualitative data that is categorized without a set order

Ordinal data

A type of qualitative data with a set order or scale

Internal data

Data that lives within a company’s own systems

External data

Data that lives and is generated outside of an organization

Structured data

Data organized in a certain format such as rows and columns

Unstructred data

Data that is not organized in any easily identifiable manner

You might also like