0% found this document useful (0 votes)
31 views

PBI Desktop Fundamentals Training Session 1

The document provides an overview of a two-session Power BI Desktop training. Session one covers downloading and installing Power BI Desktop, connecting to and transforming data sources, and building a relational data model. Session two focuses on analyzing data using DAX measures and calculated columns as well as creating interactive reports and dashboards. The training aims to provide foundational understanding of Power BI Desktop and covers some key concepts and tools but not more advanced features.

Uploaded by

Mai Anh Le
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views

PBI Desktop Fundamentals Training Session 1

The document provides an overview of a two-session Power BI Desktop training. Session one covers downloading and installing Power BI Desktop, connecting to and transforming data sources, and building a relational data model. Session two focuses on analyzing data using DAX measures and calculated columns as well as creating interactive reports and dashboards. The training aims to provide foundational understanding of Power BI Desktop and covers some key concepts and tools but not more advanced features.

Uploaded by

Mai Anh Le
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 70

OPCO SC DIGITAL

POWER BI DESKTOP TRAINING

1
TRAINING STRUCTURE

Session 01: Thursday, 26/08/2021


 Download & Install, Introduction about Power BI Desktop
 Connect, transform data from sources
 Build relational data model

Session 02: Friday, 27/08/2021


 Analyze data with DAX measures, calculated columns
 Create, design interactive reports and dashboards

Use Power BI Desktop to:


• Connect and transform the raw data
OBJECTIVES • Build a relational data model
• Create new calculated columns and DAX measures
• Design an interactive report to analyze and visualize the data
SETTING EXPECTATIONS
This training aims to get you up & running with Power BI Desktop
 The goal is to provide a foundational understanding of Power BI desktop
 Some concepts may be simplified, and we will not cover some of the more advanced tools (M code, custom
R visuals, advanced DAX …)

Power BI and Power Pivot in Excel are built on the exact same engine
 Expect familiar contents if you’re already comfortable with Power Query and data modeling fundamentals

We will not cover Power BI Service as part of this course


 This course will focus on Power BI Desktop specifically; online sharing and collaboration features
(app.powerbi.com) is covered in a separate training
INTRODUCING POWER BI
MEET POWER BI

Power BI is a standalone Microsoft business intelligence


product, which includes both desktop and web-based
applications for loading, modeling, and visualizing data

More information at powerbi.microsoft.com


WHY POWER BI?
Connect, transform and analyze millions of rows of data
 Access data from virtually anywhere (database tables, flat files, cloud services, folders, etc), and create fully
automated data shaping and loading (ETL) procedures

Build relational models to blend data from multiple sources


 Create table relationships to analyze holistic performance across an entire data model

Define complex calculations using Data Analysis Expressions (DAX)


 Enhance datasets and enable advanced analytics with powerful and portable DAX expressions

Visualize data with interactive reports & dashboards


 Build custom business intelligence tools with best-in-class visualization and dashboard features

Power BI is the industry leader among BI platforms


 Microsoft Power BI is intuitive, powerful and absolutely FREE to get started
POWER BI VS EXCEL

EXCEL POWER BI
Excel and Power BI are built
PivotTable Report View on top of the same engine
s Data
Shaping • Power BI takes the same data
(Power shaping, modeling and analytics
Query) capabilities and adds new
Custom reporting and publishing tools
Data Visualization Tools
PivotCharts Modeling (R-Visuals, Bookmarks, • Transitioning is easy; you can
(Power Interactions, etc) import an entire data model
Pivot) directly from Excel

Calculated
Power Map/ Fields Publishing &
Power View (DAX) Collaboration Options
(Power BI Service)
THE POWER BI INTERFACE

Three Cores:
Filter,
Visualization
Report
and Data
Fields Tools
Data

Relationships
(Model)
THE POWER BI WORKFLOW

Connect, shape and


transform raw
data

Build table relationships


to create a data model

Design interactive reports to


explore and visualize data
REPORT VIEW TABS

HOME*:
Home tools are split across
Home & Insert

INSERT:

MODELING:

VIEW: Performance analyzer to


tune & optimize reports

New report Theme designs & previews


HELP:
DATA VIEW TABS
(CONTEXTUAL)
TABLE TOOLS:
Access table attributes,
manage relationships, add
new calculations, etc.

COLUMN TOOLS:
Access column attributes, set
data types and formats, use
sorting and grouping tools, etc.

MEASURE TOOLS:
Access measure attributes,
determine home table, set
formats and categories, etc.
CONNECTING & SHAPING DATA
TYPES OF DATA CONNECTORS

Power BI can connect to several data


sources, including:
• Files & Folders (csv, text, xls, etc)
• Databases (SQL, Access, Oracle, IBM, Azure, etc)
• Online Services (Sharepoint, GitHub, Dynamics
365,
Google Analytics, Salesforce, Power BI Service, etc)
• Others (Webs, R scripts, Spark, Hadoop, etc)
Practice – OneDrive/Sharepoint File Connection
Practice – OneDrive/Sharepoint File Connection
THE QUERY EDITOR

Query Editing Tools

Formula Bar
M Code
Table Name
& Properties

Query
Applied Steps
Pane
QUERY EDITING TOOLS

The HOME tab includes general settings and common table transformation tools

The TRANSFORM tab includes tools to modify existing columns (splitting/grouping, transposing, extracting text, etc)

The ADD COLUMN tools create new columns (based on conditional rules, text operations, calculations, dates, etc)
BASIC TABLE TRANSFORMATIONS
Change data type (date, $, percentages)

Promote
header row

You can also right


click the column
header to access
Choose or remove columns similar tools

Keep or remove rows


Practice – Basic Table Transformations
Connect the
Products CSV
File to Power BI

Rename the Query Remove unnecessary


to a structured, columns
meaningful name
TEXT-SPECIFIC TOOLS

Extract characters from a text


Split a text column based on column based on fixed lengths,
either a specific delimiter or first/last, ranges or delimiters
a number of characters
Tip: Select two or more columns to
merge (or concatenate) fields

You can access many of these tools in both the


“Transform” and “Add Column” menus -- the
difference is whether you want to add a new Format a text column to upper, lower or
column or modify an existing one proper case, or add a prefix or suffix
Practice – Text Tools
Connect the
Customers CSV
File to Power BI

Remove unnecessary
columns

Rename the Query


to a structured,
meaningful name
Practice – Text Tools

Merge the names columns into a


new column

Extract the username


from EmailAddress
using Add column –
Extract – Text before
Delimiter

Try extracting the domain name from


EmailAddress as format, using Add column
– Extract – Text between Delimiters
NUMBER-SPECIFIC TOOLS

Information tools allow


you to define binary flags
(TRUE/FALSE or 1/0) to
Standard Scientific Trigonometry mark each row in a
Statistics functions allow you to column as even, odd,
evaluate basic stats for the selected Standard, Scientific and Trigonometry tools allow you
positive or negative
column (sum, min/max, average, to apply standard operations (addition, multiplication,
count, countdistinct, etc) division, etc.) or more advanced calculations (power,
logarithm, sine, tangent, etc) to each value in a column
Practice – Number Tools
Rounding the Product Cost and
Price columns to 2 decimal format

Calculate a new
column DiscountPrice
as formatted using
Add column –
Standard - Multiply
DATE-SPECIFIC TOOLS

Date & Time tools are relatively straight-forward, and include the following options:
• Age: Difference between the current time and the date in each row
• Date Only: Removes the time component of a date/time field
• Year/Month/Quarter/Week/Day: Extracts individual components from a date field
(Time-specific options include Hour, Minute, Second, etc.)
• Earliest/Latest: Evaluates the earliest or latest date from a column as a single value
(can
only be accessed from the “Transform” menu)

Note: You will almost always want to perform these operations from the “Add Column” menu
to
build out new fields, rather than transforming an individual date/time column
Practice – Date Tools
Connect the
Customers CSV
File to Power BI

Generate Day Name


column using Date
function
Practice – Date Tools
Create a Start of week
column with Monday
as starting day

Generate the 4 new


columns beside using
Date functions
ADDING INDEX COLUMNS

Index Columns contain a list of


sequential values that can be used to
identify each unique row in a table
(typically starting from 0 or 1)

These columns are often used to


create unique IDs that can be used to
form relationships between tables
(more on that later!)
ADDING CONDITIONAL COLUMNS

Conditional Columns allow you to define new fields based


on logical rules and conditions (IF/THEN statements)

In this case we’re creating a new conditional column


called “QuantityType”, which depends on the values in
the “OrderQuantity” column, as follows:
• If OrderQuantity =1, QuantityType = “Single Item”
• If OrderQuantity >1, QuantityType = “Multiple
Items”
• Otherwise QuantityType = “Other”
Practice – Conditional Columns
Connect the
Sales 2017 CSV
File to Power BI

Create an Index column starting


from 1 and move the column to
the beginning of the Query
Practice – Conditional Columns
Connect the
Sales 2017 CSV
File to Power BI

Create a Conditional column for


Quantity types with OrderQuantity
as the conditions as shown
GROUPING & AGGREGATING DATA

Group By allows you to aggregate your data at a different level


(i.e. transform daily data into monthly, roll up transaction-level data by store, etc)

In this case we’re transforming a daily, transaction-level table into a


summary of “TotalQuantity” rolled up by “ProductKey”

NOTE: Any fields not specified in the Group By settings are lost
GROUPING & AGGREGATING DATA
(ADVANCED)

This time we’re transforming the daily, transaction-level table into a summary
of “TotalQuantity” aggregated by both “ProductKey” and “CustomerKey”
(using the advanced option in the dialog box)

NOTE: This is similar to creating a PivotTable in Excel and pulling in “Sum of


OrderQuantity” with ProductKey and CustomerKey as row labels
Practice – Group by

Group by
ProductKey and
CustomerKey to
create new Total
Quantity column
MERGING QUERIES

Merging queries allows you to join tables based


on a common column (like VLOOKUP)

In this case we’re merging the AW_Sales_Data


table with the AW_Product_Lookup table, which
share a common “ProductKey” column

NOTE: Merging adds columns to an existing table

Just because you can merge tables,


doesn’t mean you should.
In general, it’s better to keep tables
separate and define relationships
between them.
Practice – Merge Queries

Merge Sales 2017 and


Product lookup queries by
ProductKey
APPENDING QUERIES

Appending queries allows you to combine (or stack) tables


that share the exact same column structure and data types

In this case we’re appending the


AdventureWorks_Sales_2015 table to the
AdventureWorks_Sales_2016 table, which is valid since
they share identical table structures

NOTE: Appending adds rows to an existing table

Use the “Folder” option (Get Data > More > Folder) to append all files within a folder (assuming they share
the same structure); as you add new files, simply refresh the query and they will automatically append.
Practice – Appending Queries from Folder

Appending Sales 2015 2016


2017 into one query from
AW_Sales folder
DATA SOURCE SETTINGS

The Data Source Settings in the Query Editor allow you


to manage data connections and permissions

If the file name or location changes, you will need to


change the source and browse to the current version

*Copyright 2018, Excel Maven & Maven Analytics, LLC


REFRESHING QUERIES

By default, ALL queries in the model will refresh when


you use the “Refresh” command from the Home tab

From the Query Editor, uncheck “Include in report


refresh” to exclude individual queries from the refresh

Exclude queries that don’t change often, like lookups


or static data tables
DEFINING DATA CATEGORIES
Select a column in the Data view to access
Column Tools, where you can edit field properties
to define specific categories

This is commonly used to help Power BI accurately map


location-based fields like addresses, countries, cities,
latitude/longitude coordinates, zip codes, etc.
DEFINING HIERARCHIES
Hierarchies are groups of nested columns that reflect multiple levels of granularity
 For example, a “Geography” hierarchy might include Country, State, and City columns
 Each hierarchy can be treated as a single item in tables and reports, allowing users to “drill up” and
“drill down” through different levels of the hierarchy in a meaningful way

1) From within the Data view, right-click a field 2) This creates a hierarchy field 3) Right-click other fields
(or click the ellipsis) and select “New hierarchy” containing “Start of Year”, which (like “Start of Month”) and
we’ve renamed “Date Hierarchy” select “Add to Hierarchy”
Practice – Creating Hierarchy
Creating Territory Hierarchy
by Continent – Country -
Region
BEST PRACTICES: CONNECTING & SHAPING
DATA
Get yourself organized, before loading the data into Power BI

 Define clear and intuitive table names from the start; updating them later can be a headache, especially
if you’ve referenced them in multiple places

 Establish a file/folder structure that makes sense from the start, to avoid having to modify data source
settings if file names or locations change

Disabling report refresh for any static sources

 There’s no need to constantly refresh sources that don’t update frequently (or at all), like lookups or
static data tables; only enable refresh for tables that will be changing

When working with large tables, only load the data you need

 Don’t include hourly data when you only need daily, or product-level transactions when you only care
about store-level performance; extra data will only slow you down
CREATING A DATA MODEL
WHAT’S A “DATA
MODEL”?

This IS NOT a data model


 This is a collection of independent tables,
which share no connections or relationships

 If we visualize Orders and Returns by Product,


this is the result
WHAT’S A “DATA
MODEL”?

This IS a data model


• The tables are connected via relationships,
based on the common ProductKey field
• Now the Sales and Returns tables know how
to filter using fields from the Product table
Practice – Compare unconnected and connected Data Models
No Data Model

Understand the
differences
between matrix
after a Data
Model is created
and the one
without

Data Model created,


relationship between
Products table and Sales,
Returns tables
DATABASE NORMALIZATION
Normalization is the process of organizing the tables and columns in a relational
database to reduce redundancy and preserve data integrity. It’s commonly used to:
• Eliminate redundant data to decrease table sizes and improve processing speed & efficiency
• Minimize errors and anomalies from data modifications (inserting, updating or deleting records)
• Simplify queries and structure the database for meaningful analysis

TIP: In a normalized database, each table should serve a distinct and specific purpose (product
information, dates, transaction records, customer attributes, etc.)

When you don’t normalize, you end up with tables like


this; all of the rows with duplicate product info could be
eliminated with a lookup table based on product_id

This may not seem critical now, but minor inefficiencies


can become major problems as databases scale in size
DATA TABLES VS. LOOKUP TABLES
Models generally contain two types of tables: data (or “fact”) tables, and lookup (or “dimension”) tables
• Data tables contain numbers or values, typically at a detailed level, with ID or Key columns that can be used to
create table relationships
• Lookup tables provide descriptive, often text-based attributes about each dimension in a table

This Calendar Lookup table provides additional attributes about each date (month, year, weekday, quarter, etc.)

This Product Lookup table provides additional attributes about each product (brand, product name, sku, price, etc.)

This Data Table contains “quantity” values, and connects


to lookup tables via the “date” and “product_id” columns
PRIMARY VS. FOREIGN KEYS

These columns are foreign keys; they These columns are primary keys; they uniquely identify each
contain multiple instances of each row of a table, and match the foreign keys in related data tables
value, and are used to match the
primary keys in related lookup tables
Practice – Arrange and Define Data, Lookup tables; Primary, Foreign keys
Arrange the tables
in Relationship
view and try to
define primary,
foreign keys
RELATIONSHIPS VS. MERGED TABLES
Can I just merge queries or use LOOKUP or RELATED functions to pull those attributes into the
fact table itself, so that I have everything in one place?

Original Fact Table fields Attributes from Calendar Lookup table Attributes from Product Lookup table

Sure you can, but it’s inefficient


• Merging data in this way creates redundant data and utilizes significantly more memory and
processing power than creating relationships between multiple small tables
CREATING TABLE RELATIONSHIPS
Option 1: Click and drag to connect primary and foreign Option 2: Add or detect relationships using
keys within the Relationships pane the “Manage Relationships” dialog box
Practice – Create Relationships
Create the
relationships
between tables in
Relationship view –
Using two
methods (Manage
relationships &
Drag Drop)
CREATING “SNOWFLAKE” SCHEMAS

The Sales_Data table can connect to Products using the ProductKey field, but
cannot connect directly to the Subcategories or Categories tables

By creating relationships from Products to Subcategories (using


ProductSubcategoryKey) and Subcategories to Categories (using
ProductCategoryKey), we have essentially connected Sales_Data to each
lookup table; filter context will now flow all the way down the chain

TIP:
Models with chains of dimension tables are often called “snowflake” schemas
(whereas “star” schemas usually have individual lookup tables surrounding a
central data table)
Practice – Create a Snowflake Schema
Create a Snowflake
Schema from the
relationships between
Sales and Product
categories in
Relationship view
MANAGING & EDITING RELATIONSHIPS

The “Manage Relationships” dialog box allows Editing tools allow you to activate/deactivate relationships, view
you to add, edit, or delete table relationships cardinality, and modify the cross filter direction (stay tuned!)
Practice – Edit a Relationship
Try editing a
wrelationship from
OrderDate to
StockDate and back
(between Sales and
Date tables)
ACTIVE VS. INACTIVE RELATIONSHIPS

The Sales_Data table contains two date fields (OrderDate & StockDate), but
there can only be one active relationship to the Date field in the Calendar table

Double-click the relationship line, and check the “Make this relationship
active”
box to toggle (note that you have to deactivate one in order to activate
another)
RELATIONSHIP CARDINALITY

Cardinality refers to the uniqueness of values in a column


• For our purposes, all relationships in the data model should
follow a “one-to-many” cardinality; one instance of each
primary key, but potentially many instances of each foreign key

In this case, there is only ONE instance of each ProductKey in the Products
table (noted by the “1”), since each row contains attributes of a single product
(Name, SKU, Description, Retail Price, etc)
There are MANY instances of each ProductKey in the Sales_Data table (noted
by the asterisk *), since there are multiple sales associated with each product
CARDINALITY to Avoid: MANY-TO-MANY

• If we try to connect these tables using product_id,


we’ll get a “many-to-many relationship” error since
there are multiple instances of each ID in both tables
• Even if we could create this relationship, how would
you know which product was actually sold on each
date – Cream Soda or Diet Cream Soda?
CARDINALITY to Avoid: ONE-TO-ONE

• Connecting the two tables above using the product_id field creates a one-to-one relationship,
since each ID only appears once in each table
• Unlike many-to-many, there is nothing illegal about this relationship; it’s just inefficient

To eliminate the inefficiency, you could simply


merge the two tables into a single, valid lookup

Normalization is still maintained since all rows are unique and


capture attributes related to the primary key
CONNECTING MULTIPLE DATA TABLES
This model contains two data tables: Sales_Data and
Returns_Data

Note that the Returns table connects to Calendar and


Product_Lookup just like the Sales table, but without
a CustomerKey field it cannot be joined to
Customer_Lookup

This allows us to analyze sales and returns within the


same view, but only if we filter or segment the data
using shared lookups In other words, we know which
product was returned and on which date, but nothing
about which customer made the return
Practice – Adding one data table and observing possible issue
Adding one data
table – Returns
and check the
issue caused by
lack of relationship
between the new
data table and
Customer table
FILTER FLOW

Here we have two data tables (Sales_Data and Returns_Data),


connected to Territory_Lookup

Note the filter directions (shown as arrows) in each relationship; by


default, these will point from the “one” side of the relationship
(lookups) to the “many” side (data)
 When you filter a table, that filter context is passed along to all related
“downstream” tables (following the direction of the arrow)
 Filters cannot flow “upstream” (against the direction of the arrow)

TIP:
Arrange your lookup tables above your data tables in your model as a
visual reminder that filters flow “downstream”
TWO-WAY FILTERS

Updating the filter direction between Sales and Territory


from “Single” to “Both” allows filter context to flow both ways
 This means that filters applied to the Sales_Data table will pass to the
lookup, and then down to the Returns_Data table
Practice – Apply Two-way filters and observe possible issue
Applying two-way
filter for Returns
TerritoryKey and
check the issue
with the matrix
filtering
BEST PRACTICES: DATA
MODELING
Focus on building a normalized model from the start
 Make sure that each table in your model serves a single, distinct purpose
 Use relationships vs. merged tables; long & narrow tables are better than short & wide

Organize lookup tables above data tables in the diagram view


 This serves as a visual reminder that filters flow “downstream”

Avoid complex cross-filtering unless absolutely necessary


 Don’t use two-way filters when 1-way filters will get the job done
THANK YOU!

You might also like