0% found this document useful (0 votes)
82 views

Azure Data Fundamentals

The document provides an overview of a Microsoft Azure Virtual Training Day on data fundamentals. It outlines the course objectives of describing core data concepts, identifying services for relational and non-relational data, and identifying services for data analytics. The course agenda covers exploring fundamentals of data, relational data in Azure, non-relational data in Azure, large-scale data warehousing, real-time analytics, and data visualization. It also lists upcoming lessons and demos.

Uploaded by

Thandeka Skosana
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
82 views

Azure Data Fundamentals

The document provides an overview of a Microsoft Azure Virtual Training Day on data fundamentals. It outlines the course objectives of describing core data concepts, identifying services for relational and non-relational data, and identifying services for data analytics. The course agenda covers exploring fundamentals of data, relational data in Azure, non-relational data in Azure, large-scale data warehousing, real-time analytics, and data visualization. It also lists upcoming lessons and demos.

Uploaded by

Thandeka Skosana
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 56

© Copyright Microsoft Corporation. All rights reserved.

FOR USE ONLY AS PART OF MICROSOFT VIRTUAL TRAINING DAYS PROGRAM. THESE MATERIALS ARE NOT AUTHORIZED
FOR DISTRIBUTION, REPRODUCTION OR OTHER USE BY NON-MICROSOFT PARTIES.
Microsoft Azure Virtual
Training Day: Data
Fundamentals

We will be starting shortly


Course objectives:
 Describe core data concepts
 Identify services for relational data
 Identify services for non-relational data
About  Identify services for data analytics
this course
This course is supplemented by online training at
https://aka.ms/AzureLearn_DataFundamentals
Course Agenda Module 1: Explore fundamentals of data
 Core data concepts
 Data roles and services

Module 2: Explore fundamentals of relational data in Azure


 Explore relational data concepts
 Explore Azure services for relational data

Module 3: Explore fundamentals of non-relational data in Azure


 Fundamentals of Azure Storage
 Fundamentals of Azure Cosmos DB

Module 4: Explore fundamentals of large-scale data warehousing


 Large-scale data warehousing

Module 5: Explore fundamentals of real-time analytics


 Streaming and real-time analytics

Module 6: Explore fundamentals of data visualization


 Data visualization
Demos

• Demos in this course are based


on exercises in Microsoft Learn
Module 1:
Explore Fundamentals of Data

• Lesson 1: Core data concepts


• Lesson 2: Data roles and services
Lesson 1: Core Data Concepts
What is data?
Values used to record information – often representing entities that have one or more attributes

Structured Semi-structured Unstructured


{
Customer "firstName": "Joe",
"lastName": "Jones",
ID FirstName LastName Email Address "address":
{
"streetAddress": "1 Main {
[email protected] "firstName": "Samir",
1 Joe Jones 1 Main St. St.",
om "city": "New York", "lastName": "Nadoy",
"address":
"state": "NY",
{
samir@north "postalCode": "10099"
2 Samir Nadoy 123 Elm Pl. }, "streetAddress": "123 Elm
wind.com "contact": Pl.",
[ "unit": "500",
{ "city": "Seattle",

Product "type": "home",


"number": "555 123-1234"
"state": "WA",
"postalCode": "98999"
}, },
ID Name Price { "contact":
"type": "email", [
123 Hammer 2.99 "address": {
"type": "email",
"[email protected]"
"address":
162 Screwdriver 3.49 ]
}
"[email protected]"
} }
201 Wrench 4.25 ]
}
How is data stored?
Files Databases

{
"customers":
[
{ "firstName": "Joe", "lastName": "Jones"},
{ "firstName": "Samir", "lastName": "Nadoy"}
]
}

Graph
Transactional data workloads

Order
… … …
… … …
* * *
Analytical data workloads

DW


DL
Lesson 2: Data Roles and Services
Data professional roles

Database Administrator Data Engineer Data Analyst


Database provisioning, Data integration pipelines and ETL Analytical modeling
configuration and management processes
Data reporting and summarization
Database security and user access Data cleansing and transformation
Data visualization
Database backups and resiliency Analytical data store schemas and
data loads
Database performance monitoring
and optimization
Microsoft cloud services for data
Data stores Data engineering and analytics
Module 2:
Explore Fundamentals of Relational Data in Azure

• Lesson 1: Explore relational data concepts


• Lesson 2: Explore Azure services for relational data
Lesson 1: Explore Relational Data
Concepts
Relational tables

 Data is stored in tables Customer


 Tables consists of rows and columns ID FirstName MiddleName LastName Email Address City

 All rows have the same columns 1 Joe David Jones


[email protected]
m
1 Main St. Seattle

 Each column is assigned a datatype 2 Samir Nadoy


samir@northwi
123 Elm Pl. New York
nd.com

Product Order LineItem


ID Name Price OrderNo OrderDate Customer OrderNo ItemNo ProductID Quantity

123 Hammer 2.99 1000 1/1/2022 1 1000 1 123 1

162 Screwdriver 3.49 1001 1/1/2022 2 1000 2 201 2

201 Wrench 4.25 1001 1 123 2


Normalization
Sales Data  Separate each entity into its own table
OrderNo OrderDate Customer Product Quantity  Separate each discrete attribute into its own
1000 1/1/2022 Joe Jones, 1 Main St, Seattle Hammer ($2.99) 1 column
1000 1/1/2022 Joe Jones- 1 Main St, Seattle Screwdriver ($3.49) 2  Uniquely identify each entity instance (row)
1001 1/1/2022 Samir Nadoy, 123 Elm Pl, New York Hammer ($2.99) 2 using a primary key
… … … … …  Use foreign key columns to link related entities

LineItem Product
Customer Order OrderNo ItemNo ProductID Quantity ID Name Price

ID FirstName LastName Address City OrderNo OrderDate Customer 1000 1 123 1 123 Hammer 2.99

1 Joe Jones 1 Main St. Seattle 1000 1/1/2022 1 1000 2 201 2 162 Screwdriver 3.49

2 Samir Nadoy 123 Elm Pl. New York 1001 1/1/2022 2 1001 1 123 2 201 Wrench 4.25
Structured Query Language (SQL)
 SQL is a standard language for use with relational databases
 Standards are maintained by ANSI and ISO
 Most RDBMS systems support proprietary extensions of standard SQL

Data Definition Language (DDL) Data Control Language (DCL) Data Manipulation Language (DML)

CREATE, ALTER, DROP, RENAME GRANT, DENY, REVOKE INSERT, UPDATE, DELETE, SELECT
CREATE TABLE Product GRANT SELECT, INSERT, UPDATE SELECT Name, Price
( ON Product FROM Product
ProductID INT PRIMARY KEY, TO user1; WHERE Price > 2.50
Name VARCHAR(20) NOT NULL, ORDER BY Price;
Price DECIMAL NULL Product Results
);
ID Name Price Name Price
123 Hammer 2.99
Product Hammer 2.99
162 Screwdriver 3.49 Screwdriver 3.49
ID Name Price
201 Wrench 4.25 Wrench 4.25
Other common database objects
Views Stored Procedures Indexes
Pre-defined SQL queries that behave as Pre-defined SQL statements that can Tree-based structures that improve query
virtual tables include parameters performance

Customer Order
… … … … … … Product
… … … … … … ID Name Price
123 Hammer 2.99
Deliveries Product
162 Screwdriver 3.49
OrderNo OrderDate Address City ID Name Price
201 Wrench 4.25
1000 1/1/2022 1 Main St. Seattle 201 Wrench Spanner 4.25
1001 1/1/2022 123 Elm Pl. New York
Lesson 2: Explore Azure Services for
Relational Data
Azure SQL
Family of SQL Server based cloud database services
Azure Database services for open-source
Azure managed solutions for common open-source RDBMSs
Demo Provision Azure relational database services
Module 3:
Explore Fundamentals of Non-relational Data in Azure

• Lesson 1: Fundamentals of Azure Storage


• Lesson 2: Fundamentals of Azure Cosmos DB
Lesson 1: Fundamentals of Azure Storage
Azure Blob Storage
Storage for data as binary large objects (BLOBs)
• Block blobs
o Large, discrete, binary objects that change infrequently
o Blobs can be up to 4.7 TB, composed of blocks of up to 100 MB
- A blob can contain up to 50,000 blocks
• Page blobs
o Used as virtual disk storage for VMs
o Blobs can be up to 8 TB, composed of fixed sized-512 byte pages
• Append blobs
o Block blobs that are used to optimize append operations
o Maximum size just over 195 GB - each block can be up to 4 MB

Per-blob storage tiers


 Hot – Highest cost, lowest latency
 Cool – Lower cost, higher latency
 Archive – Lowest cost, highest latency
Azure Data Lake Store Gen 2
Distributed file system built on Blob Storage
• Combines Azure Data Lake Store Gen 1 with Azure Blob
Storage for large-scale file storage and analytics
• Enables file and directory level access control and
management
• Compatible with common large scale analytical systems

Enabled in an Azure Storage account through the


Hierarchical Namespace option
• Set during account creation
• Upgrade existing storage account
o One-way upgrade process
Azure Files
Files shares in the cloud that can be
accessed from anywhere with an internet
connection
• Support for common file sharing protocols:
o Server Message Block (SMB)
o Network File System (NFS) – requires premium tier

• Data is replicated for redundancy and encrypted at


rest
Azure Table Storage
Key-Value storage for application data
• Tables consist of key and value columns
o Partition and row keys
o Custom property columns for data values
- A Timestamp column is added automatically to log
data changes
• Rows are grouped into partitions to improve
performance
• Property columns are assigned a data type, and PartitionKey RowKey Timestamp Property1 Property2
can contain any value of that type
1 123 2022/1/1 A value Another value
• Rows do not need to include the same
property columns 1 124 2022/1/1 This value

2 125 2022/1/1 That value


Demo Explore Azure Storage
Lesson 2: Fundamentals of Azure
Cosmos DB
What is Azure Cosmos DB?
{
A multi-model, global-scale NoSQL database "x":[…]
management system }

Key Value Col1 Col2 Col3


Azure Cosmos DB APIs
Azure Cosmos DB for MongoDB Azure Cosmos DB for PostgreSQL

id name dept manager


1 Sue Smith Hardware Joe Jones
2 Ben Chan Hardware Sue Smith

PartitionKey RowKey Name id name dept manager


1 123 Joe Jones 1 Sue Smith Hardware
1 124 Samir Nadoy 2 Ben Chan Hardware Sue Smith
Demo Explore Azure Cosmos DB
Module 4:
Explore Fundamentals of Large-scale data
warehousing
• Lesson 1: Large-scale data warehousing
Lesson 1: Large-scale data warehousing
What is large-scale data warehousing?
Data ingestion and processing Analytical data store Analytical data model Data visualization




Data ingestion and processing pipelines

Activities
Analytical data stores
Choose an analytical data store service

Azure Synapse Analytics Azure Databricks Azure HDInsight

Use to leverage Databricks skills and Use when you need to support
for cloud portability multiple open-source platforms
Demo Explore Azure Synapse Analytics
Module 5:
Explore Fundamentals of real-time analytics
• Lesson 1: Streaming and real-time analytics
Lesson 1: Streaming and Real-time
Analytics
Batch vs stream processing
Batch processing Stream processing
Real-time data processing with Azure Stream Analytics

• Ingest data from an input, such as:


o Azure Event Hubs
o Azure IoT Hub
o Azure Blob Storage
o …
• Process data with a perpetual query SELECT …
• Send results to an output, such as:
o Azure Blob Storage
o Azure SQL Database
o Azure Synapse Analytics
o Azure Function
o Azure Event Hubs
o Power BI
o …
Real-time log and telemetry analysis with Azure Data Explorer

• Data is ingested from streaming and


batch sources into tables in a database
• Tables can be queried using Kusto Query
Language (KQL):
o Intuitive syntax for read-only queries
o Optimized for raw telemetry and time-
series data
Demo Explore Azure Stream Analytics
Module 6:
Explore Fundamentals of Data Visualization

• Lesson 1: Data visualization


Lesson 1: Data Visualization
Introduction to data visualization with Power BI
Analytical data modeling
Customer (dimension) Product (dimension) Total revenue for wrenches
Key Name Category
sold to Samir in January
Key Name Address City
1 Joe 1 Main St. Seattle 1 Hammer Tools

2 Samir 123 Elm Pl. New York 2 Screwdriver Tools

3 Alice 2 High St. Seattle 3 Wrench Tools


4 Bolts Hardware

Sales (fact)
Key TimeKey ProductKey CustomerKey Quantity Revenue
1 01012022 1 1 1 2.99 ∑
2 01012022 2 1 2 6.98
3 02012022 1 2 2 5.98

Time (dimension) Year Month Day Revenue


Model aggregates measures
Key Year Month Day WeekDay
at each hierarchy level 2022 8221.48
01012022 2022 Jan 1 Sat
Jan 574.86
02012022 2022 Jan 2 Sun
1 9.97
2 5.98
… …
Common data visualizations in reports
Tables and text Bar or column chart Line chart

Pie chart Scatter plot Map


Demo Visualize data with Power BI
Further learning

https://aka.ms/ExploreDataConcepts
https://aka.ms/ExploreRelationalData
https://aka.ms/ExploreNonRelationalData
https://aka.ms/ExploreDataAnalytics
Thank you

You might also like