Azure DataBricks

This document provides a guide on using Azure Databricks to read and write various data formats, including CSV, JSON, and Parquet. It outlines the prerequisites for setting up an Azure Databricks workspace, the steps to create and configure it, and exercises for practical learning. The module concludes with a summary of key concepts and encourages participants to provide feedback on their experience.

Uploaded by

athrvadeshmukh21

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views

Azure DataBricks

Uploaded by

athrvadeshmukh21

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 37

Microsoft.

com/Learn
Read and write data in Azure
Databricks
Speaker Name
Title

Speaker Name
Title
Join the chat at
https://aka.ms/LearnLiveTV
 Use Azure Databricks to read multiple file types,
Learning both with and without a Schema.
objectives  Combine inputs from files and data stores, such as
Azure SQL Database.
 Transform and store that data for advanced
analytics.
Unit Prerequisites
Microsoft Azure Account: You will need a valid and active
Azure account for the Azure labs.
• If you are a Visual Studio Active Subscriber, you are entitled to Azure credits per month.
You can refer to this link to find out more including how to activate and start using your
monthly Azure credit.
• If you are not a Visual Studio Subscriber, you can sign up for the FREE
Visual Studio Dev Essentials program to create Azure free account.
Create the required resources
To complete this lab, you will need to deploy an Azure
Databricks workspace in your Azure subscription.
 Introduction
Agenda  Read data in CSV format
 Read data in JSON format
 Read data in Parquet format
 Read data stored in tables and views
 Write data
Agenda  Exercises: Read and write data
continued  Knowledge check
 Summary
Introduction
Introduction
Suppose you're working for a data analytics startup that's
now expanding along with its increasing customer base.
Creating your Databricks
workspace
Deploy an Azure Databricks workspace
Click the following button to open the Azure Resource
Manager Template (ARM) template in the Azure portal.
• Click the following button to open the Azure Resource Manager Template (ARM) template in the Azure portal.
Deploy Databricks from the ARM Template
• Provide the required values to create your Azure Databricks workspace:
• Subscription: Choose the Azure Subscription in which to deploy the workspace.
• Resource Group: Leave at Create new and provide a name for the new resource group.
• Location: Select a location near you for deployment. For the list of regions supported by Azure
Databricks, see Azure services available by region.
• Workspace Name: Provide a name for your workspace.
• Pricing Tier: Ensure premium is selected.
• Accept the terms and conditions.
• Select Purchase.
• The workspace creation takes a few minutes. During workspace creation, the portal displays the Submitting
deployment for Azure Databricks tile on the right side. You may need to scroll right on your dashboard to see the
tile. There is also a progress bar displayed near the top of the screen. You can watch either area for progress.
Create a cluster
When your Azure Databricks workspace creation is complete,
select the link to go to the resource.
Clone the Databricks archive
If you do not currently have your Azure Databricks workspace
open: in the Azure portal, navigate to your deployed Azure
Databricks workspace and select Launch Workspace.

• Select Import.
• Select the 03-Reading-and-writing-
data-in-Azure-Databricks folder that
appears.
Read data in CSV format
Read data in CSV format
In this unit, you need to complete the exercises within a
Databricks Notebook.
Complete the following notebook
Open the 1.Reading Data - CSV notebook.
• Start working with the API documentation
• Introduce the class SparkSession and other entry points
• Introduce the class DataFrameReader
• Read data from:
• CSV without a Schema
• CSV with a Schema
Read data in JSON format
Read data in JSON format
In your Azure Databricks workspace, open the 03-Reading-
and-writing-data-in-Azure-Databricks folder that you
imported within your user folder.
• Read data from:
• JSON without a Schema
• JSON with a Schema
Read data in Parquet format
Read data in Parquet format
In your Azure Databricks workspace, open the 03-Reading-
and-writing-data-in-Azure-Databricks folder that you
imported within your user folder.
• Introduce the Parquet file format
• Read data from:
• Parquet files without a schema
• Parquet files with a schema
Read data stored in tables and
views
Read data stored in tables and views
In your Azure Databricks workspace, open the 03-Reading-
and-writing-data-in-Azure-Databricks folder that you
imported within your user folder.
• Demonstrate how to pre-register data sources in Azure Databricks
• Introduce temporary views over files
• Read data from tables/views
Write data
Write data
In your Azure Databricks workspace, open the 03-Reading-
and-writing-data-in-Azure-Databricks folder that you
imported within your user folder.
• Write data to a Parquet file
• Read the Parquet file back and display the results
Exercise
Exercises: Read and write data
Exercises: Read and write data
In your Azure Databricks workspace, open the 03-Reading-
and-writing-data-in-Azure-Databricks folder that you
imported within your user folder.
Knowledge check
Question 1
How do you list files in DBFS within a notebook?

A. ls /my-file-path
B. %fs dir /my-file-path
C. %fs ls /my-file-path
Question 1
How do you list files in DBFS within a notebook?

A. ls /my-file-path
B. %fs dir /my-file-path
C. %fs ls /my-file-path
Question 2
How do you infer the data types and column names when
you read a JSON file?

A. spark.read.option("inferSchema", "true").json(jsonFile)
B. spark.read.inferSchema("true").json(jsonFile)
C. spark.read.option("inferData", "true").json(jsonFile)
Summary
Summary
In this module, you learned the basics about reading and
writing data in Azure Databricks.
• Read data from CSV files into a Spark Dataframe
• Provide a Schema when reading Data into a Spark Dataframe
• Read data from JSON files into a Spark Dataframe
• Read Data from parquet files into a Spark Dataframe
• Create Tables and Views
• Write data from a Spark Dataframe
Clean up
If you plan on completing other Azure Databricks modules,
don't delete your Azure Databricks instance yet.
Delete the Azure Databricks instance

• Navigate to the Azure portal.

• Navigate to the resource group that contains your Azure Databricks instance.
• Select Delete resource group.
• Type the name of the resource group in the confirmation text box.
• Select Delete.
Next Steps
Please tell us how you liked this
Practice your knowledge by
trying these Learn modules: workshop by filling out this
survey:
There is a slightly more advanced
and involved learning path that https://aka.ms/workshopoma
covers tic-feedback
Data Engineering with Azure Data
bricks
.
© Copyright Microsoft Corporation. All rights reserved.

Azure Databricks
67% (6)
Azure Databricks
69 pages
Daniel Drescher - Blockchain Basics - A Non-Technical Introduction in 25 Steps-Apress (2017) - 27-36
0% (1)
Daniel Drescher - Blockchain Basics - A Non-Technical Introduction in 25 Steps-Apress (2017) - 27-36
10 pages
Azure Databricks Interview
100% (2)
Azure Databricks Interview
35 pages
AWS Solution Architect Certification Exam Practice Paper 2019
From Everand
AWS Solution Architect Certification Exam Practice Paper 2019
Tech Interviews
3.5/5 (3)
Data Engineering With Databricks Da
100% (2)
Data Engineering With Databricks Da
232 pages
MICROSOFT AZURE ADMINISTRATOR EXAM PREP(AZ-104) Part-3: AZ 104 EXAM STUDY GUIDE
From Everand
MICROSOFT AZURE ADMINISTRATOR EXAM PREP(AZ-104) Part-3: AZ 104 EXAM STUDY GUIDE
Devi Prasad
No ratings yet
Hands-On Azure Data Platform: Building Scalable Enterprise-Grade Relational and Non-Relational database Systems with Azure Data Services
From Everand
Hands-On Azure Data Platform: Building Scalable Enterprise-Grade Relational and Non-Relational database Systems with Azure Data Services
Sagar Lad
No ratings yet
Azure Databricks Documentation
No ratings yet
Azure Databricks Documentation
7,197 pages
Azure Databricks Course Slide Deck
75% (4)
Azure Databricks Course Slide Deck
169 pages
AWS Certified Cloud Practitioner - Practice Paper 1: AWS Certified Cloud Practitioner, #1
From Everand
AWS Certified Cloud Practitioner - Practice Paper 1: AWS Certified Cloud Practitioner, #1
Tech Interviews
4.5/5 (2)
Nosferatu SchreckNET 2016 - 0
No ratings yet
Nosferatu SchreckNET 2016 - 0
5 pages
DP-203T00 Microsoft Azure Data Engineering-03
No ratings yet
DP-203T00 Microsoft Azure Data Engineering-03
21 pages
Machine Learning with SAS Viya
From Everand
Machine Learning with SAS Viya
SAS Institute Inc.
No ratings yet
Logixpro Manual PDF
No ratings yet
Logixpro Manual PDF
10 pages
C Tadm70 19
No ratings yet
C Tadm70 19
4 pages
IEEE STD 141-1993 RED BOOK (Practice For Electric Power Distribution For Industrial Plants)
25% (4)
IEEE STD 141-1993 RED BOOK (Practice For Electric Power Distribution For Industrial Plants)
89 pages
Databricks Lab 1
100% (3)
Databricks Lab 1
7 pages
Databricks Guide
No ratings yet
Databricks Guide
27 pages
AWS Solutions Architect Certification Case Based Practice Questions Latest Edition 2023
From Everand
AWS Solutions Architect Certification Case Based Practice Questions Latest Edition 2023
Exam OG
No ratings yet
Dec 01 2020
No ratings yet
Dec 01 2020
298 pages
Data Bricks
No ratings yet
Data Bricks
43 pages
DP 3011 ENU PowerPoint - 01 Content
No ratings yet
DP 3011 ENU PowerPoint - 01 Content
42 pages
Administering Microsoft Azure SQL Solutions DP 300
From Everand
Administering Microsoft Azure SQL Solutions DP 300
Manish Soni
No ratings yet
ETL Azure
No ratings yet
ETL Azure
12 pages
Azure Databricks Mastery
No ratings yet
Azure Databricks Mastery
95 pages
Lab 2 - Setting Up Azure Databricks Workspace & Cluster
No ratings yet
Lab 2 - Setting Up Azure Databricks Workspace & Cluster
3 pages
Databricks 2
No ratings yet
Databricks 2
22 pages
Oracle OBIEE Interview Q & A
From Everand
Oracle OBIEE Interview Q & A
Mohammed Azizuddin Aamer
3/5 (1)
Designing Microsoft Azure Infrastructure Solution AZ 305
From Everand
Designing Microsoft Azure Infrastructure Solution AZ 305
Manish Soni
No ratings yet
Microsoft Certified Azure Data Fundamentals (DP-900) Exam Guide: Build a solid foundation in Azure data services and pass the DP-900 exam on your first try
From Everand
Microsoft Certified Azure Data Fundamentals (DP-900) Exam Guide: Build a solid foundation in Azure data services and pass the DP-900 exam on your first try
Steve Miles
No ratings yet
DataBricks_Note_free__1736678274
No ratings yet
DataBricks_Note_free__1736678274
87 pages
Azuredatabricks New
No ratings yet
Azuredatabricks New
22 pages
Course Notes
No ratings yet
Course Notes
11 pages
Amazon S3 Cookbook
From Everand
Amazon S3 Cookbook
Naoya Hashimoto
No ratings yet
Azure® Essentials
From Everand
Azure® Essentials
iCertify Training
No ratings yet
MC Microsoft Certified Azure Data Fundamentals Study Guide: Exam DP-900
From Everand
MC Microsoft Certified Azure Data Fundamentals Study Guide: Exam DP-900
Jake Switzer
No ratings yet
AZ-104 Azure Administrator Practice Paper 1: AZ-104 Azure Administrator, #1
From Everand
AZ-104 Azure Administrator Practice Paper 1: AZ-104 Azure Administrator, #1
Tech Interviews
No ratings yet
Databricks
No ratings yet
Databricks
36 pages
Azure Databricks
No ratings yet
Azure Databricks
21 pages
Mastering Microsoft Azure: Essential Techniques
From Everand
Mastering Microsoft Azure: Essential Techniques
Rob Proutyon
No ratings yet
Introduction to Oracle Database Administration
From Everand
Introduction to Oracle Database Administration
Ying Wang
5/5 (1)
Lab 3 - Enabling Team Based Data Science With Azure Databricks
No ratings yet
Lab 3 - Enabling Team Based Data Science With Azure Databricks
18 pages
Azure Data Engineer + Databricks Content
No ratings yet
Azure Data Engineer + Databricks Content
7 pages
Interview Prep
No ratings yet
Interview Prep
24 pages
AWS Certified Cloud Practitioner - Practice Paper 2: AWS Certified Cloud Practitioner, #2
From Everand
AWS Certified Cloud Practitioner - Practice Paper 2: AWS Certified Cloud Practitioner, #2
Tech Interviews
5/5 (2)
pREP dOC-Azure
No ratings yet
pREP dOC-Azure
35 pages
Big Data and Visualization Hands-Steps-1
No ratings yet
Big Data and Visualization Hands-Steps-1
7 pages
Azure Databricks Brief Introduction
No ratings yet
Azure Databricks Brief Introduction
40 pages
Amazon DynamoDB - The Definitive Guide: Explore enterprise-ready, serverless NoSQL with predictable, scalable performance
From Everand
Amazon DynamoDB - The Definitive Guide: Explore enterprise-ready, serverless NoSQL with predictable, scalable performance
Aman Dhingra
No ratings yet
Engineering Data Mesh in Azure Cloud: Implement data mesh using Microsoft Azure's Cloud Adoption Framework
From Everand
Engineering Data Mesh in Azure Cloud: Implement data mesh using Microsoft Azure's Cloud Adoption Framework
Aniruddha Deswandikar
No ratings yet
Visual Basic 2010 Coding Briefs Data Access
From Everand
Visual Basic 2010 Coding Briefs Data Access
Kevin Hough
5/5 (1)
Microsoft SQL Azure Enterprise Application Development
From Everand
Microsoft SQL Azure Enterprise Application Development
Jayaram Krishnaswamy
No ratings yet
Notes of Azure Data Bricks
No ratings yet
Notes of Azure Data Bricks
16 pages
Lab 3 - Enabling Team Based Data Science With Azure Databricks
No ratings yet
Lab 3 - Enabling Team Based Data Science With Azure Databricks
18 pages
SAS Interview Questions You'll Most Likely Be Asked
From Everand
SAS Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet
AZ-900 Azure Fundamentals Practice Paper 2: AZ-900 Azure Fundamentals, #2
From Everand
AZ-900 Azure Fundamentals Practice Paper 2: AZ-900 Azure Fundamentals, #2
Tech Interviews
No ratings yet
AWS Certified Solutions Architect - Associate (SAA-C03) Exam Guide: Aligned with the latest AWS SAA-C03 exam objectives to help you pass the exam on your first attempt
From Everand
AWS Certified Solutions Architect - Associate (SAA-C03) Exam Guide: Aligned with the latest AWS SAA-C03 exam objectives to help you pass the exam on your first attempt
Michelle Chismon
No ratings yet
Concept Based Practice Questions for AWS Solutions Architect Certification Latest Edition 2023
From Everand
Concept Based Practice Questions for AWS Solutions Architect Certification Latest Edition 2023
Exam OG
No ratings yet
Create An Azure Databricks Workspace and Cluster
No ratings yet
Create An Azure Databricks Workspace and Cluster
2 pages
DP 203T00A ENU AssessmentGuide
No ratings yet
DP 203T00A ENU AssessmentGuide
13 pages
Databricks Essentials: A Guide to Unified Data Analytics
From Everand
Databricks Essentials: A Guide to Unified Data Analytics
Robert Johnson
No ratings yet
Introduction to Microsoft SQL Server
From Everand
Introduction to Microsoft SQL Server
Eric Frick
No ratings yet
Azure For Starters
From Everand
Azure For Starters
Chinmoy Mukherjee
No ratings yet
Azure Databricks An Introduction
No ratings yet
Azure Databricks An Introduction
54 pages
Azure Databricks Interview Questions
No ratings yet
Azure Databricks Interview Questions
28 pages
Analyser
100% (1)
Analyser
12 pages
Programming Assignment
No ratings yet
Programming Assignment
127 pages
Apdxb
No ratings yet
Apdxb
18 pages
Analysis of Algorithms: CS 302 - Data Structures Section 2.6
No ratings yet
Analysis of Algorithms: CS 302 - Data Structures Section 2.6
37 pages
Installation Manual EnUS 2500893963
No ratings yet
Installation Manual EnUS 2500893963
112 pages
Electronics 11 02965 XV
No ratings yet
Electronics 11 02965 XV
15 pages
CSC 423-Computer Networks Lecture Material
No ratings yet
CSC 423-Computer Networks Lecture Material
52 pages
Computer Skills Full Book
No ratings yet
Computer Skills Full Book
79 pages
Benefits of Class Diagram
100% (1)
Benefits of Class Diagram
17 pages
Engineer - Desktop Management
No ratings yet
Engineer - Desktop Management
4 pages
Medecom BC Med X-Ray EN
No ratings yet
Medecom BC Med X-Ray EN
2 pages
ID 2489981 - 1 and Doc ID2592080 - 1
No ratings yet
ID 2489981 - 1 and Doc ID2592080 - 1
3 pages
Upgrading clients (Linux and UNIX) - IBM Documentation
No ratings yet
Upgrading clients (Linux and UNIX) - IBM Documentation
2 pages
UNIT 3(1)
No ratings yet
UNIT 3(1)
40 pages
Chapter # 03 (E-Business Infrastructure)
100% (2)
Chapter # 03 (E-Business Infrastructure)
37 pages
Reference List: Electronic Sources (Web Publications)
No ratings yet
Reference List: Electronic Sources (Web Publications)
8 pages
3 +Cursor+IDE+Cheatsheet
No ratings yet
3 +Cursor+IDE+Cheatsheet
5 pages
Internal Controls in A Computerised Environment
No ratings yet
Internal Controls in A Computerised Environment
4 pages
Advanced Authentication in WebSphere Application Server
No ratings yet
Advanced Authentication in WebSphere Application Server
26 pages
ProjectReport Aditya
No ratings yet
ProjectReport Aditya
22 pages
Technical Background Thesis Definition
100% (3)
Technical Background Thesis Definition
5 pages
Attitude of Secondary School Teachers Towards The Use of ICT in Teaching Learning Process
100% (1)
Attitude of Secondary School Teachers Towards The Use of ICT in Teaching Learning Process
4 pages
Commitment: SAP S/4 HANA - Features & Benefits
100% (1)
Commitment: SAP S/4 HANA - Features & Benefits
30 pages
Internet Options - Add or Remove Connections Tab - Windows 7 Help Forums
No ratings yet
Internet Options - Add or Remove Connections Tab - Windows 7 Help Forums
7 pages
Micro 32bits SP RACING F3 EVO Brushed
No ratings yet
Micro 32bits SP RACING F3 EVO Brushed
2 pages

Azure DataBricks

Uploaded by

Azure DataBricks

Uploaded by

Microsoft.

• Navigate to the Azure portal.

You might also like