0% found this document useful (0 votes)
88 views

Azure Data Fundamentals Explore Non Relational Data in Azure - Explore Non-Relational Data Offerings in Azure

This document provides an overview of non-relational data storage options in Azure, including Azure Table Storage, Blob Storage, File Storage, and Cosmos DB. It focuses on Azure Table Storage, describing its key-value data model and how entities are organized into partitions and rows. Benefits of Table Storage include flexibility for semi-structured data, easy scaling, and fast data retrieval when specifying partition and row keys. Examples of suitable uses include storing large datasets for web applications, IoT sensor data, and logging/monitoring data.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
88 views

Azure Data Fundamentals Explore Non Relational Data in Azure - Explore Non-Relational Data Offerings in Azure

This document provides an overview of non-relational data storage options in Azure, including Azure Table Storage, Blob Storage, File Storage, and Cosmos DB. It focuses on Azure Table Storage, describing its key-value data model and how entities are organized into partitions and rows. Benefits of Table Storage include flexibility for semi-structured data, easy scaling, and fast data retrieval when specifying partition and row keys. Examples of suitable uses include storing large datasets for web applications, IoT sensor data, and logging/monitoring data.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

Azure Data Fundamentals: Explore non relational data in

Azure - Explore non-relational data offerings in Azure


Wednesday, September 29, 2021 1:16 AM

The structure of the data might be too varied to easily model as a set of relational tables

Learning objectives
- Explore use-case and management benefits of using Azure Table storage
- Explore use-case and management benefits of using Azure Blob storage
- Explore use-case and management benefits of using Azure File Storage
- Explore use-case and management benefits of using Azure Cosmos DB

Explore Azure Table storage


- Implements the NoSQL key-value model. In this model, the data for an item is stored as a set of
fields, and the item is identified by a unique key

What is Azure Table Storage?


- Is a scalable key-value store held in the cloud. You create a table using an Azure storage account
- Items are referred to as rows and fields are known as columns
- An Azure table enables you to store semi-structured data. All rows in a table must have a key, but
the columns in each row can be vary.
- Azure Table Storage tables have no concept of relationships, stored procedures, secondary
indexes, or foreign keys.
- Data will usually be denormalized, with each row holding the entire data for a logical entity

Microsoft Page 1
Data will usually be denormalized, with each row holding the entire data for a logical entity

- To help ensure fast access, Azure Table Storage splits a table into partitions
- Partitioning is a mechanism for grouping related rows, based on a common property or partition
key. Rows that share the same partition key will be stored together. It can improve scalability and
performance:
○ Partitions are independent from each other, and can grow or shrink as rows are added to, or
removed from a partition. A table can contain any number of partitions
○ When you search for data, you can include the partition key in the search criteria.
This helps to narrow down the volume of data to be examined, and improves performance
by reducing the amount of I/O (reads and writes) needed to locate the data.
- The key in an Azure Table Storage table comprises two elements:
○ Partition key - identifies the partition containing the row]
○ Row key - unique to each row in the same partition
▪ Item in the same partition are stored in row key order

- This scheme enables an application to quickly perform Point queries that identify a single row and
Range queries that fetch a contiguous block of rows in a partition

Point query
○ When an application retrieves a single row, the partition key enables Azure to quickly hone
in on the correct partition and the row key lets Azure identify the row in that partition
○ The partition key and row key effectively define a clustered index over the data

Microsoft Page 2
Range query
○ The application search for a set oof rows in a partition, specifying the start and end point of
set as row keys.

The column in a table can hold numeric, string or binary data up to 64KB in size
A table can have up to 252 columns, apart from the partition and row keys.
The maximum row size is 1 MB

Use cases and management benefits of using Azure Table Storage


- Azure Table Storage tables are schemaless
- You can use tables to hold flexible datasets
Ex: user data for web applications, address books, device information, or other types of metadata
your service required
- The important part is to choose the partition and row keys carefully
- The primary advantages of using Azure Table Storage tables:
○ It’s simpler to scale.
It takes the same time to insert data in an empty table, or a table with billions of entries. An
Azure storage account can hold up to 5 PB of data
○ A table can hold semi-structured data
○ There's no need to map and maintain the complex relationships typically required by a
normalized relational database
○ Row insertion is fast
○ Data retrieval is fast, if you specify the partition and row keys as query criteria
- Disadvantages to storing data this way, including:
○ Consistency needs to be given consideration as transactional update across multiple entities
aren't guaranteed
○ There's no referential integrity; any relationships between rows need to be maintained
externally to the table
○ It’s difficult to filter and sort on non-key data. Queries that search based on non-key fields
could result in full table scans
- Azure Table Storage is an excellent mechanism for:
Storing TBs of structured data capable or serving web scale applications

Microsoft Page 3
○ Storing TBs of structured data capable or serving web scale applications
Ex: product catalog for eCommerce applications, and customer information, where the data
can be quickly identified and ordered by a composite key
○ Storing datasets that don't require complex joins, foreign keys or stored procedures, and
that can be denormalized for fast access.
In IoT system, you might use Azure Table Storage to capture device sensor data
○ Capturing event logging and performance monitoring data.
Event log and performance information typically contain data that is structured according to
the type of event or performance measure being recorded.
Ex: partitioned by event and ordered by the date and time
Or if you need to analyze an ordered series of events and performance measure
chronologically, then partition the data by date. Then consider storing data twice, first by
type and again by date.
Writing data is fast, and the data is static once it has been recorded
- Azure Table Storage is intended to support very large volumes of data, up to several hundred TBs
in size.
- As you add rows to a tale, Azure Table Storage automatically manages the partitions in a table and
allocates storage as necessary
- Provide high-availability guarantees in a single region.
The data for each table is replicated three times within an Azure region.
At additional cost, you can create tables in geo-redundant storage
- Azure Table Storage helps to protect your data. You can configure security and role-based access
control to ensure that only the people or applications that need to see your data can actually
retrieve it

Create and view a table using the Azure portal


The simplest way to create a table un Azure Table Storage is to use the Azure portal. Follow these steps:
- Sign in into the Azure portal using your Azure account
- On the home page of the Azure portal, select +Create resource

- On the New page, select Storage account - blob, file, table, queue

Microsoft Page 4
On the New page, select Storage account - blob, file, table, queue

- On the Create storage account page, enter the following details, and then select Review + create

Microsoft Page 5
- On the validation page, click Create, and wait while the new storage account is configure
- When the Your deployment is complete page appears, select Go to resource

- On the Overview page, for the new storage account, select Tables

- On the Tables page, select + Table

Microsoft Page 6
On the Tables page, select + Table

- In the Add table dialog box, enter testtable for the name of the table, and then select OK

- When the new table has been created, select Storage Explorer

- On the Storage Explorer page, expand Tables, and then select testtable. Select Add to insert a new
entity into the table
Note: in Storage Explorer, rows are also called entities

Microsoft Page 7
- In the Add Entity dialog box, enter your own values for the PartitionKey and RowKey properties,
and then select Add Property.
Add a String property called Name and set the value to your name.
Select Add Property again, and add a Double property (this is numeric) named Age, and set the
value to your age
Select Insert to save the entity

- Verify that the new entity has been created.


The entity should contain the values you specified, together with a timestamp that contains the
date and time that the entity was created.

- If time allows, experiment with creating additional entities. Not all entities must have the same

Microsoft Page 8
- If time allows, experiment with creating additional entities. Not all entities must have the same
properties. You can use the Edit function to modify the values in entity, and add or remove
properties. The Query function enables you to find entities that have properties with a specified
set of values

Explore Azure Blob storage


Microsoft Azure virtual machines use blob storage for holding virtual machine disk images. These object
can be several hundreds of GB in size

Blob : Binary Large Object

What is Azure Blob Storage


- Azure Blob storage is a service that enables you to store massive amounts of unstructured data, or
blobs, in the cloud. You create blobs using an Azure storage account
- Azure supports three different types of blob
○ Block blobs
▪ A block blob is handled as a set of blocks
▪ Each block can vary in size, up to 100 MB
▪ A block blob can contain up to 50.000 blocks, giving a maximum size of over 4.7 TB
▪ The block is the smallest amount of data that can be read or written as an individual
unit
▪ Best used to store discrete, large, binary object that change infrequently
○ Page blobs
▪ A page bob is organized as a collection of fixes size 512-byte pages.
▪ Optimized to support random read and write operations
▪ Can hold up to 8 TB of data.
▪ Azure uses page blobs to implement virtual disk storage for virtual machines
○ Append blobs
▪ An append block is a block blob optimized to support append operations.
▪ You can only add blocks to the end of an append blob; updating or deleting isn't
supported
▪ Each block can vary in size, up to 4 MB. The maximum size of an append blob is just
over 195 GB
- Inside an Azure storage account, you create blobs inside containers.
A container provide a convenient way of grouping related blobs together, and you can organize
blobs in a hierarchy of folders similar to flies in a file system on disk.
You control who can read and write blobs inside a container at the container level

- Blob storage provides three access tiers, which help to balance access latency and storage cost
○ The Hot tier (default)
▪ Use this tier for blobs that are accessed frequently
▪ The blob data is stored on high-performance media
○ The Cool tier
▪ Has lower performance and incurs reduced storage charges compared to the Hot tier.
▪ Use this tier for data that is accessed infrequently
▪ You can migrate a blob from the hot tier to cool tier and also the opposite
○ The Archive tier
▪ Provides the lowest storage cost, but with increased latency

Microsoft Page 9
▪ Provides the lowest storage cost, but with increased latency
▪ Intended for historical data that mustn't be lost, but is required only rarely
▪ Blobs in the Archive tier effectively stored in an offline state.
▪ Typical reading latency for hot and cool tiers is a few milliseconds, but for the Archive
tier, it can take hours for the data to become available.
▪ To retrieve a blob from the Archive tier, you must change the access tier to Hot or
Cool. The blob will then be rehydrated. You can read the blob only when the
rehydration process is complete.
You can create lifecycle management policies for blobs in a storage account. A lifecycle
management policy can automatically move a blob from Hot to Cool, and then to the Archive tier,
as it ages and is used less frequently (policy is based on the number of days since modification). A
lifecycle management policy can also arrange to delete outdated blobs.

Uses cases and management benefits of using Azure Blob Storage


Common uses of Azure Blob Storage include
- Serving images or documents directly to a browser, in the form of a static website
- Storing files for distributed access
- Streaming video and audio
- Storing data for backup and restore, disaster recovery, and archiving
- Storing data for analysis by an on-premises or Azure-hosted service

Note:
Azure Blob Storage is also used as the basis for Azure Data Lake Storage
Azure Data Lake -> uses for performing big data analytics

To ensure availability, Azure Blob storage provides redundancy.


Blobs are always replicated three times in the region in which you created your account, but you can
also select geo-redundancy, which replicates your data in a second region (at additional cost).

Other features available with Azure Blob storage include:


- Versioning
You can maintain and restore earlier versions of a blob
- Soft delete
Enables you to recover a blob that has been removed or overwritten, by accident or otherwise
- Snapshots
Read-only version of a blob at a particular point in time
- Change feed
Provides an ordered, read-only, record of the updates made to a blob.
You can use the change blob to monitor these changes and perform operations such as
○ Update a secondary index, synchronize with a cache, search-engine, or any other content-
management scenarios
○ Extract business analytics insights and metrics, based on changes that occur to your objects,
either in a streaming manner or batched mode.
○ Store, audit, and analyze changes to your objects, over any period of time, for security,
compliance or intelligence for enterprise data management.
○ Build solutions to back up, mirror, or replicate object state in your account for disaster
management or compliance.
○ Build connected application pipelines that react to change events or schedule executions
based on created or changed objects.

Create and view a block blob using the Azure portal


Blobs are stored in containers, and you create a container using a storage account.
The following steps assume you've created the storage account
- In the Azure portal, on the left-hand navigation menu, select Home

Microsoft Page 10
- On the home page, select Storage accounts

- On the Storage accounts page, select the storage account you created
- On the Overview page for your storage account, select Storage Explorer
- On the Storage Explorer page, right-click BLOB CONTAINERS, and then select Create blob container

- In the New Container dialog box, give your container a name, accept the default public access
level and then select Create

Microsoft Page 11
level and then select Create

- In the Storage Explorer window, expand BLOB CONTAINERS, and then select your new blob
container

- In the blobs window, select Upload

- In the Upload blob dialog box, use the files button to pick a file of your choice on your computer,
and then select Upload

Microsoft Page 12
and then select Upload

- When the upload has completed, close the Upload blob dialog box. Verify that the block blob
appears in your container

- If you have time, you can experiment uploading other files as block blobs. You can also download
blobs back to your computer using the Download button

Explore Azure File Storage


What is Azure File Storage?
- Azure File Storage enables you to create files shares in the cloud, and access these file shares from
anywhere with an internet connection.
- Azure File Storage exposes file shares using the Server Message Block 3.0 (SMB) protocol. This is
the same file sharing protocol used by many existing on-premises applications.
- These applications should continue to work unchanged if you migrate your file shares to the cloud.
- The applications can be running on-premises, or in the cloud
- You can control access to shares in Azure File Storage using authentication and authorization
services available through Azure Active Directory Domain Service

Microsoft Page 13
- You create Azure File Storage in a storage account,
- Azure File Storage enables you to share up to 100TB if data in a single storage account. The
maximum size of a single file is 1TB, but you can set quotas to limit the size of each share below
this figure.
- Supports up to 2000 concurrent connections per shared file
- Once you've created a storage account, you can upload files to Azure File Storage using the Azure
portal, or tools such as the AzCopy utility. You can also use the Azure File Sync service to
synchronize locally cached copies of shared files with the data in Azure File Storage.
- Azure File Storage offers two performance tiers
○ Standard tier - uses hard disk-based hardware datacenter
○ Premium tier - uses solid-state disks, offer greater throughput, but it charged at a higher
rate

Use cases and management benefit of using Azure File Storage


Designed to support many scenarios, including:
- Migrate existing application to the cloud
Azure File Storage enables you to migrate your on-premises file or file share-based applications to
Azure without having to provision or manage highly available file server virtual machines.
- Share server data across on-premises and cloud
With encryption in SMB 3.0, you can securely mount Azure File Storage shares from anywhere.
Applications running in the cloud can share data with on-premises applications using the same
consistency guarantees implemented by on-premises SMB servers.
- Integrate modern applications with Azure File Storage
By leveraging the modern REST API that Azure File Storage implements in addition to SMB 3.0, you
can integrate legacy applications with modern cloud applications, or develop new file or file share-
based applications.
- Simplify hosting High Availability (HA) workload data
Azure File Storage delivers continuous availability so it simplifies the effort to host HA workload
data in the cloud. The persistent handles enabled in SMB 3.0 increase availability of the file share,
which makes it possible to host applications such as SQL Server and IIS in Azure with data stored in
shared file storage.

Note:
Don't use Azure File Storage for files that can be written by multiple concurrent processes
simultaneously. Multiple writers require careful synchronization, otherwise the changes made by
one process can be overwritten by another.
The alternative solution is to lock the file as it is written, and then release the lock when the write
operation is complete. However, this approach can severely impact concurrency and limit
performance.

Azure Files Storage is a fully managed service. Your shared data is replicated locally within a region, but
can also be geo-replicated to a second region.

Microsoft Page 14
can also be geo-replicated to a second region.
Azure aims to provide up to 300 MB/second of throughput for a single Standard file share, but you can
increase throughput capacity by creating a Premium file share, for additional cost.
All data is encrypted at rest, and you can enable encryption for data in-transit between Azure File
Storage and your applications.

Create an Azure storage file share using the Azure Portal


The following steps assume you've created the storage account
- In the Azure portal, on the hamburger menu, select Home
- On the home page, select Storage accounts
- On the Storage accounts page, select the storage account you created
- On the Overview page for your storage account, select Storage Explorer
- On the Storage Explorer page, right-click FILE SHARES, and then select Create file share

- In the New file share dialog box, enter a name for your file share, leave Quota empty, and then
select Create

Microsoft Page 15
select Create

- In the Storage Explorer window, expand FILE SHARES and select your new file share, and then
select Upload

Tip

If your new file share doesn't appear, right-click FILE SHARES, and then select Refresh.

- In the Upload files dialog box, use the files button to pick a file your choice on your computer, and
then select Upload

Microsoft Page 16
then select Upload

- When the upload has completed, close the Upload files dialog box. Verify that the file appears in
file share

Tip
If the file doesn't appear, right-click FILE SHARES, and then select Refresh.

Explore Azure Cosmos DB


More generalized solution, that enables you to store and query data more easily without having worry
about the exact mechanism for performing these operations

What is Azure Cosmos DB?


- Azure Cosmos DB is a multi- model NoSQL database management system
- Cosmos DB manages data as a partitioned set of documents. A document is a collection of fields,
identified by a key.
The fields in each document can vary and a field can contain child documents
Many document database use JSON to represent the document structure.
The field in a document are enclosed between braces {} and each field is prefixed with is name

Microsoft Page 17
- A document can hold up to 2 MB of data, including small binary objects
- Cosmos DB provides APIs --> enable you to access these documents using a set of well-known
interface

Note:
API (Application Programming Interface) -- uses to write programs that need to access data.
The APIs will often be different for different database management systems

The APIs that Cosmos DB currently supports:


○ SQL API
The interface provides a SQL-like query language over documents, enable to identify and
retrieve documents using SELECT statements

○ Table API
▪ Enables you to use the Azure Table Storage API to store and retrieve documents.
▪ Enable you to switch from Table Storage to Cosmos DB without requiring that you
modify your existing applications
○ MongoDB API
▪ MongoDB is document database, with its own programmatic interface. Many
organizations run MongoDB on-premises.
▪ You can use the MongoDB API for Cosmos DB to enable a MongoDB application to run
unchanged against a Cosmos DB database.
▪ You can migrate the data in the MongoDB database to Cosmos DB running in the
cloud, but continue to run your existing applications to access this data.
○ Cassandra API
▪ A column family database management system.
▪ Many organization run it on-premises
▪ The Cassandra API for Cosmos DB provides a Cassandra-like programmatic interface
for Cosmos DB.
▪ enable you to quickly migrate Cassandra databases and applications to Cosmos DB.
○ Gremlin API
▪ Implements a graph database interface to Cosmos DB.
▪ A graph is a collection of data objects and directed relationships
▪ Data is still held as a set of documents in Cosmos DB, but the Gremlin API enables you
to perform graph queries over data.
▪ Using Gremlin API you can walk through the objects and relationships in the graph to
discover all manner of complex relationships

Microsoft Page 18
discover all manner of complex relationships

Note:
The primary purpose of the Table, MongoDB, Cassandra, and Gremlin APIs is to support
existing applications. If you are building a new application and database, you should use the
SQL API.

- Documents in a Cosmos DB database are organized into containers.


The documents in a container are grouped together into partitions.
A partition holds a set of documents that share a common partition key.
Partition key = fields in your documents
You should select partition key that collect all related documents together

- documents in a Cosmos DB partition aren't sorted by ID. Instead, Cosmos DB maintains a separate
index.
This index contains not only the document IDs, but also tracks the value of every other field in
each document. This index is created and maintained automatically.
This index enables you to perform queries that specify criteria referencing any fields in a
container, without incurring the need to scan the entire partition to find that data.

Use cases and management benefits of using Azure Cosmos DB


- Cosmos DDB is a highly scalable database management system
- Each partition can grow up to 10GB in Size
- Indexes are created and maintained automatically
- To ensure availability, all databases are replicated within a single region. This replication is
transparent, and failover from a failed replica is automatic.
- you can choose to replicate data across regions, at additional cost.
- Cosmos DB guarantees less than 10-ms latencies for both reads (indexed) and writes at 99th

Microsoft Page 19
- Cosmos DB guarantees less than 10-ms latencies for both reads (indexed) and writes at 99th
percentile, all around the world. This capability enables sustained ingestion of data and fast
queries for highly responsive apps
- Cosmos DB is certified for a wide array of compliance standards. Additionally, all data in Cosmos
DB is encrypted at rest and in motion
- Cosmos DB is a foundational service in Azure.
- Cosmos DB is highly suitable for this scenarios
○ IoT and telematics
▪ Ingest large amount of data in frequent burst of activity
▪ The data can then be used by analytics services, such as Azure Machine Learning,
Azure HDInsight, and Power BI, additionally, you can process the data in real time
using Azure Functions
○ Retail and marketing
▪ Ex: Windows Store and Xbox Live
▪ Used in the retail industry for storing catalog data and for event sourcing in order
processing pipelines
○ Gaming
▪ Modern games perform graphical processing on mobile/console clients, but rely on
the cloud to deliver customized content like in-game stats, social media integration
and high-score leaderboards.
▪ A game database needs to be fast and be able to handle massive spikes in request
rates during new game launches and feature updates
○ Web and mobile applications
▪ Well suited for modeling social interactions, integrating with third-party services and
for building rich personalized experiences.
▪ The Cosmos DB SDKs can be used to build rich iOS and Android applications using the
popular Xamaring framework

Microsoft Page 20

You might also like