0% found this document useful (0 votes)
182 views

3 Snowflake+Architecture

Uploaded by

clouditlab9
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
182 views

3 Snowflake+Architecture

Uploaded by

clouditlab9
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 20

Snowflake Architecture

Database Storage:
• Whenever data loading into snowflake,
• Stores table data and query results
• Snowflake stores this optimized in cloud storage.
• The Snowflake reorganizes
the data into its internal optimized, compressed, columnar format.
Data will be stored in columnar format
Data will be stored in micro partitions
• The data objects stored by Snowflake are not directly visible nor accessible by customers.
• They are only accessible through SQL query operations run using Snowflake.
Snowflake manages all aspects of how this data is stored i.e.
the data organization,
file size,
structure,
compression,
metadata,
statistics
We can define cluster keys on large tables for better performance.
Query Processing:
 ­Query execution is performed by Query processing layer.
 Query processing queries using "virtual warehouses".
 Warehouses are required for queries, as well as all DML operations,
including loading data into tables.
 A warehouse is defined by its size.
 Each virtual warehouse is composed of multiple compute nodes allocated by
Snowflake from a cloud provider.
• On AWS they are a group of EC2 instances and on AZURE a
group of Virtual Machines
• Compute cost will be calculated on the basis of query execution
time on virtual warehouses
• Virtual Warehouses are considered as the muscle of the system

• Can scale up and scale down easily

• Auto-Suspend and Auto-Resume is available


 Increasing the size of a warehouse does not always improve data loading
performance.
 Data loading performance is influenced more by the number of files being
loaded (and the size of each file) than the size of the warehouse.

 What is a Multi-cluster Warehouse?

o By default, a virtual warehouse consists of a single cluster of compute


resources available to the warehouse for executing queries.
o As queries are submitted to a warehouse, the warehouse allocates resources
to each query and begins executing the queries.
o If sufficient resources are not available to execute all the queries submitted to
the warehouse, Snowflake queues the additional queries until the necessary
resources become available.
If minimum cluster and maximum cluster both size is different it is called auto-scale
warehouse plan
o With multi-cluster warehouses,
o Snowflake supports allocating, either statically or dynamically,
additional clusters to make a larger pool of compute resources
available.
o A multi-cluster warehouse is defined by
o Specifying the following properties:
 Maximum number of clusters, greater than 1 (up to 10).
 Minimum number of clusters, equal to or less than the
maximum (up to 10).
 If minimum cluster and maximum cluster both size is same
means it is called Maximized warehouse plan.
Cloud Services Layer:

1. Collection of services that coordinate


activities across Snowflake
2. This is the Brain of the snowflake
3. Authentication and access control
4. Infrastructure management
5. Metadata Management
6. Security
7. Manages all Serverless tasks like Snowpipe,
Tasks, Materialized view maintenance etc.

Note:
• Snowflake is not charging if we are querying the Metadata information
• Snowflake is not charging DDL statements also
Connecting to Snowflake

Snowflake supports multiple ways of connecting to the service:

• A web-based user interface from which all aspects of managing and using Snowflake can
be accessed.
• Command line clients (e.g. SnowSQL) which can also access all aspects of managing
and using Snowflake.
• ODBC and JDBC drivers that can be used by other applications (e.g. Tableau) to connect
to Snowflake.
• Through native connectors available in ETL tools (e.g. Datastage, Informatica)
Thank you

You might also like