Snowflake Overview
Snowflake Overview
Creating and
Managing the Snowflake
Architecture
• A decade ago, data platform architectures lacked the scalability necessary to
make it easier for data-driven teams to share the same data simultaneously
regardless of the size of the team or their proximity to the data
• Snowflake is an evolutionary modern data platform that solved the
scalability problem
• Snowflake’s Data Cloud provides users with a unique experience by
combining a new SQL query engine with an innovative architecture
designed and built, from the ground up, specifically for the cloud
Create a new worksheet titled
Chapter2 Creating and Managing
Snowflake Architecture
Cloud Services Increased usage of the cloud services layer will likely
Layer occur when using several simple queries, especially
queries accessing session information or using session
variables
• In a perfect world , you’d pay the same total cost for using an X-Small virtual
warehouse as using a 4X-Large virtual warehouse
• The number of concurrent queries, the number of tables being queried, and the
size and composition of the data are a few things that should be considered when
sizing a Snowflake virtual warehouse
• Resizing a Snowflake virtual warehouse is a manual process and can be done
even while queries are running because a virtual warehouse does not have to be
stopped or suspended to be resized
• However, when a Snowflake virtual warehouse is resized, only subsequent
queries will make use of the new size
TIP
• In Snowflake, we can create virtual warehouses through the user interface or with SQL
• When we create a new virtual warehouse using the Snowsight user interface, as shown
in Figure 2-7, we’ll need to know the name of the virtual warehouse and the size we
want the virtual warehouse to be
• A multicluster virtual warehouse allows Snowflake to scale in and out automatically
• A virtual warehouse can be resized, either up or down, at any time, including while it is
running and processing statements
• Resizing a virtual warehouse doesn’t have any impact on statements that are currently
being executed by the virtual warehouse
The value of the Auto Resume and
Auto Suspend times should equal or
exceed any regular gaps in your
query workload
• Query processing tends to slow down when the workload reaches full capacity on
traditional database systems
• In contrast, Snowflake estimates the resources needed for each query, and as the
workload approaches 100%, each new query is suspended in a queue until there
are sufficient resources to execute them
• One way is to separate the workloads by assigning different virtual warehouses to
different users or groups of users
• The virtual warehouses depicted in Figure 2-15 are all the same size, but in
practice, the virtual warehouses for different workloads will likely be of different
sizes
Separation of Workloads and
Workload Management
• In which of the three Snowflake architecture layers will you find the virtual warehouse cache?
• If you are experiencing higher than expected costs for Snowflake cloud services, what kinds of things
might you want to investigate?
• What effect does scaling up or scaling out have on storage used in Snowflake?
• What components do you need to configure specifically for multicluster virtual warehouses?
Knowledge Check
• What are two options to change the virtual warehouse that will
be used to run a SQL command within a specific worksheet?