DataLoading in Snowflake
DataLoading in Snowflake
Snowflake
Agenda
• Load Types
• Bulk Loading v/s Continuous Loading
• Copy Command
• Transforming Data
Load Types
1. Bulk loading using Copy
2. Continuous loading using Snowpipe
Bulk Loading Using the COPY
Command
• This option enables loading batches of data from files already available
in cloud storage(External Stages)
• We have to create Storage Integration objects to extract data from
these cloud storages.
(Or)
• Copying data files from a local machine to an Internal Stage(i.e.
Snowflake) before loading the data into table.
• Bulk loading uses virtual warehouses
• Users are required to size the warehouse appropriately to
accommodate expected loads using the COPY command.
Continuous Loading Using
Snowpipe
• Designed to load small volumes of data (i.e. micro-batches) and
incrementally make them available for analysis.
• Live or real time data
• Snowpipe loads data within minutes after files are added to a stage and
submitted for ingestion.
• This ensures users have the latest data for business analysis.
• Snowpipe uses compute resources provided by Snowflake, It is a serverless
task and there will be separate charge for these serverless tasks.
• The COPY statement in the pipe definition supports the same COPY
transformation options as when bulk loading data.
COPY Command
COPY INTO TABLENAME
FROM @STAGE
file_format= (…)
files = (filename1, filename2 …)
(or)
pattern = ‘.*filepattern.*’
other_optional_props ;
COPY Command
Other ways to load Data
• By using ETL tools like
Matellion
Datastage
Informatica
Hevo
Azure Data Factory
Azure Synapse etc
Simple Transformations During Data
Load
Snowflake supports transforming data while loading it into a table
using the COPY command. Options include:
• Column reordering
• Column omission
• String operations
• Other functions
• Sequence numbers
• Auto increment fields
Thank You