Snowflake uses standard SQL queries to import and modify data. You can't use multiple SQL queries on a single flow step. If your flow requires multiple SQL queries, you must create one flow step for each query.
To import data into Snowflake, choose an import type based on your requirements. Each import type is explained in the sections below.
For Use optimized bulk load (recommended for larger exports) and Use batch SQL query , when you specify the destination table, the Snowflake connector automatically creates an SQL query to import data.
This import type is ideal for large data volumes as it aggregates data before loading, reducing the need for numerous batch inserts and significantly lowering costs by minimizing warehouse compute time. Additionally, integrator.io automatically generates insert or merge queries based on the destination table and primary keys, allowing you to update existing rows seamlessly without the need for a separate deduplication flow.
Note
Snowflake bulk load does not support variant columns; use batch inserts for them.
Destination table: Select the destination table into which the data must be inserted in either of these ways:
-
By validated table name: Begin typing the table name in the Destination table field, and any table that already exists in your Snowflake environment will display in a list filtered by the text you enter.
-
By referencing a dynamic lookup with a handlebars expression that identifies the destination table: Use a handlebars expression to reference the lookup using the following format:
{{lookup.<lookup name>}} Ex: {{lookup.SnowflakeTableLookup}}
Primary keys: Select the primary keys so that your data can be matched based on these fields and then merged in the destination table. If your table already has primary keys, they will be automatically shown in this setting. You can select one or more fields to designate as primary keys. These fields uniquely identify each record in your table.
Tip
-
If you leave the Primary keys setting blank, an insert query is created.
-
If you provide values in the Primary keys setting, a merge query is created. (After you create your import, map these values (fields) that you've provided in your Primary keys setting.)
With this option, based on the query, after the data processing is completed, it is pushed as a single request to Snowflake.
The batch insert data option is helpful for moderate data volumes. integrator.io builds the insert query for you automatically for each batch of records.
Destination table: Select the destination table to receive batch insert data in either of these ways:
-
by validated table name: Begin typing the table name in the Destination table field, and any table that already exists in your Snowflake environment will display in a list filtered by the text you enter.
-
by referencing a dynamic lookup with a handlebars expression that identifies the destination table: Use a handlebars expression to reference the lookup using the following format:
{{lookup.<lookup name>}} Ex: {{lookup.SnowflakeTableLookup}}
The default number of records bound in each batch is 100, but you can use the Batch size field in the Advanced section to tune your imports as needed.
For more information on:
-
Snowflake batch insert functionality, see Binding an array for batch insert.
-
Snowflake query size limitations, see Query size limits.
Execute a SQL query once for each record.
You can write your SQL command in the SQL query text field. Click Edit () to the right of the text field to open the SQL Query builder AFE.
Execute a SQL query once per page of data.
You can write your SQL command in the SQL query text field. Click Edit () to the right of the text field to open the SQL Query builder AFE.
Use this option ONLY if you want to run a SQL command once per flow run. The SQL will execute as soon as the first page of data reaches the flow step. This option should be used very carefully to avoid race conditions in your flows.
The primary use case for this option is to support flows where source records are first aggregated in a file and then uploaded to Amazon S3, and then finally Use SQL query on first page only is used at the end of the flow to execute a Snowflake COPY command to bulk load the file directly from Amazon S3. See bulk loading from Amazon S3 in Snowflake documentation for additional information.
Important
Do NOT use this option to delete or create tables that are also being loaded or updated in the same flow. To automate those tasks, use separate flows for each individual task, and then link the flows together to run consecutively. Using linked separate flows guarantees that the SQL commands will execute in the correct order without colliding with each other.
Comments
0 comments
Please sign in to leave a comment.