Contents
- Use bulk insert SQL query
- Use bulk insert to load raw JSON data
- Use MERGE to update data in Snowflake
- Use SQL query once per record
- Use SQL query once per page of records
- Use SQL query on first page only
Snowflake uses standard SQL queries to import and modify data. You can't use multiple SQL queries on a single flow step. If your flow requires multiple SQL queries, you must create one flow step for each query.
Use bulk insert SQL query
The bulk insert data option is ideal for large data volumes. integrator.io builds the insert query for you automatically for each batch of records. You can select the destination table to receive bulk insert data by validated table name or by referencing a dynamic lookup with a handlebars expression that identifies the destination table. The default number of records bound in each batch is 100, but you can use the Batch size field in the Advanced section to tune your imports as needed.
For more information on Snowflake bulk insert functionality, see Binding an array for bulk insert.
For more information on Snowflake query size limitations, see Query size limits.
Identify destination table by name
In the Choose destination table section, select Bulk insert data into a table in the database and begin typing the table name in the Destination table field. Any table that already exists in your Snowflake environment will display in a list filtered by the text you enter. Select the destination table from the list.
Identify destination table by dynamic lookup
To identify the destination table with a dynamic lookup, choose Bulk insert data into a table identified by a dynamic lookup and enter a handlebars expression to reference the lookup using the following format:
{{lookup.<lookup name>}}
Ex:
{{lookup.SnowflakeTableLookup}}
Use bulk insert to load raw JSON data
When using the bulk insert option, you can take advantage of Snowflake technology to insert raw JSON payloads (16 MB compressed, max., into a single column of one row).
Use the following steps to use a bulk insert SQL query for JSON semi-structured data.
- Click the mappings (
) button on the Snowflake import step.
- Edit the mapping between your source and the Snowflake import, then click the Settings (
) button to the right of the mapped fields.
- Set the Data type to JSON, and select Multi-field as the Field mapping type.
For more information on Snowflake semi-structured data, see Semi-structured concepts.
Use MERGE to update data in Snowflake
The best practice for updating existing data in Snowflake is to use two separate flows that are linked together:
- The first flow should simply import all changes to a temporary staging table in Snowflake.
- The second flow should merge the staging table with the production table, and then delete the staging table so that the flows can be run again.
The following two examples illustrate the flows described above. The first flow (DF03 - NetSuite to Snowflake) exports data from NetSuite and imports into a single staging table in Snowflake.
The second flow (DF03 -- m&d) first checks if any records exist in the staging table. This done with a simple dummy query: SELECT * FROM netsuite_staging LIMIT 1
If any records are returned, integrator.io runs the flow to merge the tables together, then deletes all rows in the staging table so that the next time the flows runs, only future changes will be synchronized.
The following image illustrates the MERGE query used in the above flow. Notice that Use SQL query on first page only is selected since we only want to run the MERGE query once.
The following image illustrates the dummy query that checks if there are any records in the staging table. If so, integrator.io will only export a single record to keep the merge. The single record returned is only used to trigger subsequent steps in the flow, and the data from that record is not used for anything else.
Use SQL query once per record
Execute a SQL query once for each record.
You can write your SQL command in the SQL query text field. Click Edit () to the right of the text field to open the SQL Query builder AFE.
Use SQL query once per page of data
Execute a SQL query once per page of data.
You can write your SQL command in the SQL query text field. Click Edit () to the right of the text field to open the SQL Query builder AFE.
Use SQL query on first page only
Use this option ONLY if you want to run a SQL command once per flow run. The SQL will execute as soon as the first page of data reaches the flow step. This option should be used very carefully to avoid race conditions in your flows.
The primary use case for this option is to support flows where source records are first aggregated in a file and then uploaded to Amazon S3, and then finally Use SQL query on first page only is used at the end of the flow to execute a Snowflake COPY command to bulk load the file directly from Amazon S3. See bulk loading from Amazon S3 in Snowflake’s documentation for additional information.
Important: Do NOT use this option to delete or create tables that are also being loaded or updated in the same flow. To automate those tasks, use separate flows for each individual task, and then link the flows together to run consecutively. Using linked separate flows guarantees that the SQL commands will execute in the correct order without colliding with each other.
Comments
0 comments
Please sign in to leave a comment.