You can create an import into Databricks in either of these ways:
- as part of a flow while designing or building your flow (Flow builder > Add destination app)
– or –
- as a standalone resource that you can later attach to a flow (Resources > Imports)
Creating an import into Databricks consists of two logical parts: providing general details and configuring your import.
Provide general details
To import data into Databricks, you can start by creating a destination or creating an import resource and provide details:
Option A: Create a destination
- From the main menu, select Tools > Flow builder.
- In the Flow builder, click the plus sign () next to Destinations & Lookups.
- In Create destination/lookup, for Application, select Databricks.
- For Select What would you like to do?, select Import records into destination application.
- For Connection, select a previously created Databricks connection from the drop-down menu. It saves time if you've already created your connection and only have to select it. However, you can create a connection if you haven't done so already.
- Click Next.
– or –
Option B: Create an import resource
- From the main menu, select Resources > Imports.
- Click + New import.
- In Create import, for Application, select Databricks.
- For Connection, select a previously created Databricks connection from the drop-down menu. It saves time if you're already created your connection and only have to select it. However, you can create a connection if you haven't done so already.
- For Name, enter a meaningful name based on the purpose, function, or any other aspect.
- For Description, optionally provide details that will help in understanding your import.
- Click Next.
You've completed providing the general import details. Next, you must configure the import.
You can select Use existing import check box later when you've created an import and want to reuse it.
Configure your import
Databricks uses standard SQL queries to import and modify data. If your flow requires multiple SQL queries, you must create one flow step for each query. You can't use multiple SQL queries on a single flow step.
For Databricks, you must configure the required field in General and How would you like the fields to be imported? sections for data to be imported successfully. (You can configure the optional fields as required for any customization.)
Configure required settings
- In Name your import, enter a meaningful name based on the purpose, function, or any other aspect.
Notes:
- If you've used the Resources > Imports path to create an import, then this value is automatically filled in.
- When importing records, the default for One to many is No. If you've got a scenario where you want to use One to many, that is, you have a single record that internally needs to create multiple records, see Create a one to many import including nested arrays
- In Choose type, select any of the following options based on your requirements.
Setting Description Use bulk insert SQL query
The bulk insert data option is ideal for large data volumes. integrator.io builds the insert query for you automatically for each batch of records. You can select the destination table to receive bulk insert data by validated table name or by referencing a dynamic lookup with a handlebars expression that identifies the destination table. The default number of records bound in each batch is 100, but you can use the Batch size field in the Advanced section to tune your imports as needed.
For the bulk insert, provide a table name in the Destination table.
When using the bulk insert option, you can map the data.
- Click the mappings () button on the Databricks import step.
- Edit the mapping between your source and the Databricks import, then click the Settings () button to the right of the mapped fields.
- Set the Data type based on your requirement.
- Click Preview to view the output.
- Click Save.
Use SQL query once per record
Execute a SQL query once for each record. You can write your SQL command in the SQL query text field. Click Edit () to the right of the text field to open the SQL Query builder AFE. Use SQL query once per page of data
Execute a SQL query once per page of data. You can write your SQL command in the SQL query text field. Click Edit () to the right of the text field to open the SQL Query builder AFE.
- Click Preview to check the import.
- Click Save.
When an import is saved,
- if you've created it in a flow, the import is added to your flow in Flow builder.
- if you've created it as a resource, then it is added under Resources > Imports.
Optional sections/fields | Instructions |
---|---|
Mock response | |
Mock response | See Provide mock response data |
Advanced | |
Batch | The Batch setting is shown only for a bulk insert. BatchSize indicates the number of records that will be imported in one request. The default value is 100. |
Concurrency ID lock template | See Build concurrency ID lock template |
Data URI template | See Data URI template |
Invoke | Copy the URL if you want to invoke this resource via an HTTP request. |
Comments
Please sign in to leave a comment.