You can create an export from Databricks in either of these ways:
-
as part of a flow while designing or building your flow (Flow builder > Add source app)
– or –
-
as a standalone resource that you can later attach to a flow (Resources > Exports)
Tip
The navigation paths differ, but the exporting task is the same.
Creating an export into Databricks consists of two logical parts: providing general details and configuring the export.
You can start creating an export from Databricks and provide essential/general details in either of these ways:
-
From the main menu, select Build > Flow builder.
-
In Flow builder, click the plus sign (
) next to Source.
-
In Create source, for Application, select Databricks.
-
For What would you like to do?, the default is Export records from source application.
-
For Connection, select a previously created Databricks connection from the drop-down menu. It saves time if you've already created your connection and only have to select it. However, you can click Create connection and create one if you haven't done so already.
-
Click Next.
– or –
-
From the main menu, select Resources >Exports.
-
Click + New export.
-
In Create export, for Application, select Databricks.
-
For Connection, select a previously created Databricks connection from the drop-down menu. It saves time if you're already created your connection and only have to select it. However, you can create a connection if you haven't done so already.
-
For Name, enter a meaningful name based on the purpose, function, or any other aspect.
-
For Description, optionally provide details that will help in understanding your export.
-
Click Next.
You've completed providing the general export details. Next, you must configure the export.
You can use standard SQL queries to export and modify data. If your flow requires multiple SQL queries, you must create one export for each query. You can't use multiple SQL queries on a single export.
For Databricks, you must configure the required settings in General, What would you like to export? and Configure export type sections for data to be exported successfully. (You can configure the optional settings as required for any customization.)
-
In Name your export, enter a meaningful name for your export based on the purpose, function, or any other aspect.
If you've used the Resources > Exports path to create an export, then this value is automatically filled in.
-
For SQL query, click the Edit icon (
) and enter your SQL statement based on what you want to export from Databricks.
-
For Export type, select an option based on how you want to export records. See Four different export types to retrieve data.
-
Click Preview to check the export to see sample JSON-formatted data. If the SQL query is not executed successfully, revise the Query or Export type values.
-
Click Save.
When an export is saved,
-
if you've created it in a flow, the export is added to your flow in Flow builder.
-
if you've created it as a resource, then it is added under Resources > Exports.
These export settings are optional, and typically don't have to be configured. However, if required in your scenario, you can configure these settings.
|
Optional sections/settings |
Instructions |
|---|---|
|
Would you like to group records? |
|
|
Group records by fields |
|
|
Mock output |
|
|
Mock output |
|
|
Advanced |
|
|
Page size |
See Choosing the right page size in Fine-tune integrator․io for optimal performance and data throughput |
|
Data URI template |
When your flow runs but has data errors this field can be really helpful in that it allows you to make sure that all the errors in your job dashboard have a link to the original data in the export application. Use a handlebars template to generate the dynamic links based on the data being exported. For example, if you are exporting a customer record from Shopify, you would most likely set this field to the following value |
|
Do not store retry data |
Check this box if you do NOT want integrator.io to store retry data for records that fail in your flow. Storing retry data can slow down your flow's overall performance if you are processing very large numbers of records that are failing. |
|
Override trace key template |
If this field is set, you will override the platform default trace key field. integrator.io uses the trace key to identify a unique record. You can use a single field such as |