Most runtime errors are generated as a result of the quality of the data you're importing. It is extremely important to keep your data clean, accurate, and current to improve the success of your business automations.
“No error left behind”
With a better understanding of the Celigo platform’s error management tools and the principles of effective error resolution, you’ll quickly build your skills to leverage the power of integrator.io, improve data hygiene, and avoid compounding problems. It will always take continuous monitoring and diligence to some degree, but practicing these concepts will get you much closer to the goal of seamless automation.
Contents
- How the Celigo platform streamlines error management
- Take full advantage of industry-leading iPaaS features
- Analyze errors to fix the root cause – not just resolve individually
How the Celigo platform streamlines error management
The Celigo platform helps reduce your efforts by intelligently recovering behind the scenes, where possible, and reducing the number of open errors that you need to review, chiefly through the following features:
- Auto-resolving duplicate errors
- Auto-retrying intermittent errors
- Auto-detecting a connection that's back online and resuming the flow
Auto-resolving duplicate errors
When the Auto-resolve feature is enabled for a flow, the platform monitors how an error is processed in the current run, then finds records from previous runs identified by the same trace key and automatically resolves them for you.
Example 1: When the error data has been corrected in an export
An error was returned by the source application. After you correct the data coming from the source application in the export and rerun the flow: the flow reports no error for the record in the current run; and, previous (duplicate) errors in records with the same trace key are auto-resolved (that is, they are moved to the Resolved errors tab).
Example 2: When the error still exists
Your integration reports errors for the same record again and again – in a single run or across multiple scheduled or real-time flow runs. The platform only displays the most recent error for the record so that you can review and fix it; and, previous (duplicate) errors in records with the same trace key are auto-resolved. Your view of the open errors is much cleaner and easy to use as a result, but you can also view the historical “duplicate” error resolution in the Resolved errors tab.
Auto-retrying intermittent errors
When integrator.io classifies an error as Intermittent, up to four auto-retry attempts occur if Auto-resolve is enabled for the flow.
Intermittent errors – not necessarily repeatable responses, where the third-party API successfully recovers without any changes to the request – are more common than most people assume. In fact, a random audit for the week of Sept. 18, 2022, revealed that the platform auto-resolved and auto-retried 4.3 M errors, saving customers untold time and frustration.
In late August 2022, Magento 2 – NetSuite Integration App customers received numerous errors saying that the “target service might be inactive.”
These intermittent errors were resolved over time automatically for these customers by periodic retries, and the data was successfully synced without any intervention. For more information, see the following posts:
Auto-detecting a connection that's back online and resuming the flow
When auto-resolve is enabled, and the platform gets offline errors classified as Connection:
- The platform will pause the flow from running.
- It will periodically check to determine if the user has fixed the Connection error and the connection is restored. The platform automatically pings at hourly intervals with an exponential backoff of 1, 2, 4, 8, 16, and 32 hours respectively.
- If the connection is restored, the platform will automatically detect it and resume the flow.
When the flow is back up and running, you can retry any errors received before or after the flow was paused.
Take full advantage of industry-leading iPaaS features
The current integrator.io error management system was designed to give anyone monitoring a flow the most meaningful and actionable error information without writing a single line of code, whether when interacting with the platform or reviewing email notifications. Simply put, Celigo error management capabilities are vastly superior to any other integration/automation platform on the market. They’ve been proven across a massive number of flows, records, teams, applications, and so on.
Furthermore, the error management functionally has benefited from a wealth of input from customers and integration engineers, as well as from a strong commitment to enhancing the customer experience. Let’s look at some of the ways you can easily manage integration errors and examine their causes:
- The account dashboard gives you a quick view of in-progress flows and results of completed flows. It’s especially useful if you have to monitor multiple integrations.
- Flow Builder shows you the current number of open errors for each flow step, and the Run console shows you the result of the latest flow run.
- Flow analytics graphs allow you to track exactly how long a flow step takes to process. At a high level, you can monitor spikes in data and error activity and identify anomalies, such as an unusually high number of records causing a slower runtime.
- Email notifications alert you to errors, resolved errors, and offline connections. You can subscribe yourself and other users of your account.
- Integration monitor roles give coworkers access to monitor, review, and resolve errors – without having to worry about whether these users will have permissions to change the integration setup.
The platform also provides you complete diagnostic information for each error to provide instant recognition, as well as to help you drill down when troubleshooting:
- Complete error retry data
- Contextual request and response history for HTTP-based, NetSuite, and Salesforce connections
-
Standard error fields: error message, error code, error source, error ID, trace key, timestamp, and a classification to clarify where the error originates from:
- Classification values such as “Missing,” “Intermittent,” “Duplicate,” and "Rate limit" indicate that the error originates from external applications (outside integrator.io), and the “Source” value holds the application name
- The “Source” field can also reveal when the error happens due to failure in integrator.io components (such as a transformation, hook, or filter)
- A “Connection” classification indicates errors caused by a bad connection, and the “Source” field reveals the application name
- In a custom flow (as opposed to an Integration app flow), you can enrich the error ID info by overriding the trace key to include additional attributes, such as internal ID + vendor name. When you examine the error details, the Trace key field shows the custom value you specified.
- You can improve the error message by using the Data URI template setting, which results in a user-friendly link to the record in the source system.
Analyze errors to fix the root cause – not just resolve individually
For any outstanding error that the platform cannot auto-resolve, it’s definitely worth your time to perform a root-cause analysis (RCA). That is, don’t just focus on resolving one error. It’s more important to fix the root cause of the error and understand what went wrong between two or more systems and how the logic in your flow may have contributed to the error. The goal is to prevent the same or similar errors in the next flow run and maintain truly synced applications.
Understandably, at times you will need an order record to be imported as soon as possible. However, even if you use the retry data to fix specific urgent errors, you still should analyze the root cause and fix the error to improve your integration.
Important: It is imperative that your company’s IT work together with departmental application admins to manage and monitor the integrations. At a minimum, IT personnel and the admins of the applications’ being automated must have sufficient access to the integrator.io account. The primary responsibility for the RCA and fixing errors should belong to the departmental application admins, who are familiar with the applications and the data. Then, IT should be an escalation resource when the problem persists.
Tools and tips for analyzing specific types of errors
The Celigo platform contains several features for inspecting data throughout the integration, which are useful for handling errors and viewing the expected and actual outcomes of a flow run.
Tip: Most errors result from changes in the source and destination applications, whether introduced by an update or by someone in your organization. While the platform’s features will help you identify and mitigate errors, implementing a company-wide change management process for these external systems will help you avoid receiving errors and pinpoint the unforeseen causes more efficiently.
Advance field editors (AFEs)
AFEs – with the easy-to-use input, output, and console for testing statements – will rapidly enhance and troubleshoot data issues throughout a flow. When performing an error RCA, the first approach for troubleshooting should be to copy the retry data into AFE input.
Debug logs
Debug logs contain detailed traces of integration activity for HTTP-based connectors and webhook listeners. Enable debugging to track requests and responses between integrator.io and the flow step application. The logs can assist you when troubleshooting by enabling you to view errors that caused a particular record to fail. You can use the request response to compare success vs. error results.
Runtime script debugging
You can write statements to script logs in order to inspect the shape and content of the data flowing through any step in a flow. Add these console statements anywhere JavaScript export and import hooks are available.
Import preview panels
Test API calls to HTTP-based connectors, directly in the platform, with mock data in the import Preview panel. The import preview allows you to see the constructed request and a sample record that represent what integrator.io will send to the destination application. You can then send the sample record to generate a response from the destination application and use the response data as a reference when troubleshooting.
Get the file name for error handling (FTP exports)
For FTP exports, you can get the file name in an error message. The file name is helpful if you have multiple exports in a flow with similar data files. You can search for error messages based on the file name and then troubleshoot errors.
Automatically ignore an error (custom script)
You can write a custom postSubmit hook script to ignore an error with specific wording in its message. (Hooks apply only to custom flows; they are unavailable for Integration apps.)
There are scenarios where you might want to insert a record into an application only if the record does not exist. However, the application doesn’t offer an API to perform the search. In this case, if you receive a duplicate record error (the actual message varies by application) when you insert the record, the hook will remove that error from your flow.
Send errors to an integrator.io listener (custom script)
In another rare situation, you might need to extract the information from the error retry data for further research. Assuming you are working with a custom flow, one way to automate the process is to write a postSubmit hook to send the payload info to a listener in a real-time flow.
The following sample code sends the response data to an integrator.io listener (webhook) for further processing, only when the response is not successful:
/* * postSubmitFunction stub: * * The name of the function can be changed to anything you like. * * The function will be passed one ‘options’ argument that has the following fields:
* 'preMapData' - an array of records representing the page of data before it was mapped.
* A record can be an object {} or array [] depending on the data source.
* 'postMapData' - an array of records representing the page of data after it was mapped.
* A record can be an object {} or array [] depending on the data source.
* 'responseData' - an array of responses for the page of data that was submitted to the
* import application. An individual response will have the following
* fields:
* 'statusCode' - 200 is a success. 422 is a data error. 403 means the connection went
* offline.
* 'errors' - [{code: '', message: '', source: ‘’}]
* 'ignored' - true if the record was filtered/skipped, false otherwise.
* 'id' - the id from the import application response.
* '_json' - the complete response data from the import application.
* 'dataURI' - if possible, a URI for the data in the import application (populated
* only for errored records).
* '_importId' - the _importId currently running.
* '_connectionId' - the _connectionId currently running.
* '_flowId' - the _flowId currently running.
* '_integrationId' - the _integrationId currently running.
* 'settings' - all custom settings in scope for the import currently running.
*
* The function needs to return the responseData array provided by options.responseData.
* The length of the responseData array MUST remain unchanged. Elements within the
* responseData array can be modified to enhance error messages, modify the complete
* _json response data, etc.
* Throwing an exception will fail the entire page of records.
*/
import { exports } from 'integrator-api'
function postSubmit (options) {
// Response data contains error
if (options.responseData.statusCode != '200') {
/* Send data to a listener */
exports.run({_id: '••••••••••••••••', listenerData: options.responseData})
}
return options.responseData
}
branched destinations to upload the error and send a message to the team channel
It is important to note that the response data will not contain an error ID that you can use to manage the error automatically when attempting to retry or resolve the error.
Limit requests to comply with rate limit policies
Rate-limit responses are common when an integration exceeds the maximum number of times that it can invoke an application’s endpoints within a certain period.
From the 2023.9.1 release onwards, connections include a new Auto-recover rate limit errors setting that automatically retries rate limit errors with an increasing delay. The setting intelligently scales up or down the concurrency level so that your API requests are successfully received by the endpoint. For new connections, it is enabled by default; for existing connections, you must manually enable this setting.
If you have not enabled the Auto-recover rate limit errors setting in your connection, then to avoid rate limit errors, you can manage data throughput by adjusting the connection’s settings.
- Reduce the Concurrency level to 1.
- Retry the errors.
- If you’re still getting rate limit errors for an HTTP-based connection, increase the Wait time between HTTP requests.
See also Optimize multiple connections in your Integration App flows to improve throughput.
Identify flow changes in the audit logs
When you suspect that an unauthorized or unintentional change introduced an error to a custom flow that had been running without any issues, check the audit logs to identify who modified what, where, and when.
Recover revisions with Integration Lifecycle Management
You can create backups of your working custom integrations. Then, if you run into any issues, you can always compare the changes and revert to a specific revision, if needed.
If you have Sandbox access, you can develop your custom integrations in that environment, test any changes, and then pull those stable changes into your production integration.
Comments
0 comments
Please sign in to leave a comment.