Articles in this section

Best practices to manage errors – part 1

Most runtime errors are generated as a result of the quality of the data you're importing. It is extremely important to keep your data clean, accurate, and current to improve the success of your business automations.

“No error left behind”

With a better understanding of the Celigo platform’s error management tools and the principles of effective error resolution, you’ll quickly build your skills to leverage the power of integrator.io, improve data hygiene, and avoid compounding problems. It will always take continuous monitoring and diligence to some degree, but practicing these concepts will get you much closer to the goal of seamless automation.

How the Celigo platform streamlines error management

The Celigo platform helps reduce your efforts by intelligently recovering behind the scenes, where possible, and reducing the number of open errors that you need to review, chiefly through the following features:

  • Auto-resolving duplicate errors

    When the Auto-resolve feature is enabled for a flow, the platform monitors how an error is processed in the current run, then finds records from previous runs identified by the same trace key and automatically resolves them for you.

    Example 1. When the error data has been corrected in an export

    An error was returned by the source application. After you correct the data coming from the source application in the export and rerun the flow: the flow reports no error for the record in the current run; and, previous (duplicate) errors in records with the same trace key are auto-resolved (that is, they are moved to the Resolved errors tab).


    Example 2. When the error still exists

    Your integration reports errors for the same record again and again – in a single run or across multiple scheduled or real-time flow runs. The platform displays only the most recent error for the record so that you can review and fix it; and, previous (duplicate) errors in records with the same trace key are auto-resolved. Your view of the open errors is much cleaner and easy to use as a result, but you can also view the historical “duplicate” error resolution in the Resolved errors tab.

  • Auto-retrying intermittent errors

    When integrator.io classifies an error as Intermittent, up to four auto-retry attempts occur if Auto-resolve is enabled for the flow.

    Intermittent errors – not necessarily repeatable responses, where the third-party API successfully recovers without any changes to the request – are more common than most people assume. In fact, a random audit for the week of Sept. 18, 2022, revealed that the platform auto-resolved and auto-retried 4.3 M errors, saving customers untold time and frustration.

  • Auto-detecting a connection that's back online and resuming the flow

    When auto-resolve is enabled, and the platform gets offline errors classified as Connection:

    1. The platform will pause the flow from running.

    2. It will periodically check to determine if the user has fixed the Connection error and the connection is restored. The platform automatically pings at hourly intervals with an exponential backoff of 1, 2, 4, 8, 16, and 32 hours respectively.

    3. If the connection is restored, the platform will automatically detect it and resume the flow.

    When the flow is back up and running, you can retry any errors received before or after the flow was paused.

Take full advantage of industry-leading iPaaS features

The current integrator.io error management system was designed to give anyone monitoring a flow the most meaningful and actionable error information without writing a single line of code, whether when interacting with the platform or reviewing email notifications. Simply put, Celigo error management capabilities are vastly superior to any other integration/automation platform on the market. They’ve been proven across a massive number of flows, records, teams, applications, and so on.

Furthermore, the error management functionally has benefited from a wealth of input from customers and integration engineers, as well as from a strong commitment to enhancing the customer experience. Let’s look at some of the ways you can easily manage integration errors and examine their causes:

  • The account dashboard gives you a quick view of in-progress flows and results of completed flows. It’s especially useful if you have to monitor multiple integrations.

  • Flow Builder shows you the current number of open errors for each flow step, and the Run console shows you the result of the latest flow run.

  • Flow analytics graphs allow you to track exactly how long a flow step takes to process. At a high level, you can monitor spikes in data and error activity and identify anomalies, such as an unusually high number of records causing a slower runtime.

  • Email notifications alert you to errors, resolved errors, and offline connections. You can subscribe yourself and other users of your account.

  • Integration monitor roles give coworkers access to monitor, review, and resolve errors – without having to worry about whether these users will have permissions to change the integration setup.

The platform also provides you complete diagnostic information for each error to provide instant recognition, as well as to help you drill down when troubleshooting:

  • Complete error retry data

  • Contextual request and response history for HTTP-based, NetSuite, and Salesforce connections

  • Standard error fields: error message, error code, error source, error ID, trace key, timestamp, and a classification to clarify where the error originates from:

    • Classification values such as “Missing,” “Intermittent,” “Duplicate,” and "Rate limit" indicate that the error originates from external applications (outside integrator.io), and the “Source” value holds the application name

    • The “Source” field can also reveal when the error happens due to failure in integrator.io components (such as a transformation, hook, or filter)

    • A “Connection” classification indicates errors caused by a bad connection, and the “Source” field reveals the application name

Error classifications

Tip

You can add additional, unique information to a record to help you identify it in the case of an error. For example, you might want to produce a list of records that you want to fix in the source system. (There’s a convenient Download errors button in the error view that gives you the option to generate a local file with errors based on a selected date range.)

Tip

  • In a custom flow (as opposed to an Integration app flow), you can enrich the error ID info by overriding the trace key to include additional attributes, such as internal ID + vendor name. When you examine the error details, the Trace key field shows the custom value you specified.

10974991351579-trace-key-augment.png
  • You can improve the error message by using the Data URI template setting, which results in a user-friendly link to the record in the source system.

10974991355547-data-uri-augment.png

Analyze errors to fix the root cause – not just resolve individually

For any outstanding error that the platform cannot auto-resolve, it’s definitely worth your time to perform a root-cause analysis (RCA). That is, don’t just focus on resolving one error. It’s more important to fix the root cause of the error and understand what went wrong between two or more systems and how the logic in your flow may have contributed to the error. The goal is to prevent the same or similar errors in the next flow run and maintain truly synced applications.

Understandably, at times you will need an order record to be imported as soon as possible. However, even if you use the retry data to fix specific urgent errors, you still should analyze the root cause and fix the error to improve your integration.

Important

It is imperative that your company’s IT work together with departmental application admins to manage and monitor the integrations. At a minimum, IT personnel and the admins of the applications being automated must have sufficient access to the integrator.io account. The primary responsibility for the RCA and fixing errors should belong to the departmental application admins, who are familiar with the applications and the data. Then, IT should be an escalation resource when the problem persists.

Tools and tips for analyzing specific types of errors

The Celigo platform contains several features for inspecting data throughout the integration, which are useful for handling errors and viewing the expected and actual outcomes of a flow run.

Tip

Most errors result from changes in the source and destination applications, whether introduced by an update or by someone in your organization. While the platform’s features will help you identify and mitigate errors, implementing a company-wide change management process for these external systems will help you avoid receiving errors and pinpoint the unforeseen causes more efficiently.

  • Advance field editors (AFEs)

    AFEs – with the easy-to-use input, output, and console for testing statements – will rapidly enhance and troubleshoot data issues throughout a flow. When performing an error RCA, the first approach for troubleshooting should be to copy the retry data into AFE input.

  • Debug logs

    Debug logs contain detailed traces of integration activity for HTTP-based connectors and webhook listeners. Enable debugging to track requests and responses between integrator.io and the flow step application. The logs can assist you when troubleshooting by enabling you to view errors that caused a particular record to fail. You can use the request response to compare success vs. error results.

  • Runtime script debugging

    You can write statements to script logs to inspect the shape and content of the data flowing through any step in a flow. Add these console statements anywhere JavaScript export and import hooks are available.

  • Import preview panels

    Test API calls to HTTP-based connectors, directly in the platform, with mock data in the import Preview panel. The import preview allows you to see the constructed request and a sample record that represent what integrator.io will send to the destination application. You can then send the sample record to generate a response from the destination application and use the response data as a reference when troubleshooting.

  • Get the file name for error handling (FTP exports)

    For FTP exports, you can get the file name in an error message. The file name is helpful if you have multiple exports in a flow with similar data files. You can search for error messages based on the file name and then troubleshoot errors.

  • Automatically ignore an error (custom script)

    There are scenarios where you might want to insert a record into an application only if the record does not exist. However, the application doesn’t offer an API to perform the search. In this case, if you receive a duplicate record error (the actual message varies by application) when you insert the record, a hook can remove that error from your flow.

    You can write a custom postSubmit hook script to ignore an error with specific wording in its message. (Hooks apply only to custom flows; they are unavailable for Integration apps.)

  • Send errors to an integrator.io listener (custom script)

    In rare scenarios, you might want to extract the information from the error retry data for further research. In this case, a hook can get the error response, but note that the response data will not contain an error ID.

    If you're working with a custom flow, you can write a postSubmit hook to send the payload info to a listener in a real-time flow.

  • Limit requests to comply with rate limit policies

    Rate-limit responses are common when an integration exceeds the maximum number of times that it can invoke an application’s endpoints within a certain period.

    From the 2023.9.1 release onwards, connections include a new Auto-recover rate limit errors setting that automatically retries rate limit errors with an increasing delay. The setting intelligently scales up or down the concurrency level so that your API requests are successfully received by the endpoint. For new connections, it is enabled by default; for existing connections, you must manually enable this setting.

    If you have not enabled the Auto-recover rate limit errors setting in your connection, then to avoid rate limit errors, you can manage data throughput by adjusting the connection’s settings.

    1. Reduce the Concurrency level to 1.

    2. Retry the errors.

    3. If you’re still getting rate limit errors for an HTTP-based connection, increase the Wait time between HTTP requests.

    10977422370971-http-wait-time.png

    See also Optimize multiple connections in your Integration App flows to improve throughput.

  • Identify flow changes in the audit logs

    When you suspect that an unauthorized or unintentional change introduced an error to a custom flow that had been running without any issues, check the audit logs to identify who modified what, where, and when.

  • Recover revisions with Integration Lifecycle Management

    You can create backups of your working custom integrations. Then, if you run into any issues, you can always compare the changes and revert to a specific revision, if needed. If you have Sandbox access, you can develop your custom integrations in that environment, test any changes, and then pull those stable changes into your production integration.

Was this article helpful?
0 out of 0 found this helpful

Comments

0 comments

Please sign in to leave a comment.