Amazon Settlement - 200K lines file

Hey everyone,

I have a customer with a very large Amazon Settlement file (200,000+ transactions, 40MB file). I have been trying to find ways of optimizing the processing of this file, with groupings, removing unused fields, etc... But I'm always getting stuck at the 5MB limitation.

If you worked with the amazon settlement report, you know they provide a bi-weekly CSV (or XML) file, and you just can't choose a smaller period (like daily, for example). So, right now I'm stuck with this large file that Celigo doesn't seem capable of handling.

Does anyone have any ideas on what else I could try to do to process this file? Has anyone faced the same challenge?

I've reached out to support to no avail.

Thanks in advance,

0

Comments

11 comments
Date Votes
  • Official comment

    Marcelo Borges I was thinking about this some and decided to rework the flow for your use case. It would have been a bit difficult to explain without it being fresh in my mind so here it goes. Here is a fresh flow zip with the updates. What I've done is made 3 flows. 

    1. Flow 1 gets the report list and then updates the flow 2 export as usual. However, this time it places an array or report ids that you need instead of a single one.
    2. Flow 2 is the same except the handlebar expression for referencing the report id in settings is hard coded to the first report id in the list.
    3. Flow 3 runs after flow 2 and it will update flow 2 export settings by removing the used report id and then it will trigger the flow to run again if there are still report ids to run through.

    Additionally, I added a few scripts in there so that you don't need to update my hardcoded flow ids and hardcoded export ids. The scripts put in will get them dynamically from your own account.

    Hopefully this helps!

     

  • I don't have any ideas how IO could handle such a large file. If I had to work on this project I would look into importing the file into Google Sheets or a database if the customer had one available, then query out of there into IO. Another option is manually breaking the file up into smaller chunks, which is an ugly solution, but would work.

    I'm interested to see if there is a way to get IO to handle it!

    0
  • If I was talking about a very specific file layout that nobody else uses, I would understand if IO didn't handle it.

    But this is just such a standard file. It's a big customer, with a massive amount of transactions, but it's still the very same standard amazon settlement file that I'm sure hundreds of customers use.

    It would be a pity if IO couldn't handle it. I'm eager to see if anyone else has answers to this. All the manual processing that you suggested, I could easily replace that with functions inside my customer's ERP and save the customer the hassle of manually manipulating files... But they purchased a Celigo integration, so, you know, expectations are high, lol.

    Tks.

    0
  • Marcelo Borges I'm going to have my team look into this. I see your Support ticket, so no further action needed from you right now.

    1
  • Marcelo Borges I wrote up a pretty good community post on how to do this a couple months ago. Here is a link to it: https://docs.celigo.com/hc/en-us/community/posts/17841276768923-One-to-Many-using-non-standard-api-response-pattern-on-a-CSV.

     

    On that article, there is a flow zip you can download and then upload into your environment. Once you upload it, you'll need to change the reportType parameter in the export relative URIs and you'll need to swap out NetSuite for Microsoft BC.

    0
  • Marcelo Borges to make the install a bit easier, here is a new zip with Microsoft BC

     

    As for the reportType enums to change, here is what I mean:

     

    Depending on your report type, you may need to change the parser to XML if you're not grabbing a CSV type. This change would be on flow 2.

    0
  • Thanks for all the info on this Tyler... I have been working on the last 30 minutes on your post, and it looks very very promising, I'm excited about it. I just need to get the API Token to get an Integrator.io connection enabled, to try to do this from-flow-to-flow integration to get the Amazon Settlement records.

    If I understood the issue correctly, the main issue with how the original poster was setting up their flow, which is similar on how I'm writing my flow, is that the Page size is exploding the 5MB limitation because the Lookup step to read the completed report, actually add ALL records inside the 1 page returned by the very 1st Export Step.

    On the approach you suggest on your post, basically we are updating Flow No. 2, to hard code the Report ID, via an API call, and then we are calling that flow, and when it runs, each Amazon Settlement record will be put in pages, and we shouldn't hit the 5MB limitation.

    This is very tricky, yet a very interesting approach.

    I'll keep you posted on how this all worked but I'm getting a good feeling about this.

    0
  • Marcelo Borges yeah that is exactly right. I'd say one caveat with this approach is it will only handle getting 1 report at a time since the report id is hardcoded within the settings. Because of this, flow 1 is limited to return 1 report id. 

    How often do new settlement reports drop for you?

    0
  • I think this customer has 4 bi-weekly files. For some reason, Amazon is generating 2 files for the US Marketplace and 2 files for the Canadian Marketplace.

    Worst case scenario, could I have 4 "starter" flows, or, could I manually call the "starter" flow 4 times in a row (with different date ranges) and it would queue up execution...?

    0
  • Tyler, this looks great. I still need to run the tests myself to see how this will run, but so far, I like the idea.

    Now, going in a bit of a different optics.... Is there ANY way this could possibly be done with Async helps, considering that reports are already generated? Do you think I could run some async helper in some way to try to get the data out of the Async Helper from Step 1 and get reports to page just as if we were calling the additional integrations using the "Next Step" feature?

    0
  • Marcelo Borges, async helper would somewhat work, but it would be limited to one report at a time. The async helper function was built so that IO is the one requesting a report to be made and then polling a result endpoint to see when the results of the report is complete. So you could set it up to get a list of reports, then use async helper to grab the result, but the result set would have to be hardcoded to just the first report returned in the report list get request. Since that's the case, we would still need a flow to grab the list of reports and store the list in settings. Then have another flow to remove reports from the list as they are consumed.

     

    We're working on some things that would make this 5 MB limit a non issue in a case like this, but I don't have any timeline to provide. 

    1

Please sign in to leave a comment.

 

Didn't find what you were looking for?

New post