Reading a CSV From AWS S3
I am trying to dynamically read a CSV that I have on S3 to do a lookup but I cant seem to get it to work. I'm not sure if I don't have the path quite right, or if I am off somewhere else. I loaded a test CSV, and that seems to be the only thing thats being used. For example, I just loaded a new spreadsheet this morning to S3 with a “Test” row and that doesn't show up in the preview. Is the preview only using the CSV from the Sample File? How can I ensure that when the flow runs its actually pulling the CSV from the S3 bucket?


0
Comments
Dave Guderian the sample CSV is what is used for preview, we will not fetch actual files from your directory for preview. I would test by just running it. The run will not use the sample file uploaded to our UI.
Thats good to know. Thanks Tyler Lamparter .
Is my file path in Celigo correct based on the following? I am wondering if I dont have the right path.
Dave Guderian I'm not sure if a full file path works there, but you can try to run it and see. I typically go to Advanced settings and toggle “leave file on server” so I can keep testing without having to replace the file back in the bucket. You could also use the file filter options below the prefix because for S3 the filename includes the prefix along with the actual file name.
Hmm, didn't know that about the “leave file on server” setting. So, are you saying that if I don't have “leave file on server” checked then I would need to reload the file for each flow run? I definitely do not want to be doing that.
So, if I check the “leave file on server” box, what other info do I need to define in the lookup (i.e. do I still need to set my bucket name and prefix like I had before)?
Thanks again Tyler Lamparter
Dave Guderian by default, we take the file from the bucket so that the file no longer exists on the bucket. Reason being, most people don't want to double process a file. You can either choose to “leave the file on server” so that we don't delete it after pulling it or you can choose a backup path so it gets moved somewhere else, like an archive. I'm not sure your use case so it's up to you. When testing, I choose to leave the file on the server so that I can run the flow over and over without having to add the file back to the bucket over and over.
The bucket/prefix/file filter rules are all needed to grab only the files you want us to grab. Bucket name is required. If you specify no prefix or file filter, then we would just grab all files in the bucket. You want to filter down to what you want pulled.
Thank you Tyler Lamparter. Leaving the file on the server appears to be the piece that I was missing! Thanks as always for the help on this!
Please sign in to leave a comment.