Skip to main content

BigQuery Batch Import

Comments

16 comments

  • Kate Larson Community moderator Principal Technical Writer
    Celigo University Level 3: Master

    Thanks for posting this to the community, Glenn Prince. I'm checking with some of our internal experts and will get back to you with any recommendations they have. 

    0
  • Glenn Prince
    Engaged

    Thanks Kate :-)

    -1
  • Srilekha Tirumala Software Analyst

    Hi Glenn Prince, 
    We are trying to reproduce this from our end. 

    From the data that you've provided above, I can suggest some changes to your request body, 

    If you have "batch_of_records" array in your export data, please give the Resource path as "batch_of_records" in your export. Once you give the resource path, you need to give the request body in the BigQuery import as follows, 

    {
    "skipInvalidRows": true,
    "ignoreUnknownValues": true,
    "rows": [
    {
    "json": {
    "internal_id": "{{record.internal_id}}",
    "display_name": "{{record.display_name}}",
    "description_featured": "{{record.description_featured}}",
    "modified": "{{record.modified}}",
    "location": "{{record.location}}"
    }
    }
    ]
    }

    Also, can you please turn on your debugger while running the flow and send us the debug data? So that we can inspect further. 
    It will be helpful if you can provide some sample request and response data. 

    Thank you! 

    0
  • Glenn Prince
    Engaged

    Hi Srilekha,

    I've tried making these updates and get similar results. I've turned on debugging and got some logs, is there a preferred way to send them to you ?

    Glenn

    0
  • Srilekha Tirumala Software Analyst

    Hi Glenn,

    You can send them here if it is fine for you. You don't have to send the whole debug data though. Also, how does your export side data look like? 
    Maybe a sample request body and response data that you're getting in the debug and the error message. This would help us analyze and debug the issue.


    Thanks! 

    0
  • Glenn Prince
    Engaged

    Debug Log looks like it repeats the following pretty much:

    2021-05-03T16:21:06.085Z 7f8ff1ee76974cc985c4156f7147f063 import 5fdbbc24d611fd2d757e593a
    {"url":"https://bigquery.googleapis.com/bigquery/v2/projects/adswerve-datatest-dev/datasets/netsuitetest/tables/Item/insertAll","method":"POST","body":"{\"skipInvalidRows\":true,\"ignoreUnknownValues\":true,\"rows\":[{\"json\":{\"internal_id\":\"8\",\"display_name\":\"\",\"description_featured\":\"\",\"modified\":\"12/10/2020 4:30 pm\",\"location\":\"\"}},{\"json\":{\"internal_id\":\"8\",\"display_name\":\"\",\"description_featured\":\"\",\"modified\":\"12/10/2020 4:30 pm\",\"location\":\"\"}},{\"json\":{\"internal_id\":\"8\",\"display_name\":\"\",\"description_featured\":\"\",\"modified\":\"12/10/2020 4:30 pm\",\"location\":\"\"}},{\"json\":{\"internal_id\":\"9\",\"display_name\":\"\",\"description_featured\":\"\",\"modified\":\"12/10/2020 4:26 pm\",\"location\":\"\"}},{\"json\":{\"internal_id\":\"9\",\"display_name\":\"\",\"description_featured\":\"\",\"modified\":\"12/10/2020 4:26 pm\",\"location\":\"\"}},{\"json\":{\"internal_id\":\"9\",\"display_name\":\"\",\"description_featured\":\"\",\"modified\":\"12/10/2020 4:26 pm\",\"location\":\"\"}},{\"json\":{\"internal_id\":\"11\",\"display_name\":\"\",\"description_featured\":\"\",\"modified\":\"12/10/2020 4:30 pm\",\"location\":\"\"}},{\"json\":{\"internal_id\":\"11\",\"display_name\":\"\",\"description_featured\":\"\",\"modified\":\"12/10/2020 4:30 pm\",\"location\":\"\"}},{\"json\":{\"internal_id\":\"11\",\"display_name\":\"\",\"description_featured\":\"\",\"modified\":\"12/10/2020 4:30 pm\",\"location\":\"\"}},{\"json\":{\"internal_id\":\"14\",\"display_name\":\"\",\"description_featured\":\"\",\"modified\":\"12/10/2020 4:26 pm\",\"location\":\"\"}},{\"json\":{\"internal_id\":\"14\",\"display_name\":\"\",\"description_featured\":\"\",\"modified\":\"12/10/2020 4:26 pm\",\"location\":\"\"}},{\"json\":{\"internal_id\":\"14\",\"display_name\":\"\",\"description_featured\":\"\",\"modified\":\"12/10/2020 4:26 pm\",\"location\":\"\"}},{\"json\":{\"internal_id\":\"16\",\"display_name\":\"\",\"description_featured\":\"\",\"modified\":\"12/10/2020 4:30 pm\",\"location\":\"\"}},{\"json\":{\"internal_id\":\"16\",\"display_name\":\"\",\"description_featured\":\"\",\"modified\":\"12/10/2020 4:30 pm\",\"location\":\"\"}},{\"json\":{\"internal_id\":\"16\",\"display_name\":\"\",\"description_featured\":\"\",\"modified\":\"12/10/2020 4:30 pm\",\"location\":\"\"}},{\"json\":{\"internal_id\":\"17\",\"display_name\":\"\",\"description_featured\":\"\",\"modified\":\"12/10/2020 4:30 pm\",\"location\":\"\"}},{\"json\":{\"internal_id\":\"17\",\"display_name\":\"\",\"description_featured\":\"\",\"modified\":\"12/10/2020 4:30 pm\",\"location\":\"\"}},{\"json\":{\"internal_id\":\"17\",\"display_name\":\"\",\"description_featured\":\"\",\"modified\":\"12/10/2020 4:30 pm\",\"location\":\"\"}},{\"json\":{\"internal_id\":\"19\",\"display_name\":\"\",\"description_featured\":\"\",\"modified\":\"02/02/2021 5:42 pm\",\"location\":\"\"}},{\"json\":{\"internal_id\":\"19\",\"display_name\":\"\",\"description_featured\":\"\",\"modified\":\"02/02/2021 5:42 pm\",\"location\":\"\"}}]}","headers":{"Authorization":"Bearer ********","content-type":"application/json","accept":"application/json"},"requestIndex":0}


    2021-05-03T16:21:06.234Z 7f8ff1ee76974cc985c4156f7147f063 import 5fdbbc24d611fd2d757e593a
    {"headers":{"content-type":"application/json; charset=UTF-8","vary":"X-Origin, Referer, Origin,Accept-Encoding","date":"Mon, 03 May 2021 16:21:06 GMT","server":"ESF","cache-control":"private","x-xss-protection":"0","x-frame-options":"SAMEORIGIN","alt-svc":"h3-29=\":443\"; ma=2592000,h3-T051=\":443\"; ma=2592000,h3-Q050=\":443\"; ma=2592000,h3-Q046=\":443\"; ma=2592000,h3-Q043=\":443\"; ma=2592000,quic=\":443\"; ma=2592000; v=\"46,43\"","accept-ranges":"none","connection":"close","transfer-encoding":"chunked"},"body":"{\n \"kind\": \"bigquery#tableDataInsertAllResponse\"\n}\n"}

    Preview Input:

    {
    "page_of_records": [
    {
    "record": {
    "internal_id": "8",
    "display_name": "",
    "description_featured": "",
    "modified": "12/10/2020 4:30 pm",
    "location": "",
    }
    }
    ]
    }

    Preview Ouput (Your handlebar code):

    {
    "skipInvalidRows": true,
    "ignoreUnknownValues": true,
    "rows": [
    {
    "json": {
    "internal_id": "",
    "display_name": "",
    "description_featured": "",
    "modified": "",
    "location": ""
    }
    }
    ]
    }

    Preview Output (using the if/else to structure the JSON):

    {
    "skipInvalidRows": true,
    "ignoreUnknownValues": true,
    "rows": [
    {
    "json": {
    "internal_id": "8",
    "display_name": "",
    "description_featured": "",
    "modified": "12/10/2020 4:30 pm",
    "location": ""
    }
    },
    {
    "json": {
    "internal_id": "8",
    "display_name": "",
    "description_featured": "",
    "modified": "12/10/2020 4:30 pm",
    "location": ""
    }
    },
    {
    "json": {
    "internal_id": "8",
    "display_name": "",
    "description_featured": "",
    "modified": "12/10/2020 4:30 pm",
    "location": ""
    }
    },
    {
    "json": {
    "internal_id": "8",
    "display_name": "",
    "description_featured": "",
    "modified": "12/10/2020 4:30 pm",
    "location": ""
    }
    },
    {
    "json": {
    "internal_id": "8",
    "display_name": "",
    "description_featured": "",
    "modified": "12/10/2020 4:30 pm",
    "location": ""
    }
    }
    ]

     

     

    0
  • Srilekha Tirumala Software Analyst

    Hi Glenn,

    Thanks for the details!
    We will look into it. 

    Are you getting any specific error message in the dashboard/debug logs? or is it a generic one?
    Please send the error details as well. 

    Thank you! 

    0
  • Glenn Prince
    Engaged

    Hi Srilekha,

    No, I just get the generic error: Processing of submitResponse did not return same record count as the current batch size.

    Glenn

    0
  • Srilekha Tirumala Software Analyst

    Hi Glenn,

    Can you please try the following:

    1. In your import, set the Batch size to 1.
    2. Change your body template to the following format and provide the resource path in the export if any: 

    {
    "skipInvalidRows": true,
    "ignoreUnknownValues": true,
    "rows": [
    {
    "json": {
    "internal_id": "{{record.internal_id}}",
    "display_name": "{{record.display_name}}",
    "description_featured": "{{record.description_featured}}",
    "modified": "{{record.modified}}",
    "location": "{{record.location}}"
    }
    }
    ]
    }

    Once you try this, let us know if the same issue persists or if you face any other issues.

    Thank you! 

    0
  • Glenn Prince
    Engaged

    Hi Srilekha,

    When you set the batch size to 1 this works perfectly fine, but I am trying to import hundreds of thousands of rows sometimes so being able to batch them would give me a lot better performance. My assumption is that BigQuery probably doesn't support batching at this point, I just wanted to make sure.

    Glenn

    0
  • Srilekha Tirumala Software Analyst

    Hi Glenn,

    In that case, can you send us the response data when:

    1. the records are successfully imported into BigQuery
    2. the records are failing 

    This would help us understand your issue better.
    And are you providing any resource ID path, success path while saving the batch import? 

    Also, as you said, "BigQuery probably doesn't support batching at this point" did you have any luck trying on any other apps like postman? 

    0
  • Glenn Prince
    Engaged

    Hi Srileka,

    When I do one record at a time I get a success and the records are inserted. When I try and do a batch I get the error in the UI "Processing of submitResponse did not return same record count as the current batch size"

    When I use Postman with the output above I get the following 200 OK response and the five records are inserted:

    {
    "kind": "bigquery#tableDataInsertAllResponse"
    }

    This matches the documentation (https://cloud.google.com/bigquery/docs/reference/rest/v2/tabledata/insertAll) response structure:

    {
      "kind": string,
      "insertErrors": [
        {
          "index": integer,
          "errors": [
            {
              object (ErrorProto) 
    }
    ]
    }
    ]
    }

    Glenn

    0
  • Srilekha Tirumala Software Analyst

    Hi Glenn, 

    Sorry for the delayed response. 

    Upon checking internally, we found out that this will be an enhancement to our product. Thank you for bringing this to our notice and helping us enhance our product. It is due to the response format that we are getting from BigQuery. 

    As of now, to insert data into BigQuery table, we need to set batch size as 1. 

    Please let us know if you have any other concerns. 

    Thank you! 


    0
  • Jon Adams

    Hi, is there an update on this enhancement?

    0
  • Bhavik Shah Senior Product Manager
    Answer Pro
    Great Answer

    Hi Jon Adams, we recently updated BigQuery connector which supports bulk insert. Here's a help article for your reference : 

     

    https://docs.celigo.com/hc/en-us/articles/6923178410523-Import-data-into-Google-BigQuery

    0
  • Scott Henderson CTO
    Answer Pro
    Top Contributor

    Hey Jon, to add a little more color to Bhavik's post, we actually launched a brand new BigQuery connector underneath our database connector family. Our original REST API BigQuery connector is still there, but we renamed it to "Google BigQuery (REST API)". The new connector should make it much easier to do traditional database/data warehouse use cases like bulk inserts. One key difference worth highlighting is that you will now use "field mappings" to control how your source records map to your destination table schemas. In case it helps, we use Snowflake internally at Celigo, which follows many of the same patterns, and below are some pics from one of my Snowflake bulk inserts. The last pic is where i map to specific column names in Snowflake. Hope his helps!

     

    0

Please sign in to leave a comment.