Unable load big json data to Iceberg with Dremio

sleepdan · March 18, 2025, 11:01am

Hi all!
I want to load data from a json file that I have previously loaded into minio bucket. I added a data source and the loaded data is displayed (file “bigdata.json”):

and content data of file is previewed.

When i create table with schema:

CREATE TABLE nessie.test_data1 (id INT,	firstName VARCHAR,	lastName VARCHAR,	gender VARCHAR,	address VARCHAR, city VARCHAR,	phone VARCHAR,	email VARCHAR,	status VARCHAR,	createdDate TIMESTAMP);

But if I try to run the command:

COPY INTO nessie."test_data1"
FROM '@sampledata/bigdata.json'
FILE_FORMAT 'json' (TIMESTAMP_FORMAT 'YYYY-MM-DD"T"HH24:MI:SS.FFF');

i got an error on UI:

Dremio attempted to unwrap a toplevel list in your document. However, it appears that there is trailing content after this top level list. Dremio only supports querying a set of distinct maps or a single json array with multiple inner maps.

My json data looks like this:

[
  {
    "id": 1,
    "firstName": "Gonzalo",
    "lastName": "Cassin",
    "gender": "male",
    "address": "04355 Grady Summit",
    "city": "Mayertborough",
    "phone": "501-294-290",
    "email": "Freda.Mitchell@yahoo.com",
    "status": "Available",
    "createdDate": "2020-01-01T11:18:51.591"
  },
  {
    "id": 2,
    "firstName": "Alexie",
    "lastName": "Fisher",
    "gender": "female",
    "address": "7363 Murphy Run",
    "city": "East Jonathon",
    "phone": "501-668-320",
    "email": "Harrison55@gmail.com",
    "status": "Offline",
    "createdDate": "2020-01-01T11:19:51.591"
  },
.... billion items on one array
]

Total size of json file is 2.7 Gb.

The structure of my json file meets the requirements specified in the error - “or a single json array with multiple inner maps”.

For the experiment, I took json of exactly the same structure, but with only a few elements in the array, and this data was successfully loaded into the table using exactly the same query.

BUT why can’t I upload a large file to the iceberg table? Dremio can’t work with BigData?!

Topic		Replies	Views
Read iceberg table from S3	11	2662	June 16, 2022
Can Dremio process Big Data actually?	10	1461	June 19, 2018
Dremio doesn't pushdown LEFT() to Snowflake	4	495	August 14, 2023
View to dereference Iceberg hash-addressed blobs stored in S3	1	6	April 3, 2025
Issues with JSON content (after upgrade ?)	4	1415	February 27, 2020

Unable load big json data to Iceberg with Dremio

Related topics