Hi all!
I want to load data from a json file that I have previously loaded into minio bucket. I added a data source and the loaded data is displayed (file “bigdata.json”):
and content data of file is previewed.
When i create table with schema:
CREATE TABLE nessie.test_data1 (id INT, firstName VARCHAR, lastName VARCHAR, gender VARCHAR, address VARCHAR, city VARCHAR, phone VARCHAR, email VARCHAR, status VARCHAR, createdDate TIMESTAMP);
But if I try to run the command:
COPY INTO nessie."test_data1"
FROM '@sampledata/bigdata.json'
FILE_FORMAT 'json' (TIMESTAMP_FORMAT 'YYYY-MM-DD"T"HH24:MI:SS.FFF');
i got an error on UI:
Dremio attempted to unwrap a toplevel list in your document. However, it appears that there is trailing content after this top level list. Dremio only supports querying a set of distinct maps or a single json array with multiple inner maps.
My json data looks like this:
[
{
"id": 1,
"firstName": "Gonzalo",
"lastName": "Cassin",
"gender": "male",
"address": "04355 Grady Summit",
"city": "Mayertborough",
"phone": "501-294-290",
"email": "Freda.Mitchell@yahoo.com",
"status": "Available",
"createdDate": "2020-01-01T11:18:51.591"
},
{
"id": 2,
"firstName": "Alexie",
"lastName": "Fisher",
"gender": "female",
"address": "7363 Murphy Run",
"city": "East Jonathon",
"phone": "501-668-320",
"email": "Harrison55@gmail.com",
"status": "Offline",
"createdDate": "2020-01-01T11:19:51.591"
},
.... billion items on one array
]
Total size of json file is 2.7 Gb.
The structure of my json file meets the requirements specified in the error - “or a single json array with multiple inner maps”.
For the experiment, I took json of exactly the same structure, but with only a few elements in the array, and this data was successfully loaded into the table using exactly the same query.
BUT why can’t I upload a large file to the iceberg table? Dremio can’t work with BigData?!