Hi everyone,
these days I’m trying to change some of our applications procedures to force the creation of a physical dataset (a folder on local disk containing homogeneous parquet files organized in subfolders).
I have two possibilities: since I have the “auto promotion” enabled, I can just wait the next relevant query to force the pds creation, but this sometimes requires a lot of time… don’t know why.
The other way is through API (POST /catalog), but it never works and I get the message: “Failed to set catalog “None”: Physical Datasets can only be created by promoting other entities.”
The request body is copied from the GET version of the same endpoint, with some fields removed since forced by Dremio (id, tag, createdAt…).
This is the body:
{ "entityType": "dataset", "type": "PHYSICAL_DATASET", "path": [ "Disk Warehouse", "data", "data_month" ], "format": { "type": "Parquet", "ctime": 0, "isFolder": true, "location": "/var/data/warehouse/data/data_month", "autoCorrectCorruptDates": true }, "fields": [ { "name": "model_id", "type": { "name": "BIGINT" } }, { "name": "model_variable_code", "type": { "name": "VARCHAR" } }, { "name": "time_utc", "type": { "name": "BIGINT" } }, { "name": "time_local", "type": { "name": "BIGINT" } }, { "name": "value", "type": { "name": "DOUBLE" } }, { "name": "__index_level_0__", "type": { "name": "BIGINT" } }, { "name": "dir0", "type": { "name": "VARCHAR" } }, { "name": "dir1", "type": { "name": "VARCHAR" } } ] }
Thank you