Create dataset for file/folder using JDBC?

harindersb · April 12, 2022, 8:27pm

Hi, Is there a way I can create a physical dataset for a file/folder using JDBC? I have been creating dataset for a file on S3 in Dremio using REST API and Dremio UI but wanted to know if there is a way of doing this via JDBC as well? Wanted to do this in case a user only has a JDBC client to work with Dremio and wants to work with files in say S3 or any filesystem so are there any options for that? Or creating a PDS using REST or UI needs to be done first.

Example of what I am doing using REST is below:

{
“entityType”: “dataset”,
“id”: “dremio%3A%2FSamples%2Fsamples.dremio.com%2FDremio%20University%vendor_lookup.csv”,
“path”: [
“Samples”,
“samples.dremio.com”,
“Dremio University”,
“vendor_lookup.csv”
],
“type”: “PHYSICAL_DATASET”,
“format”: {
“type”: “Text”,
“fieldDelimiter”: “,”,
“lineDelimiter”: “\r\n”,
“escape”: “”",
“quote”: “”",
“skipFirstLine”: false,
“extractHeader”: true,
“trimHeader”: true,
“autoGenerateColumnNames”: false
}
}

lenoyjacob · April 12, 2022, 10:03pm

@harindersb You should be able to use the “automatically format files” option in the source settings.

Like so:
Metadata

harindersb · April 12, 2022, 10:33pm

Thanks @lenoyjacob . I tried it out and works wells for the example you gave of Parquet file but is there a way to specify the format for files like CSV like I am doing in the above REST call example? On such files it puts everything in one column and to handle such cases one would like to supply the format options like fieldDelimiter, lineDelimiter etc.

lenoyjacob · April 13, 2022, 4:57am

Interesting. I think one way is to do something like:

SELECT * FROM TABLE(Samples."samples.dremio.com"."Dremio University"."vendor_lookup.csv" (type => 'text', fieldDelimiter => ',', lineDelimiter => '
', extractHeader => true))

(The query is intentionally in two different lines)

But I guess that doesn’t really promote the dataset. Others may have a better way?

balaji.ramaswamy · April 13, 2022, 7:14am

@harindersb Auto promotion of CSV files when the options like line/field delimiters are different is not possible as when we auto promote CSV’s it ill promote with the default settings,

harindersb · April 13, 2022, 6:54pm

Yeah, but this is good to know. Thanks!

harindersb · April 13, 2022, 6:54pm

Got it. Thanks @balaji.ramaswamy

Topic		Replies	Views
Creating / altering datasets with SQL	8	1604	June 25, 2021
How to promote a file in S3 to Physical Data Set	7	3008	May 11, 2021
Create dataset format programmatically	3	1451	August 1, 2023
Promoting file to dataset using REST API	2	1588	August 17, 2021
Promote file using Rest Api Always Promoting can only create physical datasets	2	1411	February 27, 2019

Create dataset for file/folder using JDBC?

Related topics