How to create a view in dremio for set of parquet files

Hi,
I have to create a sql-view in dremio for particular set of parquet files.
This view i have to create using scala program.
Is there any API or something to perform the above mentioned task?

Hi @Gautam

In Dremio we have VDSs(A sql definition for the datasets) VDS .

You can work on VDSs with the API, please refer here to create VDS using REST API
@Venugopal_Menda

Hi Thanks for response.
Could you please help me with , “How to generate authorization token to use rest api”?
I was trying to hit rest api and below are some details related to that
Url:http://localhost:9766/api/v3/sql
Request Body:
{
“sql”:“CREATE VDS FDL.Test.Test1 as SELECT * FROM ‘FDL_DEV’.slp.store.parquet.liquidity.‘M12_PRD’.‘M12_PBALS_0123’.‘20180928_QE11’”
}

But it gave me error “UNAUTHORIZED”

Hi, Thanks for helping.
I was trying to call the GET methods of the RESTAPI but i was getting only Unautorized access error. I tried my every credentials , but it was giving same error.
I tiried calling GetUserByName method.

Could you please help me with this.

@Gautam
Can you share the syntax you are using.
If you are using postman tool, please specify the Authorization value.

Authorization value can be found at Open Dremio UI in Chrome–>Developer Tool–>Click on Network–> Click on the Dremio Home Space–>On the right side under “Stats” you can see the " authorization: _VALUE" .

Please specify the value in the Postman Headers with Key as “authorization” and value from the chrome you got.

Attaching the screenshots for your reference.

We document how to programmatically get a token here.

Hi,
Thanks for the response. It really worked with GET request.
Also, i was trying to create VDS using API and i was getting ERROR 400(INVALID QUERY).
Could you please help me identify the syntax error

URL:http://localhost:9766/api/v3/catalog

Request:
{
“entityType”: “dataset”,
“path”: [
“SPACE”,
“FOLDER”,
“VDS NAME”
],
“type”: “VIRTUAL_DATASET”,
“sql”: “select * from 20180928_QE11”,
“sqlContext”: [“VDS NAME”, “.......20180928_QE11”]
}

Can you successfully run the query with that sqlContext in the Dremio UI?

Hi @Gautam

Can you try with this sample
{code}
curl -X POST
http://localhost:9047/api/v3/catalog/
-H ‘Authorization: _dremio3rgqgubrnvdmemafqt43ih09fh’
-H ‘Content-Type: application/json’
-H ‘Postman-Token: 7facc7f3-f2bc-420a-ad8e-91a0f5e523f5’
-H ‘cache-control: no-cache’
-d ‘{
“entityType”: “dataset”,
“type”: “VIRTUAL_DATASET”,
“path”: [
@Dremio”,
“APIVDS”
],
“createdAt”: “2019-10-09T10:26:55.386Z”,
“sql”: “SELECT * FROM orcal.HR.EMPLOYEES”
}’
{code}

The path @dremio is the place you want to save the VDS APIVDS, and the sql is the command you retrive the data from PDS in this case “orcal.HR.EMPLOYEES”

@Venugopal_Menda

No Because this doesn’t work with file format .parquet.
It supports only dremio parquet format.

Is it possible to execute the query to folders having .parquet format files?

@Gautam

You can promote the folder as dataset and give the folder as the dataset path.

Please refer this : https://docs.dremio.com/data-sources/files-and-directories.html

@Venugopal_Menda

This is i am using via UI but i want to achieve using Dremio Rest Api.
Can you please show me specific example of the same.
Promoting files and folders as dataset.

Hi @Gautam

First, you need to get the dataset details like ID, Path etc with GET please refer this link here

And then using the dataset Id and other details you can promote the folder or files, please refer this link

@Venugopal_Menda

Hi, Thanks for the help.
I am able to promote the folder to PDS(parquet format).
But just one last thing i wanted to do is create VDS from PDS and i tried below query it is not working. Giving Error failed in parsing query

Request Type: POST
Url: http://Host:9766/api/v3/catalog
Request Body:
{
“entityType”: “dataset”,
“path”: [
“SPACE”,
“FOLDER”,
“VDS Name”
],
“type”: “VIRTUAL_DATASET”,
“sql”: “select * from ‪SOURCE.FOLDER.FOLDER.FOLDER.FOLDER.FOLDER.FOLDER.PDS”,
“sqlContext”: [“VDS Name”, “SOURCE”]
}

Error: Failed in parsing Query

Hi @Gautam

Can you try by following with this reference, this works for me.

{code}
curl -X POST
http://localhost:9047/api/v3/catalog
-H ‘Authorization: _dremioatm624tjp5n180shfi3tbgh99k’
-H ‘Content-Type: application/json’
-H ‘Postman-Token: 21eae4e6-4ddc-4af8-9085-b2341ee02dea’
-H ‘cache-control: no-cache’
-d '{
“entityType”: “dataset”,
“type”: “VIRTUAL_DATASET”,
“path”: [
@Dremio”,
“APIVDS1”
],
“createdAt”: “2019-10-09T10:26:55.386Z”,
“sql”: “SELECT * FROM EMPLOYEES”,
“sqlContext”: [
“orc”,
“DREMIO”
]
} ’
{code}

VDSNAME: APIVDS1
VDSLOCATION: Dremio
PDS Table name: EMPLOYEES
PATH TO PDS/SQL CONTEXT: orc–>DREMIO

@Venugopal_Menda

in Path section
@Dremio is DestinationSpaceName>>DestinationFolderName
Right?

Yes @Gautam the path section @Dremio is the destination location where the VDS APIVDS1 should save.

If you want to save the VDS in a space mention the space name

@Venugopal_Menda

I tried your piece of code, it didn’t worked.
Error 400 Invalid Query
Url: http://Host:9766/api/v3/catalog
Request Body
{
“entityType”: “dataset”,
“path”: [
“DESTINATION SPACE”,
“DESTINATION FOLDER”,
“DESTINATION VDS NAME”
],
“type”: “VIRTUAL_DATASET”,
“sql”: “select * from PDS NAME”,
“sqlContext”: [“Comma seperated path to PDS(SOURCE,FOLDER,FOLDER)”]
}

{
“entityType”: “dataset”,
“type”: “VIRTUAL_DATASET”,
“path”: [
“DESTINATION SPACE”,
“DESTINATION FOLDER”,
“DESTINATION VDS NAME”
],
“sql”: “select * from PDS NAME”,
“sqlContext”: [“Comma seperated path to PDS(SOURCE,FOLDER,FOLDER)”]
}

Also tried below one, it gave me Response Code 200 but there was no VDS generated
URL:- http://tdclxd00025:9766/api/v3/sql
Request Type: POST
Request Body:-
{
“sql”: “CREATE OR REPLACE VDS <<<SPACE.FOLDER.VDSName>>> as SELECT * FROM <<>>.<<>>.<<>>.<<>>.<<>>.<<>>.<<>>.<<>>”
}

@Gautam, for “Error 400 Invalid Query”, can you share the exact SQL query string that you sent? I seems like there might just be a syntax error here.

As for the “CREATE OR REPLACE VDS …” request; there should be a job in the “Jobs” page of the Dremio UI the query that you submitted via the API. Can you find that job? Was it successful?