Unable to update query for dataset in REST api


#1

Hi,

I am using dremio rest api for creation of datasets.

I am able to create a dataset in a path successfully. But when i edit the dataset, the edited query is updated in dataset but the fields returned by the dataset is not updated.

After creation of dataset i retrieved the dataset below:

endpoint used : http://localhost:9047/api/v3/catalog/171d0dc2-ce4a-417b-8cb8-0c2b5a679d6a
{
“entityType”: “dataset”,
“id”: “171d0dc2-ce4a-417b-8cb8-0c2b5a679d6a”,
“type”: “VIRTUAL_DATASET”,
“path”: [
“5b99012916209a090491569e_5b93594916209a261cd374cd”,
“5b99f18316209a2b2cf8e945”
],
“createdAt”: “2018-09-09T03:36:49.688Z”,
“tag”: “1”,
“sql”: “select * from “5b99012916209a090491569e”.“dizer”.“tenants””,
“sqlContext”: [
“5b99012916209a090491569e_5b93594916209a261cd374cd”
],
“fields”: [
{
“name”: “_id”,
“type”: {
“name”: “VARBINARY”
}
},
{
“name”: “tenant_id”,
“type”: {
“name”: “VARCHAR”
}
}
]
}

When i try to update the query for the dataset, i am not getting the desired fields for the changed query.

PUT endpoint: http://localhost:9047/api/v3/catalog/171d0dc2-ce4a-417b-8cb8-0c2b5a679d6a
PUT request:
{
“entityType”: “dataset”,
“id”: “171d0dc2-ce4a-417b-8cb8-0c2b5a679d6a”,
“tag”: “1”,
“type”: “VIRTUAL_DATASET”,
“path”: [
“5b99012916209a090491569e_5b93594916209a261cd374cd”,
“5b99f18316209a2b2cf8e945”
],
“createdAt”: “2018-09-09T03:33:55.307Z”,
“sql”: “select * from “5b99012916209a090491569e”.“dizer”.“rules””, ##updated query
“sqlContext”: [
“5b99012916209a090491569e_5b93594916209a261cd374cd”
]
}

PUT Response:
{
“entityType”: “dataset”,
“id”: “171d0dc2-ce4a-417b-8cb8-0c2b5a679d6a”,
“type”: “VIRTUAL_DATASET”,
“path”: [
“5b99012916209a090491569e_5b93594916209a261cd374cd”,
“5b99f18316209a2b2cf8e945”
],
“createdAt”: “2018-09-09T03:33:55.307Z”,
“tag”: “2”,
“sql”: “select * from “5b99012916209a090491569e”.“dizer”.“rules””, ##query got updated
“sqlContext”: [
“5b99012916209a090491569e_5b93594916209a261cd374cd”
],
“fields”: [ ##fields are not updated
{
“name”: “_id”,
“type”: {
“name”: “VARBINARY”
}
},
{
“name”: “tenant_id”,
“type”: {
“name”: “VARCHAR”
}
}
]
}

The fields are not getting updated.

Can you help me this? Your earliest reply will help me a lot! Thanks in advance!


#2

Hi,

It looks like you have found a bug in the API where updating virtual datasets does not work properly, sorry about that. I’ve opened an internal ticket to fix this and will update here once we have a release with the fix (and a timeline).


#3

Okay Doron, waiting for your update


#4

Hi,

This is fixed in 3.0.0, VDS editing using the REST API will now work correctly.


#5

Thanks Doron, You guys rock!

I have been using dremio 2.1.4 rest API, do i need to change anything with the endpoints or the endpoints remains the same in 3.0.0


#6

Hi @doron , I am facing dataset Update issue, while Updating dataset , I am using resapi sql to create the dataset and its creating as per my expectation, then i add one another fields then i send the request to dremio same dataset name , But its showing error in the response, May i know how to Update the existing dataset name in Update, using Rest API.


#7

@Richard-biz

To change the dataset name, you would fetch it via a GET request first. Then modify the path field, where the name of the dataset is the last part of the path and update it using PUT (http://docs.dremio.com/rest-api/catalog/put-catalog-id.html).

What kind of error message are you getting?


#8

Hi @doran ,Thanks for the Response,Yes i aware of as per your solution, My Question is i need to update the dataset with the same name not new name(Update means it should be a same name right not a new virtual Dataset ), and my query have join condition to to update the same dataset with same name.


#9

@Richard-biz

If I understand you correctly, you want to change the name of a virtual dataset.

The id would remain the same. So if you GET api/v3/catalog/datasetid, you can then do a PUT api/v3/catalog/datasetid with the JSON body that has the new path which would rename the virtual dataset you are updating.


#10

Yes @doron , Thanks for the Information, Its working good.


#11

Hi @doron some time my dataset Update throwing the error after update the sql query and fields of dataset , for your reference i attached the screenshot of it,We are changing the dataset query as well as fields

for dataset update. the update happening successfully we are getting response 200. while reading and preview we are getting those errors, could you tell me why we are facing this issue in dataset Update.


#12

Which version of Dremio are you using? You might be running into the bug that is mentioned in this thread which was fixed in Dremio 3.0.0.


#13

i am running in 3.0.6 version


#14

What is the full exception? You can follow this document on how to share a query profile.


#15

Hi @doron, we are getting stack over flow exception, for your reference i attached my failed job
8c90d78f-2092-45d8-b53c-cc574536e004.zip (4.9 KB)
Profile,

Unexpected exception during fragment initialization: null
com.dremio.exec.work.foreman.AttemptManager.run():340
java.util.concurrent.ThreadPoolExecutor.runWorker():1149
java.util.concurrent.ThreadPoolExecutor$Worker.run():624
java.lang.Thread.run():748
Caused By (java.lang.StackOverflowError) null
java.security.AccessController.doPrivileged():-2


#16

Can you try editing the dataset using the UI using the pencil button:

48%20AM

It should show you the SQL used when saving the VDS. Can you edit the sql (just add a space somewhere) and save and see if the VDS works now? I want to check if this is happens without using the REST API.


#17

Hi @doron, as per your suggestion

i did same , But its showing the Unexpected error,


#18

If you press save does it fix anything?

And if you copy the SQL and run it as a separate query (using the New Query button on top) does the query work?


#19

If we press Save Button , Its showing the Unexpected error Occured, and if we run the Query in new window its showing in “Stack over flow Error”


#20

Hi @Richard-biz

Can you please try the below?

Click on Admin-On the left side “cluster”-support-scroll down on the right to support key and enter “planner.experimental.pclean_logical”, click show, disable and save

Try the query again

Thanks
@balaji.ramaswamy