Refresh asset after data load

Dremio has “refresh” setting at following level

  1. source
  2. reflection
  3. dataset (i can see it in API), but on UI. I was not able to see this setting. Only setting at dataset level is, to update reflection . Or am i missing something?

Now my main question, I dont want to play with source refresh policy, as it might be too expensive. But
I still want to update some datasets, depending on when upstream system finish loading source data. So I though of using REST api.

if I refresh dataset using REST api, will it supersede the setting at source level?

I am sure may ppl must have gone thru this usecase. Any guidance is much appreciated.

You can refresh reflection using diasable and then enable Reflection for the dataset using rest API . Refresh reflection job will be automatically triggered as soon as reflection is enabled ( after disable)

Try steps mentioned in this post

HI,
disable/anable reflections via API will use the incremental option set on the source pds or the refresh will be full?

thanks

when disable/enable the reflection is recreated, so no incremntal use.

you can develop a custom code using Dremio API to make it. For example we develop a DAGs using apache airflow, to refresh reflection, and check availability, and notify errors when upstream system finish loading source data.

Hi @dacopan
thanks for your answer. I have made an Apache Nifi flow that use the Dremio API to governate the refresh of dremio reflections, maybe is quite similar to your model.
In my case the PDS is an hdfs directory and i have developed 18 vds on it. Each vds has at least 1 reflection.
The only endpoint i can use for refresh in incremental mode a VDS is the POST /catalog/{id}/refresh ?

You have to call refresh on the PDS using the catalog API - that will force any reflection using that PDS to refresh.

Hi @doron ,
is necessary to define reflection on the pds?

thanks

You are just informing Dremio that the PDS has changed. Dremio will rebuild all reflections that use the PDS, including any VDS that derive from the PDS. We have an internal dependency graph that handles this.