I’m new bee to the Dremio and in exist system, a normal table will be created on daily basis. Once this process is done, manually we will change the table in Parquet format, set auto reflection of data on every 1 hour in settings and save it in our project folder with Custom name.
Hence we do the said process manually, we are facing some issues in automate the reports .
Kindly give us a solution to do this manual work by scripting or any other best methods.
Once the data is available in the datalake, you can do the below
Let Dremio discover the data, this can automatically done by the background refresh job, see the source settings, under the metadata tab (or) you can run a command “alter PDS refresh metadata”, documentation for both below
Next, you would want to refresh the reflection, for this you can use the API provided
We have a mandatory manual daily exercise in Dremio which is used to enable the reports. I have documented our activity and request you to guide us to automate the same using any scripts.
WR,
Sathish R
(Attachment Daily Dremio Activity.docx is missing)
To automatically promote a folder, you can check “Automatically format files into physical datasets when users issue queries” under the source metadata tab
Create reflection can be done via SQL (This will also trigger a refresh)
alter dataset BI.“p1_vds” CREATE AGGREGATE REFLECTION p1_vds_agg
using dimensions (c1,c2) measures (p1(count, min),
p2(sum, max,approximate count distinct))
partition by (c1) localsort by (c2)
Future refreshes can be done via the REST API (Note the id is the PDS ID and not the VDS or reflection ID, even if the reflection is on a VDS, the ID needs to be the ID of the PDS that the VDS is dependent of)
We have successfully automated the folder format option and due to some performance issue, handling reflection has been dropped off in our case.
Kindly help us to automate the below process
As discussed earlier, once the folder format is done, our next process is to save the data from Data lake to available spaces.
For Ex. Our Space structure is as below
Master
Application
Staging
As of now, to move the data from Data lake to spaces, manually we will select the save as option à Giving name for the table à select the Space path( for ex. Master à Staging) à Save.
We need to automate this process so that the full Dremio process will free from manual process.
While trying to automate with Python, due to version problem or don’t know the reason exactly can’t able to proceed with examples given in Dremio forum.
As we are very new to Dremio/API, I request you to give us the same code or sample steps to automate the same.
We gone through the documentation available on Dremio forum and found by REST API mentod in Python code to achieve the VDS creation.
But as said earlier we cant find any examples with API Ver 2 and got 404 error when tried with apiv2 and got 401 error with api/v3(even authenticated).
Kindly help us to meet our requirement and kindly ignore the mistakes in above script.
We found that the installed Dremio Version is “4.9.1-202010230218060541-2e764ed0” but while quering select version() in Dremio API, its showing V2.2. Whether we need to upgrade the API version to 3? Kindly feedback.
Regarding the last update, We have generated the Authorization key dynamically and tried. Now it shows Response:400. In headers, whether Postman-Token to be generated using Postman application? Kindly share any sample script to achieve the same.