Hi All,
I am new to dremio. I am trying to find how can I dynamically change my physical dataset (File store in Minio connected with S3 config) in my sets of query.
For example-
a) User have upload file File01.xlsx on Minio Server.
b) Dremio API request will execute queries which are already exist in dremio using some other dataset.
c) Note: The file structure will be same.
Thanks,
@asif094 You should be able to set the context in your query so you can access the desired dataset
https://docs.dremio.com/rest-api/sql/post-sql/
How exactly do you want to dynamically change - (1) the SQL of a query to point to a different dataset whenever a file is added or (2) the data to return when a new file is added?
Balaji has suggested a way to change the default SQL context similar to the relative path to qualify queries, partially answering (1).
For (2), adding a new file to the S3 bucket you want that to be included in any future queries or trigger a query to run? For that, you can set up a reflection on the PDS, set its default refresh interval timer, and Dremio rebuild the reflection periodically gathering the new files. Ensure the metadata refresh timer on the overall PDS source will be about as frequent as you expect files to be added. This would guarantee that even if a reflection existed without the new file, the query planner would know that a new file has been added since the reflection was last built, and it would requery the PDS.
If you use incremental refresh on the PDS, the reflection build will run faster on an append-only add a new file approach.
At query time, Dremio will use the reflection expiration settings to find the latest data available from both sources.
To trigger a query after updating S3 location, you would need to detect the change using something like an EventBridge event and then trigger a query using Dremio REST API.