We are using Dremio with S3 compatible storage as datasource. Our datasets on Datalake updates(appends/overwrites new files) for every 3hrs. To reflect the new files on Dremio PDS, we are refreshing the metadata with the below query after every write operation to the source.
ALTER PDS <PHYSICAL-DATASET-PATH> REFRESH METADATA
But it is taking around 15mins to refresh the metadata. The data size of dataset is ~100GB.
I am not sure why it is taking more than 15mins just to refresh the metadata. Is it expected or is there any alternative for refresh metadata?
Note: I even tried the Forget metadata and recreate PDS through API. Recreation of PDS took the same time (18mins)