We’re using Dremio CE 24.0.x to query parquet files stored on top of S3, and have set automatic PDS formatting on query and 6 hours metadata refresh schedule
We’ve noticed that PDS metadata refresh only works for PDS that doesn’t contain metadata.
For example, I have a bucket path
some/path/to/dataset/YYYY/MM/DD/ containing multiple parquet files that are added throughout the day
We query both
some/path/to/dataset/YYYY, but notice automatic refresh is only taking place for
some/path/to/dataset. That means that the data under
some/path/to/dataset/YYYY is not being refreshed until we refresh it manually using the
ALTER PDS statement
metadata refresh logs indicating metadata refreshing taking place on time, and again, looks like paths queried above the date buckets are refreshed properly
Do you have any idea why could that be happening?