Dremio has a long list of dots to the right of the Query Editor that shows past versions of the query for a virtual data set. I would like to know when the most recent dot was last edited. I looked for this in the Catalog API but only see there the createdAt key. I also looked for this in the sys
and information_schema
schema. The closest match is I think the reflections modified_at
column, but I think this refers to the reflection of the virtual dataset, not the query definition last modified date.
My gut says this is a feature request. I would request exposing more of the version (and tag) related data to the API and to the information schema views.
I noticed this when finding mismatching data after querying seemingly “new” reflections ran on datasets downstream from the reflected dataset that I changed. I changed the upstream dataset’s column meaning but reused the same column name.
Workarounds I have considered include:
- Add columns when making edits. However, renaming columns is difficult for end users.
- Keep two copies of the physical data source (dev and prod), then only apply changes on the dev versions, disable then re-enable reflections, then rewire all downstream data sets to the dev version until the next “global refresh” or “maintenance refresh” runs. That would require some coding to identify and swap data sources in the FROM statements I think.
- Harvesting the last modified date from the dremio logs, detecting all downstream reflections, then scripting disable and re-enable reflections on each, possibly as a background process.
FYI, we are reflecting everything during the design phase for speed, biggest table is in the <25GB range, don’t expect many changes to columns down the road.