My use case involves providing near-real time updates of data with very low latency requirements. I figured out that careful configuration of raw and aggregate reflections could meet the requirements. However, they are not updateable near-real time.
Therefore, given a timestamp field, I wanted to use raw reflection for data before x timestamp and use external reflection with manual metadata refresh for data after x timestamp. Union of the two will give the latest and great results with the fastest query.
However, it seems that once raw reflection is created, even on a Dremio view off the source iceberg table, the raw reflection is used by default. Is there a way that I could achieve my use case?
@kyleahn You need to specify not to use a reflection, even if a reflection exists? Or a raw is getting used over agg? Do you have a profile?
If I create a reflection off an iceberg table (external reflection), would the raw reflection always be chosen over the external reflection?
@kyleahn Depends on costing, send us the profile when it is not picking the right reflection, turn on
planner.verbose_profile, then run the query then send the profile
There’s a feature coming out in the upcoming release, where you can specify reflection hints to the query planner. There are some notes about how “first view with a hint wins” during planning, but I’m unsure whether it applies when doing a
union all. You could give that a go?
The release notes are up, but the image usually lags a week or so.
@kyleahn Here is the documentation
thanks a lot @wundi @balaji.ramaswamy.
will take a look!
It does not look like 24.2.0 dremio oss has been updated? I have been waiting to update to 24.2.0 using helm chart. Am i missing something?