Using Iceberg external reflection and raw reflection together

My use case involves providing near-real time updates of data with very low latency requirements. I figured out that careful configuration of raw and aggregate reflections could meet the requirements. However, they are not updateable near-real time.

Therefore, given a timestamp field, I wanted to use raw reflection for data before x timestamp and use external reflection with manual metadata refresh for data after x timestamp. Union of the two will give the latest and great results with the fastest query.

However, it seems that once raw reflection is created, even on a Dremio view off the source iceberg table, the raw reflection is used by default. Is there a way that I could achieve my use case?

Thanks

@kyleahn You need to specify not to use a reflection, even if a reflection exists? Or a raw is getting used over agg? Do you have a profile?

If I create a reflection off an iceberg table (external reflection), would the raw reflection always be chosen over the external reflection?

@kyleahn Depends on costing, send us the profile when it is not picking the right reflection, turn on planner.verbose_profile, then run the query then send the profile

There’s a feature coming out in the upcoming release, where you can specify reflection hints to the query planner. There are some notes about how “first view with a hint wins” during planning, but I’m unsure whether it applies when doing a union all. You could give that a go?

The release notes are up, but the image usually lags a week or so.

2 Likes

Thanks @wundi

@kyleahn Here is the documentation

1 Like

thanks a lot @wundi @balaji.ramaswamy.

will take a look!

It does not look like 24.2.0 dremio oss has been updated? I have been waiting to update to 24.2.0 using helm chart. Am i missing something?

Nah, it seems we’re all waiting. Earlier, the actual release (git push, tarball, Docker image) used to follow about a week after the release notes. That doesn’t seems to hold true anymore - it’s more weeks if not a month(s).

The JDBC driver has been published for 24.2.0 already, a day or so after the release notes were published, so it seems the build has been made internally already :man_shrugging:t3: