Raw reflection not used

I’m sure I’m missing something obvious. I created a VDS, then created a raw reflection that includes all columns as “Display”. However, when I do a “select *” on the VDS, the reflection is not used – it appears in the job status as “Did not cover query”.

I also created an aggregation reflection but it too is not being used on a simple aggregated query.

Profile is attached.Profile2.zip (17.2 KB)

Reflections are associated with the query tree that defines the VDS rather than the VDS itself. This is generally beneficial as it allows the matching process to match on reflections that may be associated with ancestor or sibling datasets, allowing for more frequent cache hits.

In your example, the underlying query for the VDS is fairly complex with several joins, aggregations, and sub-queries. As a result, the matching process to determine whether the reflection covers your query is more complex and sometimes will not match. The more simple the underlying query for a VDS the more likely the match will be successful.

There are some enhancements to the matching algorithm we are planning for a release in the near future that will consider the relationship of the VDS to the reflection. We will report back here to let you know when you can give it a try.

Thank you Kelly. That makes sense, but is disappointing. I was hoping to be able to use Dremio to materialize this complex, long running query and join it with other sources to make it query-able in Tableau.

Is there anything to guide me on how complex the VDS query can be before its reflections won’t be used? Or more detail on why a particular reflection can’t cover a query?

The enhancement you mentioned sounds like it would be beneficial.

@Joe sorry you ran into this. We’re actively working on an improvement here so that the complexity of a VDS is irrelevant for reflection matching purposes when you are directly querying that VDS. As long as it has relevant reflections your query should get accelerated (e.g. raw/agg reflection on VDS, as you mentioned, should match if applicable).

Thank you, Can. That should greatly improve the usefulness.

Incidentally, are there any plans to introduce an explain plan with the query window? It would be great to be able to see which reflections will be used before actually executing the query (as well as seeing which underlying databases will be called, etc.).

Hey @Joe, thanks for the suggestion. You probably tried this already but:

  • Run EXPLAIN PLAN FOR SELECT * FROM VDS
  • Then go to Jobs and view the reflections/plan for that query visually:

We’re also thinking of a few ways to do “what-if” type analysis for reflections/queries in general. However, we don’t have a timeline on this yet – not short term.

1 Like

WOW - no, I wasn’t aware of that. Looks like it’s exactly what I need, but I didn’t notice it in the docs.

1 Like

Noted – created an internal ticket to track docs improv.

Hey @can is there any update on this?

And just for me to understand, is the plan different for creating a reflection on a VDS from the plan for running the query in the underlying VDS? I ask because I have seen the query gets accelerated but the reflection doesn’t when there is either a join or there is a group

Hello @joejk,

We’ve made a number of significant changes to the Dremio query planner since last year. If you are on the newest version of Dremio OSS or Community and you are still not seeing your reflections chosen for a particular query, please provide the profile here and we can take a look.

And just for me to understand, is the plan different for creating a reflection on a VDS from the plan for running the query in the underlying VDS? I ask because I have seen the query gets accelerated but the reflection doesn’t when there is either a join or there is a group

If you build a reflection on you.vds, reflection build job would get accelerated if there are applicable reflections. Can you describe the work flow you are trying to create and include query profiles?

Hey @ben thanks so much for responding.
I am on Dremio
Build: 3.1.6-201903070042400578-fdcd3a8
Edition: Community Edition

I have seen couple of situations where reflections are not used.
I posted two of those here

  1. No acceleration for queries with "SQL WITH" clause (sub-query refactoring)
  2. Reflection refresh is not accelerated if it involves a JOIN

I have been trying to write the queries in all sorts of manner to get it to pick up the reflection :tired_face:
I am used to periscopedata’s cache. For large queries, I create a cache and run queries on that. It fails on dremio because of what @kelly described as Reflections are associated with the query tree that defines the VDS rather than the VDS itself.

The behavior related to reflection matching in complex VDS has been significantly improved in 3.1 and beyond. You may have identified some edge cases.

824b8acb-c367-4e07-a413-928dec761d5c.zip (79,0 KB)
We are experimenting same issues on Dremio 3.3. We have a VDS, and created an Aggregation Reflection with some of the fields of that VDS, plus one measure. A simple query like “SELECT sum(pax) FROM VDS WHERE custom_field=‘XX’” is not using the Aggregation Reflection created for this case. Or are we wrong, and the Aggregation Reflection is only working on fields that do exist in the Anchor dataset?
Attached profile for that query

1 Like

A bit more on this, after 1.5 days building the aggregation reflection, the “Reflection” tab in the VDS has a red icon saying “Reflection cannot accelerate. Multiple attempts to build reflection failed, will not reattempt”. There is no more information, how could we solve this? I’ve attached the profile

cc91180e-900a-45dc-828d-3a71f0888281.zip (3,4 KB)