Raw reflection not used to accelerate select star query

Dremio version: 23.0.1-202210141019030815-c1de8bcc

I have looked through a few similar topics here but can’t seem to find a solution to this issue:

I have a raw reflection on a view that has the status CAN_ACCELERATE. When I run a simple EXPLAIN PLAN FOR SELECT * FROM view, The raw reflection is not used to accelerate the query with the message Did not cover query.

The raw profile shows the message Caught exception during default reflection substitution for <view> as well as the following:

Matching Hints:

    Missing Field starts_at
    Missing Field ends_at

Neither the query nor the view itself contain the mentioned fields. The missing fields are from a table that is joined within the view. starts_at and ends_at are grouped and concatenated in a field runtimes in the view.

The view joins multiple tables from a postgres data source.

Any help with this is appreciated.

profile (98.8 KB)

Hi Darian, do you have any additional logging about the substitution error in server.log? If not, you can try enabling this to see if you get the actual error:

<appender name="acceleration" class="ch.qos.logback.core.rolling.RollingFileAppender">
  <file>${dremio.log.path}/acceleration.log</file>
  <rollingPolicy class="ch.qos.logback.core.rolling.TimeBasedRollingPolicy">
    <fileNamePattern>${dremio.log.path}/archive/acceleration.%d{yyyy-MM-dd}.log.gz</fileNamePattern>
    <maxHistory>30</maxHistory>
  </rollingPolicy>
  <encoder>
    <pattern>%date{ISO8601} [%thread] %-5level %logger{36} - %msg%n</pattern>
  </encoder>
</appender>

<logger name="com.dremio.exec.planner.acceleration" additivity="false">
    <level value="debug" />
    <appender-ref ref="acceleration" />
</logger>
<logger name="com.dremio.reflection.bup" additivity="false">
  <level value="debug" />
  <appender-ref ref="acceleration" />
</logger>
<logger name="com.dremio.reflection.ReflectionTracer" additivity="false">
  <level value="debug" />
  <appender-ref ref="acceleration" />
</logger>

The log message will contain the job id in the thread name to help you find the relevant messages.

Hi Benny,

Thank you for your reply. How would we go about configuring this if we deploy Dremio using the Helm chart?

Kind regards
Toby

I had some tests from the 23 version that don’t use raw reflection all most time, except for some SQL operations that can push down

@Benny_Chow We have downgraded to 22.1.1. That seems to fix the problem. Is it possible that 23.1.0 is broken?

We are hesitant to consider exploring Dremio as an option given the fact that this is happening.

@Toby Do you have the 22.x and 23.x job profiles? Would like to compare

Hi @balaji.ramaswamy, thank you for your help. Here are the requested profiles:

Let me know if you need more information.

@darian @Toby I can’t reproduce Caught exception during default reflection substitution

Here is what I am doing:

  1. Install v22
  2. From Sample Source, promote zips.json
  3. Build VDS on PDS
  4. Enable Raw Reflection on VDS
  5. Run “select * on VDS” and verified query is accelerated with default raw reflection
  6. Shutdown server
  7. Upgrade to v23
  8. Re-run “select * on VDS”. Verified that query continues to be accelerated with previous reflection materialization. No errors in profile.

I have couple questions/suggestions:

  • Do the above steps reproduce the issue for you?
  • What happens if you disable and re-enable the reflection and then re-run the user query?
  • Are there any errors in server.log?
  • Can you provide the acceleration.log requested above?
  • When upgrading, make sure you don’t change the absolute path to the dist storage.
  • Enabling the support option planner.verbose_profile gives a lot more information in the acceleration profile tab

Hi @Benny_Chow,

The only thing that is different for the two servers we are testing on is the image tag in our Kubernetes manifest.

We are creating the views and reflections using dbt.

So basically everything is created through code. I can’t think of what might be different between the two installations.

For the points you mentioned maybe Darian can provide some more insight in the coming work week…

Kind regards
Toby

@darian

Are you able to send us the “describe” or create table output for the below tables from Postgres?

pilot_v3.channels
pilot_v3.media
pilot_v3.plan_items
pilot_v3.publishers
pilot_v3.runtimes