Reflection not working

Can you help me understand why reflection fails? Is there a document on reflection? How can I troubleshoot? Thanks.

Hi @wz2020

Sure, let me come back with some detailed explanation and what problem you are facing and how a more recent version can help

Thanks
@balaji.ramaswamy

@wz2020

You said there were other queries that used this reflection to accelerate, any chance you have that profile too? Want to understand your complete picture

Thanks

@wz2020

There are couple of phases in the reflection matching process. First, the reflection needs to match. This means the reflection should have all the required data to accelerate the query or a subset of the query. Then we do a costing algorithm to see if using the reflection is better than going back to the source

What I am not able to understand is that your query does not even consider the reflection, if you open the profile and click on the accelerations tab it says the below

Reflection Outcome

Query was NOT accelerated

Time To Find Reflections: 6 ms
Time To Canonicalize: 0 ms
Time To Match: 0 ms

I am wondering if your reflection refresh, even though the job was successful had an exception

Try the below,
New Query - select * from sys.reflections where reflection_id=‘00fecdf8-b637-4c2c-9a63-e47a8424822e’ and then click on the 3 dots (…) on the top right corner and click JSON and send us the file

We have completely changed the reflection algorithm since Dremio 3.2 and that is the reason I wanted you to upgrade so we are sure we are not hitting any old bug. In any case let us start here and then decide next steps

Thanks
Bali

Thanks
Bali

Using this reflection_id, the system returns nothing. It’s probably because I deleted that previous reflection. Here is a new one:

{“reflection_id”:“e3712173-f1fc-4c73-af94-61f0ce129960”,“name”:“Raw Reflection”,“type”:“RAW”,“status”:“CANNOT_ACCELERATE_SCHEDULED”,“num_failures”:1,“dataset”:“Insight.tc_vessel_shift”,“sortColumns”:"",“partitionColumns”:"",“distributionColumns”:"",“dimensions”:"",“measures”:"",“displayColumns”:“ID$, PRIMARY_KEY_VAL$, TERM_ID$, CREATED_DATE$, SOURCE_SCN$, SQL_OPERATION$, BREAK_DURATION, BREAK_TIME, FINISH_DATE_TIME, MOVES_PER_HOUR, NAME, START_DATE_TIME, VESSEL_SHIFT_ID, VESSEL_VISIT_ID, dir0, dir1, dir2, dir3”,“externalReflection”:""}

What I don’t understand is that Data Reflection is like creating an index, right? If so, whether or not it can help accelerate any query is irrelevant at the time of creation. It will be my job later to craft my query to take advantage of the reflection. But why did reflection creation fail in the first place?

@wz2020

I see the reflection creation failed (like the index creation failed), Do you have the profile of the REFRESH REFLECTION and also the LOAD MATERIALIZATION jobs? Also send us the server.log, right after the reflection creation job completed and there should be a clue there

Thanks
Bali

How do I get the profiles for REFRESH REFLECTION and also the LOAD MATERIALIZATION jobs?

My dremio instance doesn’t have a server.log file, only server.out. How do I enable server.log?

@wz2020

To get the profiles, do the below

Click on jobs, select filter UI, External Tools and uncheck everything other than “Accelerator”. You should find 2 profiles next to each other REFRESH REFLECTION & LOAD MATERIALIZATION. On the right side the dataset on which it was created would be displayed. Download profile on the right after clicking on the respective jobs

server.log might be going to standard out. Try using journalctl -u dremio.service

I only found the REFRESH REFLECTION885c5fbf-c0a3-452b-8b4b-2325defdd6b0.zip (11.0 KB) job and here is the profile.

Here is the server log. tc_vessel_shift.zip (6.0 KB)

@wz2020

This query ran at “2020-06-25 18:01:29” UTC but the logfile you sent me starts at “2020-06-25 18:55:10,580” UTC

It’s the same reflection I was trying to enable. I did several times. So the time may not match but the content should. Does it matter?

@wz2020

Got it, but the only issue is I am not sure if the error is for the right query ID, I do see a creation error so it could be from your other reflection creation job

In your jobs page search, can you search for “210b0ce7-71ff-ebe7-03f7-5d86aa273a00” and send me the profile

Thanks
Bali

I don’t have that reflection id in sys.reflections.

4f1d1aae-cc2c-4d3d-afc1-3077df76f81d.zip (11.0 KB)

OK, I did this again and attached the profile here. It’s the same reflection.

@wz2020

The log file and profile needs to be in pairs. This time you have only sent me the profile

The ID I gave above to search was a query ID and not reflection ID so please search for the ID I gave before in the jobs page and send me the profile or alternate is to send me the log file for the reflection you just created

I’m really confused now. I don’t see that ID in jobs page other than if it’s the reflection ID. See the screenshot.

OK, I started a new reflection and attached all relevant files:

The reflection seems to be OK and I have also attached the query profile:

But on the dataset screen, you can clearly see reflection is not working with the infamous error message:

f4eaf422-63de-4e1e-9e28-2a06e2efe3a4.zip (89.2 KB) mylog.zip (51.2 KB)

I see this error in the log: Caused by: com.dremio.connector.metadata.DatasetNotFoundException: null
But have no idea why that is. The query without acceleration works fine so all the datasets are there for the query to work.

@wz2020

The code base has changed a lot since 3.2 to 3.3 to 4.0 to 4.1 to 4.2 to 4.3 to 4.5. I am not saying this problem would for sure go away but it is extremely difficult to troubleshoot this with 3.2.

Drop the reflection
Upgrade Dremio
Recreate the reflection

Does it work?

OK, I will give that a try and revert. Thanks.