Query Status Stuck In "Starting" ~60 min. Before Failing

Hello,

We have been using Dremio 2.0 for a few months and recently copied all our tables/reflections over to using Dremio 3.0. We have a basic query that is able to run in the Dremio 2.0 server environment in about ~30 seconds, but the same query is stuck in “starting” status for over 60 min. before failing and giving us the details of “Invalid Query Exception” and “Failed”.

The queries are the exactly the same.
Our Queue control settings are exactly the same.

Any ideas on what to try to better troubleshoot?

@kevthebandit

Can you try and disable “planner.experimental.pclean_logical” via the support key under admin-support? and retry the query?

Thanks
@balaji.ramaswamy

@balaji.ramaswamy

Thank you for the suggestion.

the “planner.experimental.pclean_logical” is already unchecked in both environments. Sorry for not calling that out in my first post.

Hi @kevthebandit

Can you please compare the differences in convert to rel between the 2.x and 3.x plan? Also do the same for Logical planning as most lilely the 1 minute would have been spent on logical planning. Do you have both profiles? Can you please send it across? the one from 2.x and the one from 3.x?

Thanks
@balaji.ramaswamy

@balaji.ramaswamy

Dremio 3.x - Query does NOT work
Convert To Rel (139 ms)
Logical Planning (734 ms)

Dremio 2.x - Query does work
Convert To Rel (79 ms)
Logical Planning (959 ms)

What part of the “Job Profile” would be helpful? Query, Planning (what part within planning), Acceleration, or Error?

Tahnks for your help!

I think I found the downloadable profile you were talking about. Please let me know if you meant something different. Profile_2.x.zip (237.1 KB)
619c6c5c-ff76-4133-a3ea-1e36315dde60.zip (241.5 KB)

Hi @kevthebandit

Few things,

  1. The query you ran on 3.0.6 is a prepare statement while the one on 2.0.5 is not. We would need to compare the same query. Would it be possible to send the prepare statement profile from 2.0.5?

  2. We need verbose planner on , turn on planner.verbose_profile under admin-support-support key

  3. Are the reflections used on both versions the same?

Thanks
@balaji.ramaswamy

Thanks for all your responses and help!

  1. Attached is the 2.x prepare statement
    prepare_statement_profile_dremio_2.x.zip (78.5 KB)

  2. I turned on the planner.verbose_profile for both Dremio 3.x and 2.x server environments

  3. The prepare statement that successfully completes shows me reflections used, the prepare statements for Dremio 3.x where the prepare statements fails, it does not show me the names of reflections chosen within “Job Profile > Acceleration”. Still looking for detail so I can compare the two.

Hi @kevthebandit

Can you please send us the r profiles with verbose on, we will look and see what changed. If you are not able to find the missing reflections, that should be ok for now

Thanks
@balaji.ramaswamy

How do I find the r (reflection) profiles? Do you mean the full download profile for the same query in each environment like I sent before? Attached are the full profiles for each query run, when the planner.verbose_profile switch turned on.

Let me know if you mean something different.

prepare_statement_dremio_2.x-verbose_on.zip (78.5 KB)
prepare_statement_Dremio_3.x-verbose_on.zip (494.3 KB)

Hi @kevthebandit

The verbose profiles certainly helps. Let us not worry about the same set of reflections getting considered between the 2 profiles for now. Let me look into what has changed and get back to you

Thanks
@balaji.ramaswamy