Exceptionally poor response time ( > 700 x slower) compared to underlying database

mfarmer · December 16, 2019, 1:37pm

I have a query that performs three inner joins
2 of the joins are to the same table
on snowflake, no caching, medium warehouse execution time is around 5 secs, just under 1million rows
dremio, all tables accelerated, run time is 1 HOUR and STILL GOING.

More worrying is that it spends all its time in a NESTED_LOOP_JOIN AND it’s only uses ONE core of the entire cluster the whole time

mfarmer · December 16, 2019, 1:53pm

The query ultimately timed out - but that may be aqua data studio client - not too sure - either way, here’s the profile
c044da8c-4351-48b0-97e7-cbc6a00d22f8.zip (35.1 KB)

balaji.ramaswamy · December 16, 2019, 4:17pm

@mfarmer

Since you are operating only on 341,248 records, it is single threaded. I also see 15m blocked on downstream which is waiting on JDBC to consume records

mfarmer · December 16, 2019, 4:21pm

Hmm - so was using aqua data studio as client - not sure why it would be causing a long wait for 15 mins. I have two questions:

how can you see the 15 min wait in the profile
I monitored the server with htop for a long time - there’s still over 45 mins of delay, and that one core was pegged at 100% for a long time

balaji.ramaswamy · December 16, 2019, 4:28pm

@mfarmer

how can you see the 15 min wait in the profile? Expand phase 00-xx-xx . Phase 0 is the last phase and downstream to Phase 0 is external like ODBC or JDBC etc. If you see under Phase 00-xx-xx and under “Blocked Downstream” you will see the time I am talking about

NL joins are expensive and since you are single threaded that core is probably getting used 100%

mfarmer · December 16, 2019, 4:38pm

So - I am re-running the query from within dremio - 8 mins in and it’s still going… Questions:

It’s still very slow compared to snowflake (other queries seem to work same or slightly quicker)
why the decision to only use a single thread ? It pains me to see the cluster at 0% except one core for so long!
What caused the nested loop and what could be changed to avoid it ?

Thanks!

balaji.ramaswamy · December 16, 2019, 6:53pm

#2 We use one thread per 100,000 records but limited by number of splits. In your case you only have one split so hence the single thread. Can you partition by a different key to increase number of splits?

#3 Why does it use a NL join, in the below query, are “lctryiso” and “mctryiso” columns in kldw.reference.kl_iata_all

One thing you can try is to first join only kldw.reference.kl_iata_all and kldw.reference.iata_purchased_db and see how the query performs and look at the plan. Then add “kldw.reference.country”

mfarmer · December 16, 2019, 6:59pm

thanks - will try - just an FYI - query took 1hr 46 mins to complete - snowflake was about 5 seconds

balaji.ramaswamy · December 16, 2019, 7:00pm

@mfarmer

Would you be able to send us the profile of the completed query?

mfarmer · December 16, 2019, 9:49pm

1d8785b7-9ea8-4787-b45b-509befc874a9.zip (719.4 KB)

Topic		Replies	Views
Dremio uses one core for processing	7	1583	January 12, 2020
Query execution time	5	1206	June 6, 2022
Identical SQL Querys - Profile Comparison	5	851	February 22, 2023
Speed of execution of a query for oracle dataset	5	1772	November 11, 2019
JSON Sub Scan extremely slow	11	2036	June 25, 2018

Exceptionally poor response time ( > 700 x slower) compared to underlying database

Related topics