So I have seen this previously too - This query did a pull from snowflake in about 18 mins and the majority of the rest of the time (before it failed with lack of disk space), it chugged away with just one core processing - This has the effect of making it much slower than snowflake.
I don’t really understand why the one core only is used - but it’s causing me real performance issues…
The reason we are executing on only one core is because you have only one reflection Parquet file and in that only one group which decides the number of splits which in turn decides the number of parallel threads. Number of rowgroups upto 70% of the total number of cores on that node will decide the degree of parallelism. I see you query is accelerated so when you create the reflection you need to decide on the right partition column so you get more splits, below is a best practices document for creating reflections
RAW reflection on snowflake.KLDW.PUBLIC.D_DATE has just 1 split so single threaded
RAW reflection on KLDW.LOOKUP.BRAND_COMP_SET_LOOKUP has just 1 split so single threaded
RAW reflection on snowflake.KLDW.PUBLIC.D_PROPERTY has 23 splits so can have parallel threads but unfortunately has only 45K records and we spin up the next thread only at 100K
I just had a closer look on why your query is slow as initially you wanted to know the reason for single threaded. Your query is slow not because of the 3 reflections (#2, #3 and #4) but because of the JDBC scan #1 which is scanning 3,032,397,184 records.
If you run this SQL on the source directly, how much does it take. Preferably run it from a client installed on the Dremio Coordinator
SELECT “OCCUPANCY_DT_ID”, “ROOM_COUNT”, “KL_TA_CODE”, “KL_TA_CATEGORY_ID”, “KL_PROP_ID”, “KL_RATE_ID”, “KL_SOB_ID”, “KL_CHANNEL_COST_AMT”, “KL_TRANSACTION_COST_AMT”, “KL_AMENITY_COST_AMT”, “KL_LOYALTY_COST_AMT”, “KL_GUEST_PAID_REVENUE_AMT”, “KL_HOTEL_COLLECTED_REVENUE_AMT”, “KL_COPE_REVENUE_AMT”, “KL_LOYALTY_FLAG”, “KL_NO_SHOW_FLAG”, “KL_CANCELLATION_FLAG”
FROM “KLDW”.“PUBLIC”.“F_STAY”) AS “F_STAY”
WHERE “KL_SOB_ID” IS NOT NULL