Queries ( QLIK) not being accelerated by Aggregated reflections

Good Day Guys,

I have a set of queries running against Dremio from Qlik (Qlik sends queries against the VDS based on the user’s selection)

VDS: has a Raw reflection and an aggregated reflection
Base Table: Only has a raw and aggregated reflection:

what seems to happen is that all my queries against the VDS seem to ignore both the aggregated reflections and only use the Raw reflection from the base table.

Am I perhaps missing something here?

please see sample query & profile below

Select
fin_year_no as "Financial Year",
business_unit_name as "Business Unit",
fin_year_no as VSort,
BUSINESS_UNIT_NO as HSort,
business_unit_name as HDim,
fin_year_no as VDim,
count(distinct UCID) as UCID
From
"Qlik Sense".Marketing."Customer Landscape".FullDYTableUpdated11June
--Where location_no in (103)
group by fin_year_no,business_unit_name,BUSINESS_UNIT_NO

bf0c803b-3fd4-401c-a40d-535f93d91227.zip (147.0 KB)

@Calvin_Klyn Which one of the below Agg reflections is supposed to get used?

  • bar_cust_basket_dy (“woolworths-its”.“woolworths-its-datalake”.“level-three”.woolworths.analytics)
    • Raw Reflection (raw): considered, matched, chosen.
    • Aggregation Reflection (agg): considered, not matched.
  • FullTblVDSTransformations2020 (“Qlik Sense”.Marketing.“Customer Landscape”)
    • Aggregation Reflection (agg): considered, not matched.
  • FullDYTableUpdated11June (“Qlik Sense”.Marketing.“Customer Landscape”)
    • Raw Reflection (raw): considered, not matched.

Hey Balaji,
Thanks for your response.

I want it to use it the below aggregation reflection.
bar_cust_basket_dy (“woolworths-its”.“woolworths-its-datalake”.“level-three”.woolworths.analytics)

  • Raw Reflection (raw): considered, matched, chosen.
  • Aggregation Reflection (agg): considered, not matched.

I created the other aggregation reflection to see if they might be used or not.

Basically I am trying to see if we can further optimize the queries that Qlik sends to Dremio, hence I thought the aggregation reflection might help in that regard

@Calvin_Klyn Have you made sure you have included all the measures/dimensions used in the query are part of the reflection definition?

Hi Balaji,
All of the fields do exist in the reflections.
Please see the detailed breakdown below:

The below query uses the Raw reflection and ignores the aggregation reflection as it has aliases

Select
fin_year_no as "Financial Year",
business_unit_name as "Business Unit",
SUM(Revenue) as Rev
From
"woolworths-its"."woolworths-its-datalake"."level-three".woolworths.analytics."bar_cust_basket_dy"
--Where location_no in (103)
group by fin_year_no,business_unit_name

The below query uses the Raw reflection and ignores the aggregation reflection, note that we are not aliasing the fields but we are filtering with the field in the where clause that exists in the reflection.

Select
fin_year_no,
business_unit_name,
SUM(revenue)
From
"woolworths-its"."woolworths-its-datalake"."level-three".woolworths.analytics."bar_cust_basket_dy"
Where location_no in (103)
group by fin_year_no,business_unit_name

Based on the above it seems like whenever we are using a where clause and aliases the aggregation reflection is ignored

See below example without a where clause and aliases
The below code will then use the aggregated reflection as we want,
In our case we need to pass the queries that have where clauses from Qlik.

Select
fin_year_no,
business_unit_name,
SUM(revenue)
From
"woolworths-its"."woolworths-its-datalake"."level-three".woolworths.analytics."bar_cust_basket_dy"
--Where location_no in (103)
group by fin_year_no,business_unit_name

I have attached all the three job profiles as well,

d124d044-3137-4a39-bc21-1151785777f0.zip (79.3 KB)
75aeb8ea-6ad5-438b-9565-d26fdb00d43d.zip (82.6 KB)
76ceb89b-bd34-4dbb-95a8-5136bc520819.zip (78.9 KB)

@Calvin_Klyn Thanks for the detailed explanation, will look into this early next week, if you do not see an update by Wednesday please ping here,

Hey @balaji.ramaswamy ,
Just a reminder, not to forget about the above optimisation issue.

@Calvin_Klyn Sorry ! have been busy with few things, will look into this by Friday

1 Like

No worries, let me know when you get the chance to have a look at this.

@Calvin_Klyn You have 3 profiles,

Job ID# 1f34e291-2eaa-65a8-619e-7727cb64c400, I see Agg i matched, not chosen

Query was accelerated

  • bar_cust_basket_dy (“woolworths-its”.“woolworths-its-datalake”.“level-three”.woolworths.analytics)
    • Raw Reflection (raw): considered, matched, not chosen.
    • Aggregation Reflection (agg): considered, matched, chosen.

Query ID# 1f34e1dd-013a-ce70-223a-d97e9070d700

Agg matched but Dremio planner estimated that using raw was less expensive, in fact the previous query acelerated by the Agg ran for 6.7s while this one accelerated by the raw ran for 1.5s

Query was accelerated

  • bar_cust_basket_dy (“woolworths-its”.“woolworths-its-datalake”.“level-three”.woolworths.analytics)
    • Raw Reflection (raw): considered, matched, chosen.
    • Aggregation Reflection (agg): considered, matched, not chosen.

Query ID# 1f34e2f9-d68f-19e6-1eca-051bf11c4300 similar to the above, Agg was matched but planner estimated to choose the raw over the agg and query completed in 6.4s

You will see times when the planner favors raw, as long as the query completes in time we should be good. Reasons could be high cardinality aggs or too many dimensions to group by etc

1 Like

Ahh, thank you so much then this makes sense.
The queries are behaving as they should.

Thanks a lot for this