Hello! I am performing the following SQL-Query on a Postgres database.
select
l_returnflag,
l_linestatus,
sum(l_quantity) as sum_qty,
sum(l_extendedprice) as sum_base_price,
sum(l_extendedprice * (1 - l_discount)) as sum_disc_price,
sum(l_extendedprice * (1 - l_discount) * (1 + l_tax)) as sum_charge,
avg(l_quantity) as avg_qty,
avg(l_extendedprice) as avg_price,
avg(l_discount) as avg_disc,
count(*) as count_order
from
lineitem
where
l_shipdate <= date ‘1998-12-01’ - interval ‘90’ day
group by
l_returnflag,
l_linestatus
order by
l_returnflag,
l_linestatus;
Everything works as expected. But when I look into the postgres server query store, I see that Dremio is forming my initial SQL-Query into a slightly different one. This is the SQL Query that Dremio performs on my postgres server.
SELECT
“l_returnflag”,
“l_linestatus”,
CAST(
CASE WHEN COUNT(“l_quantity”) = $ 1 THEN $ 2 ELSE CAST(SUM(“l_quantity”) AS NUMERIC(38, 6)) END AS NUMERIC(38, 6)
) AS “sum_qty”,
CAST(
CASE WHEN COUNT(“l_extendedprice”) = $ 3 THEN $ 4 ELSE CAST(SUM(“l_extendedprice”) AS NUMERIC(38, 6)) END AS NUMERIC(38, 6)
) AS “sum_base_price”,
CAST(
CASE WHEN COUNT(“$f4”) = $ 5 THEN $ 6 ELSE CAST(SUM(“$f4”) AS NUMERIC(38, 6)) END AS NUMERIC(38, 6)
) AS “sum_disc_price”,
CAST(
CASE WHEN COUNT(“$f5”) = $ 7 THEN $ 8 ELSE CAST(SUM(“$f5”) AS NUMERIC(38, 6)) END AS NUMERIC(38, 6)
) AS “sum_charge”,
(
CAST(
CASE WHEN COUNT(“l_quantity”) = $ 9 THEN $ 10 ELSE CAST(SUM(“l_quantity”) AS NUMERIC(38, 6)) END AS DOUBLE PRECISION
) / COUNT(“l_quantity”)
) AS “avg_qty”,
(
CAST(
CASE WHEN COUNT(“l_extendedprice”) = $ 11 THEN $ 12 ELSE CAST(SUM(“l_extendedprice”) AS NUMERIC(38, 6)) END AS DOUBLE PRECISION
) / COUNT(“l_extendedprice”)
) AS “avg_price”,
(
CAST(
CASE WHEN COUNT(“l_discount”) = $ 13 THEN $ 14 ELSE CAST(SUM(“l_discount”) AS NUMERIC(38, 6)) END AS DOUBLE PRECISION
) / COUNT(“l_discount”)
) AS “avg_disc”,
COUNT(*) AS “count_order”
FROM
(
SELECT
“l_returnflag”,
“l_linestatus”,
“l_quantity”,
“l_extendedprice”,
(“l_extendedprice” * ((1 - “l_discount”))) AS “$f4”,
(
(“l_extendedprice” * ((1 - “l_discount”))) * ((1 + “l_tax”))
) AS “$f5”,
“l_discount”
FROM
(
SELECT
“lineitem”.“l_quantity”,
“lineitem”.“l_extendedprice”,
“lineitem”.“l_discount”,
“lineitem”.“l_tax”,
“lineitem”.“l_returnflag”,
“lineitem”.“l_linestatus”,
“lineitem”.“l_shipdate”
FROM
“public”.“lineitem”
) AS “lineitem”
WHERE
“l_shipdate” <= DATE ‘1998-12-01’ - interval ‘90’ day
) AS “lineitem”
GROUP BY
“l_returnflag”,
“l_linestatus”
ORDER BY
“l_returnflag”,
“l_linestatus”
Is there any documentation on why Dremio does that? And especially why forming the initial statement to this particular one?
Thanks for any help!