I have a query that performs a merge on an iceberg table, there is little data, around 4 million rows. However, every merge execution stops the query stating that it has reached the memory limit. I have already increased the execution machine to 128GB of RAM, but the error persists. What can I do to resolve this case?
I will send the query profile below.
a8bdae1a-7eca-42e5-b19f-7e3df568af67.zip (402,6,KB)
The 2 main memory consumers are
05-04 HashJoin(condition=[AND(=($16, $34), =($17, $35))], joinType=[full])
and
09-03 HashJoin(condition=[=($0, $16)], joinType=[left]) :
Arethese joins expanding as they are Full outer and Left outer joins?
It`s a merge…
merge into
{{this.database}}.{{this.schema}}.{{this.table}} a
using
(
select
OEV_IDENTI_CURATED,
OEV_DATABT,
OEV_DATALT,
OEV_DATALT_PART,
OEV_DATCRI,
OEV_IDENTI,
OEV_OBSERV,
OEV_OCP_IDENTI,
OEV_PRGALT,
OEV_PRGCRI,
OEV_TIPEVE,
OEV_USUALT,
OEV_USUBDD,
OEV_USUCRI,
OWNER,
INGESTION_DATE
from
{{ ref('landing_twf_ocpeve') }}
) b
on
( b.OEV_IDENTI_CURATED = a.OEV_IDENTI_CURATED )
when matched then
update set *
when not matched then
insert *
Any update?
We are struggling with this until now …
I’ve tried to config dremio-env increasing memory, incresed coordinator ec2 config, increased engine ec2, increased number of nodes, enabled join.spill, but nothing works…
Actually it’s a merge with 4mi rows in 6mi rows
@meneizs Any chance you can try the same Merge in 25.x? The reason is that 25.x has Hash Join spilling