Concurrent DML operation on Iceberg table

Hi Team,

We are posting dremio jobs through Dremio’s SQL API. We are trigerring subsequent SQL jobs that update an Iceberg table, however some of the jobs are failing with the below error:

Concurrent DML operation has updated the table, please retry. (Attached screenshot for your reference)

Please let us know if we can issue multiple DML commands on an Iceberg table concurrently.

Thanks
Sumi

Hi @Sumithra

Looks like the commit is happening at the same time, if you introduce a delay between the SQL statements, does it still happen?

Thanks
Bali

Hi @balaji.ramaswamy ,

Yes, we tried introducing delay between the SQL statements, but still some updates coincides with the commit of other update statements and is failing some of the updates. Please let us know if there is a way around issuing multiple update statements on the same iceberg table simultaneously.

Thanks
Sumi

@Sumithra Currently no, is this because you have updates from different users ?

@Sumi et all - did you find a way to do this? We also have a similar issue in that we want to support multiple processes deleting data in the same iceberg table. We know there will be no clashes between these delete statements but they need to be executed concurrently ideally.

@balaji.ramaswamy

I recently got a similar error as well.

I have an iceberg table which logs all the ingestion jobs I have.

I have multiple ingestion jobs, and they all start by inserting a row into the job run log iceberg table when they begin.

Does INSERT cause concurrent update issue?

Or is it caused by another issue?

FYI

I have set additional Table properties

write.metadata.delete-after-commit.enabled → True
write.metadata.previous-versions-max → 3

Would these properties cause concurrent update problems?

Referring to the below, I see that concurrent INSERT does work

=========

According to my dremio logs, for the job_run_log table, all of them BEFORE and AFTER the failing query are just INSERT statements

So I wonder why it failed.

Attached is the log file of the failed INSERT SQL for job_run_log table

c5ae252b-6703-4119-a9f9-77a7ab77fd76.zip (26.6 KB)

@Ken I see you are running multiple concurrent inserts to the same table. Is that not correct?

However, Apache Iceberg’s Serializable Isolation level with non-locking table semantics can result in scenarios in which write collisions occur. In these circumstances, the SQL command that finishes second fails with an error. Such failures occur only for a subset of combinations of two SQL commands running concurrently on a single Iceberg table

Hi @balaji.ramaswamy

Yes. That’s correct.

And I thought concurrent inserts would not result in failure.

The documentation says

(...) Such failures occur only for a subset of combinations of two SQL commands running concurrently on a single Iceberg table

According to the screenshot below, running 2 concurrent inserts should be fine, right?

@Ken Let me check on this and get back to you