Iceberg: Choosing a catalog when using "Dremio Software" with other compute engines

wundi · March 24, 2023, 10:16am

There are a lot of catalog implementations that Iceberg currently supports for common engines like Spark and Flink. I was wondering what the best catalog would be Iceberg and Dremio is used in a multi-engine environment.

This thread is inspired by the discussions in:

I’d like to scope it to the “Dremio Software” edition only, since Arctic is available in Dremio Cloud and an obvious choice (unless the “Preview” disclaimer is a blocker for some). In regards to thread 1., I suggest at least initially also scoping to Dremio only reading data, in order to avoid discussions about concurrent writes.

The options known to me are:

Glue catalog
Hive catalog
Hadoop catalog
Nessie catalog (using an undocumented services.nessie.remote-uri config in dremio.conf)
Roll your own support for the JDBC catalog, REST catalog or similar
Wait for Arctic to come to the “Software” version

Short comments on the options above:

Glue may be an option, but it is not an option for non-AWS clouds or on-prem deployments
I would rather want to avoid all the complexities of Hive (unless you’re deeply invested in Hive already, but not all are - my company isn’t either)
The lowest, common denominator. For one, engines doing writes need an explicit lock manager for coordinating writes, since it isn’t provided by the catalog.
This might not even be officially supported, and as far as I can test, Dremio-wise it seems to be treated very similar to the Hadoop catalog
This seems like a daunting task. It may also be in vein, since Dremio may add more catalogs in the future (Dremio’s roadmap isn’t open, so on-one knows)
One could hope. I haven’t heard of any ambitions of it being added to the “Software” version (Dremio’s roadmap isn’t open, so on-one knows)

For context, in my company we’re currently:

Running Dremio OSS on-prem
Using a per-table Hadoop catalog
Using an AWS S3 remote file source (MinIO) and manually configuring folders as Iceberg tables
Using a custom built lockmanager to coordinate commits by writers

I’d like to do better and there are a lot of good reasons to use a proper catalog. Especially 4. above is an issue for me, since it makes writing to Iceberg risky. You need to be certain you have nailed the configuration of the LockManager, because, unlike other catalogs, if you forget or misconfigure the LockManager, it is still possible to write to the table. We have corrupted more than one table this way.

TL;DR - can we do better than the Hadoop catalog and still use Dremio for reading Iceberg tables - now or in the near future? @Benny_Chow may have some insights/recommendations, but what are others in the community doing?

tid · September 23, 2024, 8:07am

Hi @wundi,

thanks for your great post describing your line of thinking very clearly. I have very similar thoughts and questions. Did you every receive an answer or how did you proceed?

Many thanks, Tim

balaji.ramaswamy · September 25, 2024, 4:04am

@wundi and Dr Tim,

Let me discuss this internally and get back here

Thanks
Bali

wundi · September 25, 2024, 4:43am

Hi Tim,

Frankly, we are still running in exactly the way I described. However, a lot has changed over the last couple of years, so we are hoping to make the switch soon. We have narrowed our catalog options to a few:

REST catalog
1.1 The Tabular rest catalog on top of a JDBC catalog
1.2 Polaris
Nessie

The consensus across all the technologies we are using is that they all support the REST catalog or will soon, so that is where we’re heading. Based on Dremio announcing support for the REST catalog, we had hoped that support would have landed in the recently released 25.1, but that is not the case. And since the roadmap isn’t open, we’re not sure when that’ll happen.

If we’d want to move now and have Dremio support, the only option is to go for Nessie as a catalog. Dremio does have support for Nessie, but the REST catalog api support is still experimental in Nessie.
So, we’d either test Nessie REST api support extensively or do a hybrid of option 1 and 2 above, using Nessie as the underlying catalog for the Tabular REST catalog implementation. However, that also needs to be tested rather well, before I’d be comfortable running that in production.

The reason we haven’t adopted Nessie already is that not all of our tech stack support Nessie, as well as us being reluctant to take on the additional GC maintenance required for running Nessie properly. It might not be a big deal, it is additional complexity I’d like to avoid unless strictly necessary.

The last and rather obvious option is to wait until REST catalog support is released from Dremio. Currently, we do have some non-production technologies that don’t support the Hadoop catalog. In order to use those, someone hacked a bit on the Tabular catalog to support exposing a Hadoop catalog as a REST catalog - one REST catalog per Hadoop catalog. This obviously doesn’t scale to many tables as well as a pain to manage, but it does serve as a workaround for us for now.

Hope that helps @tid

sumanth · October 18, 2024, 6:52am

@balaji.ramaswamy, were you able to check on this?

@wundi, thanks for your detailed opinion on various options and this helped me explore a lot of options and these are my thoughts.

The tabular rest doesn’t seem to be maintained enough with limited docs and there are high CVEs opened against their docker images which hampers adoption in an enterprise.

github.com/tabular-io/iceberg-rest-image

CVE security findings that need to be resolved

opened 03:18PM - 29 May 24 UTC

reedog117

Security findings found via Trivy when using that need to be resolved. Unsure if… this should be reported here or upstream. In summary there are 5 medium and 4 high CVE vulnerabilities that can be resolved with updated dependencies within `iceberg-rest-image-all.jar` Note that this is with the following contents in `build.gradle` ``` ext { icebergVersion = '1.5.2' hadoopVersion = '3.4.0' } ``` ``` Total: 39 (UNKNOWN: 0, LOW: 33, MEDIUM: 6, HIGH: 0, CRITICAL: 0) ┌──────────────────┬────────────────┬──────────┬──────────┬──────────────────────────┬───────────────┬──────────────────────────────────────────────────────────────┐ │ Library │ Vulnerability │ Severity │ Status │ Installed Version │ Fixed Version │ Title │ ├──────────────────┼────────────────┼──────────┼──────────┼──────────────────────────┼───────────────┼──────────────────────────────────────────────────────────────┤ │ coreutils │ CVE-2016-2781 │ LOW │ affected │ 8.32-4.1ubuntu1.2 │ │ coreutils: Non-privileged session can escape to the parent │ │ │ │ │ │ │ │ session in chroot │ │ │ │ │ │ │ │ https://avd.aquasec.com/nvd/cve-2016-2781 │ ├──────────────────┼────────────────┤ │ ├──────────────────────────┼───────────────┼──────────────────────────────────────────────────────────────┤ │ gcc-12-base │ CVE-2022-27943 │ │ │ 12.3.0-1ubuntu1~22.04 │ │ binutils: libiberty/rust-demangle.c in GNU GCC 11.2 allows │ │ │ │ │ │ │ │ stack exhaustion in demangle_const │ │ │ │ │ │ │ │ https://avd.aquasec.com/nvd/cve-2022-27943 │ ├──────────────────┼────────────────┤ │ ├──────────────────────────┼───────────────┼──────────────────────────────────────────────────────────────┤ │ gpgv │ CVE-2022-3219 │ │ │ 2.2.27-3ubuntu2.1 │ │ gnupg: denial of service issue (resource consumption) using │ │ │ │ │ │ │ │ compressed packets │ │ │ │ │ │ │ │ https://avd.aquasec.com/nvd/cve-2022-3219 │ ├──────────────────┼────────────────┤ │ ├──────────────────────────┼───────────────┼──────────────────────────────────────────────────────────────┤ │ libc-bin │ CVE-2016-20013 │ │ │ 2.35-0ubuntu3.7 │ │ sha256crypt and sha512crypt through 0.6 allow attackers to │ │ │ │ │ │ │ │ cause a denial of... │ │ │ │ │ │ │ │ https://avd.aquasec.com/nvd/cve-2016-20013 │ ├──────────────────┤ │ │ │ ├───────────────┤ │ │ libc6 │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ ├──────────────────┼────────────────┤ │ ├──────────────────────────┼───────────────┼──────────────────────────────────────────────────────────────┤ │ libgcc-s1 │ CVE-2022-27943 │ │ │ 12.3.0-1ubuntu1~22.04 │ │ binutils: libiberty/rust-demangle.c in GNU GCC 11.2 allows │ │ │ │ │ │ │ │ stack exhaustion in demangle_const │ │ │ │ │ │ │ │ https://avd.aquasec.com/nvd/cve-2022-27943 │ ├──────────────────┼────────────────┼──────────┤ ├──────────────────────────┼───────────────┼──────────────────────────────────────────────────────────────┤ │ libgcrypt20 │ CVE-2024-2236 │ MEDIUM │ │ 1.9.4-3ubuntu3 │ │ libgcrypt: vulnerable to Marvin Attack │ │ │ │ │ │ │ │ https://avd.aquasec.com/nvd/cve-2024-2236 │ ├──────────────────┼────────────────┤ │ ├──────────────────────────┼───────────────┼──────────────────────────────────────────────────────────────┤ │ libgssapi-krb5-2 │ CVE-2024-26462 │ │ │ 1.19.2-2ubuntu0.3 │ │ krb5: Memory leak at /krb5/src/kdc/ndr.c │ │ │ │ │ │ │ │ https://avd.aquasec.com/nvd/cve-2024-26462 │ │ ├────────────────┼──────────┤ │ ├───────────────┼──────────────────────────────────────────────────────────────┤ │ │ CVE-2024-26458 │ LOW │ │ │ │ krb5: Memory leak at /krb5/src/lib/rpc/pmap_rmt.c │ │ │ │ │ │ │ │ https://avd.aquasec.com/nvd/cve-2024-26458 │ │ ├────────────────┤ │ │ ├───────────────┼──────────────────────────────────────────────────────────────┤ │ │ CVE-2024-26461 │ │ │ │ │ krb5: Memory leak at /krb5/src/lib/gssapi/krb5/k5sealv3.c │ │ │ │ │ │ │ │ https://avd.aquasec.com/nvd/cve-2024-26461 │ ├──────────────────┼────────────────┼──────────┤ │ ├───────────────┼──────────────────────────────────────────────────────────────┤ │ libk5crypto3 │ CVE-2024-26462 │ MEDIUM │ │ │ │ krb5: Memory leak at /krb5/src/kdc/ndr.c │ │ │ │ │ │ │ │ https://avd.aquasec.com/nvd/cve-2024-26462 │ │ ├────────────────┼──────────┤ │ ├───────────────┼──────────────────────────────────────────────────────────────┤ │ │ CVE-2024-26458 │ LOW │ │ │ │ krb5: Memory leak at /krb5/src/lib/rpc/pmap_rmt.c │ │ │ │ │ │ │ │ https://avd.aquasec.com/nvd/cve-2024-26458 │ │ ├────────────────┤ │ │ ├───────────────┼──────────────────────────────────────────────────────────────┤ │ │ CVE-2024-26461 │ │ │ │ │ krb5: Memory leak at /krb5/src/lib/gssapi/krb5/k5sealv3.c │ │ │ │ │ │ │ │ https://avd.aquasec.com/nvd/cve-2024-26461 │ ├──────────────────┼────────────────┼──────────┤ │ ├───────────────┼──────────────────────────────────────────────────────────────┤ │ libkrb5-3 │ CVE-2024-26462 │ MEDIUM │ │ │ │ krb5: Memory leak at /krb5/src/kdc/ndr.c │ │ │ │ │ │ │ │ https://avd.aquasec.com/nvd/cve-2024-26462 │ │ ├────────────────┼──────────┤ │ ├───────────────┼──────────────────────────────────────────────────────────────┤ │ │ CVE-2024-26458 │ LOW │ │ │ │ krb5: Memory leak at /krb5/src/lib/rpc/pmap_rmt.c │ │ │ │ │ │ │ │ https://avd.aquasec.com/nvd/cve-2024-26458 │ │ ├────────────────┤ │ │ ├───────────────┼──────────────────────────────────────────────────────────────┤ │ │ CVE-2024-26461 │ │ │ │ │ krb5: Memory leak at /krb5/src/lib/gssapi/krb5/k5sealv3.c │ │ │ │ │ │ │ │ https://avd.aquasec.com/nvd/cve-2024-26461 │ ├──────────────────┼────────────────┼──────────┤ │ ├───────────────┼──────────────────────────────────────────────────────────────┤ │ libkrb5support0 │ CVE-2024-26462 │ MEDIUM │ │ │ │ krb5: Memory leak at /krb5/src/kdc/ndr.c │ │ │ │ │ │ │ │ https://avd.aquasec.com/nvd/cve-2024-26462 │ │ ├────────────────┼──────────┤ │ ├───────────────┼──────────────────────────────────────────────────────────────┤ │ │ CVE-2024-26458 │ LOW │ │ │ │ krb5: Memory leak at /krb5/src/lib/rpc/pmap_rmt.c │ │ │ │ │ │ │ │ https://avd.aquasec.com/nvd/cve-2024-26458 │ │ ├────────────────┤ │ │ ├───────────────┼──────────────────────────────────────────────────────────────┤ │ │ CVE-2024-26461 │ │ │ │ │ krb5: Memory leak at /krb5/src/lib/gssapi/krb5/k5sealv3.c │ │ │ │ │ │ │ │ https://avd.aquasec.com/nvd/cve-2024-26461 │ ├──────────────────┼────────────────┼──────────┤ ├──────────────────────────┼───────────────┼──────────────────────────────────────────────────────────────┤ │ liblzma5 │ CVE-2020-22916 │ MEDIUM │ │ 5.2.5-2ubuntu1 │ │ Denial of service via decompression of crafted file │ │ │ │ │ │ │ │ https://avd.aquasec.com/nvd/cve-2020-22916 │ ├──────────────────┼────────────────┼──────────┤ ├──────────────────────────┼───────────────┼──────────────────────────────────────────────────────────────┤ │ libncurses6 │ CVE-2023-45918 │ LOW │ │ 6.3-2ubuntu0.1 │ │ ncurses 6.4-20230610 has a NULL pointer dereference in │ │ │ │ │ │ │ │ tgetstr in tinf ...... │ │ │ │ │ │ │ │ https://avd.aquasec.com/nvd/cve-2023-45918 │ │ ├────────────────┤ │ │ ├───────────────┼──────────────────────────────────────────────────────────────┤ │ │ CVE-2023-50495 │ │ │ │ │ ncurses: segmentation fault via _nc_wrap_entry() │ │ │ │ │ │ │ │ https://avd.aquasec.com/nvd/cve-2023-50495 │ ├──────────────────┼────────────────┤ │ │ ├───────────────┼──────────────────────────────────────────────────────────────┤ │ libncursesw6 │ CVE-2023-45918 │ │ │ │ │ ncurses 6.4-20230610 has a NULL pointer dereference in │ │ │ │ │ │ │ │ tgetstr in tinf ...... │ │ │ │ │ │ │ │ https://avd.aquasec.com/nvd/cve-2023-45918 │ │ ├────────────────┤ │ │ ├───────────────┼──────────────────────────────────────────────────────────────┤ │ │ CVE-2023-50495 │ │ │ │ │ ncurses: segmentation fault via _nc_wrap_entry() │ │ │ │ │ │ │ │ https://avd.aquasec.com/nvd/cve-2023-50495 │ ├──────────────────┼────────────────┤ │ ├──────────────────────────┼───────────────┼──────────────────────────────────────────────────────────────┤ │ libpcre3 │ CVE-2017-11164 │ │ │ 2:8.39-13ubuntu0.22.04.1 │ │ pcre: OP_KETRMAX feature in the match function in │ │ │ │ │ │ │ │ pcre_exec.c │ │ │ │ │ │ │ │ https://avd.aquasec.com/nvd/cve-2017-11164 │ ├──────────────────┼────────────────┤ │ ├──────────────────────────┼───────────────┼──────────────────────────────────────────────────────────────┤ │ libpng16-16 │ CVE-2022-3857 │ │ │ 1.6.37-3build5 │ │ libpng: Null pointer dereference leads to segmentation fault │ │ │ │ │ │ │ │ https://avd.aquasec.com/nvd/cve-2022-3857 │ ├──────────────────┼────────────────┤ │ ├──────────────────────────┼───────────────┼──────────────────────────────────────────────────────────────┤ │ libstdc++6 │ CVE-2022-27943 │ │ │ 12.3.0-1ubuntu1~22.04 │ │ binutils: libiberty/rust-demangle.c in GNU GCC 11.2 allows │ │ │ │ │ │ │ │ stack exhaustion in demangle_const │ │ │ │ │ │ │ │ https://avd.aquasec.com/nvd/cve-2022-27943 │ ├──────────────────┼────────────────┤ │ ├──────────────────────────┼───────────────┼──────────────────────────────────────────────────────────────┤ │ libsystemd0 │ CVE-2023-7008 │ │ │ 249.11-0ubuntu3.12 │ │ systemd-resolved: Unsigned name response in signed zone is │ │ │ │ │ │ │ │ not refused when DNSSEC=yes... │ │ │ │ │ │ │ │ https://avd.aquasec.com/nvd/cve-2023-7008 │ ├──────────────────┼────────────────┤ │ ├──────────────────────────┼───────────────┼──────────────────────────────────────────────────────────────┤ │ libtinfo6 │ CVE-2023-45918 │ │ │ 6.3-2ubuntu0.1 │ │ ncurses 6.4-20230610 has a NULL pointer dereference in │ │ │ │ │ │ │ │ tgetstr in tinf ...... │ │ │ │ │ │ │ │ https://avd.aquasec.com/nvd/cve-2023-45918 │ │ ├────────────────┤ │ │ ├───────────────┼──────────────────────────────────────────────────────────────┤ │ │ CVE-2023-50495 │ │ │ │ │ ncurses: segmentation fault via _nc_wrap_entry() │ │ │ │ │ │ │ │ https://avd.aquasec.com/nvd/cve-2023-50495 │ ├──────────────────┼────────────────┤ │ ├──────────────────────────┼───────────────┼──────────────────────────────────────────────────────────────┤ │ libudev1 │ CVE-2023-7008 │ │ │ 249.11-0ubuntu3.12 │ │ systemd-resolved: Unsigned name response in signed zone is │ │ │ │ │ │ │ │ not refused when DNSSEC=yes... │ │ │ │ │ │ │ │ https://avd.aquasec.com/nvd/cve-2023-7008 │ ├──────────────────┼────────────────┤ │ ├──────────────────────────┼───────────────┼──────────────────────────────────────────────────────────────┤ │ libzstd1 │ CVE-2022-4899 │ │ │ 1.4.8+dfsg-3build1 │ │ zstd: mysql: buffer overrun in util.c │ │ │ │ │ │ │ │ https://avd.aquasec.com/nvd/cve-2022-4899 │ ├──────────────────┼────────────────┤ │ ├──────────────────────────┼───────────────┼──────────────────────────────────────────────────────────────┤ │ locales │ CVE-2016-20013 │ │ │ 2.35-0ubuntu3.7 │ │ sha256crypt and sha512crypt through 0.6 allow attackers to │ │ │ │ │ │ │ │ cause a denial of... │ │ │ │ │ │ │ │ https://avd.aquasec.com/nvd/cve-2016-20013 │ ├──────────────────┼────────────────┤ │ ├──────────────────────────┼───────────────┼──────────────────────────────────────────────────────────────┤ │ login │ CVE-2023-29383 │ │ │ 1:4.8.1-2ubuntu2.2 │ │ shadow: Improper input validation in shadow-utils package │ │ │ │ │ │ │ │ utility chfn │ │ │ │ │ │ │ │ https://avd.aquasec.com/nvd/cve-2023-29383 │ ├──────────────────┼────────────────┤ │ ├──────────────────────────┼───────────────┼──────────────────────────────────────────────────────────────┤ │ ncurses-base │ CVE-2023-45918 │ │ │ 6.3-2ubuntu0.1 │ │ ncurses 6.4-20230610 has a NULL pointer dereference in │ │ │ │ │ │ │ │ tgetstr in tinf ...... │ │ │ │ │ │ │ │ https://avd.aquasec.com/nvd/cve-2023-45918 │ │ ├────────────────┤ │ │ ├───────────────┼──────────────────────────────────────────────────────────────┤ │ │ CVE-2023-50495 │ │ │ │ │ ncurses: segmentation fault via _nc_wrap_entry() │ │ │ │ │ │ │ │ https://avd.aquasec.com/nvd/cve-2023-50495 │ ├──────────────────┼────────────────┤ │ │ ├───────────────┼──────────────────────────────────────────────────────────────┤ │ ncurses-bin │ CVE-2023-45918 │ │ │ │ │ ncurses 6.4-20230610 has a NULL pointer dereference in │ │ │ │ │ │ │ │ tgetstr in tinf ...... │ │ │ │ │ │ │ │ https://avd.aquasec.com/nvd/cve-2023-45918 │ │ ├────────────────┤ │ │ ├───────────────┼──────────────────────────────────────────────────────────────┤ │ │ CVE-2023-50495 │ │ │ │ │ ncurses: segmentation fault via _nc_wrap_entry() │ │ │ │ │ │ │ │ https://avd.aquasec.com/nvd/cve-2023-50495 │ ├──────────────────┼────────────────┤ │ ├──────────────────────────┼───────────────┼──────────────────────────────────────────────────────────────┤ │ passwd │ CVE-2023-29383 │ │ │ 1:4.8.1-2ubuntu2.2 │ │ shadow: Improper input validation in shadow-utils package │ │ │ │ │ │ │ │ utility chfn │ │ │ │ │ │ │ │ https://avd.aquasec.com/nvd/cve-2023-29383 │ └──────────────────┴────────────────┴──────────┴──────────┴──────────────────────────┴───────────────┴──────────────────────────────────────────────────────────────┘ 2024-05-29T11:12:12-04:00 INFO Table result includes only package filenames. Use '--format json' option to get the full path to the package file. Java (jar) Total: 9 (UNKNOWN: 0, LOW: 0, MEDIUM: 5, HIGH: 4, CRITICAL: 0) ┌─────────────────────────────────────────────────────────────┬────────────────┬──────────┬────────┬───────────────────┬────────────────────────┬──────────────────────────────────────────────────────────────┐ │ Library │ Vulnerability │ Severity │ Status │ Installed Version │ Fixed Version │ Title │ ├─────────────────────────────────────────────────────────────┼────────────────┼──────────┼────────┼───────────────────┼────────────────────────┼──────────────────────────────────────────────────────────────┤ │ ch.qos.logback:logback-classic (iceberg-rest-image-all.jar) │ CVE-2023-6378 │ HIGH │ fixed │ 1.2.10 │ 1.3.12, 1.4.12, 1.2.13 │ logback: serialization vulnerability in logback receiver │ │ │ │ │ │ │ │ https://avd.aquasec.com/nvd/cve-2023-6378 │ ├─────────────────────────────────────────────────────────────┤ │ │ │ │ │ │ │ ch.qos.logback:logback-core (iceberg-rest-image-all.jar) │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ ├─────────────────────────────────────────────────────────────┼────────────────┼──────────┤ ├───────────────────┼────────────────────────┼──────────────────────────────────────────────────────────────┤ │ com.nimbusds:nimbus-jose-jwt (iceberg-rest-image-all.jar) │ CVE-2023-52428 │ MEDIUM │ │ 9.30.2 │ 9.37.2 │ Denial of Service in Connect2id Nimbus JOSE+JWT │ │ │ │ │ │ │ │ https://avd.aquasec.com/nvd/cve-2023-52428 │ ├─────────────────────────────────────────────────────────────┼────────────────┤ │ ├───────────────────┼────────────────────────┼──────────────────────────────────────────────────────────────┤ │ io.netty:netty-codec-http (iceberg-rest-image-all.jar) │ CVE-2024-29025 │ │ │ 4.1.100.Final │ 4.1.108.Final │ netty-codec-http: Allocation of Resources Without Limits or │ │ │ │ │ │ │ │ Throttling │ │ │ │ │ │ │ │ https://avd.aquasec.com/nvd/cve-2024-29025 │ ├─────────────────────────────────────────────────────────────┼────────────────┼──────────┤ ├───────────────────┼────────────────────────┼──────────────────────────────────────────────────────────────┤ │ org.apache.commons:commons-compress │ CVE-2024-25710 │ HIGH │ │ 1.24.0 │ 1.26.0 │ commons-compress: Denial of service caused by an infinite │ │ (iceberg-rest-image-all.jar) │ │ │ │ │ │ loop for a corrupted... │ │ │ │ │ │ │ │ https://avd.aquasec.com/nvd/cve-2024-25710 │ │ ├────────────────┤ │ │ │ ├──────────────────────────────────────────────────────────────┤ │ │ CVE-2024-26308 │ │ │ │ │ commons-compress: OutOfMemoryError unpacking broken Pack200 │ │ │ │ │ │ │ │ file │ │ │ │ │ │ │ │ https://avd.aquasec.com/nvd/cve-2024-26308 │ ├─────────────────────────────────────────────────────────────┼────────────────┼──────────┤ ├───────────────────┼────────────────────────┼──────────────────────────────────────────────────────────────┤ │ org.apache.commons:commons-configuration2 │ CVE-2024-29131 │ MEDIUM │ │ 2.8.0 │ 2.10.1 │ commons-configuration: StackOverflowError adding property in │ │ (iceberg-rest-image-all.jar) │ │ │ │ │ │ AbstractListDelimiterHandler.flattenIterator() │ │ │ │ │ │ │ │ https://avd.aquasec.com/nvd/cve-2024-29131 │ │ ├────────────────┤ │ │ │ ├──────────────────────────────────────────────────────────────┤ │ │ CVE-2024-29133 │ │ │ │ │ commons-configuration: StackOverflowError calling │ │ │ │ │ │ │ │ ListDelimiterHandler.flatten(Object, int) with a cyclical │ │ │ │ │ │ │ │ object tree │ │ │ │ │ │ │ │ https://avd.aquasec.com/nvd/cve-2024-29133 │ ├─────────────────────────────────────────────────────────────┼────────────────┤ │ ├───────────────────┼────────────────────────┼──────────────────────────────────────────────────────────────┤ │ org.apache.zookeeper:zookeeper (iceberg-rest-image-all.jar) │ CVE-2024-23944 │ │ │ 3.8.3 │ 3.8.4, 3.9.2 │ Information disclosure in persistent watchers handling in │ │ │ │ │ │ │ │ Apache ZooKe ... │ │ │ │ │ │ │ │ https://avd.aquasec.com/nvd/cve-2024-23944 │ └─────────────────────────────────────────────────────────────┴────────────────┴──────────┴────────┴───────────────────┴────────────────────────┴──────────────────────────────────────────────────────────────┘ ```

To adopt Nessie, I don’t see an use case right away other the REST iceberg backend and feel the same as you about the maintenance not to mention the maintenance of iceberg tables is complicated with Nessie.

Nessie - Apache Iceberg™.

Polaris seems to be promising but it is yet to have it’s first release and with no UI support it will be cumbersome to manage Access Control.

A plain JDBC catalog would have been easy if Dremio Software had a support for that.

For what’s its worth, we may end up using Iceberg on S3 with a lock manager but would like to know your implementation pitfalls or any open source lock manager you can recommend that’s worth trying out!!

tid · October 22, 2024, 9:20am

Many thanks, Steen! @wundi

I lost track of the discussion because of other topics. There doesn’t seem to be much official communication about the catalog topic from Dremio, unfortunately. I heared some rumours at the first “Data Lakehouse Bytes” meetup in Germany recently that something is coming with 25.2 and that Dremio is leaning towards Polaris, and later generic support for Iceberg REST.

Would love to hear something official though. @bali maybe?

Many thanks, Tim

balaji.ramaswamy · October 28, 2024, 4:45am

@tid

Dr Tim, I am discussing this internally and will get back on this thread

Thanls
Bali

dotjdk · October 28, 2024, 6:36am

I just tested the newly released 25.2.0, and it seems to work with our custom version of the Tabular REST catalog (only custom because we added support for Hadoop Catalog Tables)

Looks like generic REST catalog support to me, even though the release notes specifically mentions Polaris and Unity

wundi · October 28, 2024, 7:32am

The support for REST catalog seems to be there now in the new 25.2.0 release (if you set the plugins.restcatalog.enabled support setting to true).
We’ll explore that to confirm it’ll work the way we would expect. We’re able to list our tables by exposing our file-based Hadoop catalogs via a modified Tabular REST catalog, so initial trials are positive.

Polaris seems to be promising but it is yet to have it’s first release and with no UI support it will be cumbersome to manage Access Control.
Polaris seems promising, but there are some blockers and concerns for us:

We run on-prem and use MinIO, which Polaris doesn’t currently support
I am a bit concerned about the potential performance overhead of credential vending
Initial bootstrapping seems a bit on the complicated side (principals, catalog/principal roles, RBAC etc.)
is obviously a showstopper for us. 2. we’d have to test and estimate the impact on performance, if any. 3. is only based on glossing the documentation, so it may turn out to be rather trivial in practice.

For what’s its worth, we may end up using Iceberg on S3 with a lock manager but would like to know your implementation pitfalls or any open source lock manager you can recommend that’s worth trying out!!

I would advise against it, if at all possible. We implemented our own, custom locking implementation, based on the AWS DynamoDB lock manager and backed by Zookeeper (because we’re using that already for other purposes), but it took some time before we were confident that the issues we were having wasn’t with out locking implementation. And even when it works properly, any client has to “opt in” and do the right thing. for the locking to be correct. One bad actor, willingly or not, is enough to corrupt the table. You can reasonably control it in a super small organization, but it becomes difficult rather quickly.

We decided to do the above when the catalog landscape (and support) was vastly different. I’d go for any catalog if starting from scratch today. One new candidate to the list would be this one:

It still seems like early days, but they do seem so suffer from some of the same issues we have running on-prem. I haven’t tested their catalog in practice, but I thought I’d mention it

Topic		Replies	Views
Dremio Iceberg JDBC catalog	9	2009	May 24, 2023
Can I use Dremio as Catalog instead of Hive for Iceberg table in Spark?	3	97	May 19, 2025
Dremio S3 Iceberg Catalog	4	251	July 15, 2025
Reading from s3tables catalog	11	92	July 4, 2025
Can Dremio be co-used with other compute engine for modifying Iceberg table?	1	1008	September 20, 2022

Iceberg: Choosing a catalog when using "Dremio Software" with other compute engines

Related topics