Best Catalog for Apache Iceberg w/ Dremio?

Q: Is there a recommended catalog to use with Dremio and Apache Iceberg?

I’m new to Apache Iceberg and Dremio. I’m using MinIO for S3 because we’re doing on-prem, and using docker to run all our services on a single computer (for now).

I ran the Apache Iceberg quick start demo and it includes a REST catalog: apache/iceberg-rest-fixture

Do I understand correctly that Dremio can’t talk to these simple REST catalogs? I swapped it for Nessie but then discovered that some of the Python SQL examples that come with the tabulario/spark-iceberg example, don’t work:

%%sql

ALTER TABLE nyc.taxis RENAME COLUMN fare_amount TO fare

...

AnalysisException: [UNSUPPORTED_FEATURE.TABLE_OPERATION] The feature is not supported: Table `spark_catalog`.`nyc`.`taxis` does not support RENAME COLUMN. Please check the current catalog and namespace to make sure the qualified table name is expected, and also check the catalog implementation which is configured by "spark.sql.catalog".

It seems that my spark instance isn’t picking up my configuration changes, so the above error message is related to spark using some default built-in catalog. I’ve tried both setting the spark-defaults.conf and also passing the configuration at runtime, neither is working. But since this is a spark issue and not Dremio, I’ll take my question to the spark community.

HI coldfusioncoder,

it seems from the link you’ve shared that you’ve configured a Hadoop catalog, which stores metadata in the filesystem. While it supports most operations, Spark’s SQL parser and Iceberg integration do not fully support ALTER TABLE … RENAME COLUMN.
iceberg-rest-fixture is actually just a mock REST-catalog for Iceberg using Hadoop catalog underneath. REST-catalog is just an API definition but it does not mean all implementations are feature complete and also the engine (Spark here) needs to implement the SQL features.
Regarding your initial question, I would recommend Nessie catalog, which is natively supported by Dremio and is fairly mature for usage with Spark, including lots of features. Maybe this link here helps you with the setup: Nessie + Iceberg + Spark - Project Nessie: Transactional Catalog for Data Lakes with Git-like semantics .