Dremio Iceberg JDBC catalog

txalaparta · May 16, 2023, 7:11am

In my working scenario I have a client environment written in JAVA and setup a development architecture deploying GitHub - tabular-io/docker-spark-iceberg docker images and S3 compatible local storage (minio). In this case it using a rest iceberg catalog which is backed by a sqlite database. I am able to create and manipulate the iceberg tables programatically in spark (javas, sql, python) but when I try to connect them to dremio and this is the error I get:
This folder does not contain a filesystem-based Iceberg table. If the table in this folder is managed via a catalog such as Hive, Glue, or Nessie, please use a data source configured for that catalog to connect to this table.

On the other hand I have created successfully an Iceberg table from Dremio in the same Minio bucket.
I am wondering if a jdbc catalog (I could use postgreSQL for example) would be recognised by dremio… Or should I install HIVE and connect it to minio?
I would like to avoid the need for a spark environment if possible.
Thanks
Oskar

YuriyGavrilov · May 16, 2023, 6:41pm

same message, same problem.

balaji.ramaswamy · May 17, 2023, 1:26am

@txalaparta When you created the Iceberg table, which catalog was set?

txalaparta · May 18, 2023, 5:41am

I created the Iceberg table with a REST catalog built in tabulario/iceberg-rest docker image.

I also created another table in dremio and this saves data in minio. However cannot connect to it from JAVA API nor from PyIcberg. What is the type of the catalog created in dremio? According to dremio documentation at Dremio, mino should be a Hadoop Iceberg catalog right?
Thanks
Oskar

nicolas.guerra · May 18, 2023, 2:21pm

we are using dremio with minio and apache iceberg, by the moment the best result are with the iceberg catalog on hadoop type in the same bucket, because on dremio you must configure the catalog not the bucket on minio.
this is and examplo to create with spark the catalog on minio bucket,
spark.conf.set(‘spark.sql.catalog.silver_data’, ‘org.apache.iceberg.spark.SparkCatalog’)
spark.conf.set(‘spark.sql.catalog.silver_data.type’, ‘hadoop’)
spark.conf.set(‘spark.sql.catalog.silver_data.warehouse’, ‘s3a://warehouse-silver-data’)
then the table metadata and data area written in this bucket, and dremio can read that bucket and you can take every folder as table and format as apache iceberg,
we are wating for nessie integration…

txalaparta · May 19, 2023, 5:56am

Thanks for the info Nicolas.
It is very helpful.
I will try to configure a hadoop type catalog,

YuriyGavrilov · May 22, 2023, 2:53pm

I used trino icberg connector to create catalog but it does not work. Just can’t read (same message) Iceberg connector — Trino 418 Documentation

YuriyGavrilov · May 23, 2023, 8:26am

Thanks @nicolas.guerra

Tryed Docker, Spark, and Iceberg: The Fastest Way to Try Iceberg! • Tabular as rest catalog for the Trino (Iceberg connector — Trino 418 Documentation) Then create iceberg table in s3 path using Trino.
But anyway Dremio can’t read this folder and i receive same message “This folder does not contain filesystem based Iceberg table…” also tryed different s3 provider but same…

txalaparta · May 23, 2023, 8:43am

I´ve left this issue aside for a while and not planning to continue yet. In any case I think the solution is by implementing a hadoop or hive catalog instead of jdbc or rest. I am quite sure the last two will not work in dremio.
I saw some documentation on how to configure hadoop to connect to Minio and also some github to create such a catalog in Trino. (Sorry but I don´t have the links right now)
But yes, I seems that Spark (or Trino) is needed.

Good luck

nicolas.guerra · May 24, 2023, 7:44pm

i think the best way to work with minio iceberg, its getting the nessie conector, its pretty useful.
right now, we are using airflow-spark solution that can sabe data in raw format in iceberg tables using hadoop catalog, then with dremio we read all data formating the folder that contains data and metadata folders, for ddl task and datamanagement we are using jupyter notebooks.

Topic		Replies	Views
Best Catalog for Apache Iceberg w/ Dremio? Apache Iceberg	2	107	September 3, 2025
Dremio S3 Iceberg Catalog	5	335	July 31, 2025
Can I use Dremio as Catalog instead of Hive for Iceberg table in Spark?	3	114	May 19, 2025
Read iceberg table from S3	11	2755	June 16, 2022
Iceberg support GA	13	1826	November 7, 2021

Dremio Iceberg JDBC catalog

Related topics