We are using DBT + Dremio CE 24.3.0 to create materialized iceberg views of our queries.
We instruct the the DBT to create the ICEBERG table on top of a specific S3 bucket, as it doesn’t support Glue integration.
We wish to be able to access these iceberg tables externally using PyIceberg.
While static tables access is an option, we wish to be to do this using a standalone catalog.
What catalog does dremio use when Iceberg tables are created on top of S3 buckets? is it externally accessible? is it configurable?
@sheinbergon@balaji.ramaswamy So, do we have a solution to this, how can iceberg tables with data in S3 (& default cataloging Hadoop) be able to be access externally using PyIceberg. Would like know if the default Hadoop catalog (on Dremio) is exposed via url, using Dremio 25.2.x
Yes. I Just use dbt-dremio and create using Glue as the the table storage, not S3. Though not clearly documented in dbt-dremio adapter, it works perfectly OTB. I can then access these tables either through Dremio (Glue) or directly via PyIceberg’s Glue catalog spec
@sheinbergon Thanks for the response, yes I was used dbt-dremio to create iceberg tables but it will use default hadoop catalog not Glue catalog. You mentioned after you inserted thru dbt, you were able to access via PyIceberg’s glue catalog, can you help me out here…how did you get the cataloging in Glue when it was in Hadoop (which Dremio used when inserting via dbt-dremio). From what I understand, cataloging and data are tied together, only query engine can differ (Dremio or pyiceberg). I want to access data inserted thru dbt-dremio using PyIceberg