Registering existing iceberg table in nessie catalog

Hello,

As many of you, I did the same mistake and after reboot my nessie catalog was gone.

Thanks to @AlexMerced and @tolgaevren I am now able to configure nessie correctly so it does persist the catalog after reboot.

However I would like to avoid to recreate the data in S3 (re-run the CREATE TABLE as select * command), because it takes a few hours. I have done this already in the past and the data is already in Iceberg format in S3 with the metadata.json file, so there must be a better way to simply register it in the catalog no?

Isn’t there a way to perform that using the Dremio UI ? Or using the nessie api with a simple curl / postman?

There isn’t a register table method in Dremio

to move the tables in S3 in Dremio

  • create another dremio source pointing to S3
  • Then do a CTAs between the sources

iceberg metadata use absolute file paths so you have to rewrite the metadata anyways if the location of the files are going to change.

If you were moving catalogs not storage locations then you’d be able to use the register table method in Spark or the catalog migrator tool which can be found in the Nessie or Polaris github repo which would move the metadata reference without rewriting the data but this assumes the data location is the same.

there is also a change_prefix Spark method for iceberg which you can read about in the iceberg docs at iceberg.apache.org for rewriting the metadata if you want to move the data