Hi ,
I am using local docker setup with minio and nessie. When i restart the containers, nessie catalog does not see the tables . This is what i see after restart:
I tried to add the table folders but i can’t. How can i solve this?
thanks
tolga
If you are running the Docker Container for Nessie, you have to configure where it will store entries, if not it will by default use the memory store which would be cleared when the server shuts off.
You can find Nessie Configuration for Storage here: Configuration - Project Nessie: Transactional Catalog for Data Lakes with Git-like semantics
Hi Alex , thanks for your help. After setting the db configs, now my metadata is persisted.
Tolga
shifas
January 29, 2024, 10:12am
4
HI, can u please elaborate the config setup
Facing the same issue here
Hi Shifas,
This is my docker-compose.yml file (only nessie part:)
I am using a local postgre db for catalog data.I created a database called nessie in postgres. ( You can give any name)
nessie:
image: projectnessie/nessie:0.67.0
container_name: nessie
extra_hosts:
- “host.docker.internal:host-gateway”
environment:
NESSIE_VERSION_STORE_TYPE: JDBC
QUARKUS_DATASOURCE_JDBC_URL: “jdbc:postgresql://192.168.0.10:5432/nessie”
quarkus.datasource.username: “postgres”
quarkus.datasource.password: “changeme”
networks:
network:
ipv4_address: 192.168.0.19
ports:
- 19120:19120
thanks
tolga
1 Like
shifas
February 15, 2024, 6:57am
6
Have you checked the changes in postgresql, like how is it written on the database for persistence?
Yes i checked it. It creates 2 tables , and for each dml op, it updates these tables.Did you? If you have any insights pls share.
@tolgaevren Are you also losing your tables after a Dremio docker restart
No , table’s data (as parquet files) is in mino objects storage , catalog info is in database . It must be persistent. What happens at your site?
@tolgaevren Sorry the question was directed to @shifas
@shifas When you say you have the same issue, does that mean after a restart your tables no longer exist?
shifas
February 22, 2024, 6:26am
11
Yes, as mentioned by @tolgaevren , table’s data file exists in the MinIO storage. But catalog info (ie, tables) do not seem to appear on restart
@shifas got it, we nay have to look at the logfile and see what is going on. Can you provide the table name and the server.log when Dremio comes up. Also after it comes up, add the table again and query it and send us the profile and server.log for that time also
postgres:
image: postgres:latest
container_name: postgres
env_file:
- .env/postgres.env
ports:
- “5432:5432”
volumes:
- ./postgres:/var/lib/postgresql/data
healthcheck:
test: [“CMD-SHELL”, “pg_isready -U $${POSTGRES_USER} -d $${POSTGRES_DB}”]
interval: 10s
timeout: 5s
retries: 5
start_period: 30s
nessie:
image: projectnessie/nessie:latest
container_name: nessie
environment:
NESSIE_VERSION_STORE_TYPE: JDBC2
NESSIE_VERSION_STORE_PERSIST_JDBC_DATASOURCE: postgresql
QUARKUS_DATASOURCE_POSTGRESQL_JDBC_URL: “jdbc:postgresql://postgres:5432/nessie”
QUARKUS_DATASOURCE_POSTGRESQL_USERNAME: “nessie”
QUARKUS_DATASOURCE_POSTGRESQL_PASSWORD: “nessie”
NESSIE_SERVER_AUTHENTICATION_ENABLED: “false”
depends_on:
postgres:
condition: service_healthy
object-storage:
condition: service_healthy
ports:
- “19120:19120”
stdin_open: true
tty: true
restart: always
@tolgaevren my docker setup won’t work with postgres metadata backend. Could you please help?
EDIT: Used the latest docker image which is compatible with JDBC2
@armaangohil Postgres in the below example is a source in Dremio that they use to insert data into Nessie. Have you followed the steps in the below document?
1 Like
Thanks for sharing the link @balaji.ramaswamy I was able to achieve my desired outcome
Thanks for the feedback @armaangohil That is good to hear
abed
December 10, 2024, 9:42am
17
Thank you for this.
I experienced the same (when server is restarted, I can’t find iceberg tables) eventhough I followed steps provided here: dremio-example-environment/readme.md at main · developer-advocacy-dremio/dremio-example-environment · GitHub
In that guide, in the last part, there’s an insert data to Nessie as well.
What makes this different with the one you shared?
Many thanks!
@abed It is important the disk is persistent after restart