SQL via Dremio onto S3/Minio failing

hi hi all,
hope someone can assist.
I’ve created a iceberg table out of parquet files, written all of this to my S3 objet store/based on MinIO.
I can query the table, using duckdb by pointing it at the version-hint.txt and to the v1.metadata.json which only have the value 1 in them.

other *.metadata.json files do have a contents like below.
please see the attached screen shot when trying to execute sql from within dremio.

```

{"location":"s3://warehouse/output/asyncout","table-uuid":"6f993aef-f584-4213-940c-ee8e1535f6a6","last-updated-ms":1770132060110,"last-column-id":55,"schemas":[{"type":"struct","fields":[{"id":1,"name":"accountBICFI","type":"string","required":false},{"id":2,"name":"accountAgentId","type":"string","required":false},{"id":3,"name":"accountNumber","type":"string","required":false},{"id":4,"name":"chargeBearer","type":"string","required":false},{"id":5,"name":"accountEntityId","type":"string","required":false},{"id":6,"name":"accountId","type":"string","required":false},{"id":7,"name":"accountIdCode","type":"string","required":false},{"id":8,"name":"accountCustomerEntityId","type":"string","required":false},{"id":9,"name":"accountCustomerId","type":"string","required":false},{"id":10,"name":"fromId","type":"string","required":false},{"id":11,"name":"accountName","type":"string","required":false},{"id":12,"name":"currency","type":"string","required":false},{"id":13,"name":"amount","type":"double","required":false},{"id":14,"name":"baseCurrency","type":"string","required":false},{"id":15,"name":"baseAmount","type":"double","required":false},{"id":16,"name":"counterpartyAgentId","type":"string","required":false},{"id":17,"name":"counterpartyBICFI","type":"string","required":false},{"id":18,"name":"counterpartyEntityId","type":"string","required":false},{"id":19,"name":"counterpartyId","type":"string","required":false},{"id":20,"name":"counterpartyIdCode","type":"string","required":false},{"id":21,"name":"counterpartyNumber","type":"string","required":false},{"id":22,"name":"counterpartyCustomerEntityId","type":"string","required":false},{"id":23,"name":"counterpartyCustomerId","type":"string","required":false},{"id":24,"name":"counterpartyDomain","type":"string","required":false},{"id":25,"name":"toId","type":"string","required":false},{"id":26,"name":"counterpartyName_fullName","type":"string","required":false},{"id":27,"name":"direction","type":"string","required":false},{"id":28,"name":"eventId","type":"string","required":false},{"id":29,"name":"eventTime","type":"string","required":false},{"id":30,"name":"eventType","type":"string","required":false},{"id":31,"name":"transactionId","type":"string","required":false},{"id":32,"name":"transactionType","type":"string","required":false},{"id":33,"name":"instructedAgentId","type":"string","required":false},{"id":34,"name":"instructingAgentId","type":"string","required":false},{"id":35,"name":"paymentClearingSystemReference","type":"string","required":false},{"id":36,"name":"remittanceId","type":"string","required":false},{"id":37,"name":"localInstrument","type":"string","required":false},{"id":38,"name":"msgStatus","type":"string","required":false},{"id":39,"name":"msgType","type":"string","required":false},{"id":40,"name":"paymentMethod","type":"string","required":false},{"id":41,"name":"paymentReference","type":"string","required":false},{"id":42,"name":"settlementClearingSystemCode","type":"string","required":false},{"id":43,"name":"settlementDate","type":"string","required":false},{"id":44,"name":"settlementMethod","type":"string","required":false},{"id":45,"name":"tenantId","type":"string","required":false},{"id":46,"name":"schemaVersion","type":"long","required":false},{"id":47,"name":"creationDate","type":"string","required":false},{"id":48,"name":"requestExecutionDate","type":"string","required":false},{"id":49,"name":"metadataEventId","type":"string","required":false},{"id":50,"name":"metadataEventTime","type":"string","required":false},{"id":51,"name":"receivedTime","type":"string","required":false},{"id":52,"name":"systemEventId","type":"string","required":false},{"id":53,"name":"financialValue","type":"double","required":false},{"id":54,"name":"json_file","type":"string","required":false},{"id":55,"name":"line","type":"long","required":false}],"schema-id":0,"identifier-field-ids":[]}],"current-schema-id":0,"partition-specs":[{"spec-id":0,"fields":[]}],"default-spec-id":0,"last-partition-id":999,"properties":{},"snapshots":[],"snapshot-log":[],"metadata-log":[],"sort-orders":[{"order-id":0,"fields":[]}],"default-sort-order-id":0,"refs":{},"statistics":[],"partition-statistics":[],"format-version":2,"last-sequence-number":0}

Welcome to the community!

Add the minio S3 source in dremio first → Amazon S3 | Dremio Documentation

The entire environment got stood up using the attached docker-compose. which included a dremio init block.

This included section to add the Minio source.

###########################################
#
# Dremio + Nessie + MinIO
#
###########################################

#    MINIO_ROOT_USER
services:

  # Init container to fix Nessie volume permissions
  nessie-init:
    image: alpine:latest
    container_name: nessie-init
    command: sh -c "adduser -D -u 185 default || true && chown -R default:root /data && chmod -R 775 /data"
    volumes:
      - nessie-data:/data
    restart: "no"
    labels:
      - "com.${COMPOSE_PROJECT_NAME}.service=init"
      - "com.${COMPOSE_PROJECT_NAME}.description=Initialize Nessie volume permissions"

  # Nessie Catalog Server Using RocksDB (Persistent Storage)
  nessie:
    image: projectnessie/nessie:latest
    container_name: nessie
    depends_on:
      nessie-init:
        condition: service_completed_successfully
    environment:
      - QUARKUS_OIDC_ENABLED=false
      - NESSIE_SERVER_AUTHENTICATION_ENABLED=false
      - NESSIE_VERSION_STORE_TYPE=ROCKSDB
    volumes:
      - nessie-data:/tmp/nessie-rocksdb-store
    ports:
      - "${NESSIE_PORT:-19120}:19120"
    restart: unless-stopped
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:19120/api/v2/config"]
      interval: 30s
      timeout: 10s
      retries: 3
      start_period: 40s
    labels:
      - "com.${COMPOSE_PROJECT_NAME}.service=catalog"
      - "com.${COMPOSE_PROJECT_NAME}.description=Nessie Catalog Server"


  # Dremio Query Engine
  dremio:
    platform: linux/x86_64
    image: dremio/dremio-oss:latest
    container_name: dremio
    ports:
      - "${DREMIO_UI_PORT:-9047}:9047"
      - "${DREMIO_CLIENT_PORT:-31010}:31010"
      - "${DREMIO_FLIGHT_PORT:-32010}:32010"
    environment:
      - DREMIO_JAVA_SERVER_EXTRA_OPTS=-Dpaths.dist=file:///opt/dremio/data/dist
    volumes:
      - dremio-data:/opt/dremio/data
    restart: unless-stopped
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:9047"]
      interval: 30s
      timeout: 10s
      retries: 5
      start_period: 60s
    labels:
      - "com.${COMPOSE_PROJECT_NAME}.service=query-engine"
      - "com.${COMPOSE_PROJECT_NAME}.description=Dremio SQL Query Engine"

  # Dremio Init Container - Create Admin User and Nessie Catalog
  dremio-init:
    image: alpine:latest
    container_name: dremio-init
    depends_on:
      dremio:
        condition: service_healthy
      nessie:
        condition: service_healthy
      minio:
        condition: service_healthy
      minio-init:
        condition: service_completed_successfully
    command: >
      sh -c '
        echo "Installing dependencies...";
        apk add --no-cache curl jq >/dev/null 2>&1;

        echo "Waiting for Dremio to be ready...";
        for i in $$(seq 1 30); do
          if curl -f http://dremio:9047 >/dev/null 2>&1; then
            echo "Dremio is ready!";
            break;
          fi;
          echo "Waiting... ($$i/30)";
          sleep 2;
        done;

        echo "Creating admin user...";
        response=$$(curl -s -w "\n%{http_code}" -X PUT "http://dremio:9047/apiv2/bootstrap/firstuser" \
          -H "Content-Type: application/json" \
          -H "Authorization: null" \
          --data-raw "{\"userName\":\"${DREMIO_ADMIN_USERNAME:-mnadmin}\",\"firstName\":\"George\",\"lastName\":\"Leonard\",\"email\":\"georgelza@gmail.com\",\"createdAt\":$$(date +%s)000,\"password\":\"${DREMIO_ADMIN_PASSWORD:-mnpassword1}\"}");

        http_code=$$(echo "$$response" | tail -n1);
        body=$$(echo "$$response" | head -n-1);

        if [ "$$http_code" = "200" ]; then
          echo "✅ Admin user created successfully!";
          echo "   Username: ${DREMIO_ADMIN_USERNAME:-mnadmin}";
          echo "   Password: ${DREMIO_ADMIN_PASSWORD:-mnpassword1}";
        elif [ "$$http_code" = "400" ]; then
          echo "ℹ️  Admin user already exists (HTTP 400)";
        else
          echo "⚠️  Unexpected response (HTTP $$http_code): $$body";
          exit 1;
        fi;

        echo "";
        echo "Logging in to get token...";
        sleep 2;
        login_response=$$(curl -s -w "\n%{http_code}" -X POST "http://dremio:9047/apiv2/login" \
          -H "Content-Type: application/json" \
          --data-raw "{\"userName\":\"${DREMIO_ADMIN_USERNAME:-mnadmin}\",\"password\":\"${DREMIO_ADMIN_PASSWORD:-mnpassword1}\"}");
        login_code=$$(echo "$$login_response" | tail -n1);
        login_body=$$(echo "$$login_response" | head -n-1);

        if [ "$$login_code" = "200" ]; then
          TOKEN=$$(echo "$$login_body" | jq -r ".token");
          echo "✅ Login successful";
        else
          echo "❌ Login failed (HTTP $$login_code): $$login_body";
          exit 1;
        fi;

        echo "Token extracted: $${TOKEN:0:20}...";
        if [ -z "$$TOKEN" ] || [ "$$TOKEN" = "null" ]; then
          echo "❌ Failed to extract token from login response";
          echo "Response body: $$login_body";
          exit 1;
        fi;

        echo "";
        echo "Creating Nessie catalog source...";
        catalog_response=$$(curl -s -w "\n%{http_code}" -X POST "http://dremio:9047/api/v3/catalog" \
          -H "Content-Type: application/json" \
          -H "Authorization: _dremio$$TOKEN" \
          --data-raw "{
            \"entityType\": \"source\",
            \"type\": \"NESSIE\",
            \"name\": \"nessie\",
            \"config\": {
              \"nessieEndpoint\": \"http://nessie:19120/api/v2\",
              \"nessieAuthType\": \"NONE\",
              \"credentialType\": \"ACCESS_KEY\",
              \"awsAccessKey\": \"${S3_ACCESS_KEY_ID:-mnadmin}\",
              \"awsAccessSecret\": \"${S3_SECRET_ACCESS_KEY:-mnpassword}\",
              \"awsRootPath\": \"/${S3_BUCKET}\",
              \"secure\": false,
              \"propertyList\": [
                {\"name\": \"fs.s3a.path.style.access\", \"value\": \"true\"},
                {\"name\": \"fs.s3a.endpoint\", \"value\": \"minio:9000\"},
                {\"name\": \"dremio.s3.compat\", \"value\": \"true\"}
              ]
            },
            \"metadataPolicy\": {
              \"authTTLMs\": 86400000,
              \"namesRefreshMs\": 3600000,
              \"datasetRefreshAfterMs\": 3600000,
              \"datasetExpireAfterMs\": 10800000,
              \"datasetUpdateMode\": \"PREFETCH_QUERIED\",
              \"deleteUnavailableDatasets\": true,
              \"autoPromoteDatasets\": false
            }
          }");

        catalog_code=$$(echo "$$catalog_response" | tail -n1);
        catalog_body=$$(echo "$$catalog_response" | head -n-1);

        if [ "$$catalog_code" = "200" ]; then
          echo "✅ Nessie catalog created successfully!";
          echo "   Name: nessie";
          echo "   Endpoint: http://nessie:19120/api/v2";
          echo "   Storage: MinIO (s3://warehouse)";
        elif echo "$$catalog_body" | grep -q "already exists"; then
          echo "ℹ️  Nessie catalog already exists";
        else
          echo "⚠️  Failed to create Nessie catalog (HTTP $$catalog_code)";
          echo "Response: $$catalog_body";
        fi;

        echo "";
        echo "🎉 Dremio initialization complete!";
      '
    restart: "no"
    labels:
      - "com.${COMPOSE_PROJECT_NAME}.service=init"
      - "com.${COMPOSE_PROJECT_NAME}.description=Initialize Dremio admin user and Nessie catalog"


  # MinIO Storage Server
  minio:
    image: quay.io/minio/minio:RELEASE.2024-12-18T13-15-44Z
    container_name: minio
    environment:
      - MINIO_ROOT_USER=${S3_ACCESS_KEY_ID:-mnadmin}
      - MINIO_ROOT_PASSWORD=${S3_SECRET_ACCESS_KEY:-mnpassword}
      - MINIO_DOMAIN=storage
      - MINIO_REGION_NAME=${S3_REGION:-af-south-1}
      - MINIO_REGION=${S3_REGION:-af-south-1}
    ports:
      - "${MINIO_API_PORT:-9000}:9000"
      - "${MINIO_CONSOLE_PORT:-9001}:9001"
    volumes:
      - minio-data:/data
    command: ["server", "/data", "--console-address", ":9001"]
    restart: unless-stopped
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:9000/minio/health/live"]
      interval: 30s
      timeout: 10s
      retries: 3
      start_period: 30s
    labels:
      - "com.${COMPOSE_PROJECT_NAME}.service=storage"
      - "com.${COMPOSE_PROJECT_NAME}.description=MinIO S3-compatible Object Storage"

  # Init container to create MinIO buckets
  minio-init:
    image: minio/mc:latest
    container_name: minio-init
    depends_on:
      minio:
        condition: service_healthy
    entrypoint: >
      sh -c '
        echo "Configuring MinIO client...";
        mc alias set ${MINIO_ALIAS} http://minio:9000 ${S3_ACCESS_KEY_ID:-mnadmin} ${S3_SECRET_ACCESS_KEY:-mnpassword};

        echo "Creating ${S3_BUCKET} bucket...";
        if mc ls ${MINIO_ALIAS}/${S3_BUCKET} >/dev/null 2>&1; then
          echo "ℹ️  Bucket ${S3_BUCKET} already exists";
        else
          mc mb ${MINIO_ALIAS}/${S3_BUCKET};
          echo "✅ Bucket ${S3_BUCKET} created successfully!";
        fi;

        echo "🎉 MinIO initialization complete!";
      '
    restart: "no"
    labels:
      - "com.${COMPOSE_PROJECT_NAME}.service=init"
      - "com.${COMPOSE_PROJECT_NAME}.description=Initialize MinIO buckets"


networks:
  default:
    name: ${COMPOSE_PROJECT_NAME}


volumes:
  nessie-data:
    name: nessie-data
    labels:
      - "com.${COMPOSE_PROJECT_NAME}.volume=nessie"
      - "com.${COMPOSE_PROJECT_NAME}.description=Nessie RocksDB catalog data"
    driver: local
    driver_opts:
      type: none
      o: bind
      device: ./data/nessie-data

  dremio-data:
    name: dremio-data
    labels:
      - "com.${COMPOSE_PROJECT_NAME}.volume=dremio"
      - "com.${COMPOSE_PROJECT_NAME}.description=Dremio metadata and configuration"
    driver: local
    driver_opts:
      type: none
      o: bind
      device: ./data/dremio-data

  minio-data:
    name: minio-data
    labels:
      - "com.${COMPOSE_PROJECT_NAME}.volume=minio"
      - "com.${COMPOSE_PROJECT_NAME}.description=MinIO S3 object storage"
    driver: local
    driver_opts:
      type: none
      o: bind
      device: ./data/minio

No. Add Minio as a “source” in Dremio.

Watch this tutorial → https://youtu.be/99Tr5_xsdbQ?si=TTIcXxYOHeJd5V2u&t=879 specifically from 14m 40s onward.

will look, will follow t diagnose the problem figure out which it is not, but in the end need this stack to come up from nothing with the single docker-compose up command, it’s part of a lab.
so need the dremio-init to do this.

G

so wonder if the error here is a finger pointing at the same problem being picked up by the automated deploy.

G

more info.

so made some progress, I think.
I’ve gotten it so that when I define a S3 source, it returns the buckets in that source, but it’s failing as per screen shot the second I click on any bucket.

.

.

.

anyone able to advise whats causing the error.
G