Dremio master fails to start on kubernetes

Hello everyone.

I am trying to restart a Dremio that was deployed on kubernetes (only 1 master and 1 executor).

This deployment was running fine until someone deleted the master pod. Naturally, kubernetes redeployed the deleted pod, as expected, but it fails during startup trying to reach some s3 folders (currently hosted on MinIO in another cluster).

Apparently, it complains about “java.io.FileNotFoundException: Path not found: /dremio-homo/accelerator” even though the bucket is completely accesible and the folder exists and has some content on it from the already working deployment.

Here are the logs of the master pod and a trace for the minio instance during the master startup:
logs.zip (175,8,KB)

Currently, we are using Dremio 25.0.0. Also, here are the configs for the s3 storage:

aws:
    bucketName: "dremio-homo"
    path: "/"
    authentication: "accessKeySecret"
    credentials:
      accessKey: "dremio"
      secret: "mysecretkey"
    extraProperties: |
      <property>
          <name>fs.dremioS3.impl</name>
          <description>The FileSystem implementation. Must be set to com.dremio.plugins.s3.store.S3FileSystem</description>
          <value>com.dremio.plugins.s3.store.S3FileSystem</value>
      </property>
      <property>
          <name>fs.s3a.access.key</name>
          <description>Minio server access key ID.</description>
          <value>dremio</value>
      </property>
      <property>
          <name>fs.s3a.secret.key</name>
          <description>Minio server secret key.</description>
          <value>mysecretkey</value>
      </property>
      <property>
          <name>fs.s3a.aws.credentials.provider</name>
          <description>The credential provider type.</description>
          <value>org.apache.hadoop.fs.s3a.SimpleAWSCredentialsProvider</value>
      </property>
      <property>
          <name>fs.s3a.endpoint</name>
          <description>Endpoint can either be an IP or a hostname, where Minio server is running . However the endpoint value cannot contain the http(s) prefix. E.g. 175.1.2.3:9000 is a valid endpoint. </description>
          <value>mys3endpoint:30000</value>
      </property>
      <property>
          <name>fs.s3a.path.style.access</name>
          <description>Value has to be set to true.</description>
          <value>true</value>
      </property>
      <property>
          <name>dremio.s3.compat</name>
          <description>Value has to be set to true.</description>
          <value>true</value>
      </property>
      <property>
          <name>fs.s3a.connection.ssl.enabled</name>
          <description>Value can either be true or false, set to true to use SSL with a secure Minio server.</description>
          <value>false</value>
      </property>

As this can help someone:

Turns out that the problem is the parameter fs.s3a.endpoint. We’ve been using a DNS adress so far without much problems, but this time, somehow, didn’t work out. Changing it to the IP address solved the problem.

Also, this is a known issue for older versions, but it should have been corrected on 22.1 as can be seem on Dremio fails to start after upgrading to 22.1 - Dremio.