`UnknownHostException` when accessing S3 source, even with "Path Style Access" enabled

Hi all,

I’ve recently encountered an issue when trying to connect to an S3-compatible storage source in Dremio. I’m using the “Advanced Options” and have enabled fs.s3a.path.style.access, which previously worked just fine.

However, now I’m getting this error:

This error appears when I try to query from the lakehouse catalog (the error is because the dremio automatically add prefix bucket name at an endpoint, so that’s why not resolve)

UnknownHostException: <bucket_lakehouse_name>.example-endpoint.internal

When I try to query from object storage, the error is the same. The dremio automatically adds the prefix bucket name at an endpoint, but the bucket name is using distributed storage, like this

UnknownHostException: <bucket_distributed_name>.example-endpoint.internal

That error appears when I try to change the SSL certificate from MinIO and Dremio using a valid CA (wildcard certificate like *.example-endpoint.internal). Before that, I used self-signing for MinIO and Dremio, and it worked fine.

Here is the configuration

  • Lakehouse Catalog
Nessie Endpoint URL: https://catalog-lakehouse.example-endpoint.internal:19120/api/v2
AWS Root Path: lakehouse
Other
- fs.s3a.endpoint: s3-lakehouse-lb.example-endpoint.internal:9000
- dremio.s3.compat: true
- fs.s3a.path.style.access: true
  • Object Storage

It’s the same as Lakehouse Catalog but without Nessie

  • Dremio Configuration
  1. dremio.conf
paths: {
  # the local path for dremio to store data.
  local: "/data"

  # the distributed path Dremio data including job results, downloads, uploads, etc
  #dist: "pdfs://"${paths.local}"/pdfs"
  dist: "dremioS3:///dremio"
}

services: {
  web-admin: {
    host: "0.0.0.0",
    port: 9191
  },

  coordinator {
    enabled: true,
    master.enabled: true,
    web.ssl {
      enabled: true,
      auto-certificate.enabled: false,
      keyStore: "/opt/lakehouse/certs/dremio-keystore.jks",
      keyStorePassword: "XXX",
      trustStore: "/opt/lakehouse/certs/dremio-truststore.jks",
      trustStorePassword: "YYY"
    }
  },

  executor {
    enabled: true,
    cache: {
      enabled: true,
      path: {
        # db: ${paths.local},
        # fs: [${services.executor.cache.path.db}]
        # db: "/mnt/dremio/cachemanagerdisk",
        # fs: [
        #   "/mnt/dremio/cachemanagerdisk"
        # ]
        db: "/mnt/dremio/cachemanagerdisk/db",
        fs: [
          "/mnt/dremio/cachemanagerdisk/dir1"
        ]
      },
      # pctquota: {
      #   db: 70,
      #   fs: [${services.executor.cache.pctquota.db}]
      # },
      # ensurefreespace: {
      #   fs: [10]
      # }
    }
  },

  flight {
    use_session_service: true
  }
}

registration.publish-host: "lakehouse.example-endpoint.internal"
  1. core-site.xml
<?xml version="1.0"?>
<configuration>
<property>
    <name>fs.dremioS3.impl</name>
    <description>The FileSystem implementation. Must be set to com.dremio.plugins.s3.store.S3FileSystem</description>
    <value>com.dremio.plugins.s3.store.S3FileSystem</value>
</property>
<property>
    <name>fs.s3a.access.key</name>
    <description>Minio server access key ID.</description>
    <value>XXX</value>
</property>
<property>
    <name>fs.s3a.secret.key</name>
    <description>Minio server secret key.</description>
    <value>YYY</value>
</property>
<property>
    <name>fs.s3a.aws.credentials.provider</name>
    <description>The credential provider type.</description>
    <value>org.apache.hadoop.fs.s3a.SimpleAWSCredentialsProvider</value>
</property>
<property>
    <name>fs.s3a.endpoint</name>
    <description>Endpoint can either be an IP or a hostname, where Minio server is running. The endpoint value cannot contain the `http(s)://` prefix nor can it start with the string `s3`. For example, if the endpoint is `http://175.1.2.3:9000`, the value is `175.1.2.3:9000`.</description>
    <value>s3-lakehouse-lb.example-endpoint.internal:9000</value>
</property>
<property>
    <name>fs.s3a.path.style.access</name>
    <description>Value has to be set to true.</description>
    <value>true</value>
</property>
<property>
    <name>dremio.s3.compat</name>
    <description>Value has to be set to true.</description>
    <value>true</value>
</property>
<property>
    <name>fs.s3a.connection.ssl.enabled</name>
    <description>Value can either be true or false, set to true to use SSL with a secure Minio server.</description>
    <value>true</value>
</property>
</configuration>

I’ve double-checked the endpoint URL, credentials, and also ensured there are no DNS issues from the server side (I can resolve the domain via ping/curl).

Has anyone else experienced this or have suggestions on what to check next?

Thanks in advance!

Dear @armandwipangestu,

When configuring S3-compatible storage, please note that, due to limitations of the AWS SDK, Dremio does not support endpoints (fs.s3a.endpoint) that begin with the string s3. Would you be able to adjust the endpoint accordingly and try again?

For reference, please note the following links:

Best Regards,
Francisco

1 Like

I was not careful to read the description on core-site.xml. After I tried through /etc/hosts to test the dns name before going through the dns server, I tried to point to the new name in obj-lakehouse.example-endpoint.internal it can be queried now

Thank you,
Arman