Using S3 as a Distributed Storage for Reflections

I have configured my dremio.conf file so that my distributed storage goes to an s3 bucket, as follows:

paths: {
  # the local path for dremio to store data.
  local: "/var/lib/dremio"

  # the distributed path Dremio data including job results, downloads, uploads, etc
  dist: "s3://datasprints-dremio-test-dist-storage/"

And configured the core-site.xml file with the access key and secret key of the bucket owner, as follows:

    <description>AWS access key ID.</description>
    <value>I HAVE PUT MY ACCESS KEY HERE</value>
    <description>AWS secret key.</description>

Despite that, dremio stores them locally, in a folder called pdfs (which is the default storage directory).
Here you can see the terminal output:

[ec2-user@ip-**-*-***-*** dremio]$ sudo updatedb
[ec2-user@ip-**-*-***-*** dremio]$ sudo locate dremio | grep pdfs

I have already reseted the machine after changing the config files.
So, what exactly am I doing wrong here?
Why is it storing locally instead of storing on the S3 bucket specified?

edit: this is the job profile of one of those reflection queries that was stored locally (11.7 KB)

P.S: Just to give a definitive answer to this thread, the path syntax that i tested and that worked are:

dremioS3:///path/to/folder (notice the three slashes - ///, and the capital S in S3 - S)


1 Like

hi, any explanation about the 3 slashes at the start ? I thought it was a typo in the documentation.

1 Like

Did you ever solve this? We are stuck in the same spot as you. Everything is setup just as documented and we are currently getting:

Caused by: Unable to find bucket named xxx-xxx-xxx-xxx.
at com.dremio.plugins.util.ContainerFileSystem.getFileSystemForPath( ~[dremio-s3-plugin-3.0.8-201812270118560286-801500d.jar:3.0.8-201812270118560286-801500d]

In my dremio.conf, I used

dist: s3a://xxx/yyy

where xxx is our S3 bucket name and yyy is the path to where our dremio folders (accerlator, etc) are stored.

It works as expected.

1 Like

Which version of Dremio are you using? (you can go to Help -> About Dremio in UI)

In dremio.conf, what does your dist path have for a value?

Is your S3 policy as described in our docs?