Store data in HDFS

Hello,
how can i store dremio’s data into HDFS? I changed the path in dremio.conf but still not working.

Regards,
Aym

Hi @aym

To store Dremio’s data like reflections, uploads, downloads, job results etc on HDFS you have to configure the dist:// parameter in dremio.conf, see documentation below

dremio.conf to configure dist

Either you can just do dist:// which will create sub-folders for each type like accelerator, uploads etc like below

# the distributed path Dremio data including job results, downloads, uploads, etc
  dist: "pdfs://"${paths.local}"/pdfs"

or you can separately define them like below

# storage area for the accelerator cache.
  accelerator: ${paths.dist}/accelerator

  # staging area for json and csv ui downloads
  downloads: ${paths.dist}/downloads

  # stores uploaded data associated with user home directories
  uploads: ${paths.dist}/uploads

  # stores data associated with the job results cache.
  results: ${paths.dist}/results

  # shared scratch space for creation of tables.
  scratch: ${paths.dist}/scratch

If you still have issues, kindly send us your dremio.conf

Note: These changes require restart of coordinator or executors.

Thanks
@balaji.ramaswamy

Hello balaji,

I still have the problem and can’t reach dremio through http://localhost:9047 once i set the path as mentioned in my dremio.conf. Please find out the attached dremio.conf

Thanks,

Aymdremio.conf.zip (1.9 KB)

Hi @aym

Sorry my bad, for dist below is the example

Configuring distributed storage on HDFS

So replace

dist: “pdfs://“loclhost:9000/usr/local/dremio-store”/pdfs”

with

dist: "hdfs://localhost:9000/path"}

path in this case will be on hdfs

Thanks
@balaji.ramaswamy

Hi balaji,
Still it didn’t work, i tried that before and i tried it just now and the same thing. In fact i am running hadoop single mode on vmware and i have dremio in same vm. I created directory in HDFS to suppose be the location where i to store the data , results, downloads and others from dremio.

thaks,
Aym