Avoiding results storage under dremio location

How to avoid the results storage under $DREMIO_HOME location once querying in dremio

1 Like

Hi,

If I understand you correctly you want to change the location where we store the Job results cache?

Go to your Dremio installation dir, open conf/dremio.conf and add the following to the paths section:

  results: /location/of/resultscache

For future reference, our configuration options are documented here - you can configure the individual paths of everything we store.

You can make adjustments to dremio.conf to point to a different set of directories:

paths: {

the local path for dremio to store data.

local: “/data/dremio”, <---- directory you want to use to store internal data

the distributed path Dremio data including job results, downloads, uploads, etc

dist: “pdfs://data/dremio/pdfs” <----- that’s directory you are interested in changing

Below directories you don’t need to specify if you keep them co-located

storage area for the accelerator cache.

accelerator: ${paths.dist}/accelerator

staging area for json and csv ui downloads

downloads: ${paths.dist}/downloads

stores uploaded data associated with user home directories

uploads: ${paths.dist}/uploads

stores data associated with the job results cache.

results: ${paths.dist}/results
}

Hi,

How can I avoid the results storing as results cache if more than 10 caches around, Since dealing with the larger data consuming more memory whenever I execute a single query.

Is there any possibility to avoid result caches? Or do I need to allocate one separate location where we have good memory space?

Hi,

Sounds like you want to reduce the amount of job results Dremio keeps around. By default, Dremio cleans up job results that are older than 30 days. We have a system option called results.max.age_in_days which controls the cleanup behavior.

To set a system option, as an Administrator click on Admin in the top right. Then click on Advanced Settings at the bottom of the left hand sidebar. You should see the following at the bottom of the page:

Type results.max.age_in_days into the input field and press Show. After you edit the value a Save button will appear like this:

Once you press Save, you will need to restart Dremio for this specific setting to take affect.

I tried this but still see old data under results folder. Can I just simply delete directories under data/pdfs/results ?

Can you confirm that you restarted Dremio after changing the setting? That could be an issue in Dremio.

You can delete the results, they are mainly used when viewing datasets using the UI. Dremio will recreate the data if needed.

Thanks Doron. I did restart several times. Let me know if there are any informative log message (in server.out) that I should be looking out for to indicate that “purging” is taking place.

I will take your advice and just delete them manually for now.

Let me know if there are any informative log message (in server.out) that I should be looking out for to indicate that “purging” is taking place.

Please check log/server.log . If the directory is cleaned up, there will be log entries of the following pattern:

INFO c.d.service.jobs.JobResultsStore - Deleted job output directory - /tmp/dremio/data/pdfs/results/266d658d-3a6f-9aa5-fb41-4c3c19996f00

If there is an error while deleting, there will be log entries like this:

WARN c.d.service.jobs.JobResultsStore - Could not delete job output directory : /tmp/dremio/data/pdfs/results/266d65a7-1115-1380-63f6-50fd617ebb00