Distributed Storage Questions

What are the following folders used for:
accelerator - location of reflection files?
downloads - ?
results - ?
scratch - ?
uploads - location of uploaded files?

How can I tell which reflection file in the accelerators folder belongs to which VDS? Is there a table or file where I can reference this information? If I wanted to open a specific reflection parquet file in say, python, how would I know which file to open?

Hi @summersmd,

  1. accelerator/ - this stores the reflection materialization files. You’ll see directory for each reflection id with subdirectories for each materialization id. Within those will be parquet files.
  2. downloads/ - if you click the download button when viewing a dataset, the query is run results in the chosen format (CSV, JSON or Parquet) are written to this directory.
  3. results/ - the results of executing queries, in Arrow buffer format, are written here. You should see directories associated with each job id. These eventually expire and get cleaned up.
  4. scratch/ - the out-of-the-box directory we provide so you can run CREATE TABLE $scratch.<your new table> AS ... queries to… create tables. This is where the materialized results are stored as parquet files. See https://docs.dremio.com/sql-reference/sql-commands/tables.html.
  5. uploads/ - where user uploaded files go.

How can I tell which reflection file in the accelerators folder belongs to which VDS? Is there a table or file where I can reference this information?

sys.reflections, will give you a record for each reflection ID which contains a field for the dataset to which it corresponds.

If I wanted to open a specific reflection parquet file in say, python, how would I know which file to open?

As mentioned above, accelerator/ contains directories with the following structure:

<reflection id>/<materialization id>/<partition1>/<partition2>..../<file>.parquet

Join sys.reflections with sys.materializations tables to locate the appropriate directories in accelerator/ where your reflection parquet files will be found.

You can also run a query in Dremio like this:

"__accelerator"."<reflection id>"."<materialization id>"

… and download the results (though this will be limited to 1000 records)

2 Likes