Configure multiple subdirectories as a dataset

I have a root folder with multiple sub folders–all having the same data structure of csv’s. How do I configure at the root folder so that all of the csv’s in each sub folder are combined into one data set. I don’t want to have to configure at each sub folder.

“testdata2” on the screenshot is a folder with multiple levels of subfolders that contain csv files
Click on the icon that is circled in red to create dataset from your root folder
You will be presented with the screen to select data format. You can select “Text (delimited)” there.

Hi, I’m trying to do this by following the above instructions to apply a format to a very large number of sub-folders, however after I select the format (Parquet), I get an error ‘cannot connect to Dremio server’. When I drill down into the lowest level folder then I can open the parquet file ok. Any suggestions please?

you could be hitting the maximum number of splits – 60K (file-system sources)/300K in recent Dremio versions: Limits · Dremio
What does the master’s log say?

Best, Tim

Cheers - when I look in /var/log/dremio I only see server.out so not sure where else to find logs. (Sorry I only started using Dremio yesterday!!!)

We’re probably hitting the max limits - vast number of sub directories… which can be fixed at source…


You can see the error in 3 places

  • server.log
  • query profile, error tab - Open the UI-click on jobs-click on the failed job-open profile on the right, error
  • queries.json (in newer versions, look for column outcomereason)