I have a directory with many subdirectories, each containing a parquet file, and I’m trying to setup this as a single dataset as per the instructions here Dremio
Unfortunately, each subdirectory also contains the same information as a JSON file.
When configuring the dataset I tell Dremio that the format is “parquet” and it (understandably) rejects it as it finds some files it can’t read. Similarly, if I configure the dataset as containing JSON files, it complains that there are invalid (parquet) files.
Is there anyway to configure Dremio to read files from multiple subdirectories but only read files with a specific extension?
Note: version 24.1 hosted on Azure.