We are running the Dremio Version 4.8.1 AWS Edition and we are having the following error while trying to run a Query.
“Number of splits in the query (360456) exceeds the query split limit of 300000”
This dataset is based on several VDSs and is the only one which is presenting this error.
Could you please help me to fix this issue?
We currently only support 300,000 splits for Hive and Glue. What this means is that after pruning, the total number of splits across all the datasets used in the query is > 300,000. A split is a DFS block size or the file size if it is < than the block size. Are you having too many small files? Can you generate bigger files from the source ETL?
Is this Hive or Glue?