I’ve added a datasource that is an Azure/S3 bucket
This bucket hierarchy is structured as
Data (this is selected as the ‘root’ of the datasource)
and I have multiple files under HH for each year/month/day/hh only.
The files are type .json
When I do this, Dremio gives me this error…
Number of splits (1044559) in datase exceeds dataset split limit of 300000
I realise that 1044559 is the number of json files.
How can I work around this?
Is it simply a case of creating a dataset per year-month, and then trying to ‘union’ them in to a VDS?