Query was cancelled because planning time exceeded 60 seconds
The data is parquet files partitioned by Date/Hour stored on S3 bucket.
The problem occurs when I try to limit the data to specific period (date between) 5e92d1c0-a0a7-4813-8d37-3439ac8533e1.zip (4.4 KB)
We at Fyber.com really interested by the product so I’ve removed the default timeout
However some issues
The ability to use really the queries on memory I got a pretty big cluster with 20 R4.4Xlarge each machine is about 122GB memory no matter what kind of data set I use only a tiny fractional portion of the data go in memory . With such cluster I was expected all the dataset in memory (I’ve configure dremio with heap of 16GB and DirectMemory to 80GB)
In case of I get all the machine reading the data in parallel from S3 bucket how can I increase the reading data rate to reach the network bandwidth limits ?
How to optimize the CPU to speed up the request after a lot of test making reflection the majority of the requests are simply never optimized because not cover by the reflection