I get this
Query was cancelled because planning time exceeded 60 seconds
The data is parquet files partitioned by Date/Hour stored on S3 bucket.
The problem occurs when I try to limit the data to specific period (date between)
5e92d1c0-a0a7-4813-8d37-3439ac8533e1.zip (4.4 KB)
So I can’t save the dataset
What can I do ?
We should be releasing a new version of Dremio soon… hopefully this should fix this issue.
I’ll keep you posted!
We at Fyber.com really interested by the product so I’ve removed the default timeout
However some issues
- The ability to use really the queries on memory I got a pretty big cluster with 20 R4.4Xlarge each machine is about 122GB memory no matter what kind of data set I use only a tiny fractional portion of the data go in memory . With such cluster I was expected all the dataset in memory (I’ve configure dremio with heap of 16GB and DirectMemory to 80GB)
- In case of I get all the machine reading the data in parallel from S3 bucket how can I increase the reading data rate to reach the network bandwidth limits ?
- How to optimize the CPU to speed up the request after a lot of test making reflection the majority of the requests are simply never optimized because not cover by the reflection
Could you help me?
Could you provide a query plan so we can take a closer look?
A few other questions:
- how did you deployment Dremio? How many coordinators and executors?
- what is the source data on S3?
- how are you issuing queries? If through the SQL console, are you using Preview or Run?