I keep running into this error when querying parquet from AWS S3:
Unable to execute HTTP request: Timeout waiting for connection from pool
It’s not against all data sources, but enough to cause pain.
From the research I’ve done, it looks like I need to make sure that the
fs.s3a.connection.maximum connection property is set to a high number. We currently have it set to
100000 at the moment. I could easily up the value to a million, but before I need to reset all our S3 data sources, I feel like there may be something else going on.
Is anyone else seeing this kind of behavior? Do I need to up the connection property to a million? If I do change the value to a million is there a better way to do it other than the GUI because rebuilding all the data sources is a pain.
Any help is appreciated. Thank you!