I have a dataset pointing at a mysql database, this dataset has a basic reflection setup against it. In a second dataset I’m querying that first dataset and further filtering it down. The preview happens to return no results, so I’m trying to run the query and I am getting an error:
SYSTEM ERROR: ConnectionPoolTimeoutException: Timeout waiting for connection from pool
SqlOperatorImpl ARROW_WRITER
Location 0:0:3
Fragment 0:0
[Error Id: 5f5f2224-496e-4f82-8b88-0b0b74831bb6 on ip-172-31-53-224.ec2.internal:-1]
(java.io.InterruptedIOException) saving output on storage_cache/results/.257b7fac-0509-d0c1-49d3-d011de88e700-1518633043201/0_0_0.dremarrow1: com.amazonaws.SdkClientException: Unable to execute HTTP request: Timeout waiting for connection from pool
org.apache.hadoop.fs.s3a.S3AUtils.translateException():125
org.apache.hadoop.fs.s3a.S3AOutputStream.close():121
org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close():72
org.apache.hadoop.fs.FSDataOutputStream.close():106
com.dremio.exec.store.dfs.FSDataOutputStreamWrapper.close():96
com.dremio.exec.store.easy.arrow.ArrowRecordWriter.closeCurrentFile():165
com.dremio.exec.store.easy.arrow.ArrowRecordWriter.close():181
com.dremio.sabot.op.writer.WriterOperator.noMoreToConsume():182
com.dremio.sabot.driver.SmartOp$SmartSingleInput.noMoreToConsume():216
com.dremio.sabot.driver.StraightPipe.pump():63
com.dremio.sabot.driver.Pipeline.doPump():82
com.dremio.sabot.driver.Pipeline.pumpOnce():72
com.dremio.sabot.exec.fragment.FragmentExecutor$DoAsPumper.run():288
com.dremio.sabot.exec.fragment.FragmentExecutor$DoAsPumper.run():284
java.security.AccessController.doPrivileged():-2
javax.security.auth.Subject.doAs():422
org.apache.hadoop.security.UserGroupInformation.doAs():1807
com.dremio.sabot.exec.fragment.FragmentExecutor.run():243
com.dremio.sabot.exec.fragment.FragmentExecutor.access$800():83
com.dremio.sabot.exec.fragment.FragmentExecutor$AsyncTaskImpl.run():577
com.dremio.sabot.task.AsyncTaskWrapper.run():92
com.dremio.sabot.task.slicing.SlicingThread.run():71
Caused By (com.amazonaws.SdkClientException) Unable to execute HTTP request: Timeout waiting for connection from pool
com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleRetryableException():1069
com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper():1035
com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute():742
com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer():716
com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute():699
com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500():667
com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute():649
com.amazonaws.http.AmazonHttpClient.execute():513
com.amazonaws.services.s3.AmazonS3Client.invoke():4221
com.amazonaws.services.s3.AmazonS3Client.invoke():4168
com.amazonaws.services.s3.AmazonS3Client.putObject():1718
com.amazonaws.services.s3.transfer.internal.UploadCallable.uploadInOneChunk():133
com.amazonaws.services.s3.transfer.internal.UploadCallable.call():125
com.amazonaws.services.s3.transfer.internal.UploadMonitor.call():143
com.amazonaws.services.s3.transfer.internal.UploadMonitor.call():48
java.util.concurrent.FutureTask.run():266
java.util.concurrent.ThreadPoolExecutor.runWorker():1149
java.util.concurrent.ThreadPoolExecutor$Worker.run():624
java.lang.Thread.run():748
Caused By (org.apache.http.conn.ConnectionPoolTimeoutException) Timeout waiting for connection from pool
org.apache.http.impl.conn.PoolingHttpClientConnectionManager.leaseConnection():286
org.apache.http.impl.conn.PoolingHttpClientConnectionManager$1.get():263
sun.reflect.GeneratedMethodAccessor13.invoke():-1
sun.reflect.DelegatingMethodAccessorImpl.invoke():43
java.lang.reflect.Method.invoke():498
com.amazonaws.http.conn.ClientConnectionRequestFactory$Handler.invoke():70
com.amazonaws.http.conn.$Proxy39.get():-1
org.apache.http.impl.execchain.MainClientExec.execute():190
org.apache.http.impl.execchain.ProtocolExec.execute():184
org.apache.http.impl.client.InternalHttpClient.doExecute():184
org.apache.http.impl.client.CloseableHttpClient.execute():82
org.apache.http.impl.client.CloseableHttpClient.execute():55
com.amazonaws.http.apache.client.impl.SdkHttpClient.execute():72
com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest():1190
com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper():1030
com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute():742
com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer():716
com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute():699
com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500():667
com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute():649
com.amazonaws.http.AmazonHttpClient.execute():513
com.amazonaws.services.s3.AmazonS3Client.invoke():4221
com.amazonaws.services.s3.AmazonS3Client.invoke():4168
com.amazonaws.services.s3.AmazonS3Client.putObject():1718
com.amazonaws.services.s3.transfer.internal.UploadCallable.uploadInOneChunk():133
com.amazonaws.services.s3.transfer.internal.UploadCallable.call():125
com.amazonaws.services.s3.transfer.internal.UploadMonitor.call():143
com.amazonaws.services.s3.transfer.internal.UploadMonitor.call():48
java.util.concurrent.FutureTask.run():266
java.util.concurrent.ThreadPoolExecutor.runWorker():1149
java.util.concurrent.ThreadPoolExecutor$Worker.run():624
java.lang.Thread.run():748
I can see files on s3, so the distributed store seems to work.
So far the tool seems to work well for small datasets, but I’d really like it if this worked for larger ones.