Call to cancel query on RDBMs source failed with error: [Amazon](600000) Error setting/closing connection: Operation timed out

Anyone noticing this error? I am running a simple query joining two different datasources (limiting the results by 10). Obviously, testing on Mac for now.

The query runs for few minutes and I get the error:
2018-01-30 15:46:16,835 [e9 - 258efd16-17ca-99b8-4cda-44854a158900:frag:5:0] INFO c.d.exec.store.jdbc.JdbcRecordReader - Call to cancel query on RDBMs source failed with error: Amazon Error setting/closing connection: Operation timed out.
2018-01-30 15:46:16,843 [FABRIC-rpc-event-queue] INFO c.d.exec.work.foreman.QueryManager - Fragment 258efd16-17ca-99b8-4cda-44854a158900:5:0 failed, cancelling remaining fragments.

2018-01-30 15:46:16,946 [FABRIC-rpc-event-queue] INFO c.d.e.w.protector.ForemenWorkManager - A fragment status message arrived post query termination, dropping. Fragment [0:0] reported a state of CANCELLED.
2018-01-30 15:46:17,070 [e9 - 258efd16-17ca-99b8-4cda-44854a158900:frag:4:0] ERROR com.dremio.sabot.driver.SmartOp - EOFException: unexpected end of exception, read 19 bytes from 244
com.dremio.common.exceptions.UserException: EOFException: unexpected end of exception, read 19 bytes from 244
at com.dremio.common.exceptions.UserException$Builder.build(UserException.java:648) ~[dremio-common-1.4.4-201801230630490666-6d69d32.jar:1.4.4-201801230630490666-6d69d32]
at com.dremio.sabot.driver.SmartOp.contextualize(SmartOp.java:125) [dremio-sabot-kernel-1.4.4-201801230630490666-6d69d32.jar:1.4.4-201801230630490666-6d69d32]
at com.dremio.sabot.driver.SmartOp$SmartProducer.close(SmartOp.java:535) [dremio-sabot-kernel-1.4.4-201801230630490666-6d69d32.jar:1.4.4-201801230630490666-6d69d32]
at com.dremio.common.AutoCloseables.close(AutoCloseables.java:89) [dremio-common-1.4.4-201801230630490666-6d69d32.jar:1.4.4-201801230630490666-6d69d32]
at com.dremio.sabot.driver.Pipeline.close(Pipeline.java:158) [dremio-sabot-kernel-1.4.4-201801230630490666-6d69d32.jar:1.4.4-201801230630490666-6d69d32]
at com.dremio.common.DeferredException.suppressingClose(DeferredException.java:181) [dremio-common-1.4.4-201801230630490666-6d69d32.jar:1.4.4-201801230630490666-6d69d32]
at com.dremio.sabot.exec.fragment.FragmentExecutor.retire(FragmentExecutor.java:382) [dremio-sabot-kernel-1.4.4-201801230630490666-6d69d32.jar:1.4.4-201801230630490666-6d69d32]
at com.dremio.sabot.exec.fragment.FragmentExecutor.finishRun(FragmentExecutor.java:349) [dremio-sabot-kernel-1.4.4-201801230630490666-6d69d32.jar:1.4.4-201801230630490666-6d69d32]
at com.dremio.sabot.exec.fragment.FragmentExecutor.run(FragmentExecutor.java:263) [dremio-sabot-kernel-1.4.4-201801230630490666-6d69d32.jar:1.4.4-201801230630490666-6d69d32]
at com.dremio.sabot.exec.fragment.FragmentExecutor.access$800(FragmentExecutor.java:83) [dremio-sabot-kernel-1.4.4-201801230630490666-6d69d32.jar:1.4.4-201801230630490666-6d69d32]
at com.dremio.sabot.exec.fragment.FragmentExecutor$AsyncTaskImpl.run(FragmentExecutor.java:577) [dremio-sabot-kernel-1.4.4-201801230630490666-6d69d32.jar:1.4.4-201801230630490666-6d69d32]
at com.dremio.sabot.task.AsyncTaskWrapper.run(AsyncTaskWrapper.java:92) [dremio-sabot-kernel-1.4.4-201801230630490666-6d69d32.jar:1.4.4-201801230630490666-6d69d32]
at com.dremio.sabot.task.slicing.SlicingThread.run(SlicingThread.java:71) [dremio-extra-sabot-scheduler-1.4.4-201801230630490666-6d69d32.jar:1.4.4-201801230630490666-6d69d32]
Caused by: java.sql.SQLNonTransientConnectionException: (conn:37583530) Could not close resultSet : unexpected end of exception, read 19 bytes from 244
at org.mariadb.jdbc.internal.util.exceptions.ExceptionMapper.get(ExceptionMapper.java:156) ~[mariadb-java-client-1.6.2.jar:na]
at org.mariadb.jdbc.internal.util.exceptions.ExceptionMapper.getException(ExceptionMapper.java:118) ~[mariadb-java-client-1.6.2.jar:na]
at org.mariadb.jdbc.internal.util.exceptions.ExceptionMapper.throwException(ExceptionMapper.java:92) ~[mariadb-java-client-1.6.2.jar:na]
at org.mariadb.jdbc.internal.com.read.resultset.SelectResultSet.close(SelectResultSet.java:524) ~[mariadb-java-client-1.6.2.jar:na]
at com.dremio.common.AutoCloseables.close(AutoCloseables.java:89) [dremio-common-1.4.4-201801230630490666-6d69d32.jar:1.4.4-201801230630490666-6d69d32]
at com.dremio.common.AutoCloseables.close(AutoCloseables.java:68) [dremio-common-1.4.4-201801230630490666-6d69d32.jar:1.4.4-201801230630490666-6d69d32]
at com.dremio.exec.store.jdbc.JdbcRecordReader.close(JdbcRecordReader.java:280) ~[dremio-extra-plugin-jdbc-1.4.4-201801230630490666-6d69d32.jar:1.4.4-201801230630490666-6d69d32]
at com.dremio.common.AutoCloseables.close(AutoCloseables.java:89) [dremio-common-1.4.4-201801230630490666-6d69d32.jar:1.4.4-201801230630490666-6d69d32]
at com.dremio.common.AutoCloseables.close(AutoCloseables.java:68) [dremio-common-1.4.4-201801230630490666-6d69d32.jar:1.4.4-201801230630490666-6d69d32]
at com.dremio.exec.store.CoercionReader.close(CoercionReader.java:235) ~[dremio-sabot-kernel-1.4.4-201801230630490666-6d69d32.jar:1.4.4-201801230630490666-6d69d32]
at com.dremio.common.AutoCloseables.close(AutoCloseables.java:89) [dremio-common-1.4.4-201801230630490666-6d69d32.jar:1.4.4-201801230630490666-6d69d32]
at com.dremio.common.AutoCloseables.close(AutoCloseables.java:68) [dremio-common-1.4.4-201801230630490666-6d69d32.jar:1.4.4-201801230630490666-6d69d32]
at com.dremio.sabot.op.scan.ScanOperator.close(ScanOperator.java:350) ~[dremio-sabot-kernel-1.4.4-201801230630490666-6d69d32.jar:1.4.4-201801230630490666-6d69d32]
at com.dremio.sabot.driver.SmartOp$SmartProducer.close(SmartOp.java:533) [dremio-sabot-kernel-1.4.4-201801230630490666-6d69d32.jar:1.4.4-201801230630490666-6d69d32]
… 10 common frames omitted
Caused by: java.sql.SQLException: Could not close resultSet : unexpected end of exception, read 19 bytes from 244
at org.mariadb.jdbc.internal.com.read.resultset.SelectResultSet.close(SelectResultSet.java:526) ~[mariadb-java-client-1.6.2.jar:na]
… 20 common frames omitted
Caused by: java.io.EOFException: unexpected end of exception, read 19 bytes from 244
at org.mariadb.jdbc.internal.io.input.StandardPacketInputStream.getPacketArray(StandardPacketInputStream.java:269) ~[mariadb-java-client-1.6.2.jar:na]
at org.mariadb.jdbc.internal.com.read.resultset.SelectResultSet.readNextValue(SelectResultSet.java:405) ~[mariadb-java-client-1.6.2.jar:na]
at org.mariadb.jdbc.internal.com.read.resultset.SelectResultSet.close(SelectResultSet.java:520) ~[mariadb-java-client-1.6.2.jar:na]
… 20 common frames omitted

Hi @HLNA,

Is your query aborting during execution and logging this error message, or succeeding while still showing this error message in the log?

Hi @jduong It’s aborting. It’s strange that it works few minutes (fresh server start) and after some time, it fails with this error. If it did not work in the first place, I can check my IAM/RedShift inbound settings, but this is different.

Just to confirm:- I restarted Dremio (quit and start; not “stop” and start), the query will work fine and for sometime.

Are you able to query your MySQL and Redshift sources individually successfully?

What is the query you are running? What are the sizes of the tables being joined?

Hi @jduong , yes, individually able to query both. And this error is not just related to joining (because, that’s not yet successful… perhaps size of the tables), but, also for single table queries.

Also, I see mariadb jar being used for mysql; should be okay… but, should I change it to mysql jar?

It should be OK to use the MariaDB driver for MySQL. Is your MySQL actually Amazon Aurora, or is it a normal MySQL instance? It is MySQL on RDS?

For Redshift, you may need to adjust your TCP keepalive settings:
https://docs.aws.amazon.com/redshift/latest/mgmt/connecting-firewall-guidance.html

Do you see errors with both MySQL and Redshfit when querying them individually?

@jduong Thanks … so, does it means Dremio has one connection started during start of the server? is it not per query basis?

Dremio will use connection pooling with relational databases. If you have several queries running you’ll have several JDBC connections. If there are already connections available in the pool, they’ll get re-used rather than a new physical connection getting started.

My link to the Redshift docs has some configuration settings Amazon recommends to avoid timeout issues with long-running queries with Redshift. It might help with some of the issues you’re seeing.

Thanks @jduong! Makes sense of connection pooling. Yes, I saw the docs…and ttl/keep-alive settings on local machine has to be adjusted; will do. Right now, it’s on test mode. When deploying to production, I should do similar settings in the cluster (all nodes) are just the master node? (if I prefer to do at client side settings)… or on RedShift I should enable just the cluster ip/range right?

@HLNA, you should set the keep alive settings on nodes in your Dremio cluster. You’ll also want to grant access for each node in your Dremio cluster to Redshift.

@jduong Thank you… will update on the outcome