Big MySQL virtual dataset creation fails after about 1:15:00

Hi, I’m trying to load a big MySQL table (~10 GB. It is actually a UNION ALL of several tables) into Dremio (running with default settings on a Ubuntu desktop) for testing purposes but it fails after about 1:15:00 with the error to follow. The query itself run in Toad for MySQL only takes about half an hour to run. What is Dremio actually doing here? Is it loading the data into Arrow format or something similar that makes it take so long (and fail)? Is the issue something todo with configuration of the MySQL box? Thanks in advance.

DATA_READ ERROR: The JDBC storage plugin failed while trying setup the SQL query.

sql [My MySQL query here. It is just a case of SELECT (bunch of columns) with a few UNION ALLs]
plugin MySQL
SqlOperatorImpl JDBC_SUB_SCAN
Location 0:0:5
SqlOperatorImpl JDBC_SUB_SCAN
Location 0:0:5
Fragment 0:0

[Error Id: 301ebbb1-1c22-4349-a2b8-3779484803fa on PC987:31010]

(java.sql.SQLNonTransientConnectionException) (conn:1713) Could not send query: Connection reset
org.mariadb.jdbc.internal.util.exceptions.ExceptionMapper.get():156
org.mariadb.jdbc.internal.util.exceptions.ExceptionMapper.getException():118
org.mariadb.jdbc.MariaDbStatement.executeExceptionEpilogue():247
org.mariadb.jdbc.MariaDbStatement.executeInternal():322
org.mariadb.jdbc.MariaDbStatement.executeQuery():492
com.dremio.exec.store.jdbc.JdbcRecordReader.setup():125
com.dremio.exec.store.CoercionReader.setup():111
com.dremio.sabot.op.scan.ScanOperator$1.run():179
com.dremio.sabot.op.scan.ScanOperator$1.run():175
java.security.AccessController.doPrivileged():-2
javax.security.auth.Subject.doAs():422
org.apache.hadoop.security.UserGroupInformation.doAs():1807
com.dremio.sabot.op.scan.ScanOperator.setupReaderAsCorrectUser():175
com.dremio.sabot.op.scan.ScanOperator.setupReader():167
com.dremio.sabot.op.scan.ScanOperator.setup():148
com.dremio.sabot.driver.SmartOp$SmartProducer.setup():535
com.dremio.sabot.driver.Pipe$SetupVisitor.visitProducer():79
com.dremio.sabot.driver.Pipe$SetupVisitor.visitProducer():63
com.dremio.sabot.driver.SmartOp$SmartProducer.accept():505
com.dremio.sabot.driver.StraightPipe.setup():102
com.dremio.sabot.driver.StraightPipe.setup():102
com.dremio.sabot.driver.StraightPipe.setup():102
com.dremio.sabot.driver.StraightPipe.setup():102
com.dremio.sabot.driver.StraightPipe.setup():102
com.dremio.sabot.driver.Pipeline.setup():58
com.dremio.sabot.exec.fragment.FragmentExecutor.setupExecution():322
com.dremio.sabot.exec.fragment.FragmentExecutor.run():222
com.dremio.sabot.exec.fragment.FragmentExecutor.access$800():81
com.dremio.sabot.exec.fragment.FragmentExecutor$AsyncTaskImpl.run():567
com.dremio.sabot.task.AsyncTaskWrapper.run():91
com.dremio.sabot.task.slicing.SlicingThread.run():71
Caused By (java.sql.SQLException) Could not send query: Connection reset
Query is:[My MySQL query here. It is just a case of SELECT (bunch of columns) with a few UNION ALLs]
org.mariadb.jdbc.internal.util.LogQueryTool.exceptionWithQuery():113
org.mariadb.jdbc.internal.protocol.AbstractQueryProtocol.executeQuery():167
org.mariadb.jdbc.MariaDbStatement.executeInternal():316
org.mariadb.jdbc.MariaDbStatement.executeQuery():492
com.dremio.exec.store.jdbc.JdbcRecordReader.setup():125
com.dremio.exec.store.CoercionReader.setup():111
com.dremio.sabot.op.scan.ScanOperator$1.run():179
com.dremio.sabot.op.scan.ScanOperator$1.run():175
java.security.AccessController.doPrivileged():-2
javax.security.auth.Subject.doAs():422
org.apache.hadoop.security.UserGroupInformation.doAs():1807
com.dremio.sabot.op.scan.ScanOperator.setupReaderAsCorrectUser():175
com.dremio.sabot.op.scan.ScanOperator.setupReader():167
com.dremio.sabot.op.scan.ScanOperator.setup():148
com.dremio.sabot.driver.SmartOp$SmartProducer.setup():535
com.dremio.sabot.driver.Pipe$SetupVisitor.visitProducer():79
com.dremio.sabot.driver.Pipe$SetupVisitor.visitProducer():63
com.dremio.sabot.driver.SmartOp$SmartProducer.accept():505
com.dremio.sabot.driver.StraightPipe.setup():102
com.dremio.sabot.driver.StraightPipe.setup():102
com.dremio.sabot.driver.StraightPipe.setup():102
com.dremio.sabot.driver.StraightPipe.setup():102
com.dremio.sabot.driver.StraightPipe.setup():102
com.dremio.sabot.driver.Pipeline.setup():58
com.dremio.sabot.exec.fragment.FragmentExecutor.setupExecution():322
com.dremio.sabot.exec.fragment.FragmentExecutor.run():222
com.dremio.sabot.exec.fragment.FragmentExecutor.access$800():81
com.dremio.sabot.exec.fragment.FragmentExecutor$AsyncTaskImpl.run():567
com.dremio.sabot.task.AsyncTaskWrapper.run():91
com.dremio.sabot.task.slicing.SlicingThread.run():71
Caused By (java.net.SocketException) Connection reset
java.net.SocketInputStream.read():210
java.net.SocketInputStream.read():141
java.io.BufferedInputStream.fill():246
java.io.BufferedInputStream.read1():286
java.io.BufferedInputStream.read():345
org.mariadb.jdbc.internal.io.input.StandardPacketInputStream.getPacketArray():267
org.mariadb.jdbc.internal.com.read.resultset.SelectResultSet.readNextValue():405
org.mariadb.jdbc.internal.com.read.resultset.SelectResultSet.fetchAllResults():309
org.mariadb.jdbc.internal.com.read.resultset.SelectResultSet.():182
org.mariadb.jdbc.internal.protocol.AbstractQueryProtocol.readResultSet():1315
org.mariadb.jdbc.internal.protocol.AbstractQueryProtocol.readPacket():1081
org.mariadb.jdbc.internal.protocol.AbstractQueryProtocol.getResult():1031
org.mariadb.jdbc.internal.protocol.AbstractQueryProtocol.executeQuery():164
org.mariadb.jdbc.MariaDbStatement.executeInternal():316
org.mariadb.jdbc.MariaDbStatement.executeQuery():492
com.dremio.exec.store.jdbc.JdbcRecordReader.setup():125
com.dremio.exec.store.CoercionReader.setup():111
com.dremio.sabot.op.scan.ScanOperator$1.run():179
com.dremio.sabot.op.scan.ScanOperator$1.run():175
java.security.AccessController.doPrivileged():-2
javax.security.auth.Subject.doAs():422
org.apache.hadoop.security.UserGroupInformation.doAs():1807
com.dremio.sabot.op.scan.ScanOperator.setupReaderAsCorrectUser():175
com.dremio.sabot.op.scan.ScanOperator.setupReader():167
com.dremio.sabot.op.scan.ScanOperator.setup():148
com.dremio.sabot.driver.SmartOp$SmartProducer.setup():535
com.dremio.sabot.driver.Pipe$SetupVisitor.visitProducer():79
com.dremio.sabot.driver.Pipe$SetupVisitor.visitProducer():63
com.dremio.sabot.driver.SmartOp$SmartProducer.accept():505
com.dremio.sabot.driver.StraightPipe.setup():102
com.dremio.sabot.driver.StraightPipe.setup():102
com.dremio.sabot.driver.StraightPipe.setup():102
com.dremio.sabot.driver.StraightPipe.setup():102
com.dremio.sabot.driver.StraightPipe.setup():102
com.dremio.sabot.driver.Pipeline.setup():58
com.dremio.sabot.exec.fragment.FragmentExecutor.setupExecution():322
com.dremio.sabot.exec.fragment.FragmentExecutor.run():222
com.dremio.sabot.exec.fragment.FragmentExecutor.access$800():81
com.dremio.sabot.exec.fragment.FragmentExecutor$AsyncTaskImpl.run():567
com.dremio.sabot.task.AsyncTaskWrapper.run():91
com.dremio.sabot.task.slicing.SlicingThread.run():71

From the stack trace itself, it looks like Dremio didn’t actually fetch any data, but failed sending the query to your MySQL server (it might be also that Dremio actually sent the query but the server never replied back to Dremio).

If you could check your MySQL server logs, you might gain useful informations about what happened to your query. I would also advise checking the processlist during query “execution” as it should show what the current query status is.

When trying to reproduce the issue, I noticed that the MySQL/MariaDB client tries to read all data in memory first before handing it over to Dremio. I open an internal issue to get this behavior fixed.

Thanks for reporting!