Clickhouse ARP - 100х slower than direct JDBC query

Hi, we are testing the Dremio (v20.1.0) connection to Clickhouse via
mmsmdali’s ARP connector.
The Clickhouse DB source table has around 3.5 Bln rows.

A simple query like 'select count() from table1’ works like a charm - 50ms.
But whenever we try to add a WHERE clause or a simple CAST function then it stucks for ages.
'SELECT count() from table1 where field1=1’
leads to several hours then fail.
The same query posted via DBeaver JDBC works in just milliseconds. With the same JDBC driver.

Q1. It is obvious that no pushdown is happening and Dremio tries to get all data and then filter. What might be wrong with ARP? Which setting might be missed? The ARP’s code seems to be very much similar to the other ARP connectors we’ve tested.

Q2. We tried an external query workaround. But if fails with any query submitted with ‘Invalid External Query statement on source CH’ Caclite error. The simple query like ‘SELECT COUNT(*) FROM table1’ fails.
In ARP code an externalQuerySupported is enabled:

@SourceType(value = “CLICKHOUSE”, label = “ClickHouse”, uiConfig = “clickhouse-layout.json”, externalQuerySupported = true)

Other ARP connectors that we’ve tested also work well with external queries.
What might be wrong with external query ARP configuration? Any hidden option?

Invalid External Query statement on source
VALIDATION ERROR: Invalid External Query statement on source

SQL Query SELECT * FROM table(CH.external_query(‘select count(*) from db.table1’))
startLine 0
startColumn 0
endLine 0
endColumn 0

(org.apache.calcite.runtime.CalciteContextException) At line 0, column 0: Invalid External Query statement on source
jdk.internal.reflect.GeneratedConstructorAccessor156.newInstance():-1
jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance():45
java.lang.reflect.Constructor.newInstance():490
org.apache.calcite.runtime.Resources$ExInstWithCause.ex():463
org.apache.calcite.sql.SqlUtil.newContextException():789
org.apache.calcite.sql.SqlUtil.newContextException():774
com.dremio.exec.store.jdbc.JdbcExternalQueryMetadataUtility.newValidationError():146
com.dremio.exec.store.jdbc.JdbcExternalQueryMetadataUtility.getExternalQueryMetadataFromSource():121
com.dremio.exec.store.jdbc.JdbcExternalQueryMetadataUtility.lambda$getBatchSchema$0():67
java.util.concurrent.FutureTask.run():264
java.util.concurrent.ThreadPoolExecutor.runWorker():1128
java.util.concurrent.ThreadPoolExecutor$Worker.run():628
java.lang.Thread.run():834
Caused By (org.apache.calcite.sql.validate.SqlValidatorException) Invalid External Query statement on source
jdk.internal.reflect.GeneratedConstructorAccessor155.newInstance():-1
jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance():45
java.lang.reflect.Constructor.newInstance():490
org.apache.calcite.runtime.Resources$ExInstWithCause.ex():463
org.apache.calcite.runtime.Resources$ExInst.ex():572
org.apache.calcite.sql.SqlUtil.newContextException():789
org.apache.calcite.sql.SqlUtil.newContextException():774
com.dremio.exec.store.jdbc.JdbcExternalQueryMetadataUtility.newValidationError():146
com.dremio.exec.store.jdbc.JdbcExternalQueryMetadataUtility.getExternalQueryMetadataFromSource():121
com.dremio.exec.store.jdbc.JdbcExternalQueryMetadataUtility.lambda$getBatchSchema$0():67
java.util.concurrent.FutureTask.run():264
java.util.concurrent.ThreadPoolExecutor.runWorker():1128
java.util.concurrent.ThreadPoolExecutor$Worker.run():628
java.lang.Thread.run():834

1 Like