JDBC PreparedStatement fails. Cannot convert RexNode to equivalent Dremio expression

The online documentation provide no details on what methods etc the JDBC driver is supposed to support or not.

Server 4.9.3-202010281843560195-edc49b6d
Driver 4.9.1-202010230218060541-2e764ed0

Attempting to use the PreparedStatement interface fails.

Connection.prepareStatement(select * from dbcert.tsint where csint = ?)

SQLException.getSQLState returned a null
SQLException.sqlState(null) SQLException.Message(Failed to create prepared statement: error_id: “e8ad9501-1c43-40db-ada5-e44499e6f4b5”
endpoint {

roles {
sql_query: true
java_executor: false
master: true
}
start_time: 1604771734645
max_direct_memory: 2837446656
available_cores: 8
node_tag: “”
conduit_port: 38827
dremio_version: “4.9.3-202010281843560195-edc49b6d”
}
error_type: PLAN
message: “PLAN ERROR: Cannot convert RexNode to equivalent Dremio expression. RexNode Class: org.apache.calcite.rex.RexDynamicParam, RexNode Digest: ?0\n\n\n[Error Id: e8ad9501-1c43-40db-ada5-e44499e6f4b5 on 172.31.22.93:31010]\n\n”
original_message: “Cannot convert RexNode to equivalent Dremio expression. RexNode Class: org.apache.calcite.rex.RexDynamicParam, RexNode Digest: ?0”
)

@dqmdev

Can you please provide us the job profile?

Just create a simple Java/JDBC test program using any VDS.

String s = "select * from dbcert.tint where cint = ? ";
PreparedStatement stmt = dbConnection.prepareStatement(s);

24a220e9-58d5-42b4-b1d5-565d1e93890d.zip (6.5 KB)

@dqmdev

‘?’ is a character, what should the expected resulted be for this query?

Did you mean to give single quotes around the filter value?

Example

select * from dbcert.tint where cint = ‘?’

The statement is using a JDBC parameter marker (the question mark).

Does Dremio JDBC driver claim to support
(a) Prepared Statements
(b) Parameter markers in Prepared Statements

@dqmdev,

Currently we do not support this

Where is Dremio capturing enhancement requests or showing roadmap etc?

Still not supported in Dremio 14, is it on the Dremio roadmap ?

@dqmdev

Currently this is not in the roadmap. The only way currently is to follow release notes, let me check internally if there is a plan to expose roadmap and enhancement tickets to the community

Hi @balaji.ramaswamy ,

Does PreparedStatement is supported in Dremio?

Thanks,
Saurabh

@sauku Yes prepared statements via ODBC/JDBC should work

Hi @balaji.ramaswamy,

But when i try to use “?” for dynamic variable replacement it does not work and throw error.

Thanks,
Saurabh

@sauku Dynamic variables are not supported but prepared statements are

Greetings. Googling on the error dremio 22.1 handed back after my ETL tool tried to create a prepared statement brought me here. What is the proper way to diagnose the issue?

For context, this is a Junk Dimension transformation step in Apache Hop 2.1 which should insert new records if a match on all fields is not found. It has an option to lookup on a hash field instead of just comparing across fields. The transform is generating prepared SQL statements.

For example:

SELECT DUDD_SK
FROM "Empower.data-staging".Drug_Utilization_By_Drug_per_Date
WHERE row_fingerprint = ? 
 AND ( ( Filled_Date = ? ) OR ( Filled_Date IS NULL AND ? IS NULL ) )
 AND ( ( Deactivation_Date = ? ) OR ( Deactivation_Date IS NULL AND ? IS NULL ) )
 AND ( ( Drug_Name = ? ) OR ( Drug_Name IS NULL AND ? IS NULL ) )
 AND ( ( Drug_Form = ? ) OR ( Drug_Form IS NULL AND ? IS NULL ) )
 AND ( ( Drug_Strength = ? ) OR ( Drug_Strength IS NULL AND ? IS NULL ) )
 AND ( ( Drug_ID = ? ) OR ( Drug_ID IS NULL AND ? IS NULL ) )
 AND ( ( PMS_ID = ? ) OR ( PMS_ID IS NULL AND ? IS NULL ) )
 AND ( ( Qty = ? ) OR ( Qty IS NULL AND ? IS NULL ) )
 AND ( ( Number_of_Prescriptions = ? ) OR ( Number_of_Prescriptions IS NULL AND ? IS NULL ) )
 AND ( ( Total_Price = ? ) OR ( Total_Price IS NULL AND ? IS NULL ) )
 AND ( ( Compound = ? ) OR ( Compound IS NULL AND ? IS NULL ) )

ba42477c-873a-498a-9f8b-4b338b27a956.zip (10.8 KB)

The experiment is to use Iceberg tables and see how to update them with an ETL tool.
Thanks for whatever guidance you can provide.

2022/10/17 17:47:55 - LifeFile - Drug_Utilization_By_Drug_per_Date Report Data.0 - Opening file: file:///C:/opt/ETL/LoadDremio/DrugUtilizationReports/Drug_Utilization_By_Drug_per_Date.csv
org.apache.hop.core.exception.HopDatabaseException:
Unable to prepare combi-lookup statement
Failed to create prepared statement: error_id: "e36a6928-fec8-446a-8c7e-4002cb03c126"
endpoint {
  address: "915f43849174"
  user_port: 31010
  fabric_port: 45678
  roles {
    sql_query: true
    java_executor: true
    master: true
  }
  start_time: 1665511781005
  max_direct_memory: 8589934592
  available_cores: 16
  node_tag: ""
  conduit_port: 40107
  dremio_version: "22.1.1-202208230402290397-a7010f28"
}
error_type: PLAN
message: "PLAN ERROR: Cannot convert RexNode to equivalent Dremio expression. RexNode Class: org.apache.calcite.rex.RexDynamicParam, RexNode Digest: ?0\n\n\n[Error Id: e36a6928-fec8-446a-8c7e-4002cb03c126 on 915f43849174:31010]\n\n"
original_message: "Cannot convert RexNode to equivalent Dremio expression. RexNode Class: org.apache.calcite.rex.RexDynamicParam, RexNode Digest: ?0"


        at org.apache.hop.pipeline.transforms.combinationlookup.CombinationLookup.setCombiLookup(CombinationLookup.java:479)
        at org.apache.hop.pipeline.transforms.combinationlookup.CombinationLookup.processRow(CombinationLookup.java:380)
        at org.apache.hop.pipeline.transform.RunThread.run(RunThread.java:51)
        at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: java.sql.SQLException: Failed to create prepared statement: error_id: "e36a6928-fec8-446a-8c7e-4002cb03c126"
endpoint {
  address: "915f43849174"
  user_port: 31010
  fabric_port: 45678
  roles {
    sql_query: true
    java_executor: true
    master: true
  }
  start_time: 1665511781005
  max_direct_memory: 8589934592
  available_cores: 16
  node_tag: ""
  conduit_port: 40107
  dremio_version: "22.1.1-202208230402290397-a7010f28"
}
error_type: PLAN
message: "PLAN ERROR: Cannot convert RexNode to equivalent Dremio expression. RexNode Class: org.apache.calcite.rex.RexDynamicParam, RexNode Digest: ?0\n\n\n[Error Id: e36a6928-fec8-446a-8c7e-4002cb03c126 on 915f43849174:31010]\n\n"
original_message: "Cannot convert RexNode to equivalent Dremio expression. RexNode Class: org.apache.calcite.rex.RexDynamicParam, RexNode Digest: ?0"

        at com.dremio.jdbc.impl.DremioJdbc41Factory.newServerPreparedStatement(DremioJdbc41Factory.java:147)
        at com.dremio.jdbc.impl.DremioJdbc41Factory.newPreparedStatement(DremioJdbc41Factory.java:108)
        at com.dremio.jdbc.impl.DremioJdbc41Factory.newPreparedStatement(DremioJdbc41Factory.java:50)
        at cdjd.org.apache.calcite.avatica.AvaticaConnection.prepareStatement(AvaticaConnection.java:355)
        at com.dremio.jdbc.impl.DremioConnectionImpl.prepareStatement(DremioConnectionImpl.java:298)
        at cdjd.org.apache.calcite.avatica.AvaticaConnection.prepareStatement(AvaticaConnection.java:175)
        at com.dremio.jdbc.impl.DremioConnectionImpl.prepareStatement(DremioConnectionImpl.java:330)
        at org.apache.hop.pipeline.transforms.combinationlookup.CombinationLookup.setCombiLookup(CombinationLookup.java:474)
        ... 3 more
2022/10/17 17:47:55 - Junk Dimension - Empower.data-staging.Drug_Utilization_By_Drug_per_Date.0 - ERROR: Unexpected error
2022/10/17 17:47:55 - Junk Dimension - Empower.data-staging.Drug_Utilization_By_Drug_per_Date.0 - ERROR: org.apache.hop.core.exception.HopDatabaseException:
2022/10/17 17:47:55 - Junk Dimension - Empower.data-staging.Drug_Utilization_By_Drug_per_Date.0 - Unable to prepare combi-lookup statement
2022/10/17 17:47:55 - Junk Dimension - Empower.data-staging.Drug_Utilization_By_Drug_per_Date.0 - Failed to create prepared statement: error_id: "e36a6928-fec8-446a-8c7e-4002cb03c126"
endpoint {
  address: "915f43849174"
  user_port: 31010
  fabric_port: 45678
  roles {
    sql_query: true
    java_executor: true
    master: true
  }
  start_time: 1665511781005
  max_direct_memory: 8589934592
  available_cores: 16
  node_tag: ""
  conduit_port: 40107
  dremio_version: "22.1.1-202208230402290397-a7010f28"
}
error_type: PLAN
message: "PLAN ERROR: Cannot convert RexNode to equivalent Dremio expression. RexNode Class: org.apache.calcite.rex.RexDynamicParam, RexNode Digest: ?0\n\n\n[Error Id: e36a6928-fec8-446a-8c7e-4002cb03c126 on 915f43849174:31010]\n\n"
original_message: "Cannot convert RexNode to equivalent Dremio expression. RexNode Class: org.apache.calcite.rex.RexDynamicParam, RexNode Digest: ?0"

@Brandon_Jackson I am able to reproduce the error with a similar SQL, What is the expected output from this SQL?

SELECT DUDD_SK
FROM "Empower.data-staging".Drug_Utilization_By_Drug_per_Date
WHERE row_fingerprint = ? 
 AND ( ( Filled_Date = ? ) OR ( Filled_Date IS NULL AND ? IS NULL ) )
 AND ( ( Deactivation_Date = ? ) OR ( Deactivation_Date IS NULL AND ? IS NULL ) )
 AND ( ( Drug_Name = ? ) OR ( Drug_Name IS NULL AND ? IS NULL ) )
 AND ( ( Drug_Form = ? ) OR ( Drug_Form IS NULL AND ? IS NULL ) )
 AND ( ( Drug_Strength = ? ) OR ( Drug_Strength IS NULL AND ? IS NULL ) )
 AND ( ( Drug_ID = ? ) OR ( Drug_ID IS NULL AND ? IS NULL ) )
 AND ( ( PMS_ID = ? ) OR ( PMS_ID IS NULL AND ? IS NULL ) )
 AND ( ( Qty = ? ) OR ( Qty IS NULL AND ? IS NULL ) )
 AND ( ( Number_of_Prescriptions = ? ) OR ( Number_of_Prescriptions IS NULL AND ? IS NULL ) )
 AND ( ( Total_Price = ? ) OR ( Total_Price IS NULL AND ? IS NULL ) )
 AND ( ( Compound = ? ) OR ( Compound IS NULL AND ? IS NULL ) )

A single integer value would be returned that represents a row that could be updated or an empty result which would indicate the need for a newly inserted row.

@Brandon_Jackson Sorry for the dumb question, you are giving something like total_price=?, is total_price a number and equating to a ‘?’ should yield what?

Hi Balaji, No dumb questions at all. It is a specific price inserted replacing the question mark. First a bit of background. I am using a Java based ETL tool called Apache Hop. All Java based tools used JDBC drivers. When dealing with any data manipulation, like inserting, updating, and even selecting, java applications will craft prepared statements that include parameters like filter conditions. You end up with SQL queries that look like mine with a bunch of question marks, each one takes on a specific value per executed statement. Prepared statements exist in the Windows world, on MS SQL and every other database I know of. The particular SQL query in question asks the database if the destination table has precisely each value for each column listed there for a given fingerprinted row (hashcoded aka all columns values asking like a huge password resulting in a unique code identifying the row).

What I found after browsing the Dremio Java code and experimentation is that Dremio only supports sending a fully filled out query with concrete values as text, no matter what kind of query it is. If you use a Java based tool that programmatically tries to interact with Dremio and separate out the parameter values from the query and communicate over a query pattern for Dremio to chew on, that fails. That is what was meant by “Dynamic Parameters” or “Parameters” generally. That means that Dremio cannot be treated like a true database yet compared to all the others like Trino/Presto/Starburst, MySQL, Postgres, MS SQL. Dremio only processes a query without any variability at all. I venture to say only one statement at a time.

I wish Dremio supported prepared statements with parameters so it would be easier to use a wider variety of tools to interact with data it could query. I found this disappointing.

I imagine the best help you could provide is to communicate the desire to use prepared statements with parameters to the development team. I can see they are using Calcite to process queries and that it should provide a good JDBC driver, so it is really a feature implementation beyond Calcite itself in whatever the engine is that Dremio has.

Does Dremio, the company, publish features they are working on anywhere?
Is the forum here the only way to communicate back with the company in a meaningful way?

Hey @Brandon_Jackson ! I hope you are well. Dremio doesn’t support parameterized SQL statements yet. For large complex queries on huge datasets, it really doesn’t make sense to use parameterized statements as you need the actual values to read partition stats and estimate row counts for join ordering and other query optimizations. These stats are also needed to determine degrees of parallelism for various phases of execution. A plan for one set of values could be totally different than a plan for a different set of values. Reflections can be used to accelerate queries and reflections need to consider actual filter values when trying to match.

Parameterized SQLs make more sense in the context of simple SQL statements that are executed 10-100s of times like small update/insert/delete statements. Dremio just released DML support for Apache Iceberg in June 2022: Announcing DML Support for Apache Iceberg | Dremio However, issuing 1,000 small insert statements is going to result in a ton of small Parquet files and table maintenance will become a real concern: Maintenance

Sorry, this statement doesn’t make a lot of sense. One use case for parameterized queries is repeated inserts, sure, and I’m not sure that has to be supported, here. However, one of the biggest use cases for parameterized queries is to prevent SQL injection on selects. Parameters, in this case, are so useful for large, complex queries on huge datasets. Brandon has provided ample examples. This is extremely standard stuff in dealing with data.

Having users sanitize their own input leaves the Dremio connection methods very open to SQL injection attacks. Most database engines will not actually create a named plan for the query unless it’s run repeatedly in a short amount of time, so just supporting parameterization without fully preparing the statements on the backend would be a huge boon to the security of Dremio drivers.

It is very disheartening to see the flippant attitude of Dremio toward basic SQL injection prevention.