No suitable driver error

arjun · July 11, 2018, 7:01am

Hi,
I am trying to connect dremio to spark using python and create a dataframe, but this error pops up every time.
To run, I used a spark-submit with the jdbc jar file in the ‘–jars’ option.
Pretty sure there is some basic mistake here in the code. Here’s the code.

import pyodbc, pandas
from pyspark import SparkContext
sc = SparkContext()
from pyspark.sql import SQLContext

sqlContext = SQLContext(sc)

host = ‘127.0.0.1’
port = 31010
uid = ‘username’
pwd = ‘password’
driver = ‘/opt/dremio-odbc/lib64/libdrillodbc_sb64.so’ #I couldnt get the DSN to work, but this works.

con = pyodbc.connect(“Driver={};ConnectionType=Direct;HOST={};PORT={};AuthenticationType=Plain;UID={};PWD={}”.format(driver, host, port, uid, pwd), autocommit=True)

df0 = sqlContext.read.format(“jdbc”).option(“url”, “jdbc:dremio:direct=127.0.0.1:31010”).option(“dbtable”, “”"’@username’.‘spacename.datasetname’""").option(“user”, “username”).option(“password”, “password”).load()

ERROR MESSAGE:
raceback (most recent call last):
File “dremio_test.py”, line 29, in
.option(“password”, “password”)
File “/usr/local/spark-2.3.1-bin-hadoop2.7/python/lib/pyspark.zip/pyspark/sql/readwriter.py”, line 172, in load
File “/usr/local/spark-2.3.1-bin-hadoop2.7/python/lib/py4j-0.10.7-src.zip/py4j/java_gateway.py”, line 1257, in call
File “/usr/local/spark-2.3.1-bin-hadoop2.7/python/lib/pyspark.zip/pyspark/sql/utils.py”, line 63, in deco
File “/usr/local/spark-2.3.1-bin-hadoop2.7/python/lib/py4j-0.10.7-src.zip/py4j/protocol.py”, line 328, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling o36.load.
: java.sql.SQLException: No suitable driver
at java.sql.DriverManager.getDriver(DriverManager.java:315)
at org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions$$anonfun$7.apply(JDBCOptions.scala:85)
at org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions$$anonfun$7.apply(JDBCOptions.scala:85)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions.(JDBCOptions.scala:84)
at org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions.(JDBCOptions.scala:35)
at org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider.createRelation(JdbcRelationProvider.scala:34)
at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:340)
at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:239)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:227)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:164)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
at py4j.Gateway.invoke(Gateway.java:282)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.GatewayConnection.run(GatewayConnection.java:238)
at java.lang.Thread.run(Thread.java:748)

I tried similar code in scala too. But same error.

PLS HELP

anthony · July 11, 2018, 3:38pm

Seems like the error is coming from the df0 declaration. From a glance, there is no need to make a JDBC call when you are already setting up an ODBC connection. Maybe this documentation will help you with a sample connection via Python - https://docs.dremio.com/client-applications/python.html

arjun · July 12, 2018, 5:25am

Thanks for the quick response.

The program in the above link reads dataset using pandas. But what I’m trying to do is do the same using pyspark. Is there any possible way to do it with pyspark…???

But anyway I tried using pandas also, which gives another error:

Traceback (most recent call last):
File “/home/arjun/anaconda3/envs/test-env/lib/python3.6/site-packages/pandas/io/sql.py”, line 1378, in execute
cur.execute(*args)
pyodbc.Error: (‘HY000’, ‘[HY000] [Dremio][Connector] (1040) Dremio failed to execute the query: SELECT * FROM product_budget.budget18\n[30034]Query execution error. Details:[ \nSYSTEM ERROR: CompileException: Line 64, Column 30: No applicable constructor/method found for actual parameters “org.apache.arrow.vector.holders.UnionHolder”; candidates are: "public void com.dremio.exec.vector.complex.fn.JsonWriter.write(org.apache.arrow.vector.complex.reader.FieldReader) t…[see log] (1040) (SQLExecDirectW)’)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File “/home/arjun/PycharmProjects/test_stuff/dremio_test.py”, line 22, in
dataframe = pandas.read_sql(sql, con)
File “/home/arjun/anaconda3/envs/test-env/lib/python3.6/site-packages/pandas/io/sql.py”, line 381, in read_sql
chunksize=chunksize)
File “/home/arjun/anaconda3/envs/test-env/lib/python3.6/site-packages/pandas/io/sql.py”, line 1413, in read_query
cursor = self.execute(*args)
File “/home/arjun/anaconda3/envs/test-env/lib/python3.6/site-packages/pandas/io/sql.py”, line 1390, in execute
raise_with_traceback(ex)
File “/home/arjun/anaconda3/envs/test-env/lib/python3.6/site-packages/pandas/compat/init.py”, line 403, in raise_with_traceback
raise exc.with_traceback(traceback)
File “/home/arjun/anaconda3/envs/test-env/lib/python3.6/site-packages/pandas/io/sql.py”, line 1378, in execute
cur.execute(*args)
pandas.io.sql.DatabaseError: Execution failed on sql ‘SELECT * FROM product_budget.budget18’: (‘HY000’, ‘[HY000] [Dremio][Connector] (1040) Dremio failed to execute the query: SELECT * FROM spacename.datasetname\n[30034]Query execution error. Details:[ \nSYSTEM ERROR: CompileException: Line 64, Column 30: No applicable constructor/method found for actual parameters “org.apache.arrow.vector.holders.UnionHolder”; candidates are: "public void com.dremio.exec.vector.complex.fn.JsonWriter.write(org.apache.arrow.vector.complex.reader.FieldReader) t…[see log] (1040) (SQLExecDirectW)’)

Topic		Replies	Views
Dremio with python	12	6338	April 26, 2019
Connecting dremio with python using jdbc	13	4916	April 26, 2019
Error connecting to dremio via JDBC driver from Spark scala	3	2129	December 9, 2019
Dremio3.0 jdbc error	1	903	December 7, 2018
Jupyter on Docker dremio AWS with ODBC and JDBC	1	869	March 10, 2021

No suitable driver error

Related topics