Issue with executing queries in Dremio from python

Hi Team,

When i am trying to execute CTAS query from Dremio UI, it is working fine as expected and parquet file is getting distributed properly in different chunks.

The same query when i am trying to execute it from Python is getting failed, One workaround i was able to do that. After applying limit to select query of CTAS, it works. but it generates only single parquet file.

I have tried using pyodbc & pyarrow, same results when trying to run in from python end.

challenges

  1. Need a proper python package/library which can execute same as dremio ui from python end.
  2. pyarrow fails if query time goes more then 10 minutes.

Note - Cant provide the SQL Query.

Thanks in advance.

Hi,

I’m not part of the dremio support team, but we had issues with our load balancer having a timeout where we saw a very similar issue.

Also might be helpful to check flight.client.readiness.timeout.millis and check if dbeaver has the timeout issue too or not.

@benw Are you able to send us the job profile when run from Python, wondering if the query is getting cancelled by the client