Hi there,
I am evaluating Dremio and trying to write a generic JDBC connector that will allow me to query relational databases that support ANSI SQLs. In the Data Source creation form, I will have the jdbc url and the driver (which is available in the 3rdParty folder in dremio jars). I will also have the user name and password authentication as a start to see if this works. My questions are:
What should the ARP YAML file contain since I want the connector to be generic? (data types, relational algebra, etc.)
Is it enough to construct the jdbc connection url from the input field?
What is the drawback of having a generic connector?
Since I only want to access relational databases that support ANSI SQL, will the predicate pushdown still work in dremio.
Any insights on achieving this will be highly appreciated.
@ben, thanks for the reply. I have already done that tutorial and managed to get the sqlite connector running. However, I am trying to make the connector generic. If I specify only the JDBC driver to be used, can Dremio get the data types and functions from the provided driver?
If the answer to the question above is No, is there a way to write generic ARP file?
@olumo I don’t have a definitive answer to your question, but I have spent a fair amount of time diving into the closed-source innards of the ARP JDBC adapter source code.
Go grab a program like JD-GUI and start digging around the custom ARP adapter that Dremio implemented for MS SQL (which is how they are able to set all the custom character ENCODING stuff), which they defined through Java code instead of through a YAML file.
Of course, you can only use the code you see as a guide, license wise it’s off limits from copy/paste or modification.
The code does not pass many unused classes around. I wanted to run some pre/post query commands, but was locked out because of how it’s coded; and since Dremio will not open source the ARP JDBC driver, there isn’t any path forward for many use cases.
I really like the concept behind the ARP adapter, I just wish it was a lot more flexible (like if it was open source…) Some of my previous posts on this topic:
Apologies for the late reply @patricker. Thanks for your information! I did make some progress to write a generic connector which has input fields such as JDBC class name, JDBC Driver to use (must be present in the 3rdParty folder), user name, and password. It worked for some databases and didn’t work for others. It was a lot of hassle and I came to the conclusion that it’s not a sustainable or scalable way to solve our problem.
In the end, we went with Apache Drill which does what we really needed. We just needed a library that can treat our data sources as relational tables and allows us to do joins across them. Although Apache Drill configurations can be cumbersome, it fits our needs. Dremio comes with a lot of features that we don’t really need (i.e. bloated for our simple use case).