Hi, All, I studied this article https://www.dremio.com/tutorials/how-to-create-an-arp-connector/, and tested dremio connector for sqlite, and the connector works, but in the conclusion section, it says: “Starting in Dremio 3.0 we have developed an all-new declarative framework (ARP) for developing relational connectors.”, so that means it only supports relational connectors, then what about Apache Drill? since Drill is not a relational database, does that mean we can not write an ARP connector for Drill? how can we use Dremio to query Data in Drill?
Thanks
Drill has JDBC driver. So technically it should be possible to create ARP connector. But why?
ok, thanks, I have a task to investigate the possibility to create a connector for drill so we can use dremio as entry point to connect to all other data sources including drill.
What is the difference between the connectors of SQLite and Drill?
I made the drill connector jar based on the sqlite connector example, but there are two issues for support of Drill connector:
- after I copied drill-jdbc-all-1.17.0.jar to the dremio/jars/3rdparty, and the drill connector jar to dremio/jars/ folder, then I run ./dremio start, there was a starting exception in the server.out log file, which is caused by the conflicting jars, I believe it is related to the drill-jdbc-all-1.17.0.jar, since if I remove this jar, there will be no exception:
Dremio is exiting. Failure while starting services.
MultiException[javax.servlet.ServletException: org.glassfish.jersey.servlet.ServletContainer-2aa3ac73@d6b4192d==org.glassfish.jersey.servlet.ServletContainer,jsp=null,order=2,inst=true,async=true, javax.servlet.ServletException: org.glassfish.jersey.servlet.ServletContainer-7700c19b@dc8cb165==org.glassfish.jersey.servlet.ServletContainer,jsp=null,order=3,inst=true,async=true]
at org.eclipse.jetty.util.MultiException.ifExceptionThrow(MultiException.java:122)
at org.eclipse.jetty.server.Server.doStart(Server.java:397)
at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:72)
at com.dremio.dac.server.WebServer.start(WebServer.java:255)
at com.dremio.service.SingletonRegistry$AbstractServiceReference.start(SingletonRegistry.java:137)
at com.dremio.service.ServiceRegistry.start(ServiceRegistry.java:88)
at com.dremio.service.SingletonRegistry.start(SingletonRegistry.java:33)
at com.dremio.dac.daemon.DACDaemon.startServices(DACDaemon.java:184)
at com.dremio.dac.daemon.DACDaemon.init(DACDaemon.java:190)
at com.dremio.dac.daemon.DremioDaemon.main(DremioDaemon.java:104)
Suppressed: javax.servlet.ServletException: org.glassfish.jersey.servlet.ServletContainer-2aa3ac73@d6b4192d==org.glassfish.jersey.servlet.ServletContainer,jsp=null,order=2,inst=true,async=true
at org.eclipse.jetty.servlet.ServletHolder.initServlet(ServletHolder.java:617)
at org.eclipse.jetty.servlet.ServletHolder.initialize(ServletHolder.java:425)
at org.eclipse.jetty.servlet.ServletHandler.lambda$initialize$0(ServletHandler.java:751)
at java.util.stream.SortedOps$SizedRefSortingSink.end(SortedOps.java:352)
at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
at java.util.stream.StreamSpliterators$WrappingSpliterator.forEachRemaining(StreamSpliterators.java:312)
at java.util.stream.Streams$ConcatSpliterator.forEachRemaining(Streams.java:743)
at java.util.stream.Streams$ConcatSpliterator.forEachRemaining(Streams.java:742)
at java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:580)
at org.eclipse.jetty.servlet.ServletHandler.initialize(ServletHandler.java:744)
at org.eclipse.jetty.servlet.ServletContextHandler.startContext(ServletContextHandler.java:361)
at org.eclipse.jetty.server.handler.ContextHandler.doStart(ContextHandler.java:821)
at org.eclipse.jetty.servlet.ServletContextHandler.doStart(ServletContextHandler.java:276)
at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:72)
at org.eclipse.jetty.util.component.ContainerLifeCycle.start(ContainerLifeCycle.java:169)
at org.eclipse.jetty.util.component.ContainerLifeCycle.doStart(ContainerLifeCycle.java:110)
at org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHandler.java:106)
at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:72)
at org.eclipse.jetty.util.component.ContainerLifeCycle.start(ContainerLifeCycle.java:169)
at org.eclipse.jetty.server.Server.start(Server.java:407)
at org.eclipse.jetty.util.component.ContainerLifeCycle.doStart(ContainerLifeCycle.java:110)
at org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHandler.java:106)
at org.eclipse.jetty.server.Server.doStart(Server.java:371)
… 8 more
Caused by: A MultiException has 2 exceptions. They are:
-
java.lang.NoSuchMethodError: javax.validation.BootstrapConfiguration.getClockProviderClassName()Ljava/lang/String;
-
java.lang.IllegalStateException: Unable to perform operation: create on com.dremio.dac.server.DACJacksonJaxbJsonFeature$DACJacksonJaxbJsonProvider
-
After I removed drill-jdbc-all-1.17.0.jar, the dremio started successfully, but from the http://localhost:9047/, when tried to add a source from the plus sign, in the list of all the supporting datasources, there is no drill listed, so no way to add a drill source.
Could anybody help?
Thanks very much in advance.
Finally I got this exception resolved, it is related to the jar file drill-jdbc-all-1.17.0.jar, and validation-api jar file, dremio 4.1.7 is using 2.0.1.Final for validation-api, which has getClockProviderClassName(), but the drill-jdbc-all-1.17.0.jar uses 1.1.0.Final version, so there is version mismatching issue, so I went to drill github https://github.com/apache/drill to get the full new version 1.18 in the master branch, but the new version 1.18 is not released yet, so I did my build based on the master branch and built the new drill-jdbc-all-1.18.0.jar, and used this jar in dremio, that allowed me to start dremio successfully.
But I am having another issue:
after I created the source for drill in dremio browser UI, as this url:
http://localhost:9047/source/dremiodrill/folder/DRILL
It only shows two names: sys and information_schema, and all the other drill schemas do not show up, the full list of schemas from show databases from drill is:
cp.default
dfs.default
dfs.root
dfs.tmp
information_schema
postgreslocal.admin
postgreslocal.eadb
postgreslocal.information_schema
postgreslocal.pg_catalog
postgreslocal.public
postgreslocal
sys
so does anybody have a clue where I did wrong?
Thanks
@idoor88 What is the connection string you are using in the ARP code?
jdbc:drill:drillbit=localhost:31010
Here is the DrillConf.java:
@SourceType(value = “Drill”, label = “Drill”)
public class DrillConf extends AbstractArpConf {
private static final String ARP_FILENAME = “arp/implementation/drill-arp.yaml”;
private static final ArpDialect ARP_DIALECT =
AbstractArpConf.loadArpFile(ARP_FILENAME, (ArpDialect::new));
//private static final String DRIVER = “org.sqlite.JDBC”;
private static final String DRIVER = “org.apache.drill.jdbc.Driver”;//org.apache.drill.jdbc.Driver
@Tag(1)
@DisplayMetadata(label = “JDBC URL (jdbc:drill:drillbit=localhost:31010)”)
public String jdbcURL;
@Tag(2)
@DisplayMetadata(label = “Username”)
public String username;
@Tag(3)
@Secret
@DisplayMetadata(label = “Password”)
public String password;
@Tag(4)
@DisplayMetadata(label = “Record fetch size”)
@NotMetadataImpacting
public int fetchSize = 200;
@VisibleForTesting
public String toJdbcConnectionString() {
System.out.println("toJdbcConnectionString jdbcURL: "+ jdbcURL);
checkNotNull(this.jdbcURL, "JDBC URL is required");
return jdbcURL;
}
@Override
@VisibleForTesting
public Config toPluginConfig(SabotContext context) {
System.out.println("toPluginConfig context: "+ context);
return JdbcStoragePlugin.Config.newBuilder()
.withDialect(getDialect())
.withFetchSize(fetchSize)
.withDatasourceFactory(this::newDataSource)
.clearHiddenSchemas()
.addHiddenSchema(“SYSTEM”)
.build();
}
private CloseableDataSource newDataSource() {
System.out.println("newDataSource username: “+username+” password: "+password);
return DataSources.newGenericConnectionPoolDataSource(DRIVER,
toJdbcConnectionString(), username, password, null,
DataSources.CommitMode.DRIVER_SPECIFIED_COMMIT_MODE);
}
@Override
public ArpDialect getDialect() {
System.out.println("getDialect ARP_DIALECT: "+ ARP_DIALECT);
return ARP_DIALECT;
}
@VisibleForTesting
public static ArpDialect getDialectSingleton() {
System.out.println("getDialectSingleton ARP_DIALECT: "+ ARP_DIALECT);
return ARP_DIALECT;
}
}
I created a connector here. https://github.com/dremio-brock/dremio-drill-connector
If you check the drillbit box add in the host ip and port, then the schema you want to query. You should see the schema show up.
There does appear to be an issue with the database schema when running queries. When retrieving the metadata from drill, drill returns “DRILL” as the base schema, causing Dremio to submit this back to drill which causes the queries to fail.
i.e. select * from DRILL.postgres.db.employees
instead of select * from postgres.db.employees
I’ll have to debug this to figure out the issue or workaround.
@b-rock Hi, did you ever find a solution or a workaround to the “DRILL” base schema issue? I tried your connector and noticed it as well.