If Dremio supported an OData source, that would be a wonderful thing for all us system integrators who would love to deploy Dremio on client sites. With the current datasource support, there will always be those one or two data sources that Dremio can’t talk to that are crucial to a BI project. Weird and wonderful web APIs, industrial databases, obscure legacy systems, etc,…
Sure, we could do ETL, but that defeats the point of keeping data live, and having a single source of truth.
If Dremio supported an OData source then we could all get busy writing custom OData interfaces to expose our dodgy data sources, and there would be no more datasource-based blockers to adoption.
Hey @MikeWiese, thank you for the feedback. Agreed, we’ll never cover all the sources for all use cases. Our initial plan here is to introduce a Data Source SDK (timing TBD) to enable various types of custom high-performance connectors. At the same time, we are exploring various frameworks/options to simplify this process and enable faster turnarounds for some types of sources (e.g. web APIs). So far, OData hasn’t come up frequently, but we’re still investigating.
Out of curiosity, what would be the top sources you’d like to see in Dremio next?
Hi @can, ODBC and OLEDB are the datasources that are blocking me from adopting Dremio, or recommending it to my clients. My industrial clients all have a variety of pretty standard Oracle and SQL Server systems, plus some kind of industrial historian (a specialised type of database optimised for time-series data). The various historians have a bunch of proprietary APIs, but in addition to that they almost always support ODBC and/or OLEDB.
One of the reasons I like the idea of OData is that it opens up a bunch of existing datasources via a nice REST API, but also because if it comes to a crunch, I can always build a custom OData facade over whatever hideously-ugly data repository I happen to be battling with at the time. I suppose I could also write facades with ODBC or OLEDB, but the idea of going back to C++ and COM is about as exciting as returning to black-and-white TV.
I’m not wedded to OData, but the ease with which I can write OData glue code is a big attraction (plus I tend to use BI tools that can consume OData). If you go in some other direction, that will be quite influential. As long as it works over port 80, and we can all write custom facades, then it should be a good thing.
Hi @MikeWiese appreciate the details – makes sense. I’ve added your feedback to our internal tracking. We’ll reach out when there are developments on this front.
On a side note: At a high level, we’re currently focusing on making Apache Arrow a common interchange between different systems regardless if one of those systems is Dremio. This way, once you have, say, a X to Arrow integration, it not only benefits Dremio, but many other systems.
Are you referring to connecting to Dremio over ODATA, or having Dremio connect to ODATA sources?
The issue with ODATA is that it’s not SQL-based. You can’t push down the processing the way you can with JDBC or ODBC. It’s fine for pulling small amounts of data.