Is Dremio only beneficial for applications with multiple data sources or is it also better to use for applications which queries only from a single relational data source such as Oracle DB?

I am a beginner in learning the advantages of Dremio. I could understand how Dremio works wonder in simplifying querying data from multiple data sources by creating Virtual data sets on top of them.

However, my application primarily queries from Oracle database and returns response for other clients to consume. We do optimizations on Database to improve performance for the services which are needed to be highly available along with having to read a huge amount of inventory data.

Is Dremio is good choice here, if I am considering performance improvements while querying from Oracle DB?

Hi @Namrata766

As far as Oracle goes, Dremio will push down the work to Oracle, if you join 2 tables and query via Dremio, the join with any additional filters should get pushed down to Oracle. This will get you the same speed as executing in Oracle. However to make it super fast, you can create reflection in Dremio. A combination of RAW and Aggregate reflection would help. Probably raw reflect some of the bigger tables and then create aggregate reflections on the VDS that has joins and Aggs

1 Like

@Namrata766

I interpreted your question slightly differently.

My use case is being able to join between a dataset from say MySql, an Excel spreadsheet, and a CSV.

In your case though, it sounds like an incremental reflection migt be useful as it’ll take a lot of performance considerations away from Oracle potentially.

You do have to balance this out with whether you need the data to be up to date. A reflection is updated once a hour, so our latency is effective an hour. You could of course write a cronjob to force an update at more regularly intervals.

To coin a phrase - your mileage may vary

1 Like