Dremio Use Case Question

I wanted to ask if Dremio is the right tool for what we’re trying to achieve. We’re interested in using Dremio as a database accelerator. The data we’re working with is all in a single database (e.g. Redshift or SQL Server), so we don’t really need the distributed data integration feature that Dremio provides.

But we’re very interested in boosting query performance against large tables. Our queries are mainly aggregations of metrics on fact tables with some joins into generally small to medium sized dimension tables. We’re seeing manageable performance when working with tables of 1 million to 10 million rows, but once we start getting up into the 100 million or 1 billion row tables, queries can take a while (especially when hitting the db server with a couple of dozen concurrent queries).

Can Dremio noticeably accelerate these queries using reflections, provided the server configuration has enough memory and nodes?


Hi Larry, this can be a good use case, and you can scale Dremio easily by adding more nodes.

However, you might also consider that the data could simply be in S3 - there’s no need for it to necessarily be in a database.

I would suggest taking a look at the Data Reflections documentation and white paper, and having a look at the course on Data Reflections on university.dremio.com.

Data Reflections can “invisibly” optimize data so that far less work is performed for each query, allowing for lower latency and greater concurrency.

Good info, thank you.