Dremio on YARN execution engine

filanovskiy · May 6, 2019, 9:06pm

Hi everyone!

I’m curious what is the execution engine used by Dremio when I deploy it on YARN?
is it something like Spark or MarReduce? Or it’s something completely different?
and if yes (if it’s different) what is the performance differences between Spark and DremIO execution engine?

thank you!

kelly · May 6, 2019, 10:14pm

You can read more about Dremio and YARN here: https://docs.dremio.com/deployment/yarn-hadoop.html

I would suggest reading the Architecture Guide as well: https://www.dremio.com/lp/architecture-guide

In short, Dremio provides its own execution engine based on Apache Arrow. This is only for executing Dremio jobs, so it isn’t really comparable to Spark or MapReduce as those are general purpose. You could compare to SparkSQL or Hive, for example.

filanovskiy · May 6, 2019, 10:25pm

Hi Kelly! Thank you. And what about performance difference between DremIO engine and SparkSQL (is it roughly the same or like 10 times faster/slower)
thank you!

kelly · May 6, 2019, 11:44pm

I think you should give it a shot, based on your data, workloads, and operational environment. In general Dremio tends to be somewhat faster than other engines (2x-5x), and hundreds of times faster with Data Reflections.

Topic		Replies	Views
Performances comparisons	18	11768	February 1, 2021
Embed spark into dremio	9	3915	July 3, 2018
How does dremio move data?	10	3077	July 13, 2021
Difference between Dremio vs Presto	4	12402	June 19, 2020
Dremio ansi sql dialect - is there option to change?	5	229	September 21, 2024

Dremio on YARN execution engine

Related topics