Apache Drill v/s Dremio

How is the dremio better than apache drill and how it differs ??

There are lots of differences, here are a few:

  • Dremio is based on Apache Arrow
  • Dremio has far more sophisticated push-down capabilities
  • Dremio supports Data Reflections which can dramatically accelerate queries by up to 1000x
  • Dremio provides data curation, data lineage, row & column-level access control, and data masking abilities
  • Dremio is self-service for data consumers
  • Dremio has a very different philosophy about schema. Drill is effectively schema less, whereas with Dremio you declare schema and it discovers and adapts to schema during query execution. This approach provides better and more predictable performance.

Both are open source so try them out. I think you’ll see that Dremio has a much larger functional scope, and that even without using Data Reflections it is typically 5x-10x faster than Drill.

2 Likes

From my understanding both have the same root but different direction.
Dremio is not cover full Apache Drill functionality and probably shouldn’t.
Personally I use both for different purposes.

Drill has an nice idea as schema on the fly.
Query with filepaths and mask
It has some kind of API to create UDF, new data sources.

Dremio - all the things from prev post make it very powerful but it has to discover schema first.

1 Like

Thank you Dimtry .
Can you list out your use cases where you prefer Drill over dremio ??

Drill - mainly to land the data dumps into appropriate format (primarily into parquet sometime in json)

  1. Query data in the files with mask -> Select * FROM logs./2019/*/events_*;
  2. When I need to have folders & files info in the query
  3. When i need to query files with slightly different structure
  4. Sources based on regex pattern (logs/some kind of configs)
  5. Custom UDF (crypto/geo/text etc.)
  6. JDBC connector
  7. JDBC connector with integrated security to access SQL server (for dumps) etc.

When data is landed -> register in dremio, create reflection / refresh metadata, do all fantastic dremio magic mentioned in Kelly’s post and share results.

Personally I just prefer “drill with dremio” instead of “drill vs dremio”

thank you so much .Perfect combination I would to like to explore on what you said .
Is there any right time to connect to you and discuss more on this .