Complete data pipeline

sksankar2006 · October 14, 2025, 2:28pm

Can any one give some list of reference with which we can implement the complete data pipeline using dremio ?

Best ways to ingest data to dremio and process them. Comparing between different options to ingest and on which scenario I should use which tool.

I saw some reference where it states we can process data using spark. And some places where it states we can process it using DBT. when to use spark and when to use DBT.

balaji.ramaswamy · October 16, 2025, 11:21am

@sksankar2006 What format is your source data in? Is it CSV/TXT/PARQUET?

sksankar2006 · October 17, 2025, 11:02am

Hi It will be in multiple formats. Basically we are building a new data lake house setup which can grow in exponential size . The file formats will be in of all CSV, json, .dat and we are planning to include kafka as a streaming service to ingest stream data. What I am looking for a reference architecture documentation with which I can find

the best way to impliment it.
Should I use spark or DBT or the native SQL in dremio itself.
If I have to mix all these where will each of them fit.
How to maintain the CI/CD for Dremio project? Iam not getting a reference except one document that mention about DBT.

Topic		Replies	Views
Spark with Dremio?	6	6438	September 24, 2022
How does dremio move data?	10	3170	July 13, 2021
Dremio With Spark	2	3186	November 13, 2022
Will Dremio kill ETL/ELT? Dremio University	4	2988	September 8, 2020
What is the best open source BI tool for dremio	4	2452	March 28, 2019

Complete data pipeline

Related topics