Oracle/ SQL Server DW replaced by Iceberg

Hi,
May I know if anyone has experience replacing traditional DW on Oracle/ SQL Server to Iceberg + Parquet in S3?

Great if you could share advantages and disadvantages on doing this?

@amin Oracle and SQL server are very good as an OLTP database, example when you place order on an E-commerce web site, it is going into one of these databases. They store data in rows When it comes to running dashboards, BI reports, there are 3 bottle necks

  • Dashboards usually query very few columns and since above databases store row wise, they need to read several files
  • These are highly normalized 3NF tables best suited for OLTP databases, where the most common DML is insert into a single table whereas in Decision Support Systems, multiple tables need to be joined and the joins/aggs are very expensive operators. When it comes to data lake on Iceberg table format, they are denormalized to a certain degree so less joins during runtime. A tool like Dremio can add Agg reflections and avoid aggregations during runtime, making it super fast
  • OLTP databases can scale horizontally in terms of CPU/Memory/Node but the storage layer is common and often the bottle neck, tools like Dremio can send parallel threads and read object store in fragments making the execution time of the query as the time taken by one thread

To move data of Oracle to S3 (Iceberg format) is something you should look into as there are several ETL tools.

Kindly, let me know if you have any questions

Thanks
Bali

1 Like

Thanks for the insights on RDBMS vs Iceberg.
I have done several testing using 800M+ NY Taxi dataset. Amazed to see how fast Dremio can return query results from Iceberg+S3-compatible dataset.

Welcome

Thanks for the feedback @amin , please keep us posted on your testing and do not hesitate to reach out to us on any questions or issues you encunter