Why not Apache Hudi?

Some recent activity on Hudi:

  • AWS automatically includes hudi jars in latest EMR setups
  • Hudi has graduated from incubating
  • Hudi Github repo shows ~ double the activity over iceberg
  • Hudi has been around a year longer in apache, and even longer as Hoodie
  • Just yesterday an AWS architect released a blog post on using Hudi in Glue 2.0 which leads me to believe it may be supported in Glue soon.
  • In September AWS announced support for reading Hudi tables from Redshift Spectrum

I’ll need to read up on support for iceberg in AWS, but if AWS is adopting Hudi, then support in Dremio would make it easier to integrate.