CDC support / full and delta files


we receive full-files (csv) every month and delta files (twice a day). The delta files contain a additional “amd” (add, modify, delete) column indicating what happened in the source database. Unfortunately we have different delta files. Some are json and contain only the fields that have changed. Others contain the complete record. To analyse the data, I somehow Need to “reconstruct” the source db. Which again can happen in 2 ways: reconstruct (a add is add, a delete is a delete, a modify is modify) and “DWH” (add, modify, delete delta records are all added to the DB).

This is pretty much like CDC (Change Data Capture) which allows efficient DB content replication.

Is anything of that supportted or can be achieved in any way or form? Dremio reflection stores data in parquet files (i belief). The delta reader could maintain the combined state in parquet files, which are used for analysis.

thanks a lot