Read that Dremio doesn’t support CDC process so I was trying a work around using unions and this post from @can by overwriting and sourcing tables in the $scratch space
But when I union my historical data with file daily file, duplicates are being retained
Is there a way i can remove them ?
I had tried distinct but that didn’t help
Here the query I ran
“SELECT DISTINCT ATMID,ZIP FROM
(Select * FROM ATMDATA
UNION
Select * FROM ATMDATA_CDC)
ORDER BY ATMID ASC”
You are doing a select * from, are you saying every column in the row is an exact duplicate of the other row. Only then union will suppress, even if you have one column that has a different value, it will come back