I have imported a csv file from HDFS into dremio and initially it has 4 columns. To test metadata management, deleted one column from HDFS file and tried to refresh physical dataset by below command
ALTER TABLE <dataset_path> REFRESH METADATA
Command executed successfully but still I can see dropped column with null value while querying dataset.
Thanks for your response. Executed below commands;
Alter Table <dataset_path> FORGET METADATA
Alter Table <dataset_path> REFRESH METADATA
select * from <dataset_path>
But now columns are not split on delimiter
@Monika_Goel did the configuration for the HDFS source change between metadata refreshes? In some instances, where a “metadata impacting” change is made to the source (e.g. changing connection settings like NameNode URI etc.), Dremio will reset existing metadata with the expectation that you’re now working with a new set of files/datasets.
Otherwise, we’d expect existing formats to be always preserved between metadata refreshes.
Did this issue fixed in latest version? As mentioned in another page(see below), ‘forget metadata’ will lose the reflections and their definitions. I don’t want to re-format it, it will take high cost, any other approach we can choose to refresh metadata when columns changed?
If I execute ‘forget metadata’ on a PDS,and there is no changes in parquet file(no column added/deleted/updated), does it impact existing VDS? This PDS will be auto promoted as we enabled auto promotion in data source level config.