Dremio can't read files in a HDFS folder after upgrade to 1.2.1

This is an odd bug… During the upgrade I got this message:

Forcing all home file or folder based acceleration to replan during next server startup [1.0.7, 1.2.0)
Error : one or more elements on the path are not found in namespace:
“HDFS”.proj.“analytics”.“year=2017”.“month=06”.“USA”

Here are some sample HDFS folders:
drwxrwxr-x - mylogin mygroup 0 2017-07-14 18:30 /proj/analytics/year=2017/month=06/USA
drwxrwxr-x - mylogin mygroup 0 2017-07-14 18:30 /proj/analytics/year=2017/month=06/CAN
drwxrwxr-x - mylogin mygroup 0 2017-07-14 18:30 /proj/analytics/year=2017/month=06/MEX

When I choose the USA folder in Dremio UI it displays “No Items” even though there is a 0_0_0.parquet file in it.
The Dremio UI displays the parquet files in the CAN and MEX folders without any problems after the upgrade.

If I were to rename the “USA” folder to “US” then the Dremio UI is ABLE to find and read the 0_0_0.parquet file so the problem isn’t with the parquet file.

I’ve tried deleting that USA folder, restarting the server. recreating the folder, etc… without any success.

Is there some sort of folder cache that needs to be reset?

The above installation runs on linux…

I installed a fresh copy of Dremio on my Windows PC and it is able to see the parquet file in that USA HDFS directory.

Hi,

It looks like something is out of sync, possibly related to our metadata store.

Did you convert the USA folder into a physical dataset (which seems likely given that it seems you had an acceleration on it at some point)? If yes, you can try to refresh the metadata for the USA folder dataset. For now we have to run this manually as a query:

ALTER PDS “HDFS”.proj.“analytics”.“year=2017”.“month=06”.“USA” REFRESH METADATA

(more info here)

If you don’t have it as a physical dataset, you could edit the HDFS source and make any minor change (maybe change the refresh policy) and save it, that should trigger a full refresh of the source and all the folders.

Hopefully these steps help, we are aware that we need an easier way to manually trigger these updates :slight_smile:

thanks,
Doron

I probably set acceleration on at some point for that one directory and then removed it.

Running the ALTER PDS with REFRESH METADATA fixed the problem.

1 Like