Job automatic retry on failure


we have been experiencing a few job (reflection or query) failures due to external reasons (Eg. error Remote backend is unreachable (502 Bad Gateway)).
The problem in such cases is that the job, such as a long running reflection, is not retried automatically.
Note: Physical datasets set to “Never Refresh” and “Never expire” and launch of PDS dependent reflections done using the API catalog refresh functionality.

Is there a “retry” functionality in Dremio in case of such job failures ?


Unfortunately, there is no retry in this case. There’s another outstanding issue where if you query this reflection from the sys.reflections table, it’ll have a status of CANNOT_ACCELERATE_SCHEDULED.

This would make a good improvement since Dremio does retry reflections whose physical dataset have refresh periods configured.

Tks for your feedback @Benny_Chow