We are working with a containerised Dremio connected with various sources such as MySQL and HDFS. The data in those Dremio datasets does not seem to refresh and it seems to be locked to a previous version of the same data. We double checked the new data are actually present in the MySQL tables and HDFS. Our issue seems similar to this one but what we could read there doesn’t really solve our problems. We tried with the ALTER TABLE commands, both with the REFRESH and the FORGET, both are executing fine but there is no update on the source tables nonetheless. We also wanted to try with the metadata clean commands but, working in a containerised solution we can’t stop the Master coordinator without shutting down the container itself. We also tried deleting and adding the source back again (keeping the same naming convention for VDS’ sake) but it just keeps up the same metadata once again (jobs and all), not allowing us to see any updated data once again. Last but not least, if we add the very same source with a different name, the new data is shown. If you could give us any help we would be grateful. Thanks!
Is Dremio discovering new datasets? For example, if another table is added to MySQL does that table eventually appear in that source in Dremio?
It may be helpful to add additional logging to Dremio’s
server.log to find out what’s going on. Stop the Dremio service (coordinator and executors). Go to the Dremio configuration directory and find
logback.xml. Add the following logger to others toward the end of that file:
<logger name="com.dremio.exec.catalog.SourceMetadataManager" additivity="false"> <level value="debug" /> <appender-ref ref="text" /> </logger>
Restart the Dremio and service and monitor
server.log for debug information on that source and its tables. Do you see anything like the following (but with your source and tables instead of MonogDb etc):
2019-03-25 13:23:53,236 [metadata-refresh-local MonogDB] DEBUG c.d.e.catalog.SourceMetadataManager - Metadata refresh for dataset : "local MonogDB".testDB2.myCollection took 11 milliseconds.
First of all thanks for your prompt answer. I do not have the means to add a table to the MySQL right now but the fetching seemed to work fine after forgetting the metadata for a given table and then discovering it again. I’m afraid as I stated in the opening topic that I can’t really stop the Dremio service without having the container itself stopped but I’ll try and work something out to add the debug level to the server log. In the meantime we sort of solved the issue on HDFS removing the format and adding it back again, removing and then setting the source reflections again and then moving onto the same operation on the VDS’ reflections. For the MySQL source we add to add it again changing the name and modifying the virtual datasets consequently. Thanks.