We are using the Dremio community edition (MapR release) with MapR on Azure.
Dremio is able to connect to MapRfs, show a preview of files. But, when we try to save the file in parquet format, we see an error ‘Failed to Fetch’.
After throwing the said error, Dremio crashes. We’d have to manually restart it again.
We do not have this issue with any other data sources.(SQL etc)
There are also no logs printed of the event inside server.log or server.out
We’d greatly appreciate any help in this matter.
Hi @ruppala
Kindly send us the server.log and server.out at the time of the error
Thanks
@balaji.ramaswamy
logs.zip (6.9 KB)
Thank you for the immediate response. Please find the logs attached.
Hi @ruppala
Kindly send me the dremio-env file and output of “free -g” from the unix prompt
Can you also send us /var/log/messages and dmesg at the time this issue happened?
Also can you please send us the server.log before and after the event. I see only the startup in the log attached?
Thanks
@balaji.ramaswamy
output of free -g:
total used free shared buff/cache available
Mem: 62 30 24 0 7 31
Swap: 1 0 1
Please find the logs that includes everything printed after startup until the next restart
log.zip (3.7 KB)
Hi @balaji.ramaswamy could you please take a look.
Hi @ruppala
I do not see anything in the logs other than a metadata refresh for the SQL server query
2019-04-30 13:34:34,078 [main] INFO com.dremio.dac.server.WebServer - Started on http://localhost:9047
2019-04-30 13:34:34,215 [main] INFO c.dremio.dac.server.LivenessService - Started liveness service on port 46201
2019-04-30 13:34:39,543 [metadata-refresh-Finance DataMart] WARN c.d.e.store.jdbc.JdbcSchemaFetcher - Took longer than 5 seconds to query row count for [FDM].[dbo].[FCT_BAL], Using default value of 1000000000.
com.microsoft.sqlserver.jdbc.SQLServerException: The query has timed out.
at com.microsoft.sqlserver.jdbc.TDSCommand.checkForInterrupt(IOBuffer.java:6498) ~[microsoft-sqljdbc41-4.2.6420.100.jar:na]
Can you send me the server.gc? Try this again and send the server.gc, server.gc.1
Thanks
@balaji.ramaswamy
Thank you @balaji.ramaswamy
There is just one message that was printed in the server.gc when I repeated the save:
2019-04-30T19:08:13.712+0000: 22.187: [GC (Allocation Failure) [PSYoungGen: 868352K->37502K(1341952K)] 929037K->98211K(1923584K), 0.0292115 secs] [Times: user=0.09 sys=0.03, real=0.03 secs]
And in server.gc.1:
2019-04-30T19:00:49.928+0000: 209.822: [GC (Allocation Failure) [PSYoungGen: 1227737K->25415K(1338880K)] 1258083K->194728K(1819136K), 0.0922304 secs] [Times: user=0.38 sys=0.03, real=0.09 secs]
Hi @balaji.ramaswamy. Please take a look at this when you can.
Hi @balaji.ramaswamy, please let me know if you need any other information.
Thank you!
Hi @balaji.ramaswamy,
The data is on MapR stood up on Azure VMs.
@ruppala
I think it would be good if you enable debug and review the logs. Would be a bit noisy but have to look through it
Under the conf folder vi logback.xml and change the below to debug
Then restart Dremio and check for errors during startup time or before it crashes
If this does not reveal anything then change above to info from debug and try below
Restart again
Check log again