Dremio Crashing

Long shot but has anyone come across an error similar to this one? Dremio has been crashing consistently for us (new cluster setup).

Environment details:
CentOS 7.5.1804 (Core)
MapR Cluster (5 nodes)
Latest Dremio 2.1

From /var/log/dremio/server.out
*** Error in `/usr/java/jdk1.8.0_191-amd64/jre/bin/java’: free(): invalid next size (normal): 0x00007f0a4f7218a0 ***
======= Backtrace: =========
/lib64/libc.so.6(+0x81499)[0x7f0aae69d499]
/tmp/mapr-mapr-libMapRClient.5.2.2.201801021050-mapr.so(Java_com_mapr_fs_jni_Page_releaseMemory+0x100)[0x7f0a76bb3a60]
[0x7f0a9b212342]
======= Memory map: ========

Dremio Daemon Started as master
SLF4J: Failed to load class “org.slf4j.impl.StaticLoggerBinder”.
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
java: fs/client/fileclient/cc/client.cc:5297: int mapr::fs::MapClient::ReadRPC(mapr::fs::Inode*, uint64_t, uint32_t, uint64_t, int, iovec*, bool*, mapr::fs::SecondaryFid*, uint64_t*, mapr::fs::FidMsg*, const char*, uint64_t, mapr::fs::Inode**): Assertion `0’ failed.

Can you email me (you have my email from our conversation earlier) both server.log & server.out?

Also, out of curiousity, if you are unable to start Dremio, how did you run into the IndexOutOfBoundsException ? Different cluster/setup?

Hi Anthony, I have the same error on the 3 executor nodes running on Centos 7 too.

It starts OK and run for a while and suddenly they goes down (just dremio). After reboot the AWS instance it connect with no problem.

Have you reach any news on this message?

Thanks!

@jmillet what version of MapR and Dremio are you on? IIRC - the particular issue the original poster faced is running into a non-fully supported version of MapR 6

Hi @jmillet

For the Dremio node that is crashing, can you please look in your server.log and server.out for an entry for a heap dump file. Will be java_pid_<pid_no>.hprof. Also under your log folder see if any hprof files are there

You can also look at the server.gc file under the log folder for consecutive 'Full GC"

Kindly let me know what you find and I will suggest next steps

Thanks
@balaji.ramaswamy

Hi Anthony, thanks for your quick response.

We are not using MapR, we have an Stand alone installation and we use HDFS as our main data source.
The Dremio version is 2.0.5-201806021755080191-767cfb5
We will add 3 additional executors nodes to be sure they are exhausted.

Regards

Jose

@jmillet it seems like your issue may be a completely different issue than. May I recommend upgrading to latest version 3.0.6? If the issue still persists, please make a new thread (since the issues are probably different) and attach server.out and server.log

Just in case this helps anyone, agree with Anthony, the issue originally raised here was only relevant to MapR. At the end, we found that the version of Dremio that we were running was not compatible with MapR 6.1. Dremio was using outdated jars to interact with MapR