Dremio CE parquet reader causes native memory leak

Continuing the discussion from Support parquet files compressed via ZSTD:

The Dremio CE parquet reader is not cleaning up decompressors after use, which can result in native memory leaks for codecs with native library support as well as circumventing decompressor pooling, depending on the particular codec used. The snappy decompressor luckily does not exhibit this behavior, due to the release() method being a no-op, but other decompressors do - the supported GZip codec being one of them.

We spent a rather large effort getting to the bottom of this. Would fixing this issue be something the Dremio team would pick up, since we can’t do PRs on source that is not open? I outlines the fix below if so.

Thanks in advance,
Steen


Specifically, com.dremio.parquet.pages.BaseReaderIterator does not call release() on the BytesDecompressor created in the constructor:

public BaseReaderIterator(ColumnChunkMetaData metadata, FullColumnDescriptor descriptor, CompressionCodecFactory codecFactory, BufferAllocator allocator, BulkInputStream in, boolean useSingleStream) {
    ...
    this.decompressor = (metadata.getCodec() == CompressionCodecName.UNCOMPRESSED) ? null : codecFactory.getDecompressor(metadata.getCodec());
    ...
  }

needs to be released in the close() method like so:

public void close() throws Exception {
  try {
    if (!this.useSingleStream && this.in != null)
      this.in.close(); 
  } finally {
    // Remember to release the decompressor after use
    decompressor.release();
  }
}
3 Likes

Hi @wundi I will pass this on to the right team. What are the symptoms of the issue? Is this a Heap or direct memory leak

Hi @balaji.ramaswamy Sounds great, much appreciated. We read parquet files written with ZStandard, so in our particular case it is native memory. Byte buffers with direct memory are temporarily taking up memory, but GC cleans them up sooner or later. The snappy codec doesn’t require any cleanup, so for most users it won’t be an issue. Leaking only happens for codecs that use native code and acquire resources outside the JVM and these resources depend on codec instances being “closed” properly in order to release those resources.

The Zlib codec does this as a safeguard in a finalizer, so in that case there shouldn’t be a leak as such, but performance might potentially be affected (since codec pooling is circumvented).

If it’s any help, I can share a flamegraph of the allocation path - it’s attached in the bottom. However, the allocation itself isn’t an issue though, but the releasing not being done. In our particular case, the JVM uses ~650GB virtual memory on an executor configured with 200GB direct memory and 30GB heap, before the OS OOM killer terminates the JVM due to it using almost all of the 384GB the machine has.

I just discovered that there has been work in the parquet-mr community (issue [PARQUET-2160]([PARQUET-2160] Close decompression stream to free off-heap memory in time - ASF JIRA and PR 982) , where there has been made changes specifically to release the native memory used by ZStandard as early as possible and not rely on GC to do the work. This would remove the native memory leak in our particular case as well as alleviate some of the potential performance impact.

However, for codec pooling and better performance, calling release() on the decompressor after use is still very much relevant (for any codec, not just ZStandard)

Love this.
I’ll get this in from of our devs.

Thank you, Steen!

2 Likes

Much appreciated dch! :clap:

We already did (most of) the work, so no reason others in the community have to. You taking it in means less maintenance for us going forward, so it’s a win-win :smiling_face:

1 Like

@dch is there a timeline for resolving this?

I’ve come across a similar issue in Azure and trying to determine whether it should be a new thread. I routed through from Support parquet files compressed via ZSTD - Dremio, BUT I can read small amounts of data fine - it’s only a problem when reading a folder with many parquet files.

Error message:

Are you running Dremio 24.1.0? If not, upgrading should solve it. Native support for reading and writing Zstandard compressed parquet files is part of the release notes.

Hi @wundi,

I’ve just built a cluster on Azure 7 days ago pulling from marketplace, but version looks old:
20.1.0-202202061055110045-36733c65

Suppose next question is why the Azure image is behind. One for Dremio team…

Yeah, 20.1 is from January 2022, so that’s quite out of date. Not sure if you’re using Parquet or Iceberg, but if it’s the latter there’s a new release coming out shortly with some great improvements to their Iceberg support, which may be relevant for you (if you’re doing anything on your own without using the marketplace).