FlightServerError: Flight RPC failed with message: Stream removed

Hey there! I posted about this under another topic I started last week, but want to make sure it gets noticed.

I’m running into an issue where I’m unable to retrieve the results of a query that returns a relatively large number of records (~1.9 million) from our Dremio Cloud instance when using Arrow Flight in Python.

I’ve tried the available functions (read_all(), read_pandas(), read_chunk()), but no matter which one I use, they eventually fail with the same basic error message. The output below is from my last attempt to retrieve results using read_chunk() in a loop–it failed 170 chunks in. It has sometimes made it nearly 500 chunks in before failing, and I was once able to get read_pandas() to return the entire dataset.

---------------------------------------------------------------------------
FlightServerError                         Traceback (most recent call last)
Input In [11], in <cell line: 4>()
      4 try:
      5     print("Reading chunk #",chunk_num)
----> 6     batch, _ = reader.read_chunk()
      7     batches.append(batch)
      8     chunk_num += 1

File /lib/python3.8/site-packages/pyarrow/_flight.pyx:903, in pyarrow._flight._MetadataRecordBatchReader.read_chunk()

File /lib/python3.8/site-packages/pyarrow/_flight.pyx:60, in pyarrow._flight.check_flight_status()

FlightServerError: Flight RPC failed with message: Stream removed. gRPC client debug context: {"created":"@1658325230.254288460","description":"Error received from peer ipv4:34.149.92.66:443","file":"/opt/vcpkg/buildtrees/grpc/src/85a295989c-6cf7bf442d.clean/src/core/lib/surface/call.cc","file_line":903,"grpc_message":"Stream removed","grpc_status":2}. Client context: OK

In our Dremio Cloud instance, I can see the jobs that were kicked off to handle the query, and they are largely successful. I see a couple of jobs that failed on “Connection reset by peer” errors, but the number of successes at this end far outnumbers the number of successes I’ve experienced from the Python script. :slight_smile:

The code I’m running is more or less example.py but tweaked to accommodate whichever function I’m using to retrieve records.

So what’s happening here? Can I do something to prevent this “Stream removed” error and retrieve my full dataset?

There was a similar post with this error here: Flight connector on helm chart after enabled tls flight gettting error. The fix noted was to add root crt file and pass as argument in authentication.

I saw that already, and confirmed that I am correctly finding and passing my certificate before posting.

The connection is being opened and data is being transferred, and then apparently that connection is closed prematurely.

More importantly, if I run a smaller query, or a simple test query (‘SELECT 1’), without changing anything else, I can retrieve results pretty consistently.

Minor update here–I edited my query to pull back fewer rows (~895,000 instead of ~1.9 million), and without changing anything else in my setup, I’m able to submit the query and retrieve full results just fine. That other post really doesn’t have anything to do with the issue I encountered.

New question: does Dremio enforce a data limit for non-enterprise customers? We are close to signing up, pending some internal activities, but for now are only using the community-level edition of Dremio Cloud. I mean I’m getting my query results now, so I’m mostly asking out of curiosity, but this would explain the behavior I encountered.

Great to hear that reducing the size of the query fixed the issue.

There is no difference between Standard and Enterprise edition in terms of data limits. The differences are outlined here.

There may be some other limit on the quantity of data or a timeout that caused this behavior.