Limit List field exceeded the maximum number of elements 128

Hi,

I’ve encountered since I am on Community Edition v14.0 the error
List field exceeded the maximum number of elements 128, as some of my parquet files have list with more than 128 items.

I did not have this error message in the past (with v4.9)
Do you confirm it’s a new limit ? I do not see it in documentation about limits : Dremio
Is there any way to change this limit ?

Thanks

Here is the error log :

2021-03-29 09:50:56,145 [grpc-default-executor-16139] INFO c.d.service.jobs.JobResultsStore - User Error Occurred [ErrorId: 1e156f0f-2185-4028-b241-1e8b016b8327]
com.dremio.common.exceptions.UserException: List field ‘data_cart_items’ exceeded the maximum number of elements 128
at com.dremio.common.exceptions.UserException$Builder.build(UserException.java:804)
at com.dremio.service.jobs.JobResultsStore.loadJobData(JobResultsStore.java:145)
at com.dremio.service.jobs.JobResultsStore$LateJobLoader.load(JobResultsStore.java:294)
at com.dremio.service.jobs.JobDataImpl.range(JobDataImpl.java:46)
at com.dremio.service.jobs.LocalJobsService.getJobData(LocalJobsService.java:906)
at com.dremio.service.jobs.JobsFlightProducer.getStream(JobsFlightProducer.java:76)
at org.apache.arrow.flight.FlightService.doGetCustom(FlightService.java:111)
at org.apache.arrow.flight.FlightBindingService$DoGetMethod.invoke(FlightBindingService.java:144)
at org.apache.arrow.flight.FlightBindingService$DoGetMethod.invoke(FlightBindingService.java:134)
at io.grpc.stub.ServerCalls$UnaryServerCallHandler$UnaryServerCallListener.onHalfClose(ServerCalls.java:172)
at io.grpc.PartialForwardingServerCallListener.onHalfClose(PartialForwardingServerCallListener.java:35)
at io.grpc.ForwardingServerCallListener.onHalfClose(ForwardingServerCallListener.java:23)
at io.grpc.ForwardingServerCallListener$SimpleForwardingServerCallListener.onHalfClose(ForwardingServerCallListener.java:40)
at io.grpc.util.TransmitStatusRuntimeExceptionInterceptor$1.onHalfClose(TransmitStatusRuntimeExceptionInterceptor.java:74)
at io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.halfClosed(ServerCallImpl.java:331)
at io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1HalfClosed.runInContext(ServerImpl.java:820)
at io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
at io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

@dfleckinger

Dremio did not support complex type resolution during planning in v4.9 and were treating the datatype is a mixed type. In the current version, the planner detects the right data type as ARRAY and hit the maximum elements per LIST limit

How many such datasets are affected?

Thanks
Bali

I think only one dataset is affected, but not the least important unfortunately.
More than 128 cart items is frequent in FMCG baskets.

If there is no way to use this dataset anymore, I will need to process all the data beforehand by flattening it.

@dfleckinger Here you go “store.parquet.list_items.threshold”

Thanks @balaji.ramaswamy , works great ! That fixed my issue

Hi

Using the community edition (v15) to complete the dremio university training - Lesson “Handling Nested Fields | Curating data | D101 Courseware …”

Importing the restaurant_reviews.parquet file as provided by the university in that specific lesson - the following message is returned

List field ‘friends’ exceeded the maximum number of elements 128.

Where in the community edition should I adjust the configuration parameter store.parquet.list_items.threshold - please?

Found it ```
ALTER SYSTEM SET store.parquet.block-size = 1073741824;


Thanks

@irnerd

You can use SQL or go via the admin-support-support key option

Thanks -
I found this did not fix the issue with the dremio university lesson - I increased the store.parquet.block-size incrementally to 8GB but the same message reported "
List field ‘friends’ exceeded the maximum number of elements 128.Show more"

Uploading this file from the university fails - anybody have any ideas how to get of this response please?

restaurant_reviews.parquet

1 Like

@irnerd The parameter is “store.parquet.list_items.threshold”

Thank you - had to set to 2048 in the end.

  • Is there a reference page of all these settings - the main page does not contain that one
    Dremio

@irnerd Currently no, these settings have to be carefully modified under guidance as some can cause cluster instability, the specific one you have changed is generally safe, across versions these can also change,

We ran into this issue too. I would expect any valid parquet file to be readable by Dremio. As it stands, the user runs into issues unexpectant upon the first occurrence. Maybe you send a parquet file with 1k cols, fail… send a parquet file using the legacy list structure, fail, send a parquet file with list col over 128 entries, fail… this list of limitations on top of the parquet format leaves the end user questioning when the next generated parquet file will break things.

1 Like

This is super confusing for new users - people doing a tutorial from the Vendor definitely expect it to work fully. Perhaps remove the friends field from the parquet file?

I disagree - I think it is an excellent use case and people should be aware - Given the complexity of data structures - this was a really useful exercise in being able to manipulate large parquets - vote for leaving it in, but maybe supplement the training session with a note about the configuration parameter?

1 Like

That would work for me - use it as a learning/training point :slight_smile:

@balaji.ramaswamy Is there any hard limit for this property ‘store.parquet.list_items.threshold’? We need to set it much higher(~20000) due to our business use case. What kind of issues we can face in this case?

Thanks
Amit

@Amit_Tyagi Not really, please monitor for direct memory usage

Hello,

is there a way to set this parameter also in Dremio Cloud? If I try to run an SQL query

ALTER SYSTEM SET store.parquet.list_items.threshold = 1000

I get the error that “Changing advanced system settings is not supported on Dremio Cloud”

Would be a shame if the list limitation cannot be overcome in Dremio Cloud.

@radu.popa How many elements do you have?

Up to 1300 rows I can have in the array