ARRAY_AGG: IndexOutOfBoundsException

Hello everyone,

I created the following view

CREATE OR REPLACE VIEW “VIEW”
AS
SELECT
TOKEN,
TYPE,
ARRAY_AGG(TEXT) AS SYMPTOMS_TEXT,
ARRAY_AGG(CODE) AS SYMPTOMS_CODE
FROM “TABLE”
GROUP BY TOKEN, TYPE;

The “TABLE” is an Iceberg file in an “Object Storage” (MinIO (S3)) and we create the view inside a schema in the “Spaces”. I can create the view without errors but when I run a simple SELECT * FROM “VIEW”, I receive the follwing error-message:

SQL-Fehler: SYSTEM ERROR: IndexOutOfBoundsException: index: 32760, length: 36 (expected: range(0, 32768))

SqlOperatorImpl HASH_AGGREGATE
Location 0:0:2
Fragment 0:0

[Error Id: 16e6c645-b686-476c-94e6-9170ca95c743 on dremio.zi.de:0]

(java.lang.IndexOutOfBoundsException) index: 32760, length: 36 (expected: range(0, 32768))
org.apache.arrow.memory.ArrowBuf.checkIndex():701
org.apache.arrow.memory.ArrowBuf.setBytes():765
org.apache.arrow.vector.BaseVariableWidthVector.setBytes():1264
org.apache.arrow.vector.BaseVariableWidthVector.set():1079
org.apache.arrow.vector.VarCharVector.set():249
com.dremio.sabot.op.aggregate.vectorized.arrayagg.VarcharArrayAggAccumulatorHolder.addItemToVector():48
com.dremio.sabot.op.aggregate.vectorized.arrayagg.VarcharArrayAggAccumulatorHolder.addItemToVector():23
com.dremio.sabot.op.aggregate.vectorized.arrayagg.BaseArrayAggAccumulatorHolder.addItem():80
com.dremio.sabot.op.aggregate.vectorized.arrayagg.BaseArrayAggAccumulator.accumulate():115
com.dremio.sabot.op.aggregate.vectorized.AccumulatorSet.accumulate():248
com.dremio.sabot.op.aggregate.vectorized.VectorizedHashAggOperator.accumulateForAllPartitions():1728
com.dremio.sabot.op.aggregate.vectorized.VectorizedHashAggOperator.consumeDataHelper():1088
com.dremio.sabot.op.aggregate.vectorized.VectorizedHashAggOperator.consumeData():1003
com.dremio.sabot.driver.SmartOp$SmartSingleInput.consumeData():267
com.dremio.sabot.driver.StraightPipe.pump():59
com.dremio.sabot.driver.Pipeline.doPump():124
com.dremio.sabot.driver.Pipeline.pumpOnce():114
com.dremio.sabot.exec.fragment.FragmentExecutor$DoAsPumper.run():565
com.dremio.sabot.exec.fragment.FragmentExecutor.run():480
com.dremio.sabot.exec.fragment.FragmentExecutor.access$1700():109
com.dremio.sabot.exec.fragment.FragmentExecutor$AsyncTaskImpl.run():1016
com.dremio.sabot.task.AsyncTaskWrapper.run():122
com.dremio.sabot.task.slicing.SlicingThread.mainExecutionLoop():249
com.dremio.sabot.task.slicing.SlicingThread.run():171

The TEXT variable is a VARCHAR field, no entry longer then 24 bytes and the CODE variable is a VARCHAR field aswell, with a fixed length of 36 bytes. An example would look like this

TOKEN TYPE TEXT CODE
A1 MAIN Schwindel 8cda9a1a-a46f-4510-8872-d49bd4bc596e
A1 MAIN Fingerbeschwerden 704dc839-0de0-4797-b78b-1e6e488a47c7
A1 MAIN Mund-/Zungen-/Lippenbeschwerden 83431d7a-3555-444e-a693-24a7111e782d
A1 MAIN Durchfall f028a1f4-214d-4f15-afda-8d2691912d5f
A1 RELATED Fieber 1b5cc668-d357-43af-a863-799ebe288ffc
A1 RELATED Schwarzer Stuhl/Blut im Stuhl 0884bccb-6582-4622-8581-5e970ede1ecb

What I expected from the ARRAY_AGG function was the following result:

TOKEN TYPE SYMPTOM_TEXT SYMPTOM_CODE
A1 MAIN [Schwindel,Fingerbeschwerden,Mund-/Zungen-/Lippenbeschwerden,Durchfall] [8cda9a1a-a46f-4510-8872-d49bd4bc596e,704dc839-0de0-4797-b78b-1e6e488a47c7,83431d7a-3555-444e-a693-24a7111e782d,f028a1f4-214d-4f15-afda-8d2691912d5f]
A1 RELATED [Fieber,Schwarzer Stuhl/Blut im Stuhl] [1b5cc668-d357-43af-a863-799ebe288ffc,0884bccb-6582-4622-8581-5e970ede1ecb]

So my question would be two-fold:
(1) Did I missunderstood the function and if not
(2) how do I get rid of the error-message?

I already checked the topics towards setting the parameter limits.single_field_size_bytes: to a higher value, but even setting it to 99.000 does not change the error-message.

We are running Dremio 24.3.2 on-premise.

Thanks in advance, Daniel

Hello @daniel.schreiber

Can you share what exact Dremio version did you encounter this issue? And if possible, the query profile as well?

Hello @Rafay,

I opened a ticket in the meantime and will post the solution to my problem here, after the support helped me out.