Hello everyone,
I created the following view
CREATE OR REPLACE VIEW “VIEW”
AS
SELECT
TOKEN,
TYPE,
ARRAY_AGG(TEXT) AS SYMPTOMS_TEXT,
ARRAY_AGG(CODE) AS SYMPTOMS_CODE
FROM “TABLE”
GROUP BY TOKEN, TYPE;
The “TABLE” is an Iceberg file in an “Object Storage” (MinIO (S3)) and we create the view inside a schema in the “Spaces”. I can create the view without errors but when I run a simple SELECT * FROM “VIEW”, I receive the follwing error-message:
SQL-Fehler: SYSTEM ERROR: IndexOutOfBoundsException: index: 32760, length: 36 (expected: range(0, 32768))
SqlOperatorImpl HASH_AGGREGATE
Location 0:0:2
Fragment 0:0
[Error Id: 16e6c645-b686-476c-94e6-9170ca95c743 on dremio.zi.de:0]
(java.lang.IndexOutOfBoundsException) index: 32760, length: 36 (expected: range(0, 32768))
org.apache.arrow.memory.ArrowBuf.checkIndex():701
org.apache.arrow.memory.ArrowBuf.setBytes():765
org.apache.arrow.vector.BaseVariableWidthVector.setBytes():1264
org.apache.arrow.vector.BaseVariableWidthVector.set():1079
org.apache.arrow.vector.VarCharVector.set():249
com.dremio.sabot.op.aggregate.vectorized.arrayagg.VarcharArrayAggAccumulatorHolder.addItemToVector():48
com.dremio.sabot.op.aggregate.vectorized.arrayagg.VarcharArrayAggAccumulatorHolder.addItemToVector():23
com.dremio.sabot.op.aggregate.vectorized.arrayagg.BaseArrayAggAccumulatorHolder.addItem():80
com.dremio.sabot.op.aggregate.vectorized.arrayagg.BaseArrayAggAccumulator.accumulate():115
com.dremio.sabot.op.aggregate.vectorized.AccumulatorSet.accumulate():248
com.dremio.sabot.op.aggregate.vectorized.VectorizedHashAggOperator.accumulateForAllPartitions():1728
com.dremio.sabot.op.aggregate.vectorized.VectorizedHashAggOperator.consumeDataHelper():1088
com.dremio.sabot.op.aggregate.vectorized.VectorizedHashAggOperator.consumeData():1003
com.dremio.sabot.driver.SmartOp$SmartSingleInput.consumeData():267
com.dremio.sabot.driver.StraightPipe.pump():59
com.dremio.sabot.driver.Pipeline.doPump():124
com.dremio.sabot.driver.Pipeline.pumpOnce():114
com.dremio.sabot.exec.fragment.FragmentExecutor$DoAsPumper.run():565
com.dremio.sabot.exec.fragment.FragmentExecutor.run():480
com.dremio.sabot.exec.fragment.FragmentExecutor.access$1700():109
com.dremio.sabot.exec.fragment.FragmentExecutor$AsyncTaskImpl.run():1016
com.dremio.sabot.task.AsyncTaskWrapper.run():122
com.dremio.sabot.task.slicing.SlicingThread.mainExecutionLoop():249
com.dremio.sabot.task.slicing.SlicingThread.run():171
The TEXT variable is a VARCHAR field, no entry longer then 24 bytes and the CODE variable is a VARCHAR field aswell, with a fixed length of 36 bytes. An example would look like this
TOKEN | TYPE | TEXT | CODE | |
---|---|---|---|---|
A1 | MAIN | Schwindel | 8cda9a1a-a46f-4510-8872-d49bd4bc596e | |
A1 | MAIN | Fingerbeschwerden | 704dc839-0de0-4797-b78b-1e6e488a47c7 | |
A1 | MAIN | Mund-/Zungen-/Lippenbeschwerden | 83431d7a-3555-444e-a693-24a7111e782d | |
A1 | MAIN | Durchfall | f028a1f4-214d-4f15-afda-8d2691912d5f | |
A1 | RELATED | Fieber | 1b5cc668-d357-43af-a863-799ebe288ffc | |
A1 | RELATED | Schwarzer Stuhl/Blut im Stuhl | 0884bccb-6582-4622-8581-5e970ede1ecb |
What I expected from the ARRAY_AGG function was the following result:
TOKEN | TYPE | SYMPTOM_TEXT | SYMPTOM_CODE | |
---|---|---|---|---|
A1 | MAIN | [Schwindel,Fingerbeschwerden,Mund-/Zungen-/Lippenbeschwerden,Durchfall] | [8cda9a1a-a46f-4510-8872-d49bd4bc596e,704dc839-0de0-4797-b78b-1e6e488a47c7,83431d7a-3555-444e-a693-24a7111e782d,f028a1f4-214d-4f15-afda-8d2691912d5f] | |
A1 | RELATED | [Fieber,Schwarzer Stuhl/Blut im Stuhl] | [1b5cc668-d357-43af-a863-799ebe288ffc,0884bccb-6582-4622-8581-5e970ede1ecb] |
So my question would be two-fold:
(1) Did I missunderstood the function and if not
(2) how do I get rid of the error-message?
I already checked the topics towards setting the parameter limits.single_field_size_bytes: to a higher value, but even setting it to 99.000 does not change the error-message.
We are running Dremio 24.3.2 on-premise.
Thanks in advance, Daniel