Attempting to write a too large value for field with index

nick.latocha · August 6, 2019, 6:56am

I’m getting the following error message when running a query on a csv file in Blob storage

Attempting to write a too large value for field with index 4. Size was 37468 but limit was 32000.

Any ideas?

tomxor · September 20, 2019, 11:02am

Hi there!

I have the same error, but with different limit values. I’m testing the windows version of Dremio on my notebook. Anybody has a solution? Thanks!

“Attempting to write a too large value for field with index 45. Size was 65536 but limit was 65536.”

kprifogle · December 7, 2019, 1:11am

I have the same issue, and its reading from the raw text file with no casting or anything. It seems to break in a really odd place in the file, and I verified the file is valid utf-8.

kprifogle · December 7, 2019, 1:12am

The really strange thing is that if I take a sample of the file around the byte region where it complains and import it, there is no problem.

kprifogle · December 7, 2019, 12:52pm

I solved this issue finally and here is what the cause was:

The error from dremio returned a particular byte offset. OFFSET
I tried to find the line causing the issue by running:
head -c+OFFSET FILE | wc -l
This returned a LINE_NO count that I then found the line by doing:
cat FILE | sed -n ‘1p;LINE_NOp;’
Turns out that this returned a LINE_NO that was off by 460 lines in my case.
The actualy line that was causing the issue had a character that turns out was defined as my quote character (in this case a "). This was causing the issue, however it was very hard for me to find since the byte offset appeared to be wrong. I had to basically filter and shift on the file and reimport it to dremio several times.
Disabling the quote character by setting it to a control character solved the issue.

Can someone verify that my method of finding the errant line is flawed or is the byte offset being returned by dremio in the error incorrect?

stanley · May 21, 2020, 9:35pm

I have the same issue from database SQL Server.

**

Attempting to read a too large value for field with index 2. Size was 32219 but limit was 32000.

**

  UNSUPPORTED_OPERATION ERROR: Attempting to read a too large value for field with index 2. Size was 32219 but limit was 32000.

fieldIndex 2
size 32219
limit 32000
SqlOperatorImpl JDBC_SUB_SCAN
Location 1:0:6
Fragment 1:0

[Error Id: 5d63a69e-7db9-4602-b95b-b2ab71377163 on localhost:31010

Any Suggestion to fix that limitation?

jalandhar · June 11, 2020, 1:57pm

Hi,
I’m also facing similar kind of issue while creating raw reflections on a virtual data set on dremio.
Can anyone help me on this?

  UNSUPPORTED_OPERATION ERROR: Attempting to read a too large value for field with name description. Size was 42990 but limit was 32000.

Record 659054
Line 659054
Column 44878
Field description
fieldName description
size 42990
limit 32000
SqlOperatorImpl JSON_SUB_SCAN
Location 1:8:3
Fragment 1:0

ben · June 11, 2020, 5:17pm

@jalandhar,

You have a single field in one or more of your JSON objects that is larger than 32 KB. Determine which column it is and try filtering it out in your query.

jalandhar · June 15, 2020, 6:48am

Thanks @ben,
Filtering is one of the solution, but how can i still use that field and create reflections and run queries on them.
Having different sizes of fields is very common in our datasets and filtering them each time is definitely a problem for us. Do we have any setting where i can increase this size limit and use these datasets without any external changes.

Thanks,
Jalandhar

balaji.ramaswamy · June 22, 2020, 7:51am

@jalandhar

Are you able to break the field into multiple fields?

dmm · June 25, 2020, 9:36am

I have the same issue with json files
Attempting to read a too large value for field with name batch_text. Size was 78121 but limit was 32000.
I cannot register that files as a source - so filtering is not an option
I have no control on the file generation - so I cannot break field without an additional process.
Is there any option to overcome the limitation?

dmm · June 25, 2020, 10:58am

Looks like it can be overwritten by name setting globally
or
ALTER SESSION SET name = 64000
It’s not clear the consequences for this change

gchampion · July 3, 2020, 3:37pm

We are facing the same problem.
Many difficulties to find which line(s) raise the error. We have a lot of files with many lines in json format.
Because of some lines reach the limit the all dataset can’t be used when we need this field. Impossible to select and impossible to make a reflection with this field.
A solution, a workaround will be very appreciate : to filter lines that reach the limit or to override the limitation.

jalandhar · July 30, 2020, 6:56am

Hey guys,

I am facing the problem with field limit size in my data sets even after filtering the column and creating a new VDS.

So here are my steps:

I have a VDS with a field(say: field_1) which exceeds the size limit of 32K bytes
Created a new VDS by filtering that field and selecting only other few fields
I am trying to join the new VDS with other VDS(which don’t have any field with size limit)

Now I’m getting the below error again:
Field ‘field_1’ exceeds the size limit of 32000 bytes.

I’m surprised that, even after filtering that field having limits, i m getting the same error again.

Can anyone help me on this.

Thanks,
Jalandhar

balaji.ramaswamy · August 23, 2020, 5:43am

@jalandhar

The limit is on the PDS, we have a roadmap item to increase these limits

Thanks
Bali

jalandhar · August 23, 2020, 8:26am

Thanks @balaji.ramaswamy for the update

Muffex · November 2, 2020, 10:14pm

I’m querying data from Hive. And get the same Error: “com.dremio.common.exceptions.UserException: Field exceeds the size limit of 32000 bytes.”

Is there any way to increase the limit?

balaji.ramaswamy · November 3, 2020, 7:44am

@Muffex

That’s the current Dremio limit we support, below are the complete set of limits

Can you trim the offending field to < 32K?

Thanks
Bali

Muffex · November 3, 2020, 10:39am

Hello @balaji.ramaswamy, thanks for your answer.

We need to preserve all the information. Is there any way to increase the limit?

regards

Muffex · November 4, 2020, 9:21am

Hello @balaji.ramaswamy , thanks for your answer.

We need to preserve all the information. Is there any way to increase the limit?

regards

Topic		Replies	Views
Can we customise the json field size Dremio University	2	1285	July 24, 2020
Field 'xyz' exceeds the size limit of 32000 bytes (though I don't have that filed in the Query) Dremio University	6	2957	November 5, 2020
32k Limit for Parquet columns	6	4348	July 22, 2020
Issue with POST /catalog/{id}	3	1034	June 9, 2020
Null values are not supported in lists by default. Please set `store.json.all_text_mode` to true to read lists containing nulls. Be advised that this will treat JSON null values as a string containing the word 'null'	19	4614	June 20, 2018

Attempting to write a too large value for field with index

Related topics