Attempting to write a too large value for field with index

I’m getting the following error message when running a query on a csv file in Blob storage

Attempting to write a too large value for field with index 4. Size was 37468 but limit was 32000.

Any ideas?

Hi there!

I have the same error, but with different limit values. I’m testing the windows version of Dremio on my notebook. Anybody has a solution? Thanks!

“Attempting to write a too large value for field with index 45. Size was 65536 but limit was 65536.”

I have the same issue, and its reading from the raw text file with no casting or anything. It seems to break in a really odd place in the file, and I verified the file is valid utf-8.

The really strange thing is that if I take a sample of the file around the byte region where it complains and import it, there is no problem.

I solved this issue finally and here is what the cause was:

  1. The error from dremio returned a particular byte offset. OFFSET
  2. I tried to find the line causing the issue by running:
    head -c+OFFSET FILE | wc -l
  3. This returned a LINE_NO count that I then found the line by doing:
    cat FILE | sed -n ‘1p;LINE_NOp;’
  4. Turns out that this returned a LINE_NO that was off by 460 lines in my case.
  5. The actualy line that was causing the issue had a character that turns out was defined as my quote character (in this case a "). This was causing the issue, however it was very hard for me to find since the byte offset appeared to be wrong. I had to basically filter and shift on the file and reimport it to dremio several times.
  6. Disabling the quote character by setting it to a control character solved the issue.

Can someone verify that my method of finding the errant line is flawed or is the byte offset being returned by dremio in the error incorrect?

1 Like

I have the same issue from database SQL Server.

**

Attempting to read a too large value for field with index 2. Size was 32219 but limit was 32000.

**

  UNSUPPORTED_OPERATION ERROR: Attempting to read a too large value for field with index 2. Size was 32219 but limit was 32000.

fieldIndex 2
size 32219
limit 32000
SqlOperatorImpl JDBC_SUB_SCAN
Location 1:0:6
Fragment 1:0

[Error Id: 5d63a69e-7db9-4602-b95b-b2ab71377163 on localhost:31010

Any Suggestion to fix that limitation?

Hi,
I’m also facing similar kind of issue while creating raw reflections on a virtual data set on dremio.
Can anyone help me on this?

  UNSUPPORTED_OPERATION ERROR: Attempting to read a too large value for field with name description. Size was 42990 but limit was 32000.

Record 659054
Line 659054
Column 44878
Field description
fieldName description
size 42990
limit 32000
SqlOperatorImpl JSON_SUB_SCAN
Location 1:8:3
Fragment 1:0

@jalandhar,

You have a single field in one or more of your JSON objects that is larger than 32 KB. Determine which column it is and try filtering it out in your query.

Thanks @ben,
Filtering is one of the solution, but how can i still use that field and create reflections and run queries on them.
Having different sizes of fields is very common in our datasets and filtering them each time is definitely a problem for us. Do we have any setting where i can increase this size limit and use these datasets without any external changes.

Thanks,
Jalandhar

@jalandhar

Are you able to break the field into multiple fields?

I have the same issue with json files
Attempting to read a too large value for field with name batch_text. Size was 78121 but limit was 32000.
I cannot register that files as a source - so filtering is not an option :frowning:
I have no control on the file generation - so I cannot break field without an additional process.
Is there any option to overcome the limitation?

Looks like it can be overwritten by name setting globally
or
ALTER SESSION SET name = 64000
It’s not clear the consequences for this change :slight_smile:

We are facing the same problem.
Many difficulties to find which line(s) raise the error. We have a lot of files with many lines in json format.
Because of some lines reach the limit the all dataset can’t be used when we need this field. Impossible to select and impossible to make a reflection with this field.
A solution, a workaround will be very appreciate : to filter lines that reach the limit or to override the limitation.

Hey guys,

I am facing the problem with field limit size in my data sets even after filtering the column and creating a new VDS.

So here are my steps:

  1. I have a VDS with a field(say: field_1) which exceeds the size limit of 32K bytes
  2. Created a new VDS by filtering that field and selecting only other few fields
  3. I am trying to join the new VDS with other VDS(which don’t have any field with size limit)

Now I’m getting the below error again:
Field ‘field_1’ exceeds the size limit of 32000 bytes.

I’m surprised that, even after filtering that field having limits, i m getting the same error again.

Can anyone help me on this.

Thanks,
Jalandhar

@jalandhar

The limit is on the PDS, we have a roadmap item to increase these limits

Thanks
Bali

1 Like

Thanks @balaji.ramaswamy for the update

I’m querying data from Hive. And get the same Error: “com.dremio.common.exceptions.UserException: Field exceeds the size limit of 32000 bytes.”

Is there any way to increase the limit?

@Muffex

That’s the current Dremio limit we support, below are the complete set of limits

Can you trim the offending field to < 32K?

Thanks
Bali

Hello @balaji.ramaswamy, thanks for your answer.

We need to preserve all the information. Is there any way to increase the limit?

regards

Hello @balaji.ramaswamy , thanks for your answer.

We need to preserve all the information. Is there any way to increase the limit?

regards