While accessing a collection (XYZ) in a MongoDB, Dremio is showing an error “Field ‘description’ exceeds the size limit of 32000 bytes”. As per Dremio documentation, this is the limit in Dremio.
So, created a copy of the collection (XYZ_Copy) and dropped the “description” field, it is confirmed that the “description” field is deleted from the collection XYZ_Copy. When accessed XYZ_Copy collection from Dremio, it is still showing the same error “Field ‘description’ exceeds the size limit of 32000 bytes” even after the field is dropped.
Deleting a field does not literally delete the field in MongoDB, but UNSET the field. So, internally the field exists, so the meta data. As Dremio accesses metadata, it is finding that field which was deleted (by using UNSET).
After troubleshooting, We are creating a new collection with the needed fields only, instead of copying the whole collection and UNSET a field.
New collection with only the needed is now accessible from Dremio. Working on it . Will keep posted the updates.
Like to share a solution to handle 32000 bytes limit in Dremio.
We had an issue accessing mongoDB Collection havng a value of size 32000 bytes in a field even that field is not selected in Dremio.
Solution to fix this is, Create a view in mongoDB with the fields you need without that 32000 bytes field causing the error. Then, access the view from Dremio, It is Working well.
@datalifenycALTER SESSON is only for the life of that session, so if you do it on the UI and navigate to admin-support and see the support keys it would not reflect. ALTER SESSION use case is when you want a certain value of a support key for just one query or a bunch of queries that are fired under one session
I think what may have been happening for ALTER SESSION is that I was using separate scripts for ALTER SESSION and processing the data.
If I understand the documentation correctly, it appears that I would have to run ALTER SESSION in the same script as the query I am using to COPY INTO or INSERT the data.
To tune this limit it pretty fine but is there another setting to retrieve the text until this limit whatever if the field is over this limit ? I think it’s better to extract even if it is limited than to fail ?
Sorry Baladji, but the schema applied makes a column with more than 32k bytes.When the query runs it fails …
My PDS is made of a csv file (french style - with ; …), it seems one of the line contains a column with a lot of characters (more than 32k) … If I run select * from pds limit 100 it is ok, while if I ruen select * from pds (without limit) it fails and I can’t identify which line failed.
Even if I run select substr(column,4) from PDS, it fails with same error.
My dream would be if dremio would cast the cast (left(), substr()) by it self.
Being able to change the limit is nice, but when you analyze logs or json, it contains wild information and you don’t want dremio to fail, it should skip or cast.