DATA_READ_ERROR while trying to accelerate reflections

Ritik · August 29, 2022, 11:43am

Hi,

I have a physical dataset (purple icon) on Dremio from a connection to an S3 bucket. The data files are tab separated .tab files. When I am trying to create a reflection on this dataset I am getting this error:

DATA_READ ERROR: Error processing input: , line=263, char=105146. Content parsed:
…
Failure while reading file FILE Happened at or shortly before byte position 107970.
…
Caused By (org.apache.hadoop.fs.s3a.AWSClientIOException) read on FILE com.amazonaws.SdkClientException: Data read has a different length than the expected: dataLength=105146; expectedLength=152111;

I have presented the relevant errors above. However, I have noticed nothing strange about those lines in the files mentioned in the error.

What’s strange, is that I have tried to run the reflections job multiple times on the same dataset and each time, the file (the physical dataset is constructed on top of a folder having several files) mentioned in the error changes, and so do the line and character position of the error.

Is there any reason for this behavior? And what can be done to solve this issue?

Thanks

Benny_Chow · September 2, 2022, 6:04am

Hi Ritik, so you can query your dataset successfully w/o a reflection? But when you create a reflection, you get an error? Can you post both job profiles? Thanks.

Topic		Replies	Views
S3 (minio) errors while getting data from reflection	3	1834	April 8, 2020
S3 (Compat Mode) errors while getting data from reflection	10	2883	January 24, 2023
Can not read data from reflection [S3 compatibility]	1	2031	May 20, 2020
Why use reflection on reading data from S3?	2	2757	September 15, 2018
Error when parsing multibyte UTF-8 characters from S3	2	1946	June 29, 2018

DATA_READ_ERROR while trying to accelerate reflections

Related topics