It seems that Dremio immediately stops processing local files that contain things like invalid UTF-8 sequences (DATA_READ ERROR: Error parsing JSON - Invalid UTF-8 middle byte).
It would be convenient to have an option to be able to skip invalid rows like this, instead of pre-processing the files first before being able to use them in Dremio.
@gerald, makes sense – actually something we’ve been internally discussing. We don’t have a timeline for this yet, but we’ll announce on the community when it’s available.
Numeric value (9223372036854776000) out of range of long (-9223372036854775808 - 9223372036854775807) at [Source: com.dremio.exec.store.dfs.FSDataInputStreamWrapper@6e312551: com.dremio.exec.store.dfs.FSDataInputStreamWrapper$WrappedInputStream@7c8729e5; line: 2, column: 4530]
@Sukku What is the file format? If CSV, JSON, you can use the COPY INTO feature in 24.0 and see if any of the options help? Are you able to send a sample if CSV or JSON that contains the bad record?