It seems that Dremio immediately stops processing local files that contain things like invalid UTF-8 sequences (DATA_READ ERROR: Error parsing JSON - Invalid UTF-8 middle byte).
It would be convenient to have an option to be able to skip invalid rows like this, instead of pre-processing the files first before being able to use them in Dremio.
@gerald, makes sense – actually something we’ve been internally discussing. We don’t have a timeline for this yet, but we’ll announce on the community when it’s available.
here is one more case i just ran into:
Numeric value (9223372036854776000) out of range of long (-9223372036854775808 - 9223372036854775807) at [Source: com.dremio.exec.store.dfs.FSDataInputStreamWrapper@6e312551: com.dremio.exec.store.dfs.FSDataInputStreamWrapper$WrappedInputStream@7c8729e5; line: 2, column: 4530]