Option to skip invalid rows

Hello,

It seems that Dremio immediately stops processing local files that contain things like invalid UTF-8 sequences (DATA_READ ERROR: Error parsing JSON - Invalid UTF-8 middle byte).

It would be convenient to have an option to be able to skip invalid rows like this, instead of pre-processing the files first before being able to use them in Dremio.

best,
Gerald

@gerald, makes sense – actually something we’ve been internally discussing. We don’t have a timeline for this yet, but we’ll announce on the community when it’s available.

here is one more case i just ran into:

Numeric value (9223372036854776000) out of range of long (-9223372036854775808 - 9223372036854775807) at [Source: com.dremio.exec.store.dfs.FSDataInputStreamWrapper@6e312551: com.dremio.exec.store.dfs.FSDataInputStreamWrapper$WrappedInputStream@7c8729e5; line: 2, column: 4530]

Do we have any option to skip invalid rows in the latest version of Dremio?

@Sukku What is the file format? If CSV, JSON, you can use the COPY INTO feature in 24.0 and see if any of the options help? Are you able to send a sample if CSV or JSON that contains the bad record?

https://docs.dremio.com/cloud/sql/commands/copy-into-table/