Pointing Dremio to tab delimited text - ignore field > 65535?

I’m looking to use Dremio to point to an ASDL hosted file which is tab-delimited text. One of the columns is a Remarks column, and sometimes they run to > 65535 characters

Is there anyway I can tell Dremio to ignore this column when creating the format?

I’ve taken a look, and found the issue

For my file format, I had the following settings

Format: Text (delimited)
Field Delimiter: Tab
Quote: DoubleQuote (this was the root of my problem)
Comment: NumberSign
Escape: DoubleQuote
Extract Fieldnames: Ticked
Trim fieldnames: Ticked

On painfully looking through my large text file, I finally tracked it down to a field value having a doublequoted value in it… e.g.

Johnny “Pirate” Depp

This meant that the remainder of the file was considered the rest of the line.

I then changed the Quote:Doublequote to ‘custom’ value ‘~’

Is this the right way to go about it? My workaround feels hacky.

Thanks

@surreynorthern

Looks like the field delimiter was the issue, and what you did was the right method, unfortunately today Dremio does not automatically suggest delimiter

Thanks
Bali

As I’m working with my data, setting the formats, I’m finding most of the character I could use as delimited are actually used in the remarks - even the tilde, in some of them.

Other ETL tools can actually let you set the Quote equivalent to None since the content of a Tab delimited field, or indeed a CSV, may not have encapsulated string.

I’m concerned that picking a surrogate character e.g using a caret or a curly bracket may end up being entered by users in the data.