The null value is shown as "empty text"

We have a custom plugin. I found a weird issue. For the null value, we will set it as blank. Generally, it will be shown as null. But sometime, the value is shown as “empty text”. In our testing, we could get different result if we query the same query many times. I am wondering if it is a Dremio bug? At least, it should be consistent for one query.

image

Our Dremio version: 3.2.4-201906051751050278-1bcce62

@popejune, what is the source here? Is this a table in MySQL for example, or Parquet files in HDFS?

@ben,it is a custom plugin to read REST API, we are referring to dremio-kdb-plugin (https://github.com/rymurr/dremio-kdb-plugin)

Perhaps @rymurr can speak to how the “empty text” vs “null” is handled in for KDB datasets in Dremio

Hey @ben and @popejune

Thanks for taking a look at the kdb connector! Its not the ideal template for a plugin but its usable.

If the data is being shown as an empty string rather than null I would take a look at how the arrow buffer is being constructed. The StringNullCheck class in the kdb plugin is making assumptions about what null is depending on what kdb considers null. Arrow maintains a data buffer and a null check buffer and if the null check buffer is not null then it will show whatever data lives in the data buffer. Chances are your null check is not marking those empty strings as null (I would guess because your version of null differs from kdbs). Just a guess though…would need more details of your plugin to say for sure.

Best,
Ryan

Thanks @ben for your help.
Thanks @rymurr for your response.

Actually, I did NOT write any null. All the value is “” (That means when I call ArrowWriter to write the data, the val is an array only including string or “”, like ["","",“value”,"",""] ). But it’s strange that some “” will show null, but some will show “empty”.

ArrowWriter writer = WriterBuilder.build(fields.get(name), vectors.get(name), name, val);
dataSize = writer.write(allocator);

All the value will be wrote via the logic below (in StringAllocator ). I did not see any issue here when s="". I still think it should be bug from Dremio, since all are “”, but the display is different in Dremio portal.

           for (String s : ((String[]) o)) {
                byte[] bytes = s.getBytes(StandardCharsets.UTF_8);
                int wordLength = bytes.length;
                offsets.writeInt(wordLength + offset);
                offset += wordLength;
                data.writeBytes(bytes);
            }

Hey @popejune

Dremio will display exactly what the arrow buffer is giving it. I would double check that your arrow buffer is being constructed correctly and especially the null check portion of the buffer construction is marking the correct values as null.

If something is being marked as null it is because the nullability portion of the arrow buffer is telling Dremio to treat the value as null… regardless of the underlying data values. This is all done via generated code in the nullcheck package for the kdb plugin. Note there could be a bug in the kdb plugin! It doesn’t have extensive unit tests!

Best,

Ryan