Error with CTAS and STRUCT columns

Hey there!
On an Iceberg source, whenever I try to use a CTAS statement where one the columns is a STRUCT, it fails with the following Error:

         UNSUPPORTED_OPERATION ERROR: Type conversion error for column struct_column

  (java.lang.UnsupportedOperationException) Unsupported arrow type : Null
    com.dremio.exec.store.iceberg.SchemaConverter$1.visit():288
    com.dremio.exec.store.iceberg.SchemaConverter$1.visit():285
    org.apache.arrow.vector.types.pojo.ArrowType$Null.accept():286
    com.dremio.exec.store.iceberg.SchemaConverter.toIcebergType():284
    com.dremio.exec.store.iceberg.SchemaConverter.toIcebergColumn():271
    com.dremio.exec.store.iceberg.SchemaConverter.toIcebergColumn():259
    com.dremio.exec.store.iceberg.SchemaConverter.lambda$toIcebergSchema$3():238
    java.util.stream.ReferencePipeline$3$1.accept():195
    java.util.stream.ReferencePipeline$2$1.accept():177
    java.util.ArrayList$ArrayListSpliterator.forEachRemaining():1655
    java.util.stream.AbstractPipeline.copyInto():484
    java.util.stream.AbstractPipeline.wrapAndCopyInto():474
    java.util.stream.ReduceOps$ReduceOp.evaluateSequential():913
    java.util.stream.AbstractPipeline.evaluate():234
    java.util.stream.ReferencePipeline.collect():578
    com.dremio.exec.store.iceberg.SchemaConverter.toIcebergSchema():239
    com.dremio.exec.store.iceberg.SchemaConverter.toIcebergSchema():230
    com.dremio.exec.store.iceberg.IcebergUtils.getIcebergPartitionSpecFromTransforms():622
    com.dremio.exec.planner.sql.handlers.query.DataAdditionCmdHandler.convertToDrel():397
    com.dremio.exec.planner.sql.handlers.query.DataAdditionCmdHandler.getPlan():272
    com.dremio.exec.planner.sql.handlers.query.CreateTableHandler.doCtas():104
    com.dremio.exec.planner.sql.handlers.query.CreateTableHandler.getPlan():84
    com.dremio.exec.planner.sql.handlers.commands.HandlerToExec.plan():56
    com.dremio.exec.work.foreman.AttemptManager.plan():638
    com.dremio.exec.work.foreman.AttemptManager.lambda$run$4():519
    com.dremio.service.commandpool.ReleasableBoundCommandPool.lambda$getWrappedCommand$3():156
    com.dremio.service.commandpool.CommandWrapper.run():73
    com.dremio.context.RequestContext.run():103
    com.dremio.common.concurrent.ContextMigratingExecutorService.lambda$decorate$4():246
    com.dremio.common.concurrent.ContextMigratingExecutorService$ComparableRunnable.run():222
    java.util.concurrent.Executors$RunnableAdapter.call():515
    java.util.concurrent.FutureTask.run():264
    java.util.concurrent.ThreadPoolExecutor.runWorker():1128
    java.util.concurrent.ThreadPoolExecutor$Worker.run():628
    java.lang.Thread.run():829

To reproduce, I’ve simplified the query to its most basic form:

CREATE TABLE iceberg.my_schema.test_table
AS SELECT CONVERT_FROM('{"name": "Rafael"}', 'JSON') AS struct_column;

I’m using Dremio 25.0 CE on Docker.

Upon inspection of Dremio’s source code for v25.0, there was support for STRUCT types, so my hypothesis was that the problem resided int the CTAS itself.
To test the hypothesis, I’ve changed the above code into separate CREATE TABLE and INSERT INTO statements and it worked.

@rcosta.esapiens The other workaround you can try is to create a VDS with the CONVERT_FROM and then CTAS the VDS

1 Like

CTAS with CONVERT_FROM does not work because we have trouble passing the struct data type to Iceberg directly from the CONVERT_FROM. I would recommend using two statements as you found out. That’s also better because you can be explicit about the struct schema. (Hopefully, you know this ahead of time).

1 Like