NullPointerException when Creating Reflection

Hey,

I am using dremio 23.0.1 in a docker environment with a docker-compose file:

...
  
  dremio-app:
    hostname: dremio-app
    container_name: dremio_app
    user: root
    image: ${EXT_REGISTRY}/dremio/dremio-oss:23.0.1
    ports:
      - 9047:9047 # UI
      - 31010:31010 # ODBC clients
      - 32010:32010 # Arrow Flight clients
      - 2181:2181   # ZooKeeper
      - 45678:45678 # internode communication
    volumes:
      - dremio_app:/opt/dremio/data
      - ./dremio/backup:/tmp/backup:z
    restart: unless-stopped
    depends_on:
      - dremio-app-init

Currently there is one MySQL database connected. The query fetches data from various datasets created in Dremio. I am currently having trouble creating reflections for some datasets because a NullPointerException gets thrown. Running the Query (or Preview) works fine and gives me a valid Resultset, but as soon as I want to create a reflection, the error occurs.

2022-11-08 09:48:50,120 [1c95d9ff-5bc4-e26d-0ece-71f2a9a87200/0:foreman-planning] ERROR c.d.s.commandpool.CommandWrapper - command 1c95d9ff-5bc4-e26d-0ece-71f2a9a87200/0:foreman-planning failed
com.dremio.common.exceptions.UserException: NullPointerException
        at com.dremio.common.exceptions.UserException$Builder.build(UserException.java:907)
        at com.dremio.exec.planner.sql.SqlExceptionHelper.coerceException(SqlExceptionHelper.java:126)
        at com.dremio.service.reflection.refresh.RefreshHandler.getPlan(RefreshHandler.java:275)
        at com.dremio.exec.planner.sql.handlers.commands.HandlerToExec.plan(HandlerToExec.java:59)
        at com.dremio.exec.work.foreman.AttemptManager.plan(AttemptManager.java:502)
        at com.dremio.exec.work.foreman.AttemptManager.lambda$run$4(AttemptManager.java:400)
        at com.dremio.service.commandpool.ReleasableBoundCommandPool.lambda$getWrappedCommand$3(ReleasableBoundCommandPool.java:137)
        at com.dremio.service.commandpool.CommandWrapper.run(CommandWrapper.java:62)
        at com.dremio.context.RequestContext.run(RequestContext.java:96)
        at com.dremio.common.concurrent.ContextMigratingExecutorService.lambda$decorate$3(ContextMigratingExecutorService.java:199)
        at com.dremio.common.concurrent.ContextMigratingExecutorService$ComparableRunnable.run(ContextMigratingExecutorService.java:180)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:750)
Caused by: java.lang.NullPointerException: null
        at com.dremio.exec.planner.cost.RelMdPercentageOriginalRows.getPercentageOriginalRows(RelMdPercentageOriginalRows.java:73)
        at GeneratedMetadata_PercentageOriginalRowsHandler.getPercentageOriginalRows_$(Unknown Source)
        at GeneratedMetadata_PercentageOriginalRowsHandler.getPercentageOriginalRows(Unknown Source)
        at GeneratedMetadata_PercentageOriginalRowsHandler.getPercentageOriginalRows_$(Unknown Source)
        at GeneratedMetadata_PercentageOriginalRowsHandler.getPercentageOriginalRows(Unknown Source)
        at org.apache.calcite.rel.metadata.RelMetadataQuery.getPercentageOriginalRows(RelMetadataQuery.java:333)
        at org.apache.calcite.rel.metadata.RelMdPercentageOriginalRows.getPercentageOriginalRows(RelMdPercentageOriginalRows.java:120)
        at GeneratedMetadata_PercentageOriginalRowsHandler.getPercentageOriginalRows_$(Unknown Source)
        at GeneratedMetadata_PercentageOriginalRowsHandler.getPercentageOriginalRows(Unknown Source)
        at GeneratedMetadata_PercentageOriginalRowsHandler.getPercentageOriginalRows_$(Unknown Source)
        at GeneratedMetadata_PercentageOriginalRowsHandler.getPercentageOriginalRows(Unknown Source)
        at org.apache.calcite.rel.metadata.RelMetadataQuery.getPercentageOriginalRows(RelMetadataQuery.java:333)
        at com.dremio.exec.planner.cost.RelMdPercentageOriginalRows.getPercentageOriginalRows(RelMdPercentageOriginalRows.java:46)
        at GeneratedMetadata_PercentageOriginalRowsHandler.getPercentageOriginalRows_$(Unknown Source)
        at GeneratedMetadata_PercentageOriginalRowsHandler.getPercentageOriginalRows(Unknown Source)
        at GeneratedMetadata_PercentageOriginalRowsHandler.getPercentageOriginalRows_$(Unknown Source)
        at GeneratedMetadata_PercentageOriginalRowsHandler.getPercentageOriginalRows(Unknown Source)
        at org.apache.calcite.rel.metadata.RelMetadataQuery.getPercentageOriginalRows(RelMetadataQuery.java:333)
        at org.apache.calcite.rel.metadata.RelMdPercentageOriginalRows.getPercentageOriginalRows(RelMdPercentageOriginalRows.java:120)
        at GeneratedMetadata_PercentageOriginalRowsHandler.getPercentageOriginalRows_$(Unknown Source)
        at GeneratedMetadata_PercentageOriginalRowsHandler.getPercentageOriginalRows(Unknown Source)
        at GeneratedMetadata_PercentageOriginalRowsHandler.getPercentageOriginalRows_$(Unknown Source)
        at GeneratedMetadata_PercentageOriginalRowsHandler.getPercentageOriginalRows(Unknown Source)
        at org.apache.calcite.rel.metadata.RelMetadataQuery.getPercentageOriginalRows(RelMetadataQuery.java:333)
        at org.apache.calcite.rel.metadata.RelMdPercentageOriginalRows.getPercentageOriginalRows(RelMdPercentageOriginalRows.java:120)
        at GeneratedMetadata_PercentageOriginalRowsHandler.getPercentageOriginalRows_$(Unknown Source)
        at GeneratedMetadata_PercentageOriginalRowsHandler.getPercentageOriginalRows(Unknown Source)
        at GeneratedMetadata_PercentageOriginalRowsHandler.getPercentageOriginalRows_$(Unknown Source)
        at GeneratedMetadata_PercentageOriginalRowsHandler.getPercentageOriginalRows(Unknown Source)
        at org.apache.calcite.rel.metadata.RelMetadataQuery.getPercentageOriginalRows(RelMetadataQuery.java:333)
        at com.dremio.exec.planner.cost.RelMdPercentageOriginalRows.getPercentageOriginalRows(RelMdPercentageOriginalRows.java:58)
        at GeneratedMetadata_PercentageOriginalRowsHandler.getPercentageOriginalRows_$(Unknown Source)
        at GeneratedMetadata_PercentageOriginalRowsHandler.getPercentageOriginalRows(Unknown Source)
        at GeneratedMetadata_PercentageOriginalRowsHandler.getPercentageOriginalRows_$(Unknown Source)
        at GeneratedMetadata_PercentageOriginalRowsHandler.getPercentageOriginalRows(Unknown Source)
        at org.apache.calcite.rel.metadata.RelMetadataQuery.getPercentageOriginalRows(RelMetadataQuery.java:333)
        at com.dremio.exec.planner.cost.RelMdPercentageOriginalRows.getPercentageOriginalRows(RelMdPercentageOriginalRows.java:46)
        at GeneratedMetadata_PercentageOriginalRowsHandler.getPercentageOriginalRows_$(Unknown Source)
        at GeneratedMetadata_PercentageOriginalRowsHandler.getPercentageOriginalRows(Unknown Source)
        at GeneratedMetadata_PercentageOriginalRowsHandler.getPercentageOriginalRows_$(Unknown Source)
        at GeneratedMetadata_PercentageOriginalRowsHandler.getPercentageOriginalRows(Unknown Source)
        at org.apache.calcite.rel.metadata.RelMetadataQuery.getPercentageOriginalRows(RelMetadataQuery.java:333)
        at com.dremio.exec.planner.cost.RelMdPercentageOriginalRows.getPercentageOriginalRows(RelMdPercentageOriginalRows.java:74)
        at GeneratedMetadata_PercentageOriginalRowsHandler.getPercentageOriginalRows_$(Unknown Source)
        at GeneratedMetadata_PercentageOriginalRowsHandler.getPercentageOriginalRows(Unknown Source)
        at GeneratedMetadata_PercentageOriginalRowsHandler.getPercentageOriginalRows_$(Unknown Source)
        at GeneratedMetadata_PercentageOriginalRowsHandler.getPercentageOriginalRows(Unknown Source)
        at org.apache.calcite.rel.metadata.RelMetadataQuery.getPercentageOriginalRows(RelMetadataQuery.java:333)
        at org.apache.calcite.rel.rules.MultiJoinOptimizeBushyRule.onMatch(MultiJoinOptimizeBushyRule.java:131)
        at org.apache.calcite.plan.AbstractRelOptPlanner.fireRule(AbstractRelOptPlanner.java:321)
        at org.apache.calcite.plan.hep.HepPlanner.applyRule(HepPlanner.java:557)
        at org.apache.calcite.plan.hep.HepPlanner.applyRules(HepPlanner.java:416)
        at org.apache.calcite.plan.hep.HepPlanner.executeInstruction(HepPlanner.java:281)
        at org.apache.calcite.plan.hep.HepInstruction$RuleCollection.execute(HepInstruction.java:74)
        at org.apache.calcite.plan.hep.HepPlanner.executeProgram(HepPlanner.java:212)
        at org.apache.calcite.plan.hep.HepPlanner.findBestExp(HepPlanner.java:199)
        at com.dremio.exec.planner.DremioHepPlanner.findBestExp(DremioHepPlanner.java:74)
        at com.dremio.exec.planner.sql.handlers.PrelTransformer.lambda$transform$0(PrelTransformer.java:530)
        at com.dremio.exec.planner.sql.handlers.PrelTransformer.doTransform(PrelTransformer.java:594)
        at com.dremio.exec.planner.sql.handlers.PrelTransformer.transform(PrelTransformer.java:573)
        at com.dremio.exec.planner.sql.handlers.PrelTransformer.convertToDrel(PrelTransformer.java:261)
        at com.dremio.exec.planner.sql.handlers.PrelTransformer.convertToDrelMaintainingNames(PrelTransformer.java:374)
        at com.dremio.service.reflection.refresh.RefreshHandler.getPlan(RefreshHandler.java:174)
        ... 13 common frames omitted
2022-11-08 09:48:50,123 [1c95d9ff-5bc4-e26d-0ece-71f2a9a87200:foreman] INFO  c.d.exec.work.foreman.AttemptManager - 1c95d9ff-5bc4-e26d-0ece-71f2a9a87200: State change requested ENQUEUED --> FAILED, Exception com.dremio.exec.work.foreman.ForemanException: Unexpected exception during fragment initialization: com.dremio.common.exceptions.UserException: NullPointerException
2022-11-08 09:48:50,142 [async-query-logger5] INFO  query.logger - Query: 1c95d9ff-5bc4-e26d-0ece-71f2a9a87200; outcome: FAILED

I have attached the profiles for both the query and reflection:

query.zip (69,5 KB)

raw reflection.zip (64,3 KB)

Hope someone can help me out!

@tha NPE is for sure not good but I see that there is a VDS on top of data.hypervisor that you are creating the reflection on? Dashboards.virtual_machines.vm_stats_per_host_cluster

Yeah, for this VDS I am querying other VDS which also have their respective reflection. That shoud not be a problem, right?

@tha

What happens if you run a query on Dashboards.virtual_machines.vm_stats_per_host_cluster?

It’s odd that this NPE would happen with a REFRESH REFLECTION job but not a normal SQL query. The NPE is occurring when trying to estimate row counts during join ordering. We’d have to get a full repro to be able to debug further.

@tha
Can you please try the same reflection build on 23.1 and update?