RAW reflections causing duplicates

I have a direct connection to an Oracle database, and created a RAW reflection for a table. When querying the table for one specific primary key value it shows 4 rows, and should be only one. Running exactly the same query in dBeaver only shows 1 row.

Query profile: defd2c96-6e1e-4329-bafd-182ff9c99d02.zip (35.3 KB)

@martinocando

It seems like you are getting one row per executor as you have 4 executors. it could be possible you are putting your reflection on NAS and till configured it as pdfs:///

  • Where are your reflections stored?
  • Send us your dremio.conf
paths: {
  local: ${DREMIO_HOME}"/data"

  dist: "pdfs://"${DREMIO_HOME}"/data/pdfs"
}
services: {
  coordinator.enabled: true,
  coordinator.master.embedded-zookeeper.enabled: false,
  executor.enabled: false
}
zookeeper: "zk_zoo1:2181,zk_zoo2:2181,zk_zoo3:2181"
registration.publish-host:"${LOCAL_CONTAINER_IP}"

This is the master, and the executors have the same pdfs:// config. I believe we should change it to file://, am I right?

@martinocando

If it is a NFS volume where you are writing reflections to then yes you need file:///

Well, that change seems to have fixed a lot of thing to our cluster. Not only the duplication is gone, but also Reflections are behaving quite stable for the past week. So thank you very much for the tip.

Keep safe, and have a Merry Christmas.

@martinocando Glad you are back on track