Hi,
i have an error reading Parquet files with custom build.
i have managed to build the Dremio oss build from https://github.com/dremio/dremio-oss/ (with flag -Ddremio.oss-only=true).
I have used hints in BUILD FAILURE with -Ddremio.oss-only (removed dremio-tpch-sample-data from POM files) and Dremio-oss build error (manually add jar to local repo, or use free repository) to build it.
After that i used the official Dockerfile from Dockerhub (https://github.com/dremio/dremio-cloud-tools/blob/master/images/dremio-oss/Dockerfile) and changed the download to copy the local archive. Started the docker container an tried to read a single parquet file and a S3 storage with multiple files.
Al of the 3 variants above give me an error reading the files and in the query tables most of the columns missing data.
in the log is an error:
nSqlOperatorImpl PARQUET_ROW_GROUP_SCAN
Location 0:0:8
SqlOperatorImpl PARQUET_ROW_GROUP_SCAN
Location 0:0:8
Fragment 0:0
[Error Id: 2b04a6d4-01b4-41c4-a88c-b252a9be7a75 on 0076d817e848:0]
(org.apache.arrow.vector.util.SchemaChangeRuntimeException) Schema change error
com.dremio.common.exceptions.UserException.schemaChangeError():91
com.dremio.sabot.op.scan.ScanOperator.checkAndLearnSchema():394
com.dremio.sabot.op.scan.ScanOperator.setupReader():265
com.dremio.sabot.op.scan.ScanOperator.setup():249
com.dremio.sabot.driver.SmartOp$SmartProducer.setup():563
com.dremio.sabot.driver.Pipe$SetupVisitor.visitProducer():79
com.dremio.sabot.driver.Pipe$SetupVisitor.visitProducer():63
com.dremio.sabot.driver.SmartOp$SmartProducer.accept():533
com.dremio.sabot.driver.StraightPipe.setup():102
com.dremio.sabot.driver.StraightPipe.setup():102
com.dremio.sabot.driver.StraightPipe.setup():102
com.dremio.sabot.driver.StraightPipe.setup():102
com.dremio.sabot.driver.StraightPipe.setup():102
com.dremio.sabot.driver.StraightPipe.setup():102
com.dremio.sabot.driver.StraightPipe.setup():102
com.dremio.sabot.driver.StraightPipe.setup():102
com.dremio.sabot.driver.Pipeline.setup():68
com.dremio.sabot.exec.fragment.FragmentExecutor.setupExecution():391
com.dremio.sabot.exec.fragment.FragmentExecutor.run():273
com.dremio.sabot.exec.fragment.FragmentExecutor.access$1400():94
com.dremio.sabot.exec.fragment.FragmentExecutor$AsyncTaskImpl.run():711
com.dremio.sabot.task.AsyncTaskWrapper.run():112
com.dremio.sabot.task.single.DedicatedFragmentRunnable.run():47
java.util.concurrent.Executors$RunnableAdapter.call():511
java.util.concurrent.FutureTask.run():266
java.util.concurrent.ThreadPoolExecutor.runWorker():1149
java.util.concurrent.ThreadPoolExecutor$Worker.run():624
java.lang.Thread.run():748
If i checkout the official Docker image from dockerhub the container can read the files without issues.
Why is the default oss build failing without modifications?
What causes the custom build to fail reading the same parquet files?