Dremio remote worker issue

I have setup dremio on 2 VMs to create a cluster.

VM1 : Has hadoop and Dremio configured as a master, with no execution
VM2 : Has only Dremio configured as a worker

When I ran Dremio as execution node from VM1, it worked fine with Hive and HDFS. But when I switched to a cluster with execution only from the worker it started to fail.

I don’t have kerberos or LDAP on either machines, and for simplicity I have disabled firewalls as well on both VMs. The core-site.xml and hdfs-site.xml are in the dremio conf directory.

I was able to make presto work on the same setup (VM2 had presto, VM1 had hadoop )

Can someone please help me on what I am doing wrong ? I have attached my core-site.xml and screenshots for the servers.

Error 1 : Hdfs
Failure reading JSON file - Could not obtain block: BP-82876869-127.0.0.1-1527287107078:blk_1073744756_3946 file=/user/dremio/simplified.json

Error 2 : Hive
DATA_READ ERROR: Failed to initialize Hive record reader Dataset split key hdfs://sandbox.kylo.io/user/hive/warehouse/test.db/t_customer__0 Table properties columns.types -> timestamp:varchar(30):timestamp:timestamp:varchar(3):varchar(2):varchar(30):int:timestamp:varchar(255) location -> hdfs://sandbox.kylo.io/user/hive/warehouse/test.db/t_customer columns -> reportdate,customer_nr,start_validity_date,end_validity_date,domicile,intercompany,customer_attribute1,is_spe,last_modified,modified_by COLUMN_STATS_ACCURATE -> false numRows -> -1 numFiles -> 1 serialization.ddl -> struct t_customer { timestamp reportdate, varchar(30) customer_nr, timestamp start_validity_date, timestamp end_validity_date, varchar(3) domicile, varchar(2) intercompany, varchar(30) customer_attribute1, i32 is_spe, timestamp last_modified, varchar(255) modified_by} transient_lastDdlTime -> 1537361897 rawDataSize -> -1 columns.comments -> totalSize -> 1163016 bucket_count -> 0 file.outputformat -> org.apache.hadoop.hive.ql.io.RCFileOutputFormat serialization.lib -> org.apache.hadoop.hive.serde2.columnar.LazyBinaryColumnarSerDe presto_version -> 0.208 file.inputformat -> org.apache.hadoop.hive.ql.io.RCFileInputFormat presto_query_id -> 20180919_125816_00012_uetq9 name -> test.t_customer SqlOperatorImpl HIVE_SUB_SCAN Location 0:0:7 SqlOperatorImpl HIVE_SUB_SCAN Location 0:0:7 Fragment 0:0 [Error Id: 2df0ce62-cba8-4339-ae9f-d43da058ccfc on localhost:31010] (org.apache.hadoop.hdfs.BlockMissingException) Could not obtain block: BP-82876869-127.0.0.1-1527287107078:blk_1073742969_2157 file=/user/hive/warehouse/test.db/t_customer/20180925_162010_00014_utnbv_090cff90-1bf9-47df-bad7-65a2c9c52a1c org.apache.hadoop.hdfs.DFSInputStream.refetchLocations():1052 org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode():1036 org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode():1015 org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo():647 org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy():926 org.apache.hadoop.hdfs.DFSInputStream.read():982 java.io.DataInputStream.readFully():195 java.io.DataInputStream.readFully():169 org.apache.hadoop.hive.ql.io.RCFile$Reader.init():1462 org.apache.hadoop.hive.ql.io.RCFile$Reader.():1363 org.apache.hadoop.hive.ql.io.RCFile$Reader.():1343 org.apache.hadoop.hive.ql.io.RCFileRecordReader.():100 org.apache.hadoop.hive.ql.io.RCFileInputFormat.getRecordReader():58 com.dremio.exec.store.hive.exec.HiveRCFileReader.internalInit():52 com.dremio.exec.store.hive.exec.HiveAbstractReader.setup():198 com.dremio.sabot.op.scan.ScanOperator$1.run():189 com.dremio.sabot.op.scan.ScanOperator$1.run():185 java.security.AccessController.doPrivileged():-2 javax.security.auth.Subject.doAs():422 org.apache.hadoop.security.UserGroupInformation.doAs():1836 com.dremio.sabot.op.scan.ScanOperator.setupReaderAsCorrectUser():185 com.dremio.sabot.op.scan.ScanOperator.setupReader():177 com.dremio.sabot.op.scan.ScanOperator.setup():163 com.dremio.sabot.driver.SmartOp$SmartProducer.setup():560 com.dremio.sabot.driver.Pipe$SetupVisitor.visitProducer():79 com.dremio.sabot.driver.Pipe$SetupVisitor.visitProducer():63 com.dremio.sabot.driver.SmartOp$SmartProducer.accept():530 com.dremio.sabot.driver.StraightPipe.setup():102 com.dremio.sabot.driver.StraightPipe.setup():102 com.dremio.sabot.driver.StraightPipe.setup():102 com.dremio.sabot.driver.StraightPipe.setup():102 com.dremio.sabot.driver.StraightPipe.setup():102 com.dremio.sabot.driver.StraightPipe.setup():102 com.dremio.sabot.driver.StraightPipe.setup():102 com.dremio.sabot.driver.Pipeline.setup():58 com.dremio.sabot.exec.fragment.FragmentExecutor.setupExecution():344 com.dremio.sabot.exec.fragment.FragmentExecutor.run():234 com.dremio.sabot.exec.fragment.FragmentExecutor.access$800():86 com.dremio.sabot.exec.fragment.FragmentExecutor$AsyncTaskImpl.run():591 com.dremio.sabot.task.AsyncTaskWrapper.run():107 com.dremio.sabot.task.slicing.SlicingThread.run():102

--------------------------Dremio Issue.zip (36.1 KB)

@cbhatnagar7101,
For both errors, are there query/job profiles you can share?

Error 1: Can you share the simplified.json file?

Error 2: Can you access t_customer__0 from the hive shell? Can you attach the output of
hive> DESCRIBE FORMATTED t_customer__0

Hi Ben,

I did not find a job profile for HDFS query.

I have attached the profile for Hive query, the file simplified.json and the describe output for the table t_customer.

Thanks again for your help.

issue.zip (10.1 KB)

hi Ben,

I tried another configuration. I changed the worker to be a master, and ran it again.

The errors I got were the same. I think, it is to do with remote connectivity of Dremio to my hadoop VM.

@cbhatnagar7101, are all queries against datasets in these sources failing or just the particular datasets you mentioned above?

Hey Ben… it’s for all the datasets.

Thanks guys. I figured it out. It was the way mysql was behaving when it was being targeted. I used a new hosts entry to tackle this, and bring hive services up with another one. It worked.