Am new to dremio community. i have to connect my lappy dremio to my hive server. but its giving error. where as if dremio and hive both on the same server. than its working fine.
Hive Server IP : 192.168.0.72
Dremio installed on : 192.168.0.192
We definitely allow Hive to be on a different server. In fact, most (if not all) of our deployments working with Hive has it on another box. What exactly does your error say? I can’t see the full error from the screenshot (need to scroll more to the right). I am guessing you may be running into some networking connection issue where you may have to open/allow ports.
could you try to include core-site.xml on dremio classpath - linking to dremio conf directory?
Or may be try to include namenode as an advanced property while setting hive source.
It feels like Dremio can’t connect to HDFS as it does not know namenode host/port.
Looks like there’s a reference to localhost somewhere in hive or hdfs that gets passed to dremio and dremio tries to use that hostname to connect to the namenode.
Could you post a screenshot of the source configuration within dremio and a copy of core-site and hdfs-site?
By clicking on “Add Property” you should be able to add property defining NameNode location.
As far as having core-site.xml on Dremio classpath - it depends on your installation - you either copy core-site.xml into “conf” directory of your Dremio installation or link core-site.xml to that location.
Not sure if screenshot is going to help here (it is on command line)
it looks looks like the issue is that fs.defaultFS is set to hdfs://localhost:9000 so localhost is the host that Dremio gets from the configuration to access HDFS. Change this property in Ambari/Cloudera Manager to be hdfs://192.168.0.72:9000, restart Hadoop or the necessary services then remove the source from dremio and re-add it
I tried all the steps, and removed all hardcodings to localhost. I have configured all as per the above in core-site.xml .
Now I get a different kind of error :
DATA_READ ERROR: Failed to initialize Hive record reader Dataset split key hdfs://sandbox.kylo.io/user/hive/warehouse/test.db/t_customer__0
HIVE_SUB_SCAN Location 0:0:7 SqlOperatorImpl HIVE_SUB_SCAN Location 0:0:7 Fragment 0:0 [Error Id: 2df0ce62-cba8-4339-ae9f-d43da058ccfc on localhost:31010] (org.apache.hadoop.hdfs.BlockMissingException) Could not obtain block: BP-82876869-127.0.0.1-1527287107078:blk_1073742969_2157 file=/user/hive/warehouse/test.db/t_customer/20180925_162010_00014_utnbv_090cff90-1bf9-47df-bad7-65a2c9c52a1c org.apache.hadoop.hdfs.DFSInputStream.refetchLocations():1052 org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode():1036 org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode():1015 org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo():647 org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy():926 org.apache.hadoop.hdfs.DFSInputStream.read():982 java.io.DataInputStream.readFully():195 java.io.DataInputStream.readFully():169 org.apache.hadoop.hive.ql.io.RCFile$Reader.init():1462 org.apache.hadoop.hive.ql.io.RCFile$Reader.():1363 org.apache.hadoop.hive.ql.io.RCFile$Reader.():1343 org.apache.hadoop.hive.ql.io.RCFileRecordReader.():100 org.apache.hadoop.hive.ql.io.RCFileInputFormat.getRecordReader():58 com.dremio.exec.store.hive.exec.HiveRCFileReader.internalInit():52 com.dremio.exec.store.hive.exec.HiveAbstractReader.setup():198 com.dremio.sabot.op.scan.ScanOperator$1.run():189 com.dremio.sabot.op.scan.ScanOperator$1.run():185 java.security.AccessController.doPrivileged():-2 javax.security.auth.Subject.doAs():422 org.apache.hadoop.security.UserGroupInformation.doAs():1836 com.dremio.sabot.op.scan.ScanOperator.setupReaderAsCorrectUser():185 com.dremio.sabot.op.scan.ScanOperator.setupReader():177 com.dremio.sabot.op.scan.ScanOperator.setup():163 com.dremio.sabot.driver.SmartOp$SmartProducer.setup():560 com.dremio.sabot.driver.Pipe$SetupVisitor.visitProducer():79 com.dremio.sabot.driver.Pipe$SetupVisitor.visitProducer():63 com.dremio.sabot.driver.SmartOp$SmartProducer.accept():530 com.dremio.sabot.driver.StraightPipe.setup():102 com.dremio.sabot.driver.StraightPipe.setup():102 com.dremio.sabot.driver.StraightPipe.setup():102 com.dremio.sabot.driver.StraightPipe.setup():102 com.dremio.sabot.driver.StraightPipe.setup():102 com.dremio.sabot.driver.StraightPipe.setup():102 com.dremio.sabot.driver.StraightPipe.setup():102 com.dremio.sabot.driver.Pipeline.setup():58 com.dremio.sabot.exec.fragment.FragmentExecutor.setupExecution():344 com.dremio.sabot.exec.fragment.FragmentExecutor.run():234 com.dremio.sabot.exec.fragment.FragmentExecutor.access$800():86 com.dremio.sabot.exec.fragment.FragmentExecutor$AsyncTaskImpl.run():591 com.dremio.sabot.task.AsyncTaskWrapper.run():107 com.dremio.sabot.task.slicing.SlicingThread.run():102
Note : the above works if I have the worker disabled and master executing. The master is running on the VM with hadoop, and worker is on another VM with just vanilla linux.
could you provide core-site.xml, hdfs-site.xml, and server.log from the executor that is on the other VM (not the hadoop VM)? and also the profile from the failed query.
Failed to initialize Hive record reader Dataset split key can be caused by security issues. Can you talk a little bit about your cluster’s security setup? Also, hive-site.xml would help too
Thanks guys. I figured it out. It was the way mysql was behaving when it was being targeted to log in with hive. I used a new hosts entry to tackle this, and bring hive services up with another one. It worked.