Reading data from S3 failing after upgrade to 1.2.2

allan.sene · November 8, 2017, 9:10pm

Hi, guys!

Just upgraded my Dremio cluster from 1.1.0 to 1.2.2 and now I’m getting this error for every data that I try to list, even when I go to the Node Activity screen:

VALIDATION ERROR: Failed to create workspaces for buckets owned by the account.

I’m persisting data on AWS S3. I’ve double checked the core-site.xml and my access key looks fine (is active).

Edit: Just tried to update the key and nothing

Where goes the profile:

418de095-df84-4bb6-b90a-928c05e9d4d7.zip (3.4 KB)

allan.sene · November 8, 2017, 10:08pm

I just tried something: Removed some S3 source and updated it with those new credentials that I generated before. Now I can query data from S3 buckets, but from other sources, like MySQL and ElasticSearch, I’m getting other error:

java.nio.file.AccessDeniedException: /var/lib/dremio/db/search/dac-namespace/core/_2s9.cfe

Should I erase all the reflections from the S3 reflection layer after an upgrade?

b8a26626-fa9b-482a-b06f-57d4a1815a02.zip (6.5 KB)

balaji.ramaswamy · November 9, 2017, 2:09am

Hi @allan.sene

We have had a similar issue at one of our customer’s and it was regarding the way the software was installed/upgraded.

Can you please share how you upgraded from 1.1.0 to 1.2.2?

Can you do a ps -ef | grep dremio on your co-ordinator box too?

Thanks,
@balaji.ramaswamy

allan.sene · November 9, 2017, 4:09pm

Hi @balaji.ramaswamy

I followed exactly what this doc says: https://docs.dremio.com/advanced-administration/upgrade/rpm.html

Then:

Generated a new AIM access/secret for S3
Updated the core-site.xml with it and restarted the cluster (didn’t work)
Updated one data source, that consumes from S3 (error changed)

Funny thing is that I can query my files on S3. Just the others sources don’t work anymore.

ps -ef from master/executor:

[ec2-user@dremio-master-01 ~]$ ps -ef | grep dremio
dremio    68237      1  0 Nov08 ?        00:00:00 bash /opt/dremio/bin/dremio internal_start dremio
dremio    68311  68237  0 Nov08 ?        00:03:55 /usr/lib/jvm/java-1.7.0-openjdk-1.7.0.151.x86_64/jre/bin/java -Djava.util.logging.config.class=org.slf4j.bridge.SLF4JBridgeHandler -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:/var/log/dremio/server.gc -Ddremio.log.path=/var/log/dremio -Xmx4096m -XX:MaxDirectMemorySize=8192m -XX:MaxPermSize=512m -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/var/log/dremio -cp /etc/dremio:/opt/dremio/jars/*:/opt/dremio/jars/ext/*:/opt/dremio/jars/3rdparty/* com.dremio.dac.daemon.DremioDaemon dremio start
ec2-user  72093  72064  0 16:06 pts/0    00:00:00 grep --color=auto dremio

from executor-only:

[ec2-user@dremio-executor-01 ~]$ ps -ef | grep dremio
dremio    64322      1  0 Nov08 ?        00:00:00 bash /opt/dremio/bin/dremio internal_start dremio
dremio    64396  64322  0 Nov08 ?        00:01:25 /usr/lib/jvm/java-1.7.0-openjdk-1.7.0.151.x86_64/jre/bin/java -Djava.util.logging.config.class=org.slf4j.bridge.SLF4JBridgeHandler -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:/var/log/dremio/server.gc -Ddremio.log.path=/var/log/dremio -Xmx4096m -XX:MaxDirectMemorySize=8192m -XX:MaxPermSize=512m -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/var/log/dremio -cp /etc/dremio:/opt/dremio/jars/*:/opt/dremio/jars/ext/*:/opt/dremio/jars/3rdparty/* com.dremio.dac.daemon.DremioDaemon dremio start
ec2-user  66500  66477  0 16:08 pts/0    00:00:00 grep --color=auto dremio

balaji.ramaswamy · November 9, 2017, 5:03pm

Hi @allan.sene

Do you get the same error on other sources now? Can you please send me a profile from a failed job?

Thanks,
@balaji.ramaswamy

allan.sene · November 9, 2017, 8:55pm

I suppose that my problem is with the executor node. For some reason, when I turn off it, everything returns back to normal.

Just checked all the configs and everything seems ok: dremio.conf and core-site.xml are the same on both nodes - except the service.coordinator.enable variable, of course.

I rebooted both 2 machines and when the executor-only is up, or the query hangs and keeps on running forever or it fails. When it hangs, even when I try to cancel their by the UI, nothing happens:

master-mysql-failure.zip (6.5 KB)

balaji.ramaswamy · December 5, 2017, 6:47pm

Hi @allan.sene

Apologies for the delay in responding,

Just wanted to find out if everything is working as expected now? Are you able to use Dremio with a separate node for the co-ordinator and a separate node for the executor?

Thanks,
@balaji.ramaswamy

allan.sene · December 14, 2017, 7:45pm

Hi @balaji.ramaswamy

Unfortunately, we are dropping Dremio for now and looking for another alternative to our scenario. We need to put something stable in production really fast and the experience with Dremio on AWS is not so great for now.

I really appreciate your assistance and hope that we can try the platform later when it becomes more stable.

Thanks man

can · December 14, 2017, 11:19pm

Hey @allan.sene, we’re very sorry to hear that. We’re actively working on improving our users’ experience working with Dremio and would love to incorporate your feedback and concerns into the process. If you are up for it, we’d like to do a deep dive session and talk through your experience — I’ll reach out via DM. It would be really valuable to understand what went wrong and what we could do better to support you and others who might ran into similar issues going forward.

Thanks,
Can

Topic		Replies	Views
Problem adding an s3 source	8	2720	December 5, 2017
Can not read data from reflection [S3 compatibility]	1	2029	May 20, 2020
S3 (Compat Mode) errors while getting data from reflection	10	2862	January 24, 2023
S3 (minio) errors while getting data from reflection	3	1824	April 8, 2020
Error fetching objects stored in private S3 bucket	10	2704	November 15, 2018

Reading data from S3 failing after upgrade to 1.2.2

Related topics