Error creating s3 reflection

jhaynie · December 3, 2018, 10:58pm

Looks like the acceleration failed to create with this message:

“Failure while attempting to read metadata for “__accelerator”.“591c069b-f248-4443-bffb-73d14a6e3532”.“4d840196-9145-4e58-8c86-3a2713fa17e3”.”

ca118f02-b684-4355-8088-321a1ca8729c.zip (18.5 KB)

Is this some sort of S3 permission issue or something? I feel like maybe this is a config issue…

balaji.ramaswamy · December 4, 2018, 6:30am

Hi @jhaynie

Can you please try the below? In your core-site.xml under the Dremio conf folder on all your executors make the following entry

<property> <name>fs.s3a.connection.maximum</name> <value>5000</value> </property>

jhaynie · December 5, 2018, 3:24pm

OK that seems to now allow it to work correctly.

jhaynie · December 5, 2018, 3:34pm

677f57f7-b890-4bbe-bf86-f7043ade39a0.zip (11.3 KB)

The reflection now works but the reflected query is a lot slower than the original query. It’s a really small set of files. It takes like 6s to run this really small query.

Any ideas?

balaji.ramaswamy · December 6, 2018, 3:28am

Hi @jhaynie

If you see your query is taking ~7.7s to complete but out of which 6.8s is on wait time for your Parquet row group scan

Thread	Setup Time	Process Time	Wait Time	Max Batches	Max Records	Peak Memory
00-00-07	0.039s	0.047s	6.841s	2	26	8MB

Are your reflections on S3? are the sizes of the reflection files small? This is a know issue on cloud sources like ADLS and S3. Just to see the speed of reflections would it be possible to store the reflections locally?

Thanks
@balaji.ramaswamy

jhaynie · December 6, 2018, 3:33am

yes, this is a small test data set so they are super tiny. like 2M file since there aren’t many records at all in the current data set.

if we store locally, is there any special thing we need to do in a 6 node cluster? in other words, do we need to use EFS to mount or will the coordinator somehow distribute the reflections? how can we store them locally and still use the cluster?

kelly · December 6, 2018, 10:56pm

Dremio will distribute the Data Reflections across the local storage of your nodes, see: https://docs.dremio.com/deployment/dremio-config.html#distributed-storage

Topic		Replies	Views
Reflection in AWS S3 is slow? store in EBS?	9	2299	June 29, 2018
Why use reflection on reading data from S3?	2	2757	September 15, 2018
Error creating Reflection	13	3111	July 3, 2018
Evaluating Dremio	3	2102	May 17, 2018
Query was NOT accelerated - Reflections Not Used	2	815	February 2, 2023

Error creating s3 reflection

Related topics