Reflections on S3 - no results when hitting "run"

Hi!
I installed Dremio CE (v 22.0.0-202206221430090603-1fa4049f
) on an EC2 instance.

Configured IAM Role profiles giving permissions to a S3 bucket for reading and writing (as said on Dremio docs).

Configured dremio.conf with the param:
dist: "dremioS3:///my-private-bucket/accel"

Configured the file core-site.xml file with this content:

<?xml version="1.0"?>
<configuration>
<property>
   <name>fs.dremioS3.impl</name>
   <description>The FileSystem implementation. Must be set to com.dremio.plugins.s3.store.S3FileSystem</description>
   <value>com.dremio.plugins.s3.store.S3FileSystem</value>
</property>
<property>
   <name>fs.s3a.aws.credentials.provider</name>
   <description>The credential provider type.</description>
   <value>com.amazonaws.auth.InstanceProfileCredentialsProvider</value>
</property>
</configuration>

Connected a mongodb source and write some queries…
After creating the reflections (the files are shown in S3), I stopped to have results on my queries!
When I click “preview” button, I got data. Nice…
In the same query, when I click “Run”, the result panel “flashes” a little, and get the “No results” message.
I noticed this only happens when using the FLATTEN command… here is my query:

SELECT code, date_start, nested_0.group_platform.transactions AS platform_transactions, nested_0.group_platform.sessions AS platform_sessions, nested_0.group_platform."_id" AS platform_name
    FROM (
      SELECT code, date_start, FLATTEN(group_platform) AS group_platform
      FROM MySource.summaries AS summaries
    ) nested_0

When using the default reflection storage method (in data folder), this behavior didn’t happen. Reflections are created and the data is shown correctly (more than 600k rows).

I’m new to Dremio and this is driving me nuts!
What may be causing this behavior? Maybe I could limit myself to store reflections on EBS…

Thanks!

@almirb

Are you able to send us the profile of both preview and run?

Hello!
Here are the files.
5978a6de-df57-495e-b9ea-a7037518ae3c_preview.zip (17,0,KB)
cb049d6d-657f-412f-b46c-da5fd5276247_run.zip (15,8,KB)

I also noticed that disabling iceberg on “Support Settings” :
dremio.iceberg.enabled: false
… and recreating the reflections fixes the problem.

What would be the drawbacks of disabling iceberg?

Thanks!

You might be hitting a bug where if there are nulls in the “group_platform” array column then zero records are returned. This bug has been fixed in version 23.

In the query profile, if you compare the FLATTEN and PROJECT operators, you’ll see that the PROJECT sees no records in the “Run” profile. The “Preview” profile is limited by 10,000 records so maybe the query didn’t encounter any null values in the “group_platform” column.

You can try a workaround by re-writing the SQL using a left outer join similar to:

SELECT x.c1, x.c2, x.c3, y.flat_c4
FROM “flatten_result” x
LEFT OUTER JOIN (SELECT c1, c2, c3, FLATTEN(c4) as flat_c4 FROM “flatten_result”) y ON x.c1 = y.c1

I don’t think this problem has anything to do with reflections or whether dremio.iceberg.enabled is enabled or not. (Both of your profiles shows use of the same reflection materialization stored in Iceberg table format).

You definitely don’t want to disable dremio.iceberg.enabled or else the unlimited splits feature will be disabled. See Dremio

1 Like