Dremio 3.1 Postgres View and Acceleration bug?

Using the latest community build (3.1.1-201901281837360699-30c9d74), I’m running into an issue where the query is failing without an error message (just “400 - Bad Request”).

The set up is as follows:

  1. Create a dataset using a query from Table A, and activate a “raw reflection” on the dataset (I’ll call Reflection A)
  2. Create a view (in postgres, the main driver was to expose attributes of a postgres jsonb column, but contents of the view end up not being important) of Table B (I’ll call View B)
  3. Join View B to Reflection A, “Group By” a column in View B and have a measure from Reflection A

Running will fail, with 10 profiles in the profiles tab of the job. Drilling into the profile, on the Error tab it complains:

  INVALID_DATASET_METADATA ERROR: 

Unexpected columns redacted_column_a,redacted_column_b
Expected skipped columns 
SqlOperatorImpl JDBC_SUB_SCAN
Location 2:0:6
SqlOperatorImpl JDBC_SUB_SCAN
Location 2:0:6
Fragment 2:0

“redacted_column_a” and “redacted_column_b” are the other columns that are in View B that were not being used for the “Group By”. It appears that this only happens when the job is being accelerated by a reflection, because other test cases that are not accelerated seem to work properly.

I thought maybe it was happening when the full query couldn’t be pushed down into the database, but I tested a similar query with a postgres view and a dataset from an uploaded spreadsheet and it worked as expected.

Additionally, it appears to only happen with a reflection from the same database? Having a reflection on the uploaded spreadsheet that is used for a similar query does not exhibit the same error.

Hi @lsowen

There should be a download profile on the right pane of the window when you click on jobs. Wondering if you can send us the downloaded zip file so we can look at the 9 attempts and then why the reflection was not picked up. Also please turn on verbose planning and then retry thquery and then send us the profile

To turn on verbose planning-click on admin-support on the left side-on the right side scroll down and in the support key enter “planner.verbbose_profile” and click show and then enable save. To download a profile, follow below tutorial

How To Share A Query Profile

Thanks
@balaji.ramaswamy

Hi @balaji.ramaswamy is there a way I can send it to you directly? I’m concerned it could potentially have information I don’t want to share publicly.

I’m having the same error with SQLServer as underlying database.

@balaji.ramaswamy is it possible to share the profile with you directly? We are very interested in using dremio in our org, but this is a showstopper bug.

hi @balaji.ramaswamy, have you had a chance to review the profile I provided? I appreciate your help!

Hi @lsowen, Sorry about the delay, not yet got a chance to look at this. Will get to this within the next two days

Hi @lsowen

It looks like we do not know about three columns ext_loan_id, dti and dti_joint. Were these 3 columns added to the postgresql at a later point of time. If yes, kindly try the below

New Query:

alter pds “playground”.“lc_loan” refresh metadata

Try the query now

Thanks
@balaji.ramaswamy

Hi @balaji.ramaswamy

Unfortunately, that didn’t help. I also constructed a similar view from scratch to make sure that it wasn’t some sort of caching.

I think the issue is somehow related to the fact that it is pushing a “select star” query down to the database, then doesn’t know what to do with the other columns returned from query:

Jdbc(sql=[SELECT *
FROM “playground”.“lc_loan”])

All three of the columns it is complaining about are the columns which exist on the view but are not actually used in the dremio query.

Hi @lsowen

Is this a postgres view?

Can we try couple of other options?

  1. alter pds “playground”.“lc_loan” forget metadata

Try the query again

  1. Add a new postgres source (with a different name). Try on the new source?

Also what happens when you you do “New Query” and rub below query

select * from “playground”.“lc_loan”

Thanks
@balaji.ramaswamy

hi @balaji.ramaswamy,

Finally had a chance to re-run this.

After the forget metadata, the table had disappeared from the list of datasets, so I forced a refresh of the metadata to allow queries against that dataset again. Re-running the query had the exact same issue.

Adding a new postgres datasource with a different name ran into the same issue. I removed and recreated the “Reflection A” dataset, and the first time the query ran correctly (because it was not accelerated because there wasn’t a reflection yet for “Reflection A”). However, running the query again (and all subsequent attempts) failed with the exact same error.