I run into this error often when trying to JOIN datasets with a GROUP BY statement: “New schema found and recorded. Please reattempt the query. Multiple attempts may be necessary to fully learn the schema.” Preview works fine but Run query gives this error. These are datasets with less than 10K rows, 3 columns each and the resulting JOIN + GROUP BY query would result in less than 20 rows. I’ve run the query at least 20 times and after 30 minutes continue to get this message. Not sure if I’m just expecting too much here or if it really does take more than 30 minutes to learn the schema. I see less than 1% CPU activity in the Node Activity screen.
Not sure if this is relevant but I see this strange error when I go into the settings for one of my datasets in the query:
It seems like I can temporarily get this problem to go away by toggling the manual/automatic reflection switches on my datasets. It seems to come back eventually though and never goes away on its own.
Hi Neil,
We’d expect you to see that message if your dataset contains different types of values (e.g. integer and string) for the same column, or if there were different columns (across multiple files/datasets). Is this the case?
Could you share the profile for the latest query you ran here? To do this, navigate to the Jobs page, click on the query of interest and then click “Download Profile” from the lower right hand side. Could you also share your server log (default location:
<dremio_home>/log/server.log) if possible?
I couldn’t see how to share a file in my reply but your response gave me a hint as to what was happening. I had a couple queries that were doing nested SELECTS (i.e SELECT * FROM (SELECT * FROM …) …) but the inner SELECTs had the same number of columns but different types. I was not aliasing the inner selects. Once I gave each query a unique alias for the inner select the problem went away.
Sounds good! Glad you were able to figure it out. Let us know if you have any other questions.