MONGO_SUB_SCAN takes more time to complete

Hello,

While trying to query a mongodb (a simple select * query) without reflection, the operation takes around 1 min. As in the below screen shot, the phases other than Running seem to be within expected ranges: There are ~200k rows and 40 columns.

With reflections, this time is reduced to <2s which is good.The footprint of reflection is only ~ 26Mb. And I am curious why does the normal push down query execution takes more than a minute to complete? Can you please provide some pointers for me to check that explains if this could be possibly at the DB end or some misconfigurations at Dremio? Thanks.

Please note, it won’t be possible for me to share the query profile, apologies.

@arunprasadk Let us see where the time is pent, any chance you can send the profile over? You can also click on Visual Profile and see which operator takes time. Also run the pushdown directly on MongoDB and see how much time it takes. Will narrow down the issue

Hello @balaji.ramaswamy MONGO_SUB_SCAN operator takes the most time to complete. Also, I tested the same query from a notebook (using pymongo) deployed on the same cluster and it takes ~20 seconds to complete. In addition to this, I observed in the query profile that the idealNumFragments and the splits are set to 1. Is this an expected behavior for MongoDB data source? Unfortunately, I cannot upload the query profile here.