After running a query, clicking on JSON or CSV download generates a dump of the table in a file but for some reason, the output file does not match the query result in the browser. Aren’t they supposed to match? To reproduce this problem, try adding “ORDER BY” to your sql statement.
Can you describe a bit more about how they are different (perhaps with screenshots)?
When we get questions like this it often has to do with the difference between Preview and Run in the web app.
To make it easier to quickly iterate on a query, Dremio works with a sampling of the data by default:
To see the results from the complete set of data, you can choose to Run your query:
Downloads always do a full Run (though are limited to the first million results).
This happens very nondeterministically so let me try to find a public dataset so that you can reproduce it.
I cannot share our own data but this happens with using SQL Server as the data source and executing a query with ORDER BY … e.g.
ORDER BY col1
If you add “limit” then CSV does seem to produce the same result set as what’s shown in the browser.
Uploading the screenshot below where the left side shows “run” query result in dremio and right side is csv output in excel. Results in CSV are not ordered.
Hi Tim, thanks for the additional info - I’m now able to reproduce this.
I’ve created an internal ticket to track this issue and will get back to you once it’s resolved.
I would like to add my vote for this functionality. Our Dremio deployment is growing and our users are become better at SQL. While not a show stopper, asking our users to resort the data once it opens in Excel is time they could be spending on data analysis, especially if there is a complicate ORDER BY statement.