Ctas write in order

Hello…is it possible to write out records in order using ctas…order by? For example:

create table $scratch.bob as
select * from jack
order by time

So once the table bob is created, and when I go in to do a select * from bob, I would expect all records to come back sorted by time.

hi @trang, why not ? (the order by is part of the sql query)
didn’t it work ?

Hi @dfleckinger, yeah, it didn’t work. regardless of what I do–whether i stick in the order by clause at the end like what I’ve shown above, or create the table first, then order it and then create another table based on that order, it just doesn’t store it in the order I want. Regardless of whether i’m pulling back from the UI or via the API, it still doesn’t back in the order I specify.

Does dremio store records based on the sql command, or does it override it to efficiently pack it in the parquets based on whatever rules it comes up with?

@trang, what is the data type of the time column?

Can you Run (not Preview) the query without the CREATE TABLE $scratch.bob AS part and attach the profile here?

@trang, it is possible to do a CTAS with order by. The data will be written in the correct order. But the order is not preserved when that data is read back by Dremio. This is because the data is written to multiple files, and when it is read by Dremio, the files are read in parallel without regard to the order they were written in.

1 Like

@trang you also have the LOCALSORT BY (columns) feature which enables to sort the partitions.
It makes the “order by” useless in query

On my side, it works well

create table $scratch.bob as
LOCALSORT BY (time)
select * from jack

then select * from $scratch.bob is correctly sorted by time.

thanks for this…very helpful