Cannot start dremio -- rockbd problem -- missing sst files

Hi,

we are facing the following issue, when we try to restart dremio with

dremio service restart

dremio starts for a second then stops

The log says (server.out)

2018-12-18 12:36:29,200 [main] INFO c.d.datastore.LocalKVStoreProvider - Starting LocalKVStoreProvider
2018-12-18 12:36:31,209 [main] INFO c.d.datastore.LocalKVStoreProvider - Stopping LocalKVStoreProvider
2018-12-18 12:36:31,210 [main] INFO c.d.datastore.LocalKVStoreProvider - Stopped LocalKVStoreProvider
Failed to complete cleanup.
org.rocksdb.RocksDBException: Can’t access /009551.sst: IO error: while stat a file for size: /dremio/db/catalog/009551.sst: No such file or directory
Can’t access /009552.sst: IO error: while stat a file for size: /dremio/db/catalog/009552.sst: No such file or directory
Can’t access /009537.sst: IO error: while stat a file for size: /dremio/db/catalog/009537.sst: No such file or directory
Can’t access /009538.sst: IO error: while stat a file for size: /dremio/db/catalog/009538.sst: No such file or directory
Can’t access /009539.sst: IO error: while stat a file for size: /dremio/db/catalog/009539.sst: No such file or directory
Can’t access /009540.sst: IO error: while stat a file for size: /dremio/db/catalog/009540.sst: No such file or directory
Can’t access /009541.sst: IO error: while stat a file for size: /dremio/db/catalog/009541.sst: No such file or directory
Can’t access /009542.sst: IO error: while stat a file for size: /dremio/db/catalog/009542.sst: No such file or directory
Can’t access /009543.sst: IO error: while stat a file for size: /dremio/db/catalog/009543.sst: No such file or directory
Can’t access /009544.sst: IO error: while stat a file for size: /dremio/db/catalog/009544.sst: No such file or directory
Can’t access /009545.sst: IO error: while stat a file for size: /dremio/db/catalog/009545.sst: No such file or directory
Can’t access /009546.sst: IO error: while stat a file for size: /dremio/db/catalog/009546.sst: No such file or directory
Can’t access /009547.sst: IO error: while stat a file for size: /dremio/db/catalog/009547.sst: No such file or directory

Is there any way to force a restart of dremio?

I already tried a “dremio-admin clean” but It gave me the same error

Thanks

We are using Dremio 2.0.5 on centos 6.10
Thanks

@seraus

Any chance you deleted files from db/catalog folder? or is that mount unavailable?

Thanks
@balaji.ramaswamy

Thanks for your answer balaji.ramaswamy.

We overcame the problem by recovering a recent backup.

But I’m almost sure that nobody deleted the files manually.
Have you ever experienced such an inconsistency due to Dremio malfunction?

Thanks

Hi, I got the same problem. Trying to solve without backup, if anyone known how please tell us.

In my case, no one deleted files directly but restarted my aws EC2 instance with dremio process running. I think the files were corrupted or deleted by the system by staying in a tmp directory. I just don’t known how files was deleted hehehe.

Well, any help will be appreciated.

Thanks.

Hi @Bruno_Mata

On EC2 instances does the mount on which the db folder resides get reset by server restart?

Thanks
@balaji.ramaswamy

Hi @balaji.ramaswamy thanks for answer.

No, the instance have no such config. I think something like a no commited information has lost on reboot.

dremio-admin clean command did not work, by the way.

Any idea of how reset the metadata state?

thanks

Maybe this information is relevant, so, other files .sst is in directory. A lot of them.

That is why I believe in the hypothesis of a ‘non commited files’.

thanks;

Hi @Bruno_Mata,

From 2.1.8 we are reindexing on startup when the previous shutdown in not clean

Can you please upgrade to the latest on out website and see if the behavior repeats?

Thanks
@balaji.ramaswamy

Hi @balaji.ramaswamy,

Thanks for your help. I did the update and not solve the problem, so, I decide restore a backup (fortunately I found one that someone made).

I’ll now setup a periodic automatized backup.

thanks;

Hi @Bruno_Mata

on your EC2 machine where is the rocksDB now?

Thanks
@balaji.ramaswamy

Same problem here.

Is there any way to solve this problem without having to reinstall Dremio or recover a backup?

@Mandrade

Restoring from backup or if you have a cold copy of the db folder would work,

thanks