Too many open files

Hi dremio,

We are using a legacy version of dremio community edition, build 3.0.6-201812082352540436-1f684f9 and running into an issue of “Too many open files”.

java.lang.RuntimeException: java.nio.file.FileSystemException: /opt/dremio/data/db/search/dac-namespace/core/_3giz_Lucene54_0.dvd: Too many open files

I think it’s related to issue Error - Too many open files and our dremio running for long time (more than 1 year).

And I didn’t find other related result from google. Now, we modified max NOFILE number and restarted dremio
My question is that, why too many files opened in dremio, since over 65535, is this expected?

We found that, there are too many pipe/eventpoll opened (nearly 60000),

  19930 [eventpoll]
  39860 pipe

Can we get the info which operations will cause so many fd opened and is there any fix related to this in later version?

Appreciate for your help

@qingdaojunzuo

Can you please send us the server.out and also /proc//limits output?

Yes, here is the ulimit in server.out

Tue Sep 14 08:45:29 UTC 2021 Starting dremio on ip-10-2-41-53
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 61636
max locked memory       (kbytes, -l) 65536
max memory size         (kbytes, -m) unlimited
open files                      (-n) 65536
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) 61636
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

Thanks

@qingdaojunzuo Send me output of the '/proc//limitsfile. Do you also have a setting calledstore.accurate.partition_stats` set to true

Since the issue blocked service, we already restarted dremio, here is the limits file with new started dremio

  File: limits
  Size: 0               Blocks: 0          IO Block: 1024   regular empty file
Device: 5h/5d   Inode: 869842238   Links: 1
Access: (0444/-r--r--r--)  Uid: (  999/  dremio)   Gid: (  999/  dremio)
Access: 2022-12-09 05:23:39.475916619 +0000
Modify: 2022-12-09 05:23:39.475916619 +0000
Change: 2022-12-09 05:23:39.475916619 +0000
 Birth: -

BTW, we have a cron job to call dremio per hour through odbc and I found that there are many logs shown retrieving s3 bucket failed (IOException) and also several logs(once per hour) for jdbc query failure since column data type not supported(maybe jdbc query is not related to fd issue)

The earlierst log I can find is 12/01. such as,

grep IOException syslog.7 | grep dremio | wc -l
5466

grep IOException syslog.7 | grep dremio | head -n 1
Dec  1 06:34:11 dremio[26910]: java.io.IOException: Unable to retrieve bucket named xxxx.

grep IOException syslog.7 | grep dremio | tail -n 1
Dec  2 05:54:44 dremio[26910]: Caused by: java.io.IOException: Unable to find bucket named xxxx.

The full stack is

 java.io.IOException: Unable to retrieve bucket named xxxx.
  com.dremio.plugins.util.ContainerFileSystem.getFileSystemForPath(ContainerFileSystem.java:305) 
  com.dremio.plugins.util.ContainerFileSystem.getFileStatus(ContainerFileSystem.java:479) 
  com.dremio.exec.store.dfs.FileSystemWrapper.getFileStatusSafe(FileSystemWrapper.java:304)
  com.dremio.exec.store.dfs.FileSystemPlugin.checkMultifileStatus(FileSystemPlugin.java:734) 
  com.dremio.exec.store.dfs.FileSystemPlugin.checkReadSignature(FileSystemPlugin.java:694) 
  com.dremio.exec.catalog.SourceMetadataManager.refreshDataset(SourceMetadataManager.java:514) 
  com.dremio.exec.catalog.ManagedStoragePlugin.refreshDataset(ManagedStoragePlugin.java:395) 
  com.dremio.exec.catalog.CatalogImpl.refreshDataset(CatalogImpl.java:455) 
  com.dremio.exec.catalog.DelegatingCatalog.refreshDataset(DelegatingCatalog.java:164) 
  com.dremio.exec.planner.sql.handlers.direct.RefreshTableHandler.toResult(RefreshTableHandler.java:58) 
  com.dremio.exec.planner.sql.handlers.commands.DirectCommand.plan(DirectCommand.java:61) 
  com.dremio.exec.work.foreman.AttemptManager.run(AttemptManager.java:303) 
  java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [na:1.8.0_292]
  java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [na:1.8.0_292]
  java.lang.Thread.run(Thread.java:748) [na:1.8.0_292]
 Caused by: java.io.IOException: Unable to find bucket named xxxx.
  • At the same time, the s3 bucket can be visited from VM successfully.

  • We just changed the s3 source from an old one to new one one month ago on GUI without dremio restarted. I’m not sure if this is related to the issue

  • I think we don’t set the value of “store.accurate.partition_stats” since the support keys list is empty on support GUI page

Thanks for the quick response:)

@qingdaojunzuo Just saw you are on an older version and it is complaining on a lucene file, How big is your “db” folder?

@balaji.ramaswamy , The folder size is,

/opt/dremio/data/db# du -sh *
4.7G    catalog
204K    metadata
193M    search

Thanks

The open files limit you sent me, was it from the coordinator?

Yes, we only setup one node for dremio, the coordinator and work is the same instance.

@qingdaojunzuo Do you know if this happens when a job is run? If yes, are you able to send us the profile?

I’m not sure about this. Since,

  • fd leak (let’s call the raw issue this) is the reason for dremio not working today, after restarted, the number of fd increased not very obviously (seems to be increased) in past 2-3 hours, from 1400 to 1500(may be 30-50 increased) even the above IOExceptions already run 2-3 times

  • I think the cron job is one of the way to trigger the IOException happened. The true operations done by dremio is to refresh the s3/postgres internally even with the periodical refresh by dremio itself (Interval metadata refresh)

  • The fd increased most is “pipe” and “eventpoll”, 2:1

     48 [eventpoll]
     96 pipe

I’m not sure if any relationship between the 2 issues, just happened at the same time.

The code of the stack shows maybe the IOException is expected and not sure if true IO applied,

Simply decompiled the class related to stack

private ContainerHolder getFileSystemForPath(Path path) throws IOException {
    String name = getContainer(path);
    ContainerHolder container = (ContainerHolder)this.containerMap.get(name);
    if (container == null)
      try {
        synchronized (this.refreshLock) {
          container = (ContainerHolder)this.containerMap.get(name);
          if (container == null) {
            container = getUnknownContainer(name);
            if (container == null)
              throw new IOException(String.format("Unable to find %s named %s.", new Object[] { this.containerName, name })); 
            ImmutableMap.Builder<String, ContainerHolder> newMap = ImmutableMap.builder();
            newMap.putAll((Map)this.containerMap);
            newMap.put(name, container);
            this.containerMap = newMap.build();
          } 
        } 
      } catch (Exception ex) {
        throw new IOException(String.format("Unable to retrieve %s named %s.", new Object[] { this.containerName, name }), ex);
      }  
    return container;
  }
protected ContainerFileSystem.ContainerHolder getUnknownContainer(String name) {
    return null;
  }

Map build

public void refreshFileSystems() throws IOException {
    synchronized (this.refreshLock) {
      Map<String, ContainerHolder> oldMap = new HashMap<>((Map<? extends String, ? extends ContainerHolder>)this.containerMap);
      ImmutableList.Builder<ContainerFailure> failures = ImmutableList.builder();
      ImmutableMap.Builder<String, ContainerHolder> newMap = ImmutableMap.builder();
      for (ContainerCreator creator : getContainerCreators()) {
        ContainerHolder fs = oldMap.remove(creator.getName());
        if (fs != null) {
          newMap.put(creator.getName(), fs);
          continue;
        } 
        try {
          newMap.put(creator.getName(), creator.toContainerHolder());
        } catch (Exception ex) {
          logger.warn("Failure while attempting to connect to {} named [{}].", new Object[] { this.containerName, creator.getName(), ex });
          failures.add(new ContainerFailure(creator.getName(), ex));
        } 
      } 
      this.containerMap = newMap.build();
      for (ContainerHolder old : oldMap.values()) {
        try {
          old.close();
        } catch (IOException ex) {
          logger.warn("Failure while closing {} named [{}].", new Object[] { this.containerName, old.getName(), ex });
        } 
      } 
      this.failures = (List<ContainerFailure>)failures.build();
    } 
  }

protected Iterable<ContainerFileSystem.ContainerCreator> getContainerCreators() throws IOException {
    String externalBucketList = getConf().get("dremio.s3.external.buckets");
    FluentIterable<String> buckets = (externalBucketList == null) ? FluentIterable.of((Object[])new String[0]) : FluentIterable.of((Object[])externalBucketList.split(",")).transform(input -> input.trim()).filter(input -> !Strings.isNullOrEmpty(input));
    if (getConf().get("fs.s3a.aws.credentials.provider") == "org.apache.hadoop.fs.s3a.SimpleAWSCredentialsProvider" || 
      getConf().get("fs.s3a.aws.credentials.provider") == "org.apache.hadoop.fs.s3a.SharedInstanceProfileCredentialsProvider")
      buckets = buckets.append((Iterable)FluentIterable.from(this.s3.listBuckets()).transform(input -> input.getName())); 
    return (Iterable<ContainerFileSystem.ContainerCreator>)FluentIterable.from((Iterable)buckets.toSet()).transform(input -> new BucketCreator(getConf(), input));
  }

Thanks

@qingdaojunzuo Problem is you are on such an old version of Dremio that the code base has entirely changed. Unrelated to this issue, any reason you are still on version 3.x?emphasized text

@balaji.ramaswamy . Yes. the code base changed a lot. Since we built too many PDS/VDS on this version, may have an expensive cost to move to new version and test currently, so…

We may plan to do upgrade in long term but not for now :sweat_smile:

Thanks

@balaji.ramaswamy ,

We monitored the leak these days and found that, it seems that the pipe/eventpoll increased accompanies the following error when refreshing redshift data - socket timed out when doing metadata refresh

date, time, total, epoll, pipe
2022-12-12,05:24:46,1978,201,402
2022-12-12,05:25:18,1987,202,404

Dec 12 13:25:15 ip-10-2-41-53 dremio[7009]: 2022-12-12 13:25:15,593 [metadata-refresh-Redshift xxx] WARN  c.d.e.store.jdbc.JdbcSchemaFetcher - Took longer than 5 seconds to query row count for xxx.xxx, Using default value of 1000000000.
java.sql.SQLTimeoutException: [Amazon](500053) The TCP Socket has timed out while waiting for response

It seems that the statistic on table failed since timed out. I found the following parameter may relate to,

public static final String JDBC_ROW_COUNT_QUERY_TIMEOUT = "store.jdbc.row_count_query_timeout_seconds";
  
public static final TypeValidators.LongValidator JDBC_ROW_COUNT_QUERY_TIMEOUT_VALIDATOR = (TypeValidators.LongValidator)new TypeValidators.PositiveLongValidator("store.jdbc.row_count_query_timeout_seconds", 2147483647L, 5L);

Is there a way to configure this value from configuration files or other ways without code change so that we can do some testing to see if this is related to issue?

I know the codebase changed a lot, just want to have a try if the value can be set without code change

Appreciate for your help :grinning: