Skip to content

Mono runtime does not raise open file limit on Linux #82428

@uweigand

Description

@uweigand

We've been running into problems on s390x where running dotnet restore would under certain circumstances abort with a "Too many open files" error message. The details are documented in NuGet/Home#12410.

Initially, I thought this was not related to the Mono runtime, but simply caused by timing differences in the I/O stack on s390x compared to other platforms. Therefore, I created a patch to fix the issue in NuGet by restricting the number of network connections it attempts to open simultaneously.

However, it turned out that this issue is related to the Mono runtime as well. Specifically, when running dotnet using the CoreCLR runtime, the runtime startup code will call this routine:

/*++
Function:
INIT_IncreaseDescriptorLimit [internal]
Abstract:
Calls setrlimit(2) to increase the maximum number of file descriptors
this process can open.
Return value:
TRUE if the call to setrlimit succeeded; FALSE otherwise.
--*/
static BOOL INIT_IncreaseDescriptorLimit(void)

which raises the (soft) limit on number of open files (typically 1K on Linux) to the corresponding hard limit (typically 1M !).

This is currently not done by the Mono runtime, at least not on Linux. There is a similar routine here:

/*
* tries to increase the minimum number of files, if the number is below 1024
*/
static void
darwin_change_default_file_handles ()
{
struct rlimit limit;
if (getrlimit (RLIMIT_NOFILE, &limit) == 0){
if (limit.rlim_cur < 1024){
limit.rlim_cur = MAX(1024,limit.rlim_cur);
setrlimit (RLIMIT_NOFILE, &limit);
}
}
}

but this is not invoked on Linux, only on Darwin, and raises the limit only up to 1K.

I believe that Mono should behave the same as CoreCLR here and raise the soft limit as far as possible. This would be preferable to the NuGet fix (which does affect NuGet performance somewhat), and would be useful even otherwise - e.g. an ASP.NET Core based Web server might expect to be able to handle a large number of simultaneous connections, and because it works under CoreCLR it really ought to work under Mono too.

The one question is whether it is safe to raise the limit in the Mono runtime. The reason for the default of 1K on Linux is that the select library routine and its associated data types like fd_set cannot handle file descriptors larger than 1K, so programs that can run under a higher rlimit must avoid this routine in favor of more modern solutions like (at least) poll. This is already done by the native code backing the C# library implementation. However, there are two places remaining in Mono code itself that do use select:

  • the Mono log profiler src/mono/mono/profiler/log.c
  • the Mono debugger agent src/mono/mono/component/debugger-agent.c

There's some code in both of these places that already attempts to handle file descriptors larger than 1K, however this will lead to some (graceful) degradation in service. Ideally, I guess those places should be rewritten to use poll where available, e.g. by using the mono_poll facility. However, I'm not sure whether this is a precondition for accepting a patch to raise the rlimit ...

I have implemented a patch that does raise the rlimit using the very same logic a the CoreCLR implementation, and this does indeed also fix the dotnet restore problem (without any change to NuGet). I'll be posting a PR shortly.

I'd appreciate feedback as to whether this would be acceptable as-is or whether the places using select need to be eliminated first. In the latter case, I'd be happy to work on this - but I'd appreciate some pointers as to how to exercise / test these code paths.

CC @omajid @tmds @steveisok @directhex @vargaz @akoeplinger @nealef

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions