sunfishcode's blog
A blog by sunfishcode

Fork versus Elegance

Posted on

The research paper A Fork in The Road contains a good summary of the problems with the fork call in Unix.

As the paper points out, posix_spawn provides an alternative to fork which solves some of the performance problems. However, if we set aside the performance side for the moment, posix_spawn's API kind of gives an impression that, if that's the alternative, perhaps we should reconsider.

posix_spawn needs a whole flock of posix_spawn_* helper functions, which provide ways to configure various aspects of the child process:

the posix_spawn helper function API

This is inelegant, because POSIX already has functions for doing all these things, except that they only work on the parent process. So posix_spawn has alternate versions of all these things, that operate on the child instead of the parent. And even with all the features made available this way, it still doesn't turn out to cover everything that people want to do.

In all, posix_spawn just doesn't feel "Unixy".

In contrast, fork allows the child to be configured with the same APIs as the parent. It can do everything, with no API duplication. Does this mean that maybe fork is an elegant way to design systems after all?

Not necessarily.

A different perspective

One reason why posix_spawn is simultaneously complex and insufficient is that Unix has a lot of resources implicitly associated with processes, that then need to be configured.

One such resource is the current directory. It acts a lot like a file descriptor for a directory, except that the OS implicitly holds onto to it on behalf of the process.

There isn't any fundamental reason why userspace couldn't hold onto this file descriptor itself. If a current directory handle were passed into a process alongside stdin, stdout, and stderr, userspace could use it and manage it explicitly, and there'd be no need for a dedicated chdir system call. And in a posix_spawn situation, the parent could pass any file descriptor to be the child's current directory, using the same mechanism as passing other file descriptors.

So, what if we had an OS that took this even further? What if we moved even more state out of the process, and into resources that would be explicitly managed via file descriptors?

That would let parent processes easily configure resources for their children processes without needing special APIs. All they'd have to do is use the regular system calls to set up file descriptors they want their child processes to have.

But wait, there's more

If we managed Linux namespaces with file descriptors, we could avoid having a bunch of flags on clone to say what parts of the parent get copied to the child. And we could give userspace more control over how it wants to share things.

To be sure, at this point we're talking about a fair number of file descriptors. A design like this would likely also want better ways to manage and pass around file descriptors. As the paper above points out, the fork+exec way of having children inherit file descriptors from the parent (with O_CLOEXEC as an awkward opt-out) already isn't great. Perhaps what we'd want is a way to pass file descriptors to the child explicitly, rather than relying on children implicitly inheriting them. At this point though, I'll leave that for future possible blog posts.

What does it all mean?

fork's apparent simplicity is built on an underlying complex system, which attaches a lot of implicit resources to processes.

A new OS that doesn't attach as many implicit resources to processes wouldn't necessarily need either fork or posix_spawn's flock of helpers as primitives.