Why FreeBSD should not adopt launchd
Firstly, because of the visceral nature of these discussions, I would like to add a disclaimer that this article is not borne of Ludditism (quite the contrary, as you shall see) and that more importantly I do not oppose having Mach/libSystem/launchd integration inside derivative products of FreeBSD, such as FreeNAS or PC-BSD. This is strictly about FreeBSD and why I am skeptical of the goals of the NextBSD project led by iXsystems, which is pushing the OS X-ification effort.
I have been keeping an eye on NextBSD for some time, when it was initially just openlaunchd, an effort initially started by R. Tyler as a GSoC student in 2005 to, unsurprisingly, port the launchd system and service manager to FreeBSD. It was stalled for a long time until its revival in late 2013, but again moving very slowly.
Around November of 2014 at the MeetBSD conference, Jordan Hubbard delivered a talk entitled “FreeBSD: The Next 10 Years,” which outlined a general desire for a more “event-driven” and unified configuration approach to FreeBSD, strongly implying the use of launchd as system bootstrap and service daemon, as well as other parts of the OS X low-level userspace.
At the time, a lot of people (including myself) considered this more of a hypothetical than a true ongoing effort. They certainly kept a low profile. However, with my growing research interest in GNU Hurd, I was quickly reminded of the openlaunchd effort and went to ask Kip Macy about the status of a Mach IPC port for FreeBSD.
As it turned out, not only had he ported a partial OSF Mach implementation, but also Apple System Logger, XPC, libdispatch, libnotify, libaudit and launchd with a JSON parser.
At that point, I realized that things had gotten real. Soon after, the project is officially unveiled at a Bay Area FreeBSD User Group meeting in August 2015, with a talk featuring Jordan Hubbard and Kip Macy.
Which brings me to this article, and why I believe this is misguided. Though there is nothing wrong with experimentation, I will attempt to persuade the reader why launchd and Mach are not good fits for FreeBSD.
launchd: a chimera architecture
The gist of it is that the launchd architecture is so specific to OS X’s particular use cases that trying to verbatim port it to another system leads to a conceptual mismatch.
In the context of OS X, launchd is, among other things, the Mach bootstrap server, managing a namespace of name-port bindings for discovery purposes among Mach services. Since Mach port namespaces are internal to tasks, a central arbiter was chosen to handle lookups and service registration.
It’s very important to note that this is by no means a necessity. For instance, GNU Hurd does not have the equivalent to a bootstrap server. Instead, each process is assigned a bootstrap port for rendezvous with the server library that starts it. Bootstrapping is therefore a decentralized activity, but the NextBSD developers instead largely think of Mach semantics with regard to how OS X specifically implements them rather than in general, and proceed to infer and extrapolate a specific architecture as if it were fiat.
launchd was probably the first system to popularize the buzzword idea of “socket activation”, really a rebranding of the age old ideas of fd-passing and inetd-style service launching. It is not a generic scheme like UCSPI, but an ad-hoc centrally managed protocol that to fully exploit encourages daemon writers to link to liblaunch, creating an undesirable lock-in.
However, the core blot of launchd is its absolutely muddy and confused semantics for categorizing services. launchd is designed to start and supervise daemons, so-called agents (really per-user/session daemons), Mach services and XPC services. Right off the bat, the latter two are OS X-specific and bound to an IPC mechanism, which is an inconsistent way to categorize a service. An agent can then be one of four “session types”: pre-login, per-user, GUI session and non-GUI session. Notice how the idea of GUI and pre-login session types is quite clearly designed with the Aqua interface and the loginwindow application (OS X-specific display manager) in mind. They are irrelevant and an impedance mismatch on other platforms. The idea of a non-GUI session agent, then, is tightly coupled to how launchd starts sshd from its plist, which is a peculiarly limited domain to devote an entire agent type to.
Further complicating things is the actual process types, as documented in launchd.plist(5). Where even something like systemd treats them as information for what startup routines should be performed, here they’re tightly coupled to what scheduling policy and resource limits should be applied! There are four types, ambiguously named Background, Standard, Adaptive and Interactive. Background is straightforward, it’s a regular service daemonized by launchd… but with automatic resource limiting. Standard is the implicit default. Adaptive switches between Background and Interactive using XPC. So it’s a process type coupled to a particular way of using an IPC mechanism. And then Interactive is, quite confusingly, one without any resource limits. You might be baffled at this quagmire. Well, it does make decent sense for a desktop where you want good priority scheduling and resource accounting so that applications can be kept responsive even as system events queue up. But for a server and for embedded systems, which is what FreeBSD is primarily used for? It is absolutely nonsensical. Nor is this behavior even well observable.
Moreover, launchd is largely limited to one startup discipline: lazy loading of services on demand. Now, for example, something like daemontools is based on the philosophy of “let it crash” and restarting services continuously until their dependencies are met. Because of its low overhead and versatility, it can be used as a standard eager service manager that boots everything up at once, but it can also be made lazy through the use of simple socket toolkit interfaces (similar to inetd, but much more modular) like UCSPI. systemd, too, can be used in both lazy and eager fashions, achieving this via its complicated dependency system.
launchd, though, can’t handle eagerness particularly well. It has no formal dependency or ordering system, and it doesn’t fit well with the “let it crash” paradigm, either. In fact, quoting from “Creating Launch Daemons and Agents”, it can’t play well with non-launchd daemons in general:
If your daemon has a dependency on a non-launchd daemon, you must take additional care to ensure that your daemon works correctly if that non-launchd daemon has not started when your daemon is started. The best way to do this is to include a loop at start time that checks to see if the non-launchd daemon is running, and if not, sleeps for several seconds before checking again.
Be sure to set up handlers for SIGTERM prior to this loop to ensure that you are able to properly shut down if the daemon you rely on never becomes available.
To help get around this, launchd provides a hack in the form of the KeepAlive key to bind a service’s uptime with a particular resource condition. Just about all of its options suffer from intrinsic race conditions. There’s NetworkState, which binds to a network interface being up, here being defined as “at least one non-loopback interface being up and having IPv4 or IPv6 addresses assigned to them,” PathState for keeping alive as long as an inode at a given path exists, with all the raciness that comes with file system event notifications. But, most laughably, there’s OtherJobEnabled. That’s right. You can couple a job to another job being enabled by its label. A poor man’s dependency system that’s ripe for abuse and completely subverts the whole supposed point for launchd’s existence to begin with.
Then, of course, on top of all this, you have one-shot jobs with a key named LaunchOnlyOnce (as opposed to being some sort of type, which would make more sense), crond-like functionality with StartCalendarInterval that reportedly doesn’t play well with an actual crond, more ambiguous resource balancing keys like LimitLoadTo and LowPriorityIO and the ability to yield to a debugger first via WaitForDebugger (Does this mean launchd breaks ptrace(2) semantics? I hope not.)
Let’s not neglect that all of this service state is being meshed with the global system state, instead of being isolated into different processes with different responsibilities.
And by the way. Everything I described? All of it is in PID 1. That’s just a reliability concern no matter what way you slice it. It might be fine for desktop systems, but it is not generic in the slightest.
Another reliability concern is that configuration parsing is done in PID 1, too. With Apple’s launchd, the format is XML. With NextBSD’s launchd, it’s a bit saner. They chose JSON, instead. But in general, parsing should probably be handled outside of the critical base of PID 1, since parsers have long been notorious sources of bugs and particularly adept to input fuzzing.
Ultimately, the launchd architecture is a mess of layering violations, semantic inconsistencies and odd ambiguous interfaces that make it really awkward to use. This emerges from the fact that it tries to be the manager of just about every unit of CPU time on the whole damned system. It’s clearly a highly specifically tailored tool to OS X’s constraints. It’s designed mostly to be developer-facing and for shaping the desktop experience, rather than something a sysadmin would heavily use. That’s fine for OS X. It’s clumsy and not fine for FreeBSD, or really much of anything else.
Mach: An anachronism
Jordan Hubbard claims that there isn’t such a grand impedance mismatch between Mach and Unix concepts, but I’d contest that.
In nextbsd.org’s Welcome to NextBSD post, they make the following statement:
We hope to be more progressive and willing to try new things. We do not fear changing the startup system or dragging Unix kicking and screaming into the 21st century by its hair, if necessary.
I couldn’t help but laugh.
They’re going to drag Unix kicking and screaming into the 21st century… by integrating a mid-80s first-generation research microkernel for its infamous IPC.
Now, keep in mind I’m not bashing Mach per se. The GNU Hurd does fine with its GNU Mach and has even done some interesting refactoring and updates on it throughout the years. I’m sure OS X does fine with Mach in XNU, as well (its IPC is used extensively).
Consciously adopting its interfaces in 2015, however… it’s such a baffling anachronism, especially for a project so priding itself on its futurism and vision. Now, of course, the OSF Mach interfaces from MkLinux they’ve shimmed in are mostly just a means to getting the OS X systems stuff, but they try to justify it on its own merits, too.
Now, with such a decision, it would have been best to have a userland Mach server. This is however impractical, because Mach semantics are in fact really tricky to pin down on a monolithic Unix. And even as a kernel module, they don’t support memory objects, because those are backed by a store from an external pager (i.e. a userspace page fault handler). Monolithic Unixes have never had good mechanisms for achieving this. An old hack is trapping SIGSEGV (e.g. with GNU libsigsegv), but that’s very limited. A userfaultfd(2) mechanism was proposed for Linux, inspired by work on KVM, but it has yet to be merged.
Mach enforces a very strict task-thread dichotomy, i.e. a dichotomy between the virtual address space and the unit of CPU time running under it. Together, they’re combined to form a process. In FreeBSD’s case, they had to shortcut it in an ugly fashion by assuming task = process and thread = kthread, which violates proper Mach semantics. The same would probably need to be done on Linux, by the way.
The Mach VM system appears to be well ported. Mach’s vm_read() and vm_write() functions work by mapping to and from discrete task address spaces, so that was a likely motivator for making it a kernel module. It is still a reliability concern.
The NextBSD slides praise Mach IPC as facilitating separate service namespaces that isn’t a file system, as if that’s a huge disadvantage. Using Plan 9 and Inferno has taught me that 9P with per-process namespaces is much more elegant and versatile than a table of integer descriptors, which is what a port namespace is.
They mention a “pre-existing well-defined RPC interface,” which I presume is in reference to MIG. MIG is a compiler that translates RPC definitions from a language called Matchmaker into elaborate C code for packing Mach messages. It’s not particularly pleasant to use, mostly a side effect of Mach IPC being grueling to use (hence why Apple made XPC), and to the best of my knowledge has never been used to implement location transparent communication.
Neither Darwin nor BSD
This is what it boils down to. The NextBSD effort is barely even trying to think about solving problems. They just have people who are or were associated with Apple, are enamored with OS X as a technology and simply ritualistically port the OS X infrastructure as if it were an inherently correct fiat that needs no justification. We want dynamic systems, so of course we’re gonna port launchd. We want nice indexed logging, so of course we’re gonna port ASL. Hey, all of this needs Mach and XPC, so let’s put a partial OSF Mach as a kernel module in the same address space.
The big question: Why wouldn’t I just use OS X instead? I can’t think of any reason. The NextBSD effort as it stands has no intention of doing anything interesting with the OS X-isms but using them directly as upstream intended, thus they’re in a position where they’ll play eternal catch-up with Darwin. If I want to use Darwin directly, I’d run something like PureDarwin. If I want the whole shebang, I’ll just use OS X - it’s already reasonably interoperable with Unix (a licensed Unix, in fact). NextBSD is a profound example of the age-old joke about “Something must be done. This is something, therefore we must do it.”
NextBSD is a project with a severe identity crisis.
What should FreeBSD adopt instead?
There’s three practical options I can name.
One is to use nosh, a highly flexible and modular system and service management toolkit made by Jonathan de Boyne Pollard which implements a lot of systemd and launchd-like features, yet has an architecture that is the polar opposite of each of them. It runs on FreeBSD and most rc.d scripts have even been ported.
The second is to follow DragonFly BSD’s example. Matthew Dillon wrote a small yet powerful service manager for it called svc(8), which uses jails for service tracking and isolation, plus the proctl(2) system call for becoming a reaper without being init (similar to the PR_SET_CHILD_SUBREAPER flag in Linux >=3.4’s prctl(2)). Though otherwise rudimentary, the goal would be to write a simple yet complete and versatile service manager that exploits native FreeBSD features.
The third is to do nothing and stick with rc.d. I won’t comment further on that, it’s pretty self-explanatory.
I have identified structural technical issues with the launchd architecture, how Mach concepts do not map well to monolithic Unix, and the general folly of being a second-fiddle OS X instead of exploiting native features or building on top of existing structures rather than hacky verbatim porting of foreign system software.
I hope this post was interesting reading for you.