On Hurd, Linux and the (mis)adventures of cross-compiling a GNU Hurd toolchain

by V.R.

This article is both a tutorial, a war story and a conceptual introduction to GNU Hurd in which I set up a cross-toolchain, and give a colorful tour through some rough edges of the GNU build system. My host system is Slackware Linux 14.1 (running on -current), i686 – which I find preferable due to its highly vanilla nature, running software almost entirely without distro-specific patching.

As of recent, I have found myself more interested in the Hurd – a well-known yet surprisingly simultaneously unknown project that has had very little attention given to it. In fact, I ran GitStats on the Savannah Hurd repo and found a total lifetime contributor count of 51. Only 51 people have touched the code in the 25 years of the Hurd.

The Hurd is not Linux. It’s not even Unix, though it does impart a Unix-like personality. Its lower level semantics are completely unrelated, however, as I shall elaborate later.

My interest in cross-compiling the GNU Hurd on GNU/Linux wasn’t to build software for it from the relative convenience of my GNU/Linux system. It was, in fact, to study the interactions between the various Hurd libraries (including Mach-specific and Hurd-specific glibc interfaces), the ELF contents of the resulting binaries, the generated RPC definitions and headers, all so as to observe the nature of the Hurd’s runtime from a foreign system. This could potentially allow me to begin mapping a framework to write and load stub interfaces for running the Hurd servers and libraries on a monolithic POSIX system. The forensics of the Hurd, if you will.

It is worth noting that the Hurd developers in fact discourage the practice of cross-compilation and largely insist on using a native Hurd distribution, like Debian GNU/Hurd (which is surprisingly stable these days). However, this did not fit in my aforementioned motive, and I also enjoy prying into the depths of the lesser known.

I hope this document will be useful to people embarking on a similar enterprise, given that all other references on setting up a cross-Hurd are now outdated. I also hope the madness of a cross-compiled GNU toolchain will entertain you.

Cross-compilation is indeed known to be a dark art. You’ll be seeing all sorts of spooky system interactions from targeting foreign machine ABIs. Because you’ll be testing corner cases in your compilation and build suites, you’ll also see undocumented macros and you’ll be wiring in the proper paths, watching versioned symbols fail, munging headers, modifying build recipes and so on. Furthermore, it’s a matter of bootstrapping. You’ll be building the same things in multiple passes to handle mutual dependencies. Get the sequence wrong and you might have to redo steps, or even manually invoke the tools to get the intended result.

But the educational experience is definitely worthwhile. I’m certainly more appreciative of the Plan 9 toolchain now.

Let’s dive into an intro, and then we can begin the action. Feel free to skip as desired.

Introduction to the Hurd, Mach and RPC


The GNU kernel goes back to the GNU Project’s conception in 1983. For a few years, the initially proposed target was an MIT research kernel dating from 1980 onwards called TRIX. It had an in-kernel RPC mechanism and ran certain services like file systems in userspace, so it was a bit of a proto-microkernel. In most other aspects, it implemented a conventional V7 Unix interface. Serious work in trying to adopt it for the GNU system began in late 1986, and it was ultimately scrapped in 1987 for architectural portability reasons, and because of the growing interest in procuring a license for CMU’s Mach kernel, which was quickly becoming all the rage at that time.

The chief architect of the Hurd for a long time was Thomas (then Michael) Bushnell, who ended up leaving the project around 2001. He presently works for Google and is also an ordained Gregorian friar, along with remaining a Debian package maintainer.

Bushnell initially intended to adapt a monolithic 4.4BSD-Lite kernel, but was offset by RMS' aspirations for using Mach, waiting for the licensing situation behind it to clear up. Whether or not this was a mistake or not depends on your point of view – does one value shipping fast, or shipping properly.

The work on what we know as GNU Hurd today began in 1990, as Bushnell’s brainchild. To this day much of the core architectural principles survive from his design.

The misunderstood intentions of the GNU Hurd

The Hurd is a prime example of a technology that’s famous for being famous. Everyone has a vague knowledge of it, but very few have any understanding of it. Moreover, there are some profound misconceptions that frequently float around it (mostly relating to political topics), which I will now debunk.

First of all the question of microkernels. Microkernels aren’t about code size. They’re about breadth of responsibilities and separation of concerns. V6 Unix had a very small kernel, but it was not a microkernel. Memory management, IPC, units of CPU time, units of resource usage (tasks), a representation of the machine host and even some device I/O handling can all constitute a microkernel, even if more recent designs are further “micro”.

The Hurd is not dead. It has never been a thriving community, but it has always managed to persevere out of capturing the interests of a few devoted developers who have periodically came and went. With Debian GNU/Hurd having >80% of the package base building and DDE drivers from Linux 2.6.x, it is also surprisingly usable. One can run Firefox and Xfce on it, for example. Because of the very low (at least relative to Linux) developer count, it has a remarkably different set of priorities, however.

The Hurd is not some symbolic thing that is kept alive as a political statement by GNU or the FSF, contrary to popular belief. In fact, the FSF shifted away from promoting the Hurd a long time ago and have accepted Linux as the canonical GNU target. The people who work on the Hurd are mostly hobbyists who do not have direct relations to GNU or FSF. Actually, the FSF stopped being interventionist stewards of their projects ages ago, these days occasionally stopping by to raise some points in some of their really visible crown jewels, mostly GCC and Emacs at this point. For most other projects, being part of GNU is largely a ceremonial aspect. The real significance is in the use of a GNU toolchain.

The Hurd’s benefits are not just theoretical. You can experience them for yourself by running a Debian GNU/Hurd distro, though it does require some lower level technical appreciation. Unix process semantics being a userspace server, a permission model that goes beyond Unix, relinquishing capabilities from any process, Berkeley sockets, FIFOs, pipes and TCP/IP in userspace servers, clean separation of disk, storage and VFS, a generic object server mechanism for translating one data representation to another that may or may not be accessed from a file system node (provided it uses the FS node as a point of registration and discovery, but that’s it) and so forth. Of further interest is that page fault handling and page replacement is done by userspace servers, as well. Individual applications and libraries can set default memory managers implemented through a common interface distinct from the system-wide one, and this is done e.g. by libdiskfs and ext2fs. Thus there is a separation between managing memory as a resource and memory as application content.

The Hurd is not a kernel. Mach is.

The Hurd developers haven’t really had the intention of replacing or “defeating” Linux for a long while, nor do they hold any grudge against it. People work on the Hurd for its own sake, because they see it as an interesting platform. That’s all.

So what is the Hurd?

The Hurd is a set of userspace servers, libraries and daemons that in combination with a microkernel, libc and binutils form a complete, POSIX-y (but still very distinct from Unix) multiserver operating system. The actual logic that gives a Hurd system its Unix personality exists purely as an abstraction in libc on top of more basic primitives like IPC and tasks.

Strictly, the Hurd has no necessity to run on top of any one given microkernel or even run as a particularly Unix-like personality at all. Several attempts to port the Hurd away from Mach into kernels like L4 or Viengoos have fizzled out due to conceptual non-fits or lack of interest/time/resources.

Key to the Hurd is the idea of the translator. This is a server that registers itself as a node in the file system (but does not have to export a VFS itself - this is important) which through standard library interfaces converts one representation of data to another. Since the basis of programming is input-process-output, translators are used to implement near arbitrary system logic.

It’s important to note that my last paragraph wasn’t exactly precise. Unlike Mach, the Hurd makes a distinction between “translator” and “server”. Translators do export virtual file systems, but servers do not. All translators are servers in Mach parlance, but not all servers are translators. Most services in the Hurd are translators because of the flexibility of a VFS namespace, but there is no obligation of any sort to use it for the actual data you’ll be delivering to a client, beyond registration in the FS namespace for discovery purposes.

There are three general types of translators with corresponding libraries - virtual translators (libnetfs), single-file [“trivial”] translators (libtrivfs) and physical store-backed translators (libdiskfs).

Other services include authentication, Unix process semantics, a VT console, a termios-compatible mode setting subsystem, /dev/random, crash handling, /proc/mtab, binary format registration and execution and so on. Extras outside the main Hurd repo also exist, packaged as hurd-recommended under Debian GNU/Hurd.

The Hurd also ships with some standard userland tools like an RPC tracer, login, su, mount/umount, vmstat, ps, getty, swapon/swapoff and so forth.

What is Mach?

Mach is a first-generation microkernel that originated as a research project at Carnegie Mellon University (CMU), but later sprouted at other places including the Open Software Foundation (OSF – its variant of Mach later becoming part of the XNU kernel used in Darwin/OS X) and the University of Utah (its variant called Mach4 being largely backwards compatible patches to the CMU Mach 3.0 codebase).

The flavor of Mach that GNU Hurd targets is unsurprisingly named GNU Mach. It was based on the CMU Mach 3.0 code, later integrated the Utah Mach4 extensions and is now an established variant on its own.

Key differences between GNU/CMU Mach as used in the Hurd and OSF Mach as used in OS X is that OSF Mach also implements semaphores, lock sets, a resource ledger and extensions to the Mach clock. GNU Mach does not, but on the other hand has a device interface and has since gained extensions specific to itself, like notification on IPC ports and some round-trip optimizations of message transport.

The chief concepts of Mach are ports (IPC channels), messages, tasks, threads, virtual memory and external memory management.

A task is a unit of resources running its own virtual address space with its own port name space (which is a collection of port names – integer descriptors, each of which is associated with some capability [send, send-once, receive…] called a port right).

Tasks by themselves do not do work unless they are backed by threads, which is the actual unit of CPU time. A thread may belong to only one task.

The virtual memory interface is implemented as data structures called memory objects which supply VM regions in a virtual address space. Memory objects are controlled by memory managers, called pagers (as I alluded to earlier by mentioning user-level page fault handling). This means the Mach kernel leaves VMM policy up to userspace, though it does supply a default pager from which anonymous memory is paged out whenever other pagers exhaust or time out freeing their memory cache – the actual data structure that maps to physical memory when a thread accesses a page in the controlling task’s virtual address space.

Anonymous memory and paged memory are usually distinct. The former has standard interfaces that more-or-less are equivalent to standard POSIX semantics (wiring => locking, alloc/dealloc, etc.) However, where Mach and Unix significantly differ is that tasks can access each other’s address spaces with protection boundaries still being enforced.

Ports and RPC

Mach ports are the nucleus of the Mach kernel (Yes.) Ports are the unit of communication. They are unidirectional asynchronous channels, each holding a single fixed-length message queue. They are unnamed and with a single receiver but multiple senders. They are accessible only via capabilities called port rights, which are represented as 32-bit positive integers and usually sent as part of the message body. Just about all Mach resources have ports implicitly associated to them, sans anonymous virtual memory.

A message is a typed collection of data objects. Messages are sent and received through the mach_msg() system call (of which Mach has only about 7 to 11, depending on variant – rest is library calls exposed by libc in libmachuser in the Hurd’s case). Messages may be simple or non-simple, i.e. containing only inline data or containing OOL (out-of-line) data. Inline data is directly copied by the receiver from the message structure, whereas OOL data is paged with kernel assistance (OOL data is usually used for sending regions of virtual memory or variable payloads).

Further are port sets which group ports with receive rights under a single unit for multiplexed I/O. They can’t be sent in messages, but must be recreated by a receiving task. Members of a port set indicate themselves whenever a receive operation is performed on the set, this being done in random order or FIFO if only one port in the set has a queued message.

There’s so-called special ports, which are implicitly created as part of a thread’s state. These include the bootstrap port for accessing system services like Mach devices and the exception port where the kernel sends software-based interrupts, not unlike Unix signals.

Ports have a reference count which is incremented, decremented or left in stasis depending on port operation – whether a send right is received or deallocated. When refcnt hits 0, the port name is freed. Ports die when their receive right is deallocated, leading to send and send-once rights becoming dead names, triggering a message queue sweep. Furthermore, receive rights are tracked with a make-send count, which holds the number of times a send right has been generated from a receive right, reset to 0 upon creation of a new port or when a receive right is transferred.

The semantics of Mach IPC are complex, but in practice they are abstracted either behind RPC or in the Hurd specifically, through libports for lower level port operations beyond send/receive.

RPC is made through a port held by a Hurd translator. All POSIX calls are internally implemented as RPC in glibc, thus having duties shared by glibc, Mach and Hurd. They are written as .defs files, which are written in a simple header-like configuration language called Matchmaker and compiled by the GNU MIG (Mach Interface Generator). MIG reads the RPC definition and builds a functional C source/header combination that packs the proper message arguments for the data and the port. This is done at build-time. For instance, open(2)-ing a file on the rootfs actually translates to calling a dir_lookup RPC from the compiled MIG definition, with the rootfs driver (e.g. ext2fs)’s loop keeping a demultiplexer function which dispatches the proper interface action by matching the RPC ID. Since the rootfs is backed by a physical store, it implements a libdiskfs stub, e.g. diskfs_S_dir_lookup that checks file consistency, creates a port to store the file handle and structure and crafts a reply buffer to send to the user program. The benefits of this approach are the separation between interface and implementation, generic interfaces for standard operations that each translator can hook in, and location transparency with dynamic binding (remember, it’s just a message send from the caller’s perspective, but the receiver is an object that may perform complex processing irrespective of how it actually chooses to represent its data or where it even resides). Downsides are round-trip calls, potentially three-way, though in practice the Hurd has some optimizations.

For more details on RPC, see the Hurd wiki article on it.

The unfortunate reputation of Mach (microkernel FUD)

Mach clearly had good ideas, but it took a while for it to take shape and shed off its initially heavyweight kernel interfaces. The semantics of IPC are complicated, if general. A variety of reasons, including elaborate ports right and type checks on messages led to disappointing performance results that were later addressed by more slim microkernel designs like L4, or QNX (though the latter is strictly synchronous and heavily coupled to CPU scheduling).

That said, Mach and microkernels in general, despite the latter actually dominating in real-time, mission-critical and certain embedded (like baseband processors) industries, as well as hypervisors like Xen, have never lived down their reputation as slow or inefficient. The infamous Torvalds-Tanenbaum debate did not help matters. With the popularity of Linux and Linus Torvalds having his various opinions elevated to deity-esque wisdom, there is probably a non-negligible contingent of people whose knowledge of microkernels boils down solely to echoing Linus' dislike of them without even really knowing why. Which is, well:

Jake Blues taking it away

No matter the communication overhead, the fault tolerance and reliabilty gains of a microkernel are undeniable. See the MINIX 3 reliability overview for examples.

The Linux microkernel?

It’s also worth noting that several more recent developments in Linux like kdbus (which was called “neutered Mach IPC” by Neal Walfield), the “tinification” effort [http://tiny.wiki.kernel.org], FUSE, NSUSE, the MADV_USERFAULT flag in madvise(2), kmscon and other userland VT subsystems, have demonstrated that there is in fact user demand for microkernel-like features in Linux, whether people realize this directly or not. It is not unthinkable to conceive of Linux (particularly with systemd) growing into a hybrid kernel approach with certain low-level subsystems adequately usable from a user context.

Cross-compiling the Hurd, Part I: Prerequisites

Well, now that our (unexpectedly long) intro is out of the damned way, let’s dive right in!

The architecture we will be targeting is i686-pc-gnu. We will be using Thomas Schwinge’s cross-gnu and cross-gnu-env scripts, but with some modifications to handle more recent developments in the Hurd’s toolchain.

The cross-gnu process is (outdatedly) documented at the Hurd wiki. I will be elaborating on how we will be deviating from its exact steps.

You can either clone the scripts from the Hurd incubator Git repository (which is where cutting-edge developments are stored outside main repos), or just download cross-gnu and cross-gnu-env directly as raw files. They’re just shell scripts.

Component overview

A cross-toolchain for the Hurd consists of six components: binutils, gcc, gnumach, mig, hurd and glibc.

binutils handle assembly, linking, object file analysis and management, library archiving and so on.

gcc is self-explanatory.

gnumach is the GNU Mach kernel with the relevant Mach headers needed for glibc and hurd.

mig is the Mach Interface Generator, used for generating RPC interfaces from .defs files in glibc and hurd.

hurd is all the servers, daemons and libraries that power the system services themselves, along with RPC definitions for all those. We will only be importing the headers in our cross-build root for glibc to compile. We will not be building the Hurd binaries themselves within cross-gnu, as that is beyond its scope. This will be done manually in the end.

glibc is the GNU C Library, implementing the C standard library, POSIX, low-level name resolution and in our specific case the libhurduser (the low-level RPCs to servers – basically hurd/.defs, including abstractions over ioctl()s, signal handling, file descriptors, path/file lookups, etc. – along with the few Hurd-specific glibc APIs) and the libmachuser (the user interface for the Mach kernel API - the low-level traps, thread and message code which are the base of the MIG-generated interfaces like mach_port_allocate() and vm_wire() – basically mach/.defs).

Wait, what about libpthread?

You might notice the cross-gnu wiki article mentions libpthread, which I’m omitting.

The Hurd uses its own implementation of POSIX Threads (libpthread) which for a while used to be maintained and packaged as a separate library.

As of more recent, however, despite still being treated as such for development purposes, it can now be used directly as a glibc add-on. This is, in fact, how Debian GNU/Hurd does it (libpthread is inside glibc), so it’s best for us to follow its lead.

Setting up the environment with the proper packages

The cross-gnu page mentions exact versions, but many of these are now no longer relevant.

For instance, it recommends gcc-4.5 with a configure file patch. We will instead be using an unmodified gcc-4.8.0 from GNU’s FTP servers. Older versions of gcc appear to fail on some of the Mach spinlock code in more recent versions of glibc.

We will be using binutils-2.25. cross-gnu recommends binutils-2.20, but does hint 2.22 or later should be fine. It evidently is.

I ended up fetching GNU Mach, GNU Hurd and GNU MIG from tarballs as opposed to cloning from Git. These are, respectively: 1.5, 0.6, 1.5.

The trickiest part by far (as you shall see) is glibc. After much mucking around, I pulled a glibc_source-2.19-.deb package from Richard Braun’s Debian GNU/Hurd ports mirror. This is a build that comes with libpthread integrated as an add-on and various patches applied by the Debian GNU/Hurd maintainers as they keep tracking changes in Hurd development, so it’s the path of least resistance. A .deb is just a structured tarball, so extract the data.tar.xz inside.

You can get the direct links as follows, though I encourage experimentation on your part: hurd, gnumach, mig, gcc, binutils, glibc.

After placing your cross-gnu, cross-gnu-env and the six GNU toolchain packages somewhere, you should create a new directory at the top-level, another directory inside that called src where you will symlink plain unversioned names of the packages so that they can be built and installed in the bottom level, which is effectively your cross-build root.

Like so:

mkdir -p hurd-cross-build/src && cd hurd-cross-build/src
ln -s ../binutils-2.25 binutils/
ln -s ../gcc-4.8.0 gcc/
ln -s ../glibc-2.19 glibc/
ln -s ../gnumach-1.5 gnumach/
ln -s ../hurd-0.6 hurd/
ln -s ../mig-1.5 mig/

Though not documented in the wiki, you can do the same for a seventh package: gdb. I have not tested this.

Cross-compiling the Hurd, Part II: Modifying cross-gnu

The cross-gnu and cross-gnu-env scripts are highly useful in automating the multi-pass phase nature of the cross-compilation process, but the former makes old assumptions that we must revise. cross-gnu-env requires no modifications; leave it untouched.

Disable C++ in GCC

We do not require it, and it may complicate the process.

gcc is compiled in two passes. The first pass builds it without threading, shared library support and NLS (Native Language Support), sufficient for a cross-MIG and a first-pass cross-glibc. The second is a full compiler suite.

In both passes, make sure –enable-build-with-cxx is removed and –enable-languages only has a value of c.

Disable Texinfo for GCC

Pass the MAKEINFO=missing flag separated with a slash on a newline for both passes. GCC is quite strict about Texinfo and it has been known to break. We do not strictly need docs here, anyway.

Disable nscd for glibc

GNU Hurd seemingly cannot build nscd (name server caching daemon), and Debian GNU/Hurd is not known to ship with it. As of glibc-2.17, there is a –disable-nscd flag you can pass, so add it to both build passes.

Remove the libpthread pass

It is redundant, since libpthread is a glibc add-on. It will also not build right away due to a circular dependency on a Hurd library, which I will mention later.

I am referring to this code block:

mkdir -p "$LIBPTHREAD_OBJ" &&
if ./config.status --version > /dev/null 2>&1; then :; else
  # `$TARGET-gcc' doesn't work yet (to satisfy the Autoconf checks), but isn't
  # needed either.
  CC=gcc \
  "$LIBPTHREAD_SRC"/configure \
    --host="$TARGET" \
    --prefix="$CROSS_GNU_USR" \
fi &&
"$MAKE" \
  install-data-local-headers &&
# Below, we will reconfigure for allowing to build libpthread.
if grep -q '^CC = gcc$' Makefile
then rm config.status
else :
fi &&

Remove the Hurd library pass

Though commented as “GNU Hurd’s core libraries”, this only builds and installs libihash - the Hurd’s generic hash table library.

libpthread uses libihash for maintaining thread-local storage.

From my experience, this pass fails due to a lack of glibc by this point, regardless of whether a relative or absolute compiler name is used. We will later get around this in a rather crafty (read: unremittingly horrifying) way.

The issue is that the Hurd roadblocks us by invoking the undocumented AC_NO_EXECUTABLES autoconf macro when it detects a cross-compilation, which turns off the standard autoconf link tests under the assumption that our linker is not yet bootstrapped in the absence of libc.

Autotools are a landmine as a matter of principle.

The code block is this:

# Install the GNU Hurd's core libraries.

cd "$HURD_OBJ"/ &&
if ./config.status --version > /dev/null 2>&1; then :; else
  "$HURD_SRC"/configure \
    --host="$TARGET" \
    --prefix="$CROSS_GNU_USR" \
    --disable-profile \
fi &&
"$MAKE" \
  libihash &&
"$MAKE" \
  prefix="$SYS_ROOT""$CROSS_GNU_USR" \
  libihash-install &&

This should be it. On to the first-pass!

Cross-compiling the GNU Hurd, Part III: First pass and the great glibc swindle

This is where you put the round in the chamber and fire.

Total disk space taken by the end of the whole process should be ~2GB.

Head into your cross-build directory (e.g. hurd-cross-build) and designate it as ROOT, then invoke cross-gnu-env:

. cross-gnu-env # or wherever it's located

cross-gnu-env will set various build-time variables like $SYS_ROOT, $TARGET, $PROGNAME_OBJ and $PROGNAME_SRC so that the build steps can run with minimal manual configuration on your part (or none for most of the process, as we’ll simply be running cross-gnu and trying to resolve SNAFUs).



Now fire!

cross-gnu # assuming it's in $PATH, otherwise use absolute directory name

You will be rerunning this a lot after fixing failures. It’s safe.


The easiest part. It passes without issue.

gcc, first-pass

This should also pass smoothly. It may also complain of a missing $SYS_ROOT/include, in which case just touch it.

GNU Mach headers

Needed for glibc later on. This is a simple operation. We never actually build a gzipped Mach kernel, though if you want to, see here.


For generating the RPC definitions. Again, no qualms.

GNU Hurd headers

Are really just RPC defs that are compiled by MIG, hence the previous step. They will go in $SYS_ROOT/usr/include, along with Mach ones and all others.

glibc, first-pass

Ulrich Drepper

Strange, I never saw your name on my paycheck. Since if that’s not the case you cannot order me around.” – Ulrich Drepper

glibc breaks spectacularly and we will be doing plenty of ugly dissection to get it up and building. This is all in a libpthread-integrated Debian build with patches, so you can only imagine what the actual maintainers must fight against.

Missing RPC definitions

We observe that hurd/Makefile references exec_experimental and fs_experimental in the user-interfaces directive. Except, they’re not actually included with the tarball. Or really in any public release I could find. The former is a more recent API used by the exec server to run #!-scripts, and the latter has a comment explaining its intended use.

I thus had to copy them from mailing list patches and create them, minus licensing info due to laziness.

Remember, this software comes with no warranty. For details, please read ‘WARRANTY’.


subsystem exec_experimental 434242;




routine exec_exec_file_name (
      execserver: file_t;
      file: mach_port_send_t;
      oldtask: task_t;
      flags: int;
      filename: string_t;
      argv: data_t SCP;
      envp: data_t SCP;
      dtable: portarray_t SCP;
      portarray: portarray_t SCP;
      intarray: intarray_t SCP;
      deallocnames: mach_port_name_array_t;
      destroynames: mach_port_name_array_t);


subsystem fs_experimental 444242;



/* Operations supported on all files */


/* Overlay a task with a file.  Necessary initialization, including
   authentication changes associated with set[ug]id execution must be
   handled by the filesystem.  Filesystems normally implement this by
   using exec_newtask or exec_loadtask as appropriate.  */
routine file_exec_file_name (
   exec_file: file_t;
   exec_task: task_t;
   flags: int;
   filename: string_t;
   argv: data_t SCP;
   envp: data_t SCP;
   fdarray: portarray_t SCP;
   portarray: portarray_t SCP;
   intarray: intarray_t SCP;
   deallocnames: mach_port_name_array_t SCP;
   destroynames: mach_port_name_array_t SCP);

Disable Texinfo from Makefile targets

glibc is even more anal about its insistence on installing docs for you which we can’t actually cross-compile, our target failing. As there is no switch, we need to edit manual/Makefile and reduce the install, install-data, subdir_install and catchall rule to this:

install-data subdir_install:
ifneq ($(strip $(MAKEINFO)),:)
# Catchall implicit rule for other installation targets from the parent.
install-%: ;

pthread surgery, Mk I

libpthread has a private, libc-internal mutex lock header libc-lockP.h, as well as pthread-functions.h for condition variables and thread attributes.

For some reason, glibc will also be expecting them in sysdeps/mach/hurd/bits, so you should copy them over from libpthread/sysdeps/pthread/bits and libpthread/sysdeps/pthread/, respectivly.

pthread surgery, Mk II: Mangling our libhurduser

Again, we come back to the chicken-egg problem of libihash I mentioned earlier. glibc needs libpthread, which needs libihash. libihash is part of Hurd, which needs glibc. We don’t have it, and so link tests are turned off for Hurd, stopping us from configuring a build.

I ultimately settled on a horrible hack, but oddly one that the Hurd developers themselves have considered doing: making libihash a part of libhurduser. By doing this, I could also keep intact the header definitions set in the thread-local storage handling code.

Three things need to be done.

Firstly, take the ihash source file and the ihash header file from hurd-0.6/libihash and paste them into glibc, respectively at hurd/ and hurd/hurd/.

Second, edit hurd/Makefile and add definitions for ihash.h in the headers and inline-headers directives, as well as ihash in the routines directive so they’re registered in the build system:

headers = hurd.h $(interface-headers) \
      $(addprefix hurd/,fd.h id.h port.h signal.h sigpreempt.h ioctl.h\
                userlink.h resource.h threadvar.h lookup.h ihash.h)

inline-headers = hurd.h $(addprefix hurd/,fd.h signal.h \
                      userlink.h threadvar.h port.h ihash.h)
 [...] # cut off
 routines = hurdstartup hurdinit \
       hurdid hurdpid hurdrlimit hurdprio hurdexec hurdselect \
       ihash \
       [...] # cut off

Finally, the most unsavory part: actually register the hurd_ihash functions as versioned symbols for linker visibility in the hurd/Versions file. Yep, it’s “officially” now part of glibc.

   GLIBC_2.13_DEBIAN_19 {
    # functions used by libpthread and 

Stop and take a shower, realizing you will never be able to engage with normies. REEEEEEEEEEEEEEEE

Forcing nscd through, if necessary

If for some reason –disable-nscd doesn’t work (I can’t imagine wh-

This is fun

The behavior is correct and wanted. Now stop wasting people’s time.

Alright, actually I can.

Anyway, such an event can hopefully be mitigated by:

a) Commenting out the thread_info_t overloading in nscd/nscd.c, which conflicts with <mach/thread_info.h>:

 /* Structure used by main() thread to keep track of the number of
   active threads.  Used to limit how many threads it will create
   and under a shutdown condition to wait till all in-progress
   requests have finished before "turning off the lights".  */

typedef struct
  int             num_active;
  pthread_cond_t  thread_exit_cv;
  pthread_mutex_t mutex;
} thread_info_t;

thread_info_t thread_info;

b) Redefining the connection database initialization locks in nscd/connections.c from the non-present in Hurd PTHREAD_RWLOCK_WRITER_NONRECURSIVE_INITIALIZER_NP to the closely equvalent __PTHREAD_RWLOCK_INITIALIZER.

In general, since nscd is useless in Hurd, anything you do to mangle your way out is likely permissible.

This should hopefully be it.

Cross-compiling the GNU Hurd, Part IV: Second pass

Damn, I hope we’re glad to get that out of the way.

Second pass is mostly smooth sailing.

gcc will build itself with full shared library and threading support modulo what we edited out from cross-gnu. As I mentioned earlier, it might complain of a missing $SYS_ROOT/lib that you might need to touch, either in the first or second pass.

glibc, too, after being smacked with a trout in the first pass, should be fine. You should now have a complete dynamic loader (ld.so.1).

And you’re done!


We do have a cross-toolchain targeting i686-pc-gnu and thus the Hurd, but we haven’t built the Hurd itself. Nor is this a recommended practice either, since the ABI, kernel and libraries are all a different platform. But it is what we’re seeking for research purposes.

Cross-compiling the GNU Hurd, Part V: The actual fucking Hurd

This process is pretty much bound to be fragile, since we’re pushing the purposes of a cross-toolchain (cross-compiling packages) right into actually building the Hurd servers, that are of a totally alien platform. Nonetheless, I managed to get most of them – more than enough to observe ELF headers, symbols, library interactions and generated RPC definitions to see how Mach RPC is structured.

This one is manual, but still reusing the variables set by cross-gnu-env.

cd over to the Hurd directory.


./configure --host="$TARGET" --prefix="$CROSS_GNU_USR" --disable-profile --without-parted

Take it for a spin:


Several things to note.

The Hurd is structured in a recursive per-directory Makefile build layout, each Makefile sourcing from the root Makeconf. As most servers and libraries use Mach threads, since abstracted by pthreads, they attach the -lpthread linker flag to the HURDLIBS and LDLIBS variables.

However, this actually has the effect of linking to our system-wide /lib/libpthread.so.0, when we really want to link to our cross-compiled, Hurd-specific $SYS_ROOT/lib/libpthread.so.3. To do this, you must comment out or remove all instances of HURDLIBS and LDLIBS that reference -lpthread. Simply grep for it and apply a sed patch.

Secondly, the proc server in proc/mgt.c uses a relatively recent RPC added to the Mach kernel called mach_notify_new_task, which may or may not have been compiled by MIG in the Mach header pass. Either head to gnumach/include/mach and invoke i686-pc-gnu-mig on the task_notify.defs file manually and copy it over to $SYS_ROOT/usr/include, or since it’s a low-level server that may not be of major relevance to this, just define the RPC function signature manually on top of proc/mgt.c, like this:

extern kern_return_t mach_notify_new_task
   mach_port_t notify, mach_port_t task, mach_port_t parent
{ } ;

The Hurd mount(8) binary uses libblkid, which we don’t have. Either compile it or simply comment out or remove it from utils/Makefile in the targets, special-targets, mountlibs-LDFLAGS and mountlibs-CPPFLAGS variables.

I was unable to build the ext2fs, storeio, pflocal, hello-mt and fatfs translators. There were issues with pthread read-write locks which I deferred from debugging, since the rest of the Hurd was quite enough for analysis. I do intend on revisiting them later down the road. If you figure it out in the meantime, do contact V.R. at Dark n' Edgy forums.

mach-defpager also failed, but that one is understandable and of little relevance to us. Applications and libraries set their own memory managers/pagers (the Hurd usually basing them on libpager interfaces), with mach-defpager being inherently unportable and intrinsic to Mach. In monolithic kernels, you will be using the page replacement algorithms of your kernel.

I had to comment out the aforementioned programs from hurd/Makefile’s prog-subdirs. hello-mt is just a trivial single-file translator, so that one should be taken out from trans/Makefile.

If you want to install:

gmake DESTDIR="$SYS_ROOT" install

Quickly analyzing the Hurd

Obviously you won’t be able to run the Hurd because you don’t actually have anything remotely resembling a Hurd-compatible runtime on your GNU/Linux box. ldd isn’t of much use, either, but other tools (particularly from binutils) are quite handy, besides looking at the outputs of our $SYS_ROOT.

Displaying the ELF file and program headers of the hello translator:

foo@bar:~/gnuhurd/hurd-cross-build/src$ readelf -h -l hurd/trans/hello
ELF Header:
  Magic:   7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00 
  Class:                             ELF32
  Data:                              2's complement, little endian
  Version:                           1 (current)
  OS/ABI:                            UNIX - System V
  ABI Version:                       0
  Type:                              EXEC (Executable file)
  Machine:                           Intel 80386
  Version:                           0x1
  Entry point address:               0x8048dfe
  Start of program headers:          52 (bytes into file)
  Start of section headers:          32368 (bytes into file)
  Flags:                             0x0
  Size of this header:               52 (bytes)
  Size of program headers:           32 (bytes)
  Number of program headers:         8
  Size of section headers:           40 (bytes)
  Number of section headers:         36
  Section header string table index: 33

Program Headers:
  Type           Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
  PHDR           0x000034 0x08048034 0x08048034 0x00100 0x00100 R E 0x4
  INTERP         0x000134 0x08048134 0x08048134 0x0000b 0x0000b R   0x1
      [Requesting program interpreter: /lib/ld.so]
  LOAD           0x000000 0x08048000 0x08048000 0x01754 0x01754 R E 0x1000
  LOAD           0x001754 0x0804a754 0x0804a754 0x001c0 0x00290 RW  0x1000
  DYNAMIC        0x001768 0x0804a768 0x0804a768 0x00100 0x00100 RW  0x4
  NOTE           0x000140 0x08048140 0x08048140 0x00020 0x00020 R   0x4
  GNU_EH_FRAME   0x0014e0 0x080494e0 0x080494e0 0x0006c 0x0006c R   0x4
  GNU_STACK      0x000000 0x00000000 0x00000000 0x00000 0x00000 RW  0x10

 Section to Segment mapping:
  Segment Sections...
   01     .interp 
   02     .interp .note.ABI-tag .hash .dynsym .dynstr .gnu.version .gnu.version_r .rel.dyn .rel.plt .init .plt .text .fini .rodata .eh_frame_hdr .eh_frame 
   03     .ctors .dtors .jcr .dynamic .got .got.plt .data .bss 
   04     .dynamic 
   05     .note.ABI-tag 
   06     .eh_frame_hdr 

Dumping the symbol table for the isofs/main object file:

foo@bar:~/gnuhurd/hurd-cross-build/src$ objdump -t hurd/isofs/main.o

hurd/isofs/main.o:     file format elf32-i386

00000000 l    df *ABS*  00000000 main.c
00000000 l    d  .text  00000000 .text
00000000 l    d  .data  00000000 .data
00000000 l    d  .bss   00000000 .bss
00000000 l    d  .rodata.str1.1 00000000 .rodata.str1.1
00000000 l    d  .rodata.str1.4 00000000 .rodata.str1.4
00000000 l     F .text  0000016d read_sblock
00000000 l    d  .text.startup  00000000 .text.startup
00000000 l     O .rodata    0000000b __PRETTY_FUNCTION__.11124
00000000 l    d  .rodata    00000000 .rodata
00000000 l    d  .debug_info    00000000 .debug_info
00000000 l    d  .debug_abbrev  00000000 .debug_abbrev
00000000 l    d  .debug_loc 00000000 .debug_loc
00000000 l    d  .debug_aranges 00000000 .debug_aranges
00000000 l    d  .debug_ranges  00000000 .debug_ranges
00000000 l    d  .debug_line    00000000 .debug_line
00000000 l    d  .debug_str 00000000 .debug_str
00000000 l    d  .note.GNU-stack    00000000 .note.GNU-stack
00000000 l    d  .eh_frame  00000000 .eh_frame
00000000 l    d  .comment   00000000 .comment
00000000         *UND*  00000000 _GLOBAL_OFFSET_TABLE_
00000000         *UND*  00000000 diskfs_exception_diu
00000000         *UND*  00000000 _setjmp
00000004       O *COM*  00000004 disk_image
00000000         *UND*  00000000 malloc
00000004       O *COM*  00000004 sblock
00000004       O *COM*  00000004 logical_block_size
00000000         *UND*  00000000 memcmp
00000000         *UND*  00000000 error
00000000         *UND*  00000000 __errno_location
00000170 g     F .text  00000038 diskfs_append_args
00000000         *UND*  00000000 diskfs_append_std_options
00000008 g     O .bss   00000004 store_parsed
00000000         *UND*  00000000 store_parsed_append_args
00000000 g     F .text.startup  000000f0 main
00000000         *UND*  00000000 diskfs_readonly
00000000         *UND*  00000000 diskfs_hard_readonly
00000000         *UND*  00000000 diskfs_init_main
0000000c g     O .bss   00000004 store
00000000         *UND*  00000000 create_disk_pager
00000000         *UND*  00000000 rrip_initialize
00000000         *UND*  00000000 rrip_lookup
00000004       O *COM*  00000004 diskfs_root_node
00000000         *UND*  00000000 load_inode
00000000         *UND*  00000000 pthread_mutex_unlock
00000000         *UND*  00000000 diskfs_startup_diskfs
00000000         *UND*  00000000 pthread_exit
00000000         *UND*  00000000 __assert_perror_fail
000001b0 g     F .text  00000003 diskfs_reload_global_state
000001c0 g     F .text  00000003 diskfs_set_hypermetadata
000001d0 g     F .text  00000008 diskfs_readonly_changed
00000000         *UND*  00000000 abort
00000000 g     O .data  00000004 diskfs_maxsymlinks
00000004 g     O .data  00000004 diskfs_name_max
00000008 g     O .data  00000004 diskfs_link_max
00000000 g     O .bss   00000004 diskfs_synchronous
0000000c g     O .data  00000004 diskfs_extra_version
00000010 g     O .data  00000004 diskfs_server_version
00000014 g     O .data  00000004 diskfs_server_name
00000004 g     O .bss   00000004 diskfs_disk_name
00000004       O *COM*  00000004 mounted_on
00000004       O *COM*  00000004 host_name
00000004       O *COM*  00000004 diskfs_read_symlink_hook
00000004       O *COM*  00000004 diskfs_create_symlink_hook
00000004       O *COM*  00000004 diskfs_shortcut_ifsock
00000004       O *COM*  00000004 diskfs_shortcut_fifo
00000004       O *COM*  00000004 diskfs_shortcut_blkdev
00000004       O *COM*  00000004 diskfs_shortcut_chrdev
00000004       O *COM*  00000004 diskfs_shortcut_symlink

Displaying notes, unwind info, relocations and dynamic section for ftpfs:

foo@bar:~/gnuhurd/hurd-cross-build/src$ readelf -nrud hurd/ftpfs/ftpfs

Dynamic section at offset 0x690c contains 30 entries:
  Tag        Type                         Name/Value
 0x00000001 (NEEDED)                     Shared library: [libhurdbugaddr.so.0.3]
 0x00000001 (NEEDED)                     Shared library: [libnetfs.so.0.3]
 0x00000001 (NEEDED)                     Shared library: [libfshelp.so.0.3]
 0x00000001 (NEEDED)                     Shared library: [libiohelp.so.0.3]
 0x00000001 (NEEDED)                     Shared library: [libports.so.0.3]
 0x00000001 (NEEDED)                     Shared library: [libihash.so.0.3]
 0x00000001 (NEEDED)                     Shared library: [libftpconn.so.0.3]
 0x00000001 (NEEDED)                     Shared library: [libshouldbeinlibc.so.0.3]
 0x00000001 (NEEDED)                     Shared library: [libc.so.0.3]
 0x00000001 (NEEDED)                     Shared library: [libmachuser.so.1]
 0x00000001 (NEEDED)                     Shared library: [libhurduser.so.0.3]
 0x0000000c (INIT)                       0x804989c
 0x0000000d (FINI)                       0x804d45c
 0x00000004 (HASH)                       0x8048160
 0x00000005 (STRTAB)                     0x8048cd8
 0x00000006 (SYMTAB)                     0x80484e8
 0x0000000a (STRSZ)                      2084 (bytes)
 0x0000000b (SYMENT)                     16 (bytes)
 0x00000015 (DEBUG)                      0x0
 0x00000003 (PLTGOT)                     0x804fa28
 0x00000002 (PLTRELSZ)                   576 (bytes)
 0x00000014 (PLTREL)                     REL
 0x00000017 (JMPREL)                     0x804965c
 0x00000011 (REL)                        0x804961c
 0x00000012 (RELSZ)                      64 (bytes)
 0x00000013 (RELENT)                     8 (bytes)
 0x6ffffffe (VERNEED)                    0x80495fc
 0x6fffffff (VERNEEDNUM)                 1
 0x6ffffff0 (VERSYM)                     0x80494fc
 0x00000000 (NULL)                       0x0

Relocation section '.rel.dyn' at offset 0x161c contains 8 entries:
 Offset     Info    Type            Sym.Value  Sym. Name
0804fa24  00003506 R_386_GLOB_DAT    00000000   __gmon_start__
0804fc00  00005305 R_386_COPY        0804fc00   __vm_page_size
0804fc04  00001e05 R_386_COPY        0804fc04   netfs_node_refcnt_lock
0804fc08  00002005 R_386_COPY        0804fc08   __mach_task_self_
0804fc0c  00002705 R_386_COPY        0804fc0c   netfs_std_runtime_argp
0804fc28  00003005 R_386_COPY        0804fc28   netfs_root_node
0804fc2c  00003805 R_386_COPY        0804fc2c   stderr
0804fc30  00005905 R_386_COPY        0804fc30   netfs_std_startup_argp

Relocation section '.rel.plt' at offset 0x165c contains 72 entries:
 Offset     Info    Type            Sym.Value  Sym. Name
0804fa34  00000307 R_386_JUMP_SLOT   00000000   netfs_startup
0804fa38  00000407 R_386_JUMP_SLOT   00000000   pthread_mutex_lock
0804fa3c  00000607 R_386_JUMP_SLOT   00000000   mmap64
0804fa40  00000707 R_386_JUMP_SLOT   00000000   netfs_init
0804fa44  00000807 R_386_JUMP_SLOT   00000000   ftp_conn_start_retriev
0804fa48  00000a07 R_386_JUMP_SLOT   00000000   _pthread_spin_lock
0804fa4c  00000b07 R_386_JUMP_SLOT   00000000   fflush
0804fa50  00000c07 R_386_JUMP_SLOT   00000000   malloc
0804fa54  00000f07 R_386_JUMP_SLOT   00000000   netfs_nput
0804fa58  00001007 R_386_JUMP_SLOT   00000000   fclose
0804fa5c  00001107 R_386_JUMP_SLOT   00000000   pthread_hurd_cond_wait
0804fa60  00001207 R_386_JUMP_SLOT   00000000   asprintf
0804fa64  00001407 R_386_JUMP_SLOT   00000000   hurd_ihash_destroy
0804fa68  00001507 R_386_JUMP_SLOT   00000000   ftp_conn_set_type
0804fa6c  00001607 R_386_JUMP_SLOT   00000000   munmap
0804fa70  00001707 R_386_JUMP_SLOT   00000000   hurd_ihash_init
0804fa74  00001807 R_386_JUMP_SLOT   00000000   bcopy
0804fa78  00001907 R_386_JUMP_SLOT   00000000   netfs_make_node
0804fa7c  00001c07 R_386_JUMP_SLOT   00000000   hstrerror
0804fa80  00001f07 R_386_JUMP_SLOT   00000000   error
0804fa84  00002407 R_386_JUMP_SLOT   00000000   ftp_conn_create
0804fa88  00002507 R_386_JUMP_SLOT   00000000   pthread_mutex_unlock
0804fa8c  00002807 R_386_JUMP_SLOT   00000000   ftp_conn_free
0804fa90  00002907 R_386_JUMP_SLOT   00000000   fshelp_access
0804fa94  00002b07 R_386_JUMP_SLOT   00000000   argp_failure
0804fa98  00003507 R_386_JUMP_SLOT   00000000   __gmon_start__
0804fa9c  00003a07 R_386_JUMP_SLOT   00000000   free
0804faa0  00003c07 R_386_JUMP_SLOT   00000000   __errno_location
0804faa4  00003e07 R_386_JUMP_SLOT   00000000   ftp_conn_get_names
0804faa8  00003f07 R_386_JUMP_SLOT   00000000   fshelp_isowner
0804faac  00004007 R_386_JUMP_SLOT   00000000   strcpy
0804fab0  00004107 R_386_JUMP_SLOT   00000000   pthread_cond_broadcast
0804fab4  00004207 R_386_JUMP_SLOT   00000000   fshelp_touch
0804fab8  00004607 R_386_JUMP_SLOT   00000000   netfs_nrele
0804fabc  00004907 R_386_JUMP_SLOT   00000000   getpid
0804fac0  00004b07 R_386_JUMP_SLOT   00000000   ftp_conn_append_name
0804fac4  00004c07 R_386_JUMP_SLOT   00000000   strchr
0804fac8  00004e07 R_386_JUMP_SLOT   00000000   argp_state_help
0804facc  00005007 R_386_JUMP_SLOT   00000000   hurd_ihash_locp_remove
0804fad0  00005207 R_386_JUMP_SLOT   00000000   netfs_server_loop
0804fad4  00005407 R_386_JUMP_SLOT   00000000   gethostbyname_r
0804fad8  00005507 R_386_JUMP_SLOT   00000000   strlen
0804fadc  00005607 R_386_JUMP_SLOT   00000000   strrchr
0804fae0  00005707 R_386_JUMP_SLOT   00000000   ftp_conn_finish_transf
0804fae4  00005807 R_386_JUMP_SLOT   00000000   stpcpy
0804fae8  00005c07 R_386_JUMP_SLOT   00000000   argp_error
0804faec  00005d07 R_386_JUMP_SLOT   00000000   snprintf
0804faf0  00005e07 R_386_JUMP_SLOT   00000000   maptime_map
0804faf4  00005f07 R_386_JUMP_SLOT   00000000   pthread_mutex_init
0804faf8  00006007 R_386_JUMP_SLOT   00000000   memset
0804fafc  00006307 R_386_JUMP_SLOT   00000000   __assert_fail
0804fb00  00006407 R_386_JUMP_SLOT   00000000   __strdup
0804fb04  00006507 R_386_JUMP_SLOT   00000000   pthread_cond_init
0804fb08  00006707 R_386_JUMP_SLOT   00000000   __libc_start_main
0804fb0c  00006807 R_386_JUMP_SLOT   00000000   strcmp
0804fb10  00006907 R_386_JUMP_SLOT   00000000   netfs_nref
0804fb14  00006a07 R_386_JUMP_SLOT   00000000   vm_allocate
0804fb18  00006b07 R_386_JUMP_SLOT   00000000   close
0804fb1c  00006c07 R_386_JUMP_SLOT   00000000   io_stat
0804fb20  00006e07 R_386_JUMP_SLOT   00000000   argz_add
0804fb24  00006f07 R_386_JUMP_SLOT   00000000   memchr
0804fb28  00007007 R_386_JUMP_SLOT   00000000   argp_parse
0804fb2c  00007207 R_386_JUMP_SLOT   00000000   read
0804fb30  00007407 R_386_JUMP_SLOT   00000000   ftp_conn_get_stats
0804fb34  00007507 R_386_JUMP_SLOT   00000000   calloc
0804fb38  00007607 R_386_JUMP_SLOT   00000000   hurd_ihash_add
0804fb3c  00007807 R_386_JUMP_SLOT   00000000   __strndup
0804fb40  00007907 R_386_JUMP_SLOT   00000000   fopen64
0804fb44  00007b07 R_386_JUMP_SLOT   00000000   task_get_special_port
0804fb48  00007c07 R_386_JUMP_SLOT   00000000   fprintf
0804fb4c  00007d07 R_386_JUMP_SLOT   00000000   strtol
0804fb50  00007e07 R_386_JUMP_SLOT   08049d50   ports_self_interrupted

The decoding of unwind sections for machine type Intel 80386 is not currently supported.

Notes at offset 0x00000140 with length 0x00000020:
  Owner                 Data size   Description
  GNU                  0x00000010   NT_GNU_ABI_TAG (ABI version tag)
    OS: Hurd, ABI: 0.0.0

So on and so forth. A possible next step is to load stubs for Hurd libraries and gauge the results.

That’s all from me, here’s my final words.

Closing remarks

The Hurd is severely underrated. This wouldn’t be a problem if people didn’t propagate misconceptions and falsehoods regarding, though it is evidently an emotionally charged issue for whatever reason.

I recommend people try out Debian GNU/Hurd on QEMU or VirtualBox and browse the wiki to get a taste of the Hurd’s offerings. Educating yourself about other OS never hurts.

The cross-toolchain I documented in building was motivated by my own interest in peaking at a live, though non-functional Hurd system that I can excavate and analyze from a foreign platform (GNU/Linux). How I will continue with this is, I don’t quite know yet, but I hope this is of help to anyone embarking on a similar endeavor, or that you found this article to be interesting.