QUIC and Linux capabilities - あどけない話

For security reasons, the typical boot process of HTTPS servers is as follows:

Executed by a root.
Reading a TLS private key and open a listen socket on TCP port 443.
Switching the root user to nobody (or something).

Since accept() can create connected sockets bound to TCP port 443 even with non-root privilege, servers can accept connections.

Let's consider the case of QUIC servers which uses UDP. Chrome does not allow Alt-Svc to go across the privileged boundary (i.e. 1024). For instance, `Alt-Svc: h3=":4433" provided on TCP port 443 does not work. QUIC servers should provide QUIC on UDP port 443.

Some QUIC servers make use of connected UDP socket. As I described in Implementation status of QUIC in Haskell, the following procedure can be used to create a connected UDP socket when a packet is received on a kind of listen socket of 192.0.2.1:443:

Create a new UDP socket with SO_REUSEADDR
Bind it to *:443
Connect it to 203.0.113.0:3456. This also binds the local address to 192.0.2.1.

For 2), the root privilege or the CAP_NET_BIND_SERVICE capability is necessary on Linux. The easiest way to implement secure QUIC servers is to use the setcap command:

% sudo setcap "CAP_DAC_READ_SEARCH,CAP_NET_BIND_SERVICE+epi" quic-server
% sudo -u nobody -g nobody ./quic-server

CAP_DAC_READ_SEARCH is necessary to read a TLS private key. Since the capability is not dropped, this server can read any files. Yes, still insecure.

To keep only CAP_NET_BIND_SERVICE, the following code should be run after reading the private key:

  /* root */
  /* inherits all capabilities */
  prctl(PR_SET_SECUREBITS, SECBIT_KEEP_CAPS, 0L, 0L, 0L);

  setuid(99);
  /* nobody */

  /* drop capabilities except CAP_NET_BIND_SERVICE */
  cap_t caps = cap_from_text("cap_net_bind_service=ipe");
  cap_set_proc(caps);
  cap_free(caps);

This probably works for most QUIC servers. However, this is not the case for Haskell. The Linux capability is per-thread. GHC threaded RTS spawns some native threads then runs Haskell programs. If I understand correctly, there is no way to set SECBIT_KEEP_CAPS for all native threads.

The manual page of capabilities says:

Neither glibc, nor the Linux kernel honors POSIX semantics for setting capabilities and securebits in the presence of pthreads. That is, changing capability sets, by default, only affect the running thread. To be meaningfully secure, however, the capability sets should be mirrored by all threads within a common program because threads are not memory isolated. As a workaround for this, libcap is packaged with a separate POSIX semantics system call library: libpsx. If your program uses POSIX threads, to achieve meaningful POSIX semantics capability manipulation, you should link your program with:

ld ... -lcap -lpsx -lpthread --wrap=pthread_create

or,

gcc ... -lcap -lpsx -lpthread -Wl,-wrap,pthread_create

This workaround cannot apply to Haskell. In my opinion, the securebits capability of Linux should be per-process.