あどけない話

Internet technologies

Accepting UDP connections

When we implements UDP servers, a pair of recvfrom() and sendto() is used typically. Received UDP packets are dispatched, if necessary, to each connection by our server itself. We might want to delegate this job to the OS kernel for the performance reasons.

This article discusses how to create connected sockets from listening sockets on UDP. Unlike TCP, the accept() system call cannot be used for this purpose. Linux behaves differently from BSD variants. macOS (Monterey) and Windows (10) are buggy. We need to find a reasonable method which can work on all of these.

Terminology

  • Wildcard listening socket: {UDP, *:<service-port>, *:*}
  • Interface-specific listening socket: {UDP, <interface-address>:<service-port>, *:* }
  • Connected socket: {UDP, <interface-address>:<service-port>, <peer-address>:<peer-port>}

First approach based on interface-specific listening sockets

I don't remember why I first chose interface-specific listening sockets instead of wildcard listening sockets. But let's get started with this.

This method was fist described in Implementation status of QUIC in Haskell. Suppose we have a interface-specific listening socket, say {UDP, 192.0.2.1:443, *:*} and peer's address:port is 203.0.113.1:50000.

The following method does not work.

  1. Create a new UDP socket with SO_REUSEADDR ({UDP, *:*, *:*})
  2. Bind it to 192.0.2.1:443 ({UDP, 192.0.2.1:443, *:*})
  3. Connect it to 203.0.113.1:5000 ({UDP, 192.0.2.1:443, 203.0.113.1:5000})

Unfortunately, BSD variants reject (2). Linux accepts (2) but race condition would happen. The improved process is as follows:

  1. Create a new UDP socket with SO_REUSEADDR ({UDP, *:*, *:*})
  2. Bind it to *:443 ({UDP, *:443, *:*})
  3. Connect it to 203.0.113.1:5000. ({UDP, 192.0.2.1:443, 203.0.113.1:5000})

This process succeeds even on BSD variants because there is no duplicated entries at anytime. And there is no race conditions on any platforms.

A bug of the first approach

If a server have multiple interface-addresses of the same protocol family, connect() would select a wrong address. Suppose we have another interface-specific address, say 192.0.2.2.

  1. Create a new UDP socket with SO_REUSEADDR ({UDP, *:*, *:*})
  2. Bind it to *:443 ({UDP, *:443, *:*})
  3. Connect it to 203.0.113.1:5000. ({UDP, 192.0.2.2:443, 203.0.113.1:5000})

Since we cannot specify a local address in 2) and 3), 192.0.2.2 is selected in this example.

Another approach based on wildcard listening sockets

This bug does not exist if we use wildcard listening sockets. Suppose we have ({UDP, *:443, *:*}).

  1. Create a new UDP socket with SO_REUSEADDR ({UDP, *:*, *:*})
  2. Bind it to 192.0.2.1:443 ({UDP, 192.0.2.1:443, *:*})
  3. Connect it to 203.0.113.1:5000 ({UDP, 192.0.2.1:443, 203.0.113.1:5000})

This process succeeds because there is no duplicated entries at anytime. And a local address is specified explicitly.

When the first packet of a connection arrives, recvfrom() only tells you <peer-address>:<peer-port>. To know an interface-specific address, recvmsg() should be used. struct in_pktinfo and struct in6_pktinfo contain the interface-specific address for IPv4 and IPv6, respectively.

Integrating two approaches

Two approaches can co-exist. Step 2) binds anyaddr if an interface-specific listening socket is used. It binds an interface-specific address if a wildcard listening socket is used. Note that the bug above exists if an interface-specific listening socket is used.

recvmsg() of macOS is buggy. If it is used for an IPv4 interface-specific listening port, it changes the local address to anyaddr. IPv6 is OK. So, recvmsg() should be used only for a wildcard socket and recvfrom() should be used for an interface-specific address.

Bug, bug and bug

On OpenBSD, IPV6_V6ONLY is always enabled and cannot be changed. So, we should not use IPv4-IPv6 integrated sockets for the purpose of cross-platform. We should prepare IPv6-only sockets and IPv4-only sockets for listening.

This is not used but recvmsg() on Windows is also buggy. Suppose that recvmsg() used for an IPv4-IPv6 integrated socket and an IPv4 packet is received. struct in6_pktinfo (for IPv4-mapped IPv6 address) is not returned.

Windows

On Windows, connected socket can be used for sending. But for receiving, the dispatching in the kernel works very poorly because matching is done based on local address and local port only (destination only).

  1. Interface-specific listening sockets and connected sockets have the same priority
  2. They wins wildcard listening sockets
  3. For tie break of 1), the first-created one wins.

Suppose we don't' give up connected sockets for sending.

Since packet dispatching to connected sockets in the kernel are impossible, a listing socket should catch all received packets and the server should dispatch them by itself. But wildcard listing sockets cannot be used for this purpose because of 2). On Windows, interface-specific listening sockets should be used. They win against connected sockets because they are created at the beginning.