If I understand correctly, most QUIC implementations of clients and servers uses unconnectedUDP sockets with sendto()/sendmsg() and recvfrom()/recvmsg(). For the server side, this is probably because they adopt event-driven programming. The event loop calls recvfrom()/recvmsg() and dispatches a received packet according to the peer address.
As I explained in "Implementation status of QUIC in Haskell", the quic library in Haskell made use of connected sockets in both the client and server sides. Perhaps, this is a good thing for the server side first because lightweight thread programming is common in Haskell and second because dispatching is done in the kernel.
But what about the client side? Since RFCs relating to QUIC are published, I'm trying to fix API of the quic library for the first official release. What kind of migration API should be provided for clients?
Let's consider this typical scenario:
A QUIC client is using a 5G network.
The client moves to the place where WiFi is available.
The client migrates the connection from the 5G network to the WiFi network.
If the clients uses a connected socket, this migration can be implemented as follows:
The client needs to detect the event that the WiFi network interface is available.
The client creates a new connected socket. When connect() is called, the kernel sets the remote address/port according to the argument. Then it looks up the routing table with the server's IP address. Since the WiFi network interface is resolved, the kernel sets the local address of the socket to the IP address of the network interface. A local port number is chosen randomly.
The client starts sending packets through the new socket with send().
It's easy for the quic library to provide the API for item 2. But how to implement item 1? Do major OSes provide such API for network interfaces? Should we prepare a watch-dog thread for network interface events?
I have been wondering why other implementors do not talk about this issue. I finally realized that I went the wrong way. That is, unconnected socket should be used in the client side. If sendto()/sendmsg() is used with a unconnected socket, connection migration is done automatically:
When sendto()/sendmsg() is called, the kernel sets the destination address/port of the packet according to the argument. Then it looks up the routing table with the server's IP address. Since a proper network interface is resolved, the kernel sets the source address of the packet to the IP address of the network interface. A local port number is chosen randomly at the first sendto()/sendmsg().
Conclusion: connection migration can be done without any specific API.
My open server of Haskell QUIC on Linux sometimes got segfaults. I saw two types of segfaults. One is a simple segfault by accessing a wrong address:
mighty: segmentation fault
The other is relating to free():
*** Error in `mighty': corrupted double-linked list: 0x00007fcdf0008f90 ***
I guessed that a buffer overrun occurred against a buffer allocated by malloc() and this segfault happened when the buffer is freed.
Many Haskellers would be surprised at this kind of segfaults because it is hard to cause segfaults in normalHaskell programming. However, if you manipulate pointers or use unsafe functions, segfaults are usual like other programming language.
For the first type of segfault, you can use Foreign.Storable.peek:
% ghci
>import Foreign.Ptr
>import Foreign.Storable
>:type peek
peek :: Storable a => Ptr a -> IO a
Like other compilers, GHC provides the -g option to add debug information to a complied program. We can run the program in gdb and get a back trace if a segfault happens. To compile all dependent libraries with the -g option, I modified my Cabal wrapper, called cab, to provide a command line option (whose name is also -g) to implement this feature. I also used the sandbox feature of Cabal-v1:
% sudo gdb --args mighty conf route
(gdb) handle SIGPIPE nostop noprint pass
Signal Stop Print Pass to program Description
SIGPIPE No No Yes Broken pipe
(gdb) handle SIGUSR1 nostop noprint pass
Signal Stop Print Pass to program Description
SIGUSR1 No No Yes User defined signal 1
(gdb) run
As you can see, I needed to modify behavior of two signal handlers to ignore them:
When I added some test cases of QPACK to h3spec and test the open server, gdb finally caught a segfault and showed a back trace. The reason is Data.Array.Base.unsafeAt. I did not check the boundary of an array! (My QPACK code is derived from my HPACK code where this boundary check is not necessary.)
Segfault 2
The segfault relating to free() was really mysterious because the buffer boundary is always checked when poke is used. The error message of free() on Linux is not so kind. But when I got the same segfault on macOS, the following message was displayed:
mighty(75755,0x700009519000) malloc: Incorrect checksum for freed object 0x7fb8de80ea00: probably modified after being freed.
Eureka! Even if the boundary is checked everytime, this segfault happens because a freed buffer is used.
But why is a freed buffer used? This is one of difficulties of multi-thread programming. Suppose thread A and thread B share a buffer. The following is an example clean-up procedure:
Thread A sends a kill signal to thread B
Thread A frees the buffer
Thread A exits
This looks perfect. However the timing of termination of thread B depends on the scheduler. Even after thread A freed the buffer, thread B is alive and can manipulate the buffer.
To prevent this contention, I gave up the approach of Foreign.Marshal.Alloc.mallocBytes and Foreign.Marshal.Alloc.free. Instead, I started using GHC.ForeignPtr.mallocPlainForeignPtrBytes. Buffers allocated by this function are GCed like ByteString.
Now I believe that my QUIC server gets much stabler than before.
I found an elegant solution for the problem of Haskell vs Linux capabilities explained in "QUIC and Linux capabilities". To know why the CAP_NET_BIND_SERVICE capability is necessary, please read this article in advance.
On Linux, the following is the procedure to boot a secure multi-threaded server with CAP_NET_BIND_SERVICE:
Setting SECBIT_KEEP_CAPS by prctl(2)
-- Without this, all capabilities are lost after setuid(2).
Switching the root user to nobody (or something) by setuid(2).
Dropping capabilities except CAP_NET_BIND_SERVICE by capset(2).
Spawning native threads. CAP_NET_BIND_SERVICE is inherited by all native threads.
GHCRTS executes Haskell code after spawning native threads. So, there are two problems to implement a secure multi-threaded server with CAP_NET_BIND_SERVICE in Haskell.
How to set SECBIT_KEEP_CAPS to all native threads?
How to drop capabilities except CAP_NET_BIND_SERVICE of all native threads?
For 1), by reading the source code of GHCRTS, I finally found a C level hook called FlagDefaultsHook(). The user manual has the section of Hooks to change RTS behaviour, but this hook is not written, sign. GHCRTS executes this hook before spawning native threads. So, if the following code is linked your Haskell program, all native threads keeps all capabilities after setuid(2), yay!
For 2), I considered that signals can be used. On Linux, we can get the thread IDs of all native threads in a process by scanning /proc/<process id>/task/. And Linux provides tgkill(2) to send a signal to the native thread specified a thread ID.
I first tried to use installHandler of Haskell to install a signal handler. But it appeared that an improper native thread catches the signal from tgkill(2), sigh. So, I used sigaction(2) again in FlagDefaultsHook().
The following is the procedure to implement a secure multi-threaded server with CAP_NET_BIND_SERVICE in Haskell:
Switching the root user to nobody (or something) by setuid(2).
Sending signals to all native threads to drop capabilities except CAP_NET_BIND_SERVICE by tgkill(2).
You can see a concrete implementation in this commit.
One awkward thing is that the capabilities of the process itself remains in a wrong value. It seems to me that capset(2) for a process is not permitted if it is multi-threaded. However, if I understand correctly, there is no way to access or inherit the capabilities of the process in GHCRTS. So, I don't care it so much.
For security reasons, the typical boot process of HTTPS servers is as follows:
Executed by a root.
Reading a TLS private key and open a listen socket on TCP port 443.
Switching the root user to nobody (or something).
Since accept() can create connected sockets bound to TCP port 443 even with non-root privilege, servers can accept connections.
Let's consider the case of QUIC servers which uses UDP. Chrome does not allow Alt-Svc to go across the privileged boundary (i.e. 1024). For instance, `Alt-Svc: h3=":4433" provided on TCP port 443 does not work. QUIC servers should provide QUIC on UDP port 443.
Some QUIC servers make use of connectedUDP socket. As I described in Implementation status of QUIC in Haskell, the following procedure can be used to create a connected UDP socket when a packet is received on a kind of listen socket of 192.0.2.1:443:
Connect it to 203.0.113.0:3456. This also binds the local address to 192.0.2.1.
For 2), the root privilege or the CAP_NET_BIND_SERVICE capability is necessary on Linux. The easiest way to implement secure QUIC servers is to use the setcap command:
This probably works for most QUIC servers. However, this is not the case for Haskell. The Linux capability is per-thread. GHC threaded RTS spawns some native threads then runs Haskell programs. If I understand correctly, there is no way to set SECBIT_KEEP_CAPS for all native threads.
The manual page of capabilities says:
Neither glibc, nor the Linux kernel honors POSIX semantics for
setting capabilities and securebits in the presence of pthreads. That
is, changing capability sets, by default, only affect the running
thread. To be meaningfully secure, however, the capability sets
should be mirrored by all threads within a common program because
threads are not memory isolated. As a workaround for this, libcap is
packaged with a separate POSIX semantics system call library: libpsx.
If your program uses POSIX threads, to achieve meaningful POSIX
semantics capability manipulation, you should link your program with:
h2spec is an excellent test tool to check if HTTP/2 servers can handle error cases correctly. When I was developing HTTP/2 server library in Haskell, I used to utilize Firefox and Chrome for normal cases and h2spec for error cases. h2spec much helped me to improve the quality of the library. What is surprised is that the author, Moto Ishizawa, is still enhancing h2spec!
Since the QUIC library in Haskell supports both the client side and the server side, normal cases are tested by itself. I wanted to test error cases, too. After considering for two weeks, it appeared that QUIC error cases can be made easily. The key idea is providing hooks to covert data structures:
Hooks is passed to the client/server runners via configuration. I used our lovely hspec to run the test:
transportSpec :: ClientConfig -> SpecWith a
transportSpec cc0 =do
describe "QUIC servers"$do
it "MUST send FRAME_ENCODING_ERROR if a frame of unknown type is received [Transport 12.4]"$\_ ->dolet cc = addHook cc0 $ setOnPlainCreated unknownFrame
runC cc waitEstablished `shouldThrow` transportError FrameEncodingError
transportError :: TransportError -> QUICError -> Bool
transportError te (TransportErrorOccurs te' _) = te == te'
transportError _ _ = False
The notation of `shouldThrow` transportError FrameEncodingError is really cool, isn't it?
After adding some error cases, I hit upon an idea of a command line tool with this error cases reused by adding another main function. Moto agreed that I take the name of h3spec. Here is an example of execution of h3spec:
% h3spec -v
h3spec 0.0.4
% h3spec mew.org 443
QUIC servers
MUST send TRANSPORT_PARAMETER_ERROR if initial_source_connection_id is missing [Transport 7.3]
MUST send TRANSPORT_PARAMETER_ERROR if original_destination_connection_id is received [Transport 18.2]
MUST send TRANSPORT_PARAMETER_ERROR if preferred_address, is received [Transport 18.2]
MUST send TRANSPORT_PARAMETER_ERROR if retry_source_connection_id is received [Transport 18.2]
MUST send TRANSPORT_PARAMETER_ERROR if stateless_reset_token is received [Transport 18.2]
MUST send TRANSPORT_PARAMETER_ERROR if max_udp_payload_size is invalid [Transport 7.4 and 18.2]
MUST send FRAME_ENCODING_ERROR if a frame of unknown type is received [Transport 12.4]
MUST send PROTOCOL_VIOLATION on no frames [Transport 12.4]
MUST send PROTOCOL_VIOLATION if reserved bits in Handshake are non-zero [Transport 17.2]
MUST send PROTOCOL_VIOLATION if reserved bits in Short are non-zero [Transport 17.2]
MUST send PROTOCOL_VIOLATION if NEW_TOKEN is received [Transport 19.7]
MUST send STREAM_STATE_ERROR if MAX_STREAM_DATA is received for a non-existing stream [Transport 19.9] FAILED [1]
MUST send PROTOCOL_VIOLATION if HANDSHAKE_DONE is received [Transport 19.20]
MUST send no_application_protocol TLS alert if no application protocols are supported [TLS 8.1]
MUST the send missing_extension TLS alert if the quic_transport_parameters extension does not included [TLS 8.2]
Failures:
Transport.hs:60:13:
1) QUIC servers MUST send STREAM_STATE_ERROR if MAX_STREAM_DATA is received for a non-existing stream [Transport 19.9]
did not get expected exception: QUICError
To rerun use: --match "/QUIC servers/MUST send STREAM_STATE_ERROR if MAX_STREAM_DATA is received for a non-existing stream [Transport 19.9]/"
Randomized with seed 1914918977
Finished in 0.7035 seconds
15 examples, 1 failure
In recent days, QUIC is hot. As the following blog posts describe, the standardization of QUIC is now in the final stage and h3-29 (HTTP/3 over QUIC of draft 29) is out there:
I expect that this would happen in April 2021. For the 2nd half of Japan fiscal year of 2020 (Oct 2020 - Mar 2021), I'm planning to tackle the followings (in no particular order):
Error handling -- the current error handling is really poor
QPACK encoding with the dynamic table -- the current implementation only makes use of the static table
In "Implementation status of QUIC in Haskell", I briefly described QUIC APIs of the TLS library in Haskell. I first invented APIs based on static functions but switched to the thread-based approach to follow Olivier Chéron's recommendation. The current APIs got two steps further. This article describes how Olivier and I improved the thread-based APIs.
Users of the controller can receive key information, TLS extension for QUIC transport parameters, etc through ClientStatus. The following is an example usage of the controller:
handshakeClient :: ClientConfig -> Connection -> IO ()
handshakeClient conf conn =do
ver <- getVersion conn
let sendEarlyData = isJust $ ccEarlyData conf
control <- clientController conf ver (setResumptionSession conn) sendEarlyData
setClientController conn control
sendClientHelloAndRecvServerHello control conn $ ccEarlyData conf
recvServerFinishedSendClientFinished control conn
sendClientHelloAndRecvServerHello :: ClientController -> Connection -> Maybe (StreamId,ByteString) -> IO ()
sendClientHelloAndRecvServerHello control conn mEarlyData =do
SendClientHello ch0 mEarlySecInf <- control GetClientHello
setEarlySecretInfo conn mEarlySecInf
sendCryptoData conn $ OutHndClientHello ch0 mEarlyData
(InitialLevel, sh0) <- recvCryptoData conn
state0 <- control $ PutServerHello sh0
case state0 of
RecvServerHello hndSecInf ->do
setHandshakeSecretInfo conn hndSecInf
setEncryptionLevel conn HandshakeLevel
SendClientHello ch1 mEarlySecInf1 ->do
setEarlySecretInfo conn mEarlySecInf1
sendCryptoData conn $ OutHndClientHello ch1 Nothing
(InitialLevel, sh1) <- recvCryptoData conn
state1 <- control $ PutServerHello sh1
case state1 of
RecvServerHello hndSecInf ->do
setHandshakeSecretInfo conn hndSecInf
setEncryptionLevel conn HandshakeLevel
_ -> E.throwIO $ HandshakeFailed "sendClientHelloAndRecvServerHello"
_ -> E.throwIO $ HandshakeFailed "sendClientHelloAndRecvServerHello"
recvServerFinishedSendClientFinished :: ClientController -> Connection -> IO ()
recvServerFinishedSendClientFinished control conn = loop (1:: Int)
where
loop n =do
(HandshakeLevel, eesf) <- recvCryptoData conn
state <- control $ PutServerFinished eesf
case state of
ClientNeedsMore ->do-- Sending ACKs for three times rule
when ((n `mod`3) ==2) $
sendCryptoData conn $ OutControl HandshakeLevel []
loop (n +1)
SendClientFinished cf exts appSecInf ->do
setApplicationSecretInfo conn appSecInf
setEncryptionLevel conn RTT1Level
setPeerParams conn exts
sendCryptoData conn $ OutHndClientFinished cf
_ -> E.throwIO $ HandshakeFailed "putServerFinished"
NewSessionTicket in a CRYPTO frame is passed to the controller by a receiver thread.
Introducing callbacks
Olivier noticed that both the designated thread and the user of the controller manages the TLS state. To reduce the number of status in the user side, he defined callbacks:
The followings are the definition of handshakeClient and handshakeClientAsync:
handshakeClient :: ClientConfig -> Connection -> IO ()
handshakeClient conf conn =do
ver <- getVersion conn
hsr <- newHndStateRef
let sendEarlyData = isJust $ ccEarlyData conf
qc = QUICCallbacks { quicSend = sendTLS conn hsr
, quicRecv = recvTLS conn hsr
, quicInstallKeys = installKeysClient
, quicNotifyExtensions = setPeerParams conn
}
control <- clientController qc conf ver (setResumptionSession conn) sendEarlyData
setClientController conn control
state <- control EnterClient
case state of
ClientHandshakeComplete -> return ()
ClientHandshakeFailed e -> notifyPeer conn e >>= E.throwIO
_ -> E.throwIO $ HandshakeFailed $"handshakeClient: unexpected "++ show state
where
installKeysClient (InstallEarlyKeys mEarlySecInf) =do
setEarlySecretInfo conn mEarlySecInf
sendCryptoData conn $ OutEarlyData (ccEarlyData conf)
installKeysClient (InstallHandshakeKeys hndSecInf) =do
setHandshakeSecretInfo conn hndSecInf
setEncryptionLevel conn HandshakeLevel
installKeysClient (InstallApplicationKeys appSecInf) =do
setApplicationSecretInfo conn appSecInf
setEncryptionLevel conn RTT1Level
-- second half the the TLS handshake, executed out of the main thread
handshakeClientAsync :: Connection -> ClientController -> IO ()
handshakeClientAsync conn control = handleLog logAction $ forever $do
state <- control RecvSessionTickets
case state of
ClientRecvSessionTicket -> return ()
ClientHandshakeFailed e -> notifyPeerAsync conn e >>= E.throwIO
_ -> E.throwIO $ HandshakeFailed $"unexpected "++ show state
where
logAction msg = connDebugLog conn ("client handshake: "++ msg)
Stateless APIs
When I tried to implement HTTP/3, I noticed a disadvantage of this APIs. handshakeClient returns when 1-RTT gets ready. This means that dynamically-created early data of 0-RTT cannot be sent.
So, I removed both ClientControl and ClientStatus completely from users. newQUICClient was renamed to tlsQUICClient and it does not return ClientController anymore: