Best practices say you should follow the latest Internet Standards defined in IETF RFCs. Although you should, that does not mean you must follow them to the letter. Especially if you are designing a new protocol with very specific and controlled constraints (no, I’m not saying you should forget the standards!). Working for a faster mobile content delivery service, I will explain the challenges, limitations, why we needed to abstract the socket implementation from security protocols, and how we can establish a secure channel faster with pre-negotiated ciphers in our specific environment.
At this stage you are probably thinking we are reinventing the wheel, which is something you should never do in security. However, we do not implement any new security solution, cryptographic primitive or encryption algorithms. Instead, we ensure secure communication through two well known and tested open source libraries (OpenSSL for Server Authentication and Libsodium for Key Exchange Protocol and Data Encryption), through algorithms that are used as described in TLS 1.3, but since we control client and server sides, we choose the most efficient algorithms and ciphers instead of negotiating them, which is something any Internet application (e.g., web browsing, email, or instant messaging) can do as long as both client and server sides are controlled by the same entity.
HTTPS (Hypertext Transfer Protocol Secure) is a protocol used for secure content delivery over communication networks, and is widely used over the Internet. In HTTPS, the communication data is encrypted using a Transport Layer Security (TLS) / Secure Sockets Layer (SSL) and travels from the origin to the destination using an underlying transport that provides reliable and ordered delivery of a data stream of bytes. On the Internet, this is ensured by Transmission Control Protocol (TCP), since User Datagram Protocol (UDP) neither offers reliability nor in-order packet delivery. There is also an implementation of TLS over User Datagram Protocol (UDP) called Datagram Transport Layer Security (DTLS).
Codavel’s mission is to speed up mobile content delivery, for any user, device, content or network. Our software, Bolina, is an SDK consisting of two main components: Bolina Client (integrated into the mobile application, and responsible for handling application requests) and Bolina Server (responsible for the interaction with the Content Server). Communication between Bolina Client and Bolina Server is granted by Bolina Protocol, a fast, reliable and secure transport that controls network instability, providing significant speed improvements in all kinds of network conditions while ensuring state-of-the-art level of security.
Depending on network conditions and restrictions, Bolina Protocol chooses the best underlying protocol (TCP, UDP, or simultaneously both) to deliver the best performance. As a consequence, security-wise, Bolina abstracts the transport layer and creates a common security layer for both TCP and UDP, called Bolina Layer Security (BLS), which is an instance of standard Transport Layer Security (TLS) employed on top of the reliable and ordered Bolina transport, ensuring that messages exchanged between client and server cannot be eavesdropped, modified or forged. To understand the need for a new layer, how BLS works and why it is secure, we need to take a step back and look at how TLS ensures security in common web applications.
Transport Layer Security
The security properties provided by a TLS secure channel are authentication, confidentiality and integrity. Authentication ensures that the origin of a message can be verified. Confidentiality attempts at preventing unauthorized users from reading secret information. Data integrity deals with preventing unauthorized users from altering the data without being detected. These security guarantees are only possible to ensure due to the TLS handshake protocol, allowing that application-layer traffic sent after this phase is encrypted and authenticated.
TLS handshake can be split in three different mechanisms: parameters negotiation (negotiate a protocol version, select cryptographic algorithms, specify server parameters such as whether the client is authenticated, application-layer protocol support, etc.), authentication (authenticate the server, and optionally, the client, provide key confirmation and handshake integrity) and key exchange (establish keying material to be used to protect application-layer traffic).
The primary goal of TLS parameters negotiation is to allow the interoperability between different client and server capabilities (e.g., devices possibly made by different manufacturers, running different operating systems and software libraries, and hence supporting a different set of TLS versions, cipher suites and extensions). A given parameter or extension can only be used if supported by both peers.
The objective of the authentication is to ensure that the origin of a message can be verified. In TLS, the server is always authenticated by the client, but the client is optionally authenticated by the server. Authentication can be performed by resorting to asymmetric cryptography (e.g., RSA, Elliptic Curve Digital Signature Algorithm (ECDSA), or Edwards-Curve Digital Signature Algorithm (EdDSA)), or via a symmetric pre-shared key (PSK).
The main purpose of key exchange is that client and server compute a shared secret key to be used to encrypt application-layer traffic. For key exchange to occur, either a pre-shared key (PSK) is offered (e.g., the client sends a secret to the server encrypted with server’s public key and server decrypts it with its private key), or both client and server agree on a shared secret using a Diffie-Hellman algorithm (e.g., Diffie-Hellman over either finite fields or elliptic curves (EC)DHE). This shared secret is then used by both peers to compute a shared key to secure the established connection via the symmetric encryption algorithms and parameters negotiated between client and server.
Bolina Layer Security
Recall that Bolina's objective is to speed up transport and, as a consequence, it uses Transmission Control Protocol (TCP) or User Datagram Protocol (UDP) (or simultaneously both), depending on network conditions and restrictions. In other words, Bolina chooses the best underlying protocol to deliver best performance. As a consequence, security wise, Bolina abstracts the transport layer and creates a common secure session for both TCP and UDP.
As we have seen before, TLS implementations require an underlying transport that provides reliable and ordered delivery of a data stream of bytes, e.g., TCP. Based on TLS, Datagram Transport Layer Security (DTLS) allows client/server applications to communicate over an unreliable transport protocol (e.g. UDP) with equivalent security guarantees. Since TLS and DTLS depend on the used transport layer, if they were to be applied to Bolina, e.g., TLS on top of TCP link and DTLS on top of UDP link, it would require duplicating efforts to provide two different secure channels, and probably maintaining two different libraries for multiple platforms and potentially need to rollout different versions of the security protocols, thus making a very inefficient use of resources.
In contrast with this simple but inefficient solution, BLS abstracts the transport layer and establishes a common secure session for both protocols TCP and UDP, dissociating the security layer from the socket implementation without losing TLS properties. In particular, TLS data (e.g. handshake, alerts and application data), that usually is carried by TLS records and sent over TCP, is carried by Bolina packets and sent over TCP or UDP after being cryptographically protected.
BLS is thus an instance of TLS providing the authentication, confidentiality and integrity properties of TLS, but decoupled from the socket implementation. In particular, application-layer data that usually would be protected by TLS before being sent over TCP is carried by Bolina packets and sent over TCP or UDP after being cryptographically protected by BLS. As in standard TLS, BLS security guarantees are ensured due to the initial handshake protocol, which we overview next.
Recall that TLS handshake is split in three different mechanisms: parameters negotiation, authentication, and key exchange. Remind that the primary goal of TLS parameters negotiation is allowing the interoperability between different client and server capabilities. This includes protocol version, cryptographic algorithms, server parameters such as whether the client is authenticated, application-layer protocol support, etc. Since we control both Bolina Client and Bolina Server capabilities through Bolina SDK, the parameters negotiation mechanism of TLS becomes unnecessary in BLS. In particular, BLS pre-selects those parameters, using the most efficient, fastest and secure algorithms provided by the latest version of TLS, TLS 1.3. In this context, BLS handshake is an instance of standard TLS handshake that uses TLS authentication and key exchange mechanisms for a fixed (instead of negotiated) set of parameters. We will now turn our attention to BLS authentication and key exchange mechanisms.
In BLS, Bolina Server is authenticated, but the Bolina Client remains anonymous. In order to ensure that Bolina Client is talking to the right Bolina Server, BLS follows the standard steps of TLS handshake protocol for Server Authentication, including Hostname Verification and Certificate Chain Validation using a TLS certificate.
Bolina uses Elliptic Curve Diffie-Hellman (ECDH) as the key exchange protocol, allowing Bolina Client and Bolina Server, each having an Elliptic Curve public-private key pair, to agree on a shared secret over an untrusted channel. This shared secret is used to derive a secret key, which in turn is used to encrypt subsequent communications using symmetric cryptography. Key exchange is protected against tampering attacks, as it produces shared secrets that cannot be controlled by either participating peer.
Similar to TLS 1.3, BLS provides two handshake modes:
- A full 1-RTT handshake in which the client is able to send protected application data after one round trip time;
- A 0-RTT handshake in which the client is able to send protected application data immediately, using information learned from a prior connection established with a full handshake.
In 1-RTT handshake mode, the client starts by sending a ClientHello message to the server, containing key shares. Server replies to the client with a ServerHello message, containing key shares and a certificate chain, signed with the server's RSA/ECDSA Private key. After this message exchange, client verifies server identity, ensuring that:
- Certificates are not expired;
- Chain of trust is valid and signed by a trusted certificate authority;
- Certificates are valid for the domain;
- Signature on the ServerHello message matches server RSA/ECDSA Public Key.
Using their own X25519 Private Key and the other peer’s X25519 Public Key, both client and server compute a shared secret, aborting the negotiation if it is the all-zero value. From the shared secret, and from X25519 Public Keys, both peers derive a symmetric key to send and receive encrypted application data over the connection. In 1-RTT mode, client and server can start sending protected application data after 1-RTT and 1.5-RTT, respectively.
When a client is establishing a connection to a known server (i.e. a server to which the client has already connected in the past), it can establish a secure connection in a 0-RTT handshake. To achieve this, the client adds 0-RTT data (data that allows the server to map a symmetric key) to the ClientHello message and uses a shared secret, obtained in a previous handshake, to authenticate the server and to encrypt the application data, that is sent at the same time as the ClientHello message. The server can decrypt the received data using the symmetric key that is identified in the 0-RTT data, and encrypt messages to the client with the same key. Client and server also send to each other a new key share to use in a future 0-RTT handshake. In contrast with TLS 1.3, which allows zero roundtrip on TLS handshake itself but only after the TCP three-way handshake is completed, BLS goes further by allowing for a true 0-RTT experience. This is achieved by allowing clients to send application data at the very first moment of the "UDP connection" and before completing any other handshake.
Bolina uses an Authenticated Encryption with Associated Data (AEAD) algorithm ensuring that all messages are encrypted with the secret key agreed in the key exchange step and that forged messages are rejected. AEAD encryption provides data confidentiality and integrity, ensuring that an attacker is unable to determine or modify an existing plaintext message. It also offers non-replayability, denying an attacker the ability to send the receiver a message it has already received. Using a shared secret of 256 bits, a nonce of 96 bits, and an additional data of 64 bits, BLS uses stream cipher ChaCha20 for encryption and Poly1305 for message authentication, for all the communications between Bolina Client and Bolina Server. The nonce used as input in AEAD encryption can never be repeated for the same symmetric key. This is assured by partitioning the nonce space so that the first bit is unique per sender (one for each flow direction), with the remaining bits coming from a counter, as recommended in RFC5116.
Protection Against DoS Attacks
Network and application services, especially those that use non-connection oriented transports, are particularly vulnerable to Denial-of-Service (DoS) attacks, from which two are of special concern:
- An attacker transmitting a series of handshake initiation requests, causing the server to consume an excessive amount of resources (e.g., allocate state, perform expensive cryptographic operations);
- An attacker sending connection initiation messages with a spoofed ip address of the victim. This kind of attack (reflection attack) can be very effective when the response sent by the server is larger than the request it received (amplification attack), thus flooding the victim machine.
To mitigate DoS attacks, BLS:
- Borrows the techniques described in QUIC-TLS and in QUIC-TRANSPORT that limit the number of messages the server can send to the client until it has the confirmation that its messages are being targeted to the source address of the client that initiated the connection. In BLS this confirmation occurs when the server receives a message authenticated and encrypted with the symmetric key agreed with the client during the handshake. This allows to mitigate the packet reflection attack, limit the level of an amplification attack, as well as preventing an attacker from exhausting server resources during the handshake process.
- Minimizes the impact of an amplification attack in handshake messages by increasing the size of the ClientHello message, and minimizing the size of server’s response (the handshake is usually very asymmetrical, i.e., the client sends a few bytes whereas the server responds with its complete certificate chain, which can be quite large). BLS uses the padding technique from QUIC-TLS to increase the size of the client’s message, and follows the Cloudflare suggestion to reduce the size of the server’s response with techniques such as using ECDSA certificates (when possible), which are smaller than RSA counterparts, or using compression algorithms like zlib. This procedure allows a more balanced handshake, turning the amplification attack less effective.
- Ensures that an attacker cannot forge the acknowledgments of handshake packets, which are authenticated, similarly to what QUIC-TLS does.
In contrast with BLS, both TLS 1.3 and DTLS 1.3 support a cookie extension to prevent both attacks, allowing the server to force the client to prove its reachability at its apparent network address. However, this cookie technique adds 1 extra RTT to the handshake, and it does not provide any defence against DoS attacks mounted from valid IP addresses.
One of the best practices to prevent security breaches in networked systems is to use the latest Internet Standards described in IETF RFCs. Although you should follow them to the letter in most cases, if you are designing a new protocol and sticking to the standard is not enough to fulfill your protocol requirements, my advice is that you should follow the latest RFC guidelines, having in mind the system setup and the use case where your implementation is to be used. For instance, if you control both client and server sides (which is not what usually happens across most Internet applications), you can skip the parameters negotiation phase of TLS and choose the best, most secure and most efficient cyphers. Also, if you know TLS and DTLS were designed to secure TCP and UDP, respectively, and you use both transports, you should decouple the security layer from the socket implementation following RFC guidelines, without compromising the security of your protocol. That was exactly what we did here at Codavel, but of course these are only two examples of what can be done.