Together with Lenka Mareková, Kenny Paterson, Eyal Ronen and Igors Stepanovs, we have finally completed our (first, formal, in-depth, computational) analysis of the Telegram key exchange. This work is going to be presented at Eurocrypt 2025 in Madrid.
Abstract. We describe, formally model, and prove the security of Telegram’s key exchange protocols for client-server communications. To achieve this, we develop a suitable multi-stage key exchange security model along with pseudocode descriptions of the Telegram protocols that are based on analysis of Telegram’s specifications and client source code. We carefully document how our descriptions differ from reality and justify our modelling choices. Our security proofs reduce the security of the protocols to that of their cryptographic building blocks, but the subsequent analysis of those building blocks requires the introduction of a number of novel security assumptions, reflecting many design decisions made by Telegram that are suboptimal from the perspective of formal analysis. Along the way, we provide a proof of IND-CCA security for the variant of RSA-OEAP+ used in Telegram and identify a hypothetical attack exploiting current Telegram server behaviour (which is not captured in our protocol descriptions). Finally, we reflect on the broader lessons about protocol design that can be taken from our work.
Let me expand a bit on what “the Telegram key exchange” means, here. Telegram uses its bespoke MTProto protocol to secure its client-server communications. The cryptographic core of MTProto consists of a key exchange protocol and an encryption protocol. A few years back we had already analysed the encryption protocol.
Although that prior work focused on the encryption protocol, we also uncovered a vulnerability in Telegram’s key exchange protocol which Telegram fixed in response. We now completed a formal analysis of Telegram’s key exchange protocol and, in a sense, established that this fix works – but with many caveats.
Broadly, we establish that Telegram’s key exchange protocol provides some standard security guarantees. These guarantees, however, rely on several “non-standard” assumptions that appear to be necessary because of the brittle and ad-hoc nature of how Telegram’s protocol was designed.
Below, I reproduce a section from our paper which discusses this. I have edited it to make it somewhat work without the context of the entire paper. The reason why I pulled out this section for this blog post is because we are also trying to convince practitioners to design their protocols to be – at least – “analysis friendly” (ideally, they’d come with such an analysis directly). Friends don’t let friends deploy a cryptographic protocol without a formal cryptographic analysis.
In theory, the design of a cryptographic protocol has the sole purpose of achieving the protocol’s security goals efficiently. In actuality, however, to achieve this goal it must also achieve the goal of allowing at least a sufficiently motivated expert to convince themselves that the protocol achieves these goals. In other words, the central insight of what is commonly referred to as “modern cryptography” is that a cryptographic design is also tasked with being easy to reason about.
A fundamental paradigm of achieving this goal is modularity, where different components of the design can be reasoned about in isolation and then (generically) composed to establish overall security guarantees. This modularity is typically achieved by relying on building blocks that provide strong security guarantees on their own (as opposed to only and potentially in specific compositions) and by breaking the dependency between different components of a protocol by avoiding re-use of secret material.
Telegram’s failure to achieve this design goal is the root cause for the limitations and complexity of our proofs and our seeming need to reach for unstudied assumptions on cryptographic building blocks than would otherwise be necessary.
Below, we discuss these issues and highlight several of the main Telegram design choices and their effect on our proofs of security. We begin with mere complications, then move on to limitations and seemingly necessary ad-hoc assumptions. We finish by briefly recapping our hypothetical attack. We also discuss design choices that led to these issues and note that the same design choice often lead to several different difficulties for arguing for the security of Telegram, leading to necessary repetitions in what follows.
Proof complications
Several design choices made by Telegram introduced many otherwise avoidable complications in our proofs.
Lack of a suitable key schedule. Some value
– referred to as a nonce but also used as a key – is passed to a custom key derivation function (KDF), to a function computing a key confirmation hash h and is partially XORed with the server’s nonce to form the server’s salt. These three uses of
are across three different Send calls, rendering it impossible to replace values one-by-one with random values and appealing to some pseudorandom function (PRF) notion to justify the changes. If instead
had been used solely as an input to the KDF to produce pseudorandom values, with these values replacing the three uses of
, then a significantly simpler proof would have been obtainable.
Similarly, two values called ax and aid are both the result of a single SHA-1 call, which prevents the proof from manipulating them independently.
Use of a (truncated) weak hash function. Although more efficient and secure alternatives such as SHA-256 and SHA3 exist, Telegram also uses the now mostly deprecated SHA-1 algorithm. SHA-1 has been shown not to be collision resistant via practical attacks. The use of SHA-1 to compute the key confirmation hash h complicates our proof. If a collision-resistant hash function had been used, we could have relied on this property in the first step of the proof to establish public session matching.
Further, the output of the SHA-1 hash is truncated to only 64 bits. This prevents us from using a simple PRF notion due to easy attacks even in the one-time PRF setting.
Short session identifiers. The 64-bit value output of the above-mentioned truncated hash function is aid. This value is used by the Telegram servers to identify sessions. On the one hand, this imposes a hard bound of
on the number of sessions each responder can accept. On the other hand, the shortness of the value suggests that collisions between session state identifiers are likely, which complicates the proof. A longer value, even of 128 bits, would have allowed for a simpler proof.
Lack of ciphertext integrity. Telegram’s MTProto relies on a custom mode of operation composing IGE-mode and SHA-1. The composition achieves neither INT-CTXT nor IND-CCA. Had an established authenticated encryption scheme or an unforgable MAC been used, this would have simplified the proof in allowing us to declare the Diffie-Hellman shares authenticated and using the ciphertext/mac tag as part of our session identifiers. This in turn would have enabled public session matching based on transcripts.
Reliance on plaintext checking. Our proof relies on the correctness of a complex parsing behaviour and the checking of various plaintext headers and nonce values. That is, we also could not achieve modularity separating cryptographic operations and higher-level protocol operations.
In particular, to prove soundness we require that all message headers are different, so there cannot be confusion about which state the protocol is in and role confusion is also ruled out.
Limitations of our proof
The main limitation of our proof is that we do not model the actual connection between the initial run of the key exchange and subsequent runs of it (see paper for details). Moreover, our model does not allow for generic composition of our theorems about the key exchange and existing results about the encryption protocol. This is due to several design choices made by Telegram that prevent simple composition of the security proofs.
Key dependence. While being composed of multiple stages, the key exchange protocol does not derive the keys in the different stages independently. This prevents us from using general composition results on key exchanges and encryption protocols to argue about the security of the key exchange when used in conjunction with the encryption protocol.
Another example is the fact that the Diffie-Hellman value in a sub-protocol is used to internally derive ax and aid, and is used afterwards as an encryption key ak. Instead, if the DH value had been used as an input to a KDF to derive ax, aid and authkey as (computationally) independent keys, a composition result would be more feasible to achieve.
Public key reuse. We do not model the fact that the public key pk of the server is used in several sub-protocols. To model this, a proof would have to consistently update it across two different games simultaneously. Using different independent keys would have allowed us to treat the two protocols separately without essentially assuming the co-dependence away.
Lack of key confirmation. We were unable to prove key confirmation for one sub-protocol and only proved key confirmation for the server for the full protocol. Key confirmation would have been possible if h was produced using a secure MAC.
Direct use of non-uniform key material. MTProto uses bits of the agreed DH values directly as key material instead of using them as an input to a key derivation function. However, the existing proof for MTProto assumes a uniform key distribution. This prevents us from composing our results with those prior results. Moreover, this forces us to use a session key distribution for some stage which is not the uniform distribution on strings of a given size.
Retry handling. In general, it is difficult to reason about the security of a protocol without knowing the total number of exchanged messages. For example, the security bound for INT-PTXT depends on the number of encryption and decryption queries, which in turn depends on the number of retries. Two aspects of the protocol design prevent us from making an argument that the number of retries would be bounded in practice. First, there is a question of preventing adversarially-triggered retries: this would necessitate showing that some custom hash function outputs are unforgeable, which is not possible due to its short input length. Second, even if the adversary was not able to directly manipulate the flow of the protocol, it remains in control of creating new sessions, which in turn influences the size of each server’s set of known sessions that determines the likelihood of an honest retry. Thus, we were forced to assume a maximum retry number.
Reliance on unstudied assumptions
In our paper, we describe several unstudied ad-hoc and new assumptions that we used in our proofs. These assumptions could have been avoided if collision-resistant hash functions (e.g. SHA-256 or SHA3) had been used instead of SHA-1 and if proper key derivation functions had been used.
We can view these assumptions as part of two groups, based on their plausibility and impliciations if they were invalidated. The first two (4PRF, 3TPRF) are lower-level, expressing a pseudorandomness property of SHACAL-1 (the block cipher inside SHA-1): they appear plausible due to the large key length of SHACAL-1, but symmetric cryptanalysis would be needed to determine the concrete reduction in advantage compared to the known results on SHACAL-1 without leakage.
The remaining three (SPR, UPCR, IND-KEY) are higher-level, expressing properties of SHA-1 that are variants of standard assumptions or more novel. However, it appears that breaking either of these would not be sufficient to break the key exchange protocol; there exist versions of these assumptions which if broken would be sufficient to break the protocol, but they place even stricter constraints on the adversary.
A hypothetical attack
Weak channel binding. We also describe an attack on client authentication that is based on the way that a new temporary key is bound to the long-term authentication key ak. The attack exploits the fact that the Telegram server used to not verify the expiration time sent in the binding message. Although Telegram has addressed this specific issue by enforcing the check, the design choice to rely on such checks for session binding is brittle, and its security depends on nuanced details related to the way session key management and expiration are implemented. Instead, more robust cryptographic approaches can be used to bind between the sessions that generate the new temporary key and ak. For example, one approach is to calculate a MAC over the transcript of the current session’s handshake using a key derived from ak as the MAC key.