Open access peer-reviewed compact

Partition-Based Trapdoor Ciphers

By Arnaud Bannier and Eric Filiol

Submitted: December 5th 2016Reviewed: July 17th 2017Published: September 7th 2017

DOI: 10.5772/intechopen.70420

Downloaded: 1485


Trapdoors are a two-face key concept in modern cryptography. They are primarily related to the concept of trapdoor function used in asymmetric cryptography. A trapdoor function is a one-to-one mapping that is easy to compute, but for which its inverse function is difficult to compute without special information, called the trapdoor. It is a necessary condition to get reversibility between the sender and the receiver for encryption or between the signer and the verifier for digital signature. The trapdoor mechanism is always fully public and detailed. The second concept of trapdoor relates to the more subtle and perverse concept of mathematical backdoor, which is a key issue in symmetric cryptography. In this case, the aim is to insert hidden mathematical weaknesses, which enable one who knows them to break the cipher. Therefore, the existence of a backdoor is a strongly undesirable property. This book deals with this second concept and is focused on block ciphers or, more specifically, on substitution-permutation networks (SPN). Inserting a backdoor in an encryption algorithm gives an effective cryptanalysis of the cipher to the designer.


  • cryptography
  • block ciphers
  • backdoor
  • trapdoor
  • substitution-permutation network
  • cryptanalysis


1. Introduction

Despite the fact that in the late 90s/early 2000s, citizens have partially obtained the freedom for using cryptography, the recent years have shown that more than ever, governments and intelligence agencies still try to control and bypass the cryptographic means used for the protection of data and of private life. Snowden leaks have been a first upheaval. A tremendous number of secret projects conducted by NSA and GCHQ have been revealed to the public opinion. They have shed a new light on the permanent attempt to control the use of cryptography by a growing number of governments.

The recurring approaches and attempts consist in making the implementation of backdoors mandatory. The simplest and naive approach consists in enforcing key escrowing at the operators’ level. But point-to-point encryption solutions like telegram, signal or proton mail enable to prevent it. A number of different backdoor techniques are regularly mentioned or proposed.

The most critical aspect in embedding backdoors lies on the fact that hackers or analysts may find them more or less easily and worse may exploit them. This is the reason why operators or developers are very reluctant to accept backdoors until now. In case of leak, they inevitably lose users’ confidence and favor the development of trusted services abroad. In fact, the backdoor issue arises due to the fact that only implementation backdoors (at the protocol/implementation/management level) are generally considered.

In this book, we address the most critical issue of backdoors: mathematical or by-design backdoors. In other words, the backdoor is put directly in the mathematical design of the encryption algorithm. While the algorithm is totally public, proving that there is a backdoor, identifying it and exploiting it, is generally an intractable problem, unless you know the backdoor [1]. To some extent, the RSA’sDual_EC_DRBGstandard case falls within this category [2]. Other nonpublic examples are known within the military cryptanalysis community and partially revealed to the public, thanks to the 1995 Hans Buehler case [3]. This kind of backdoor is the most difficult one to address and there is quite no public work on that topic. It is generally the technical realm of a few among the most eminent intelligence agencies, namely NSA and GCHQ, which moreover have the ability and power to step in and to influence the international standardization processes. Our objective is to explain that it is probably possible to design and put such backdoors. In this book, we consider a particular case among many other possibilities of trapdoors.

This book is organized as follows. In the next section, we explore the concept of backdoors and trapdoors and we identify two main categories. We also present the state-of-the-art, history and previous work regarding backdoors, mostly in symmetric cryptography. The rest of this book focuses on substitution-permutation networks (or SPN for short) which are a special class of block encryption systems, mapping a partition of the plaintexts to a partition of the ciphertexts, independently of the round keys used.

Chapter 2 explores the concept of linear partitions and their relationships with substitution-permutation networks. We show in Section 2 that in our case, the study of the full cipher can be restricted to the substitution layer without loss of generality. Then in Section 3, we explore this latter primitive and show that the problem can be restricted further to the study of a single S-box.

In Chapter 3, we discuss how to design a suitable S-box which preserves a linear partition and, at the same time, which resists linear and differential cryptanalysis. From those theoretical results, we have designed a full AES-like encryption system, called BEA-1, presented in Chapter 4. Section 1 gives the full specifications of this cipher. Then Section 2 deals with the design of its backdoor. In Section 3, we sketch the basic ideas underlying the BEA-1 cryptanalysis while in Section 4, we present our cryptanalysis of BEA-1 under the assumption we have the full knowledge of the backdoor.

Chapter 5 concludes this book and explore new ideas and trends in encryption backdoors. The full description of cryptographic primitives used in BEA-1 is given in Appendix.

2. The concept of backdoor

2.1. Definition and classification proposal

Trapdoors are a two-face key concept in modern cryptography. They are primarily related to the concept of trapdoor functionused in asymmetric cryptography. A trapdoor function is a one-to-one mapping that is easy to compute, but for which its inverse function is difficult to compute without special information, called the trapdoor. It is a necessary condition to get reversibility between the sender and the receiver for encryption or between the signer and the verifier for digital signature. The trapdoor mechanism is always fully public and detailed. The security and the core principle are based on the existence of a secret information, the private key, which is essentially part of the trapdoor. In other words, the private key can be seen as thetrapdoor.

The second concept of trapdoor relates to the more subtle and perverse concept of mathematical backdoor, which is a key issue in symmetric cryptography. In this case, the aim is to insert hidden mathematical weaknesses which enable one who knows them to break the cipher. Nonetheless, mathematical backdoors may be extended to asymmetric cryptography, see for example the case of theDUAL EC_DRBG[2], or the case of trapdoor primes addresses recently in [4]. Therefore, the existence of a backdoor is a strongly undesirable property.

In the rest of this section, we will oppose the term of trapdoor, the desirable property, to that of backdoor, the undesirable one. While the term of trapdoor has been already used in the very few literature covering the second face of this problem, we suggest however to use the term of backdoor to describe the issue of hidden mathematical weaknesses. This would avoid ambiguity and maybe would favor the research work around a topic which is nowadays mostly addressed by governmental entities in the context of cryptography control and regulations.

Inserting backdoors in encryption algorithms underlies quite systematically the choice of cryptographic standards (DES, AES…). The reason is that the testing, validation and selection processes are always conducted by governmental entities (NIST or equivalent) with the technical support of secret entities (NSA or equivalent). So an interesting and critical research area is: “how easy and feasible is it to design and to insert backdoors in encryption algorithms?”. In this book, we intend to address one very particular case of this question. It is important to keep in mind that a backdoor may be itself defined in the following two ways.

  • As a “natural weakness” known, but none disclosed, only by the tester, validator or final decision-maker. The best historic example is that of the differential cryptanalysis. Following Biham and Shamir’s seminal work in 1991 [5], NSA acknowledged that it was aware of that cryptanalysis years ago [6]. Most of experts estimate that it was nearly 20 years ahead. However a number of non public, commercial block ciphers in the early 90s might have been be weak with respect to differential cryptanalysis.

  • As an intended design weakness put by the author of the algorithm. To the authors knowledge, there is no known case for public algorithms yet.

As far as symmetric cryptography is concerned, there are two major families of cipher systems for which the issue of backdoor must be considered differently.

  • Stream ciphers. Their design complexity is rather low since they mostly rely on algebraic primitives: LFSRs and Boolean functions which have intensely been studied in the open literature Until the late 70s, backdoors relied on the fact that quite all algorithms were proprietary and hence secret. It was then easy to hide nonprimitive polynomials, weak-combining Boolean functions… The Hans Buehler case in 1995 [3] shed light on that particular case.

  • Block ciphers. This class of encryption algorithms is rather recent (end of the 70s for the public part). They exhibit so a huge combinatorial complexity that it is reasonable to think to backdoors. As described in [7] for a κ-bit secret key and an m-bit input/output block cipher there are ((2m)!)2κpossible such block ciphers. For such an algorithm, the number of possible internal states is so huge that we are condemned to have only a local view of the system, that is, the round function or the basic cryptographic primitives. We cannot be sure that there is no degeneration effect at a higher level. This point has been addressed in [7] when considering linear cryptanalysis. Therefore, it seems reasonable to think that this combinatorial richness of block ciphers may be used to hide backdoors.

Since block ciphers are now the most widely used encryption algorithms by the general public and the industry, we will focus on them in the rest of this book. Backdoors in stream ciphers have quite never been exposed to the public.

2.2. Previous work

Regarding the previous work, we can consider two aspects. The first one relates to authors who have considered structures on the input and output spaces of round functions to build key distinguishing or key recovery attacks. In this case, it is possible to suppose that those structures are “natural” structures. The second case is directly linked to the topic covered in this book. It relates to the design of backdoors based on such structures. Exploiting these hidden structures then leads to a tractable cryptanalysis. In this respect, we can see those structures as “intended” and no longer “natural”.

2.2.1. Attacks using space structures

Among the very first previous works that have considered structures in the plaintext and ciphertext spaces is the contribution of Evertse [8]. This paper introduced the linear structures for block ciphers, which map a subspace of F2m×F2κ(the product of the plaintext and ciphertext spaces) onto a subspace of F2m(the ciphertext space). Then, the author showed that if such a linear structure exists, then known-plaintext and chosen-plaintext attacks faster than exhaustive search are possible.

Later, Leander et al. [9] developed a new cryptanalysis, called invariant subspace attack, breaking the PRINTCipher [10] for a significant fraction of its keys. The general idea of this attack can be outlined as follows. Let Fdenote the SP-layer of a substitution-permutation network, that is, the round function without the key addition. Then, assume that Fmaps a coset of a given subspace Vto another coset of V. In other words, there exist aand bsuch that F(a+V)=b+V. Here, the addition is made in F2nand hence corresponds with the XOR operation. The round function associated with the round key kis then defined by Fk:xF(x+k). If the round key kbelongs to the coset a+ b+ V, then it holds that


hence the name of invariant subspace. Therefore, if every round key lies in this particular coset, the affine subspace b+ Vis preserved by the full encryption process. Such a property enables a very efficient distinguisher. As additional results, they also showed that the invariant subspace attack

  • implies a truncated differential attack to be possible (the probability of the truncated differential characteristic is however highly key-dependent);

  • implies the existence of strongly biased linear approximations for weak keys (independently of the number of rounds).

This attack has been generalized in 2015 by Leander et al. [11]. They proposed a generic algorithm that is able to detect invariant subspaces. Indeed, their initial invariant subspaces on PRINTCipher were found empirically.

Following the idea of the invariant subspace attack, Grassi et al. [12] introduced the subspace trail cryptanalysis. Given r+ 1 subspaces V[0],,V[r], it is assumed that the image of any coset of V[i] under the SP-network is included in a coset of V[i+1]. That is to say, for each a[i], there exists a[i+1] such the following inclusion holds


In this case, it is easy to see the all round functions Fkinherit such a property. The family of subspaces (V[i])iris said to be a subspace trail. Naturally, the dimension of V[i] must be lower than or equal to the dimension of V[i+1]. In contrast to the invariant subspace attack, Grassi et al. relaxed the assumption that the coset has to be invariant. Here, the considered subset becomes the coset of possibly different increasingly dimensional subspaces throughout the encryption. However, the authors also required this property to hold for each coset of V[0] instead of one. Therefore, this cryptanalysis is not a generalization but a variation of the invariant subspace attack. As will become clear in Section 2 of Chapter 2, the family of backdoors covered in this book is closely related to constant-dimensional subspace trails.

Let us mention that in [13], the authors introduced nonlinear invariant subspaces by considering a general Boolean function gsuch that g(F(x))g(x)is constant. Finally, Table 1.1 summarized the structures considered by the attacks presented in this section and compared it with our work.

WorkStructureKey dependence
Evertse [8]Linear structure (if any)Key independent
Leander et al. [9, 11]Exact cosetRound key dependent
Grassi et al. [12]Coset independentRound key independent
Our approachCoset independentRound key independent

Table 1.1.

Comparison of existing work with respect to input and output space structures.

2.2.2. Backdoor design and structures

One of the first trapdoor ciphers was created in 1997 by Rijmen and Preneel [14]. Their S-boxes are constructed to have one high correlation between the zero mapping and a sum of certain output bits. The knowledge of this correlation yields a high potential linear trail which is used to recover a part of the key with linear cryptanalysis. Such a weakness is generally pointed out by the first line of the S-boxes’ correlation matrices. Yet, if the output size of the S-boxes is large enough, their computation is too expensive. Relying on this fact, the authors claimed that their trapdoor is undetectable, even if one knows its global design. Nevertheless, Wu et al. [15] disproved this by discovering a way to recover the trapdoor. It is worthwhile to mention that in practice, if a real cipher containing a trapdoor is given, the presence of the trapdoor will certainly not be revealed.

More recently in [16], the authors created non-surjective S-boxes embedding a parity check to create a trapdoor cipher. The message space is thus divided into cosets and leads to create an attack on this DES-like cipher in less than 223 operations. The security of the whole algorithm, particularly against linear and differential cryptanalysis is not given and the authors admit that their attack is dependent on the first and last permutation of the cipher. Finally, the non-surjective S-boxes may lead to detect easily the trapdoor by simply calculating the image of each input vector. This problem is naturally avoided in a substitution-permutation network in which S-boxes are bijective by definition.

Our approach is mainly a generalization of the ideas presented by Paterson in [17]. In this article, a DES-like trapdoor cipher exploiting a weakness induced by the round functions is presented. The group generated by the round functions acts imprimitively on the message space. In other words, the round function preserves a partition of the message space no matter the round key used, and hence, the same applies to the full cipher. This partition forms the trapdoor. Paterson then introduced a trapdoor cipher composed of 32 rounds and using an 80-bit key. The trapdoor enables recovery of the key using 241 operations and 232 chosen plaintexts. Even if the mathematical material to build the trapdoor is given, no general algorithm details the S-boxes’ construction. Furthermore, as the author says, S-boxes using these principles are incomplete: half of the ciphertext bits are independent of half of the plaintext bits. Finally, the security against a differential attack is said to be not as high as one might expect. Moreover, the author wondered whether the partition of the message space had to be linear, that is to say, made up with every coset of a linear subspace. Caranti et al. [18] provided a first answer to Paterson’s question, by proving that if the group generated by the round functions is imprimitive, then the partition of the message space must be linear. In his thesis [19], Harpes considered trapdoor ciphers mapping a partition of the plaintexts to a partition of the ciphertexts. As these partitions are not necessarily equal, this family generalizes Paterson’s one. Harpes suggested using this trapdoor with its partitioning cryptanalysis.


Partition-Based Trapdoor Cipher

This chapter intends to study Substitution-Permutation Networks mapping a partition of the plaintexts to a partition of the ciphertexts, independently of the round keys used. All the results of this and the following chapters comes from [20].

1. Linear partitions

Let us begin with some notations and conventions.

Notation 2.1. Let mand ndenote positive integers. For two maps fand g, the composition gf(or simply gf) denotes the evaluation of ffollowed by g. For any set E, let #Edenotes its cardinality. If Fis a subset of E, Fcdenotes its complement.

Let us denote the Galois field of order two by F2and 0n=(0,,0)the zero vector of F2n. All the vector spaces considered in this chapter are over the finite field F2. It is worthwhile to mention that (F2n)mwill be often identified with F2nm. The concatenation of two vectors xand yis denoted by (x|| y).

An n-bit S-box is any permutation of F2n. If xand yare two elements of F2n, then x,y=i=0n1xiyi. If L:F2nF2mis a linear map, define L:F2mF2nby L(x),y=x,L(y)for every (x,y)F2n×F2m. In other words, Lis the transpose of Lfor the bilinear form ,.

Finally, we will denote the elements of F2nusing the hexadecimal notation. For instance, the element (1,0,1,1,1)of F25is denoted by 17.

Since we are concerned with ciphers that associate a partition of the ciphertext space to another partition of the plaintext space, let us introduce the following definition.

Definition 2.2. Let fbe a permutation of Eand A, Bbe two partitions of E. Let f(A)denote the set {f(A)|AA}. We say that fmaps Ato Bif f(A)=B. If A=B, we says that f preservesthe partition A.

The two partitions {{x}|xE}and {E} are called the trivial partitionsof E. Observe that, for any permutation fof E,


That is, every permutation preserves the two trivial partitions. Moreover it should be highlighted that if fmaps Ato Band if Ais nontrivial, then so is B.

Example 2.3. Let Edenote the set 0,8and consider the two partitions A, Bof Edefined by A={{0,1,4},{2,6},{3,7},{5}}and B={{0,2,7},{1},{3,5},{4,6}}. Let fbe the permutation of Edefined by


By definition,


The equality f(A)=Bholds, and thus fmaps the partition Ato B.▴

Lemma 2.4. Let fbe a permutation of Eand A, Bbe two partitions of E. If for any part Aof A, f(A) is a part of B, then fmaps Ato B.

In this chapter, we will consider a special kind of partitions that is composed of all the cosets of a linear subspace. Such partitions have already been introduced by [19, Definition 4.4] and are recalled below.

Definition 2.5 (linear partition). Let Abe a partition of F2n. Let Vdenote its part containing 0n. The partition Ais said to be linearif Vis a subspace of F2nand if every part of Ais a coset of Vin F2n, in other words, if


We denote L(V)such a partition.

Remark 2.6. It turns out that the linear partitions associated with the two trivial subspaces of F2n, that is {0n} and F2n, correspond with the two trivial partitions of F2n. Moreover, if Vis a nontrivial subspace of F2n, then the linear partition L(V)is also nontrivial.

Example 2.7. Consider the subspaces Vand Wof F25defined by


Since both Vand Ware two-dimensional subspaces of F25, the quotient spaces L(V)=F25/Vand L(W)=F25/Ware three-dimensional. In other words, the two linear partitions L(V)and L(W)have 23 = 8 parts. It can be verified that


For instance, the part0B+ Vof the linear partition L(V)is the coset of Vwith respect to0B. Explicitly, it is equal to


Now, consider the permutation fof F25given in Figure 2.1. The image of0B+ Vunder fis


Figure 2.1.

The permutationfof Example 2.7.

Observe that f(0B+V)is a coset of Wso a part of L(W). The images of all cosets of Vunder fare displayed in Figure 2.2. Since any of them is a part of L(W), the permutation fmaps L(V)to L(W). It is worthwhile to observe that a permutation mapping a linear partition to another one does not need to be itself linear or even affine. Indeed, fis certainly not linear as f(00)=1E00. By contradiction, suppose that fis an affine transformation. Then, there exist a linear mapping L:F25F25and an element cof F25such that f(x)=L(x)+cholds for all xin F25. Therefore,


Figure 2.2.

The permutationfmappingL(V)toL(W)whereV=span(07,1A)andW=span(0E,12).

for all x, yand zin F25. Observe that


Thus, fis not an affine transformation.▴

Lemma 2.8. Let V, Wbe two subspaces of F2nand fbe a permutation of F2n, which maps L(V)to L(W). For any xin F2n, fmaps x+ Vto f(x) + W.

Example 2.9. In Example 2.7, we have seen that f(0B+V)=03+W. Since fmaps L(V)to L(W), the previous lemma states that f(0B+V)=f(0B)+W=0D+W. There is however no contradiction here because0Dbelongs to03+ W. Consequently, the cosets03+ Wand0D+ Ware equal.▴

The following two propositions are interesting properties of linear partitions, which will be used in the rest of this chapter.

Proposition 2.10. Let V1,V2,W1,W2be four subspaces of F2nand fbe a permutation of F2n, which maps L(V1)to L(W1)and L(V2)to L(W2). Then fmaps L(V1V2)to L(W1W2).

Proposition 2.11. Let V, Wbe two subspaces of F2nand fbe a permutation of F2n, which maps L(V)to L(W). There exists an automorphism Lof F2nsuch that L(V)=W. In particular, Vand Ware isomorphic.

Example 2.12. Consider again the permutation fof F25defined in Figure 2.8. As seen in the previous example, the permutation maps the linear partition L(V)to L(W). Then, Proposition 2.11 ensures that there exists a linear permutation Lof F25such that L(V) = W. Consider the bases (07,1A) and (0E,12) of Vand Wrespectively and complete them into the following bases of F25


Then, the mapping Lcan be defined by L(vi) = wifor each i< 5. This linear transformation will be used in the next chapter.▴

2. Substitution-permutation networks and partitions

This section aims at studying an SPN, which maps a partition of the plaintexts to a partition of the ciphertexts. When the cipher key Kis fixed, the encryption function EKis just a permutation of the message space. Therefore, any partition Aof the plaintexts is mapped to the partition EK(A)of the ciphertexts. Nonetheless, to exploit the trapdoor, the designer needs to know the pair of partitions (A,EK(A)). The problem is that the output partition EK(A)depends a priorion the cipher key K, which is unknown to the attacker. The simplest way to solve this problem is to require the partition EK(A)to be independent of the cipher key K. In other words, we want all the partitions EK(A)to be equal to a fixed partition B.

As with differential and linear cryptanalysis, taking account of the exact effect of the key schedule seems to be a challenging problem. Therefore, the key schedule will deliberately be omitted throughout this chapter. This amounts to consider an SPN mapping a partition Ato a fixed partition B, independently of the round keys used.

2.1. The key addition and diffusion layer

Substitution-permutation networks belong to the class of iterated block ciphers. As every iterated block cipher, the encryption function consists in applying a simple keyed operation called round functionseveral times. A different round keyis used for each iteration of the round function. In practice, these rounds keys are extracted from a master key using an algorithm called key schedule. In an SPN, the round function is made up of three distinct stages: a key addition, a substitution layerand a permutationor diffusion layer. The substitution layer consists of the parallel evaluation of several S-boxes and is the only part of the cipher, which is not linear or affine. Then, the diffusion layer is the evaluation of some linear mappings (generally one).

Before tackling the full cipher, we look at its basic operations and primitives. The attacker knows the specifications of the substitution and diffusion layers, but he does not know the round key used in the key addition. Therefore, the key addition should not be considered as one operation but rather as a family of permutations. To get back to the subject at hand, we must first determine the partitions A, which are mapped to a unique partition under the action of all round keys.

The next proposition explains the fundamental property of linear partitions according to the key addition. This result was introduced by Harpes in [19]. Later, Caranti et al. gave a similar result expressed for imprimitive groups in [18]. For convenience, we restate this result with our own notations.

Proposition 2.13. Let nbe a positive integer. Let Aand Bbe two partitions of F2n. For each kin F2n, let αkdenote the permutation of F2ndefined by αk(x)=x+k. Then, the permutation αkmaps Ato Bfor any kin F2nif and only if A=Band Ais a linear partition.

Even if this result was easily obtained, it has maybe the most important impact on our study. Due to this result and its generalization given later in the next section, only linear partitions will be considered. By definition, the linear partitions are quotient spaces and hence highly structured algebraic objects. Consequently, the apparent combinatorial aspect of our study is reduced to an algebraic problem. This result is indeed quite restrictive since the linear partitions account for a small proportion of all partitions.

Example 2.14. Let nand kbe nonnegative integers and qbe a prime power. The q-binomial (or Gaussian) coefficient is defined by


It can be proved that this coefficient counts the number of d-dimensional subspaces of an n-dimensional vector space over the finite field Fq. Therefore, the number of subspaces of F23is given by


Since a linear partition of F23is uniquely determined by a subspace of F23, there are exactly 16 linear partitions. All these partitions are represented graphically at the top of Figure 2.3. For instance, the linear partition associated with the subspace span(2,4)={0,2,4,6}is L(span(2,4))={{0,2,4,6},{1,3,5,7}}.

Figure 2.3.

Every linear partitions and key addition inF23.

Proposition 2.13 states that among the set of all the partitions of F2n, only the linear ones yield a unique output partition for every key. The Bell number Bmcounts the number of partitions of a set of size m. Thus, the number of partitions of F2nis B2n. For n =3, there are B8 = 4140 partitions in all. Hence, the linear partitions represent a fraction of 16/B8 ≈ 2−8.0. This ratio falls greatly as nincreases. In fact, for n= 4, only 67/B16 ≈ 2−27.2 are linear and for n= 5, this ratio becomes 374/B32 ≈ 2−78.2. This underlines how Proposition 2.13 is restrictive.

All the key additions are given at the bottom of Figure 2.3. The reverse implication of Proposition 2.13 states that any linear partition is preserved by all the key additions. For instance,

α2(L(span(6))={f({0,6}),f({1,7}),f({2,4}),f({3,5})}={{2,4}, {3,5}, {0,6},{1,7}}=L(span(6)).E16

Thus, the permutation α2 preserves L(span(6)). Figure 2.4 illustrates graphically that this linear partition is preserved by all the key additions. It is then not hard to check that the same holds for every linear partition given in Figure 2.3.

Figure 2.4.

The key additions preserving the partitionL(span(6)).

Now that we know linear partitions are of major importance, we focus on how the diffusion layer deals with these partitions.

Proposition 2.15. Let nbe a positive integer. Let Lbe an automorphism of F2nand Va subspace of F2n. Then, L(L(V))=L(L(V)). In particular, Lmaps a linear partition to another one.

Proof. Since Lis an automorphism, we have


Moreover, L(V) is a subspace of F2nbecause Lis a linear mapping. Consequently, L(L(V))=L(L(V)). ▪

If Vand Ware two subspaces of F2n, it is straightforward to design a linear permutation Lof F2nmapping L(V)to L(W). Indeed, Proposition 2.15 establishes that Lmaps L(V)to L(W)is and only if L(V)=W. In other words, we only need to consider the image of Vand not the whole linear partition L(V).

2.2. From the encryption function to the substitution layer

Along with the two results of the previous section, we can now address our main issue. For the rest of this chapter, we consider a generic SPN whose parameters are defined as follows.

Definition 2.16 (SPN). Let m, nand rbe positive integers. A substitution-permutation networkis an iterated block cipher whose encryption function is defined as follows. Let S0,…,Sm−1 be n-bit S-boxes.

  • The additionof the round key kis denoted by αk:F2nmF2nm, xx+k.

  • The substitution layeris denoted by σand maps (xi)0i<mto (Si(xi))0i<m.

  • The diffusion layeris a linear permutation denoted by π:F2nmF2nm.

The round function Fkassociated with the round key kis defined by Fk=πσαk. The encryption functionassociated with the round keys K=(k[0],,k[r])in (F2nm)r+1is defined by


We can now prove the following result.

Theorem 2.17. Let Aand Bbe two partitions of F2nm. Suppose for any (r+ 1)-tuples of round keys K=(k[0],,k[r])in (F2nm)r+1that the encryption function EKmaps Ato B. Define A[0]=Aand for all 1ir, A[i]=(πσ)i(A). Then,

  • A[r]=B;

  • for any 0 ≤ i< rand for any k[i] in F2nm, Fk[i](A[i])=A[i+1];

  • for any 0ir, A[i]is a linear partition.

Proof. Observe that for the round key k= 0nm, the key addition α0nmis the identity mapping on F2nm, and thus F0nm=πσα0nm=πσ. Now, choosing K=(k[0],,k[r])=(0nm,,0nm)gives


Let 0i<rbe an integer. Let k[i] be any element of F2nm. Define k[j]=0nmfor all 0jrsuch that ji. By hypothesis, the equality αk[r]Fk[r1]Fk[0](A[0])=A[r]holds. Thus,


On one hand,


On the other hand,


Therefore, Fk[i](A[i])=A[i+1], or equivalently αk[i](A[i])=(πσ)1(A[i+1]). Since this equality holds for every k[i], Proposition 2.13 states that the partition A[i]is linear.

It remains to show that A[r]is linear as the previous argument holds only for i< r. Let k[r] be an element of F2nm. Define k[i] = 0nmfor each 0i<r. Then,


Again, Proposition 2.13 implies that A[r]is linear and the result is proven. ▪

This theorem can be restated in the following way. First, the input partition Aand the output partition Bmust be linear. This result generalizes Proposition 2.13 in the sense that it applies to the full cipher and not only to the key addition. As was pointed out earlier, linear partitions are very specific partitions. This means that our combinatorial hypothesis implies to consider only algebraic objects.

Second, we have only supposed that the encryption function maps Ato Bafter rrounds. Nevertheless, Theorem 2.17 ensures that each iteration of the round function also maps a fixed linear partition to another one. As a consequence, the study of the full cipher is reduced to the study of the round function. Additionally, this result can be strengthened as follows.

Corollary 2.18. Keep the notations of Theorem 2.17. For all 0ir, let V[i] denote the part of A[i]containing 0. According to Theorem 2.17, A[i]=L(V[i]). Let 0i<rbe an integer. Then,


where W[i] denotes the subspace π1(V[i+1]). In particular, the substitution layer must at least map one linear partition to another one.

Proof. By definition, πσ(A[i])=A[i+1]or, equivalently, σ(A[i])=π1(A[i+1]). This equality can be restated as


As πis an automorphism of F2nm, then so π−1 is. Next, Proposition 2.15 ensures that π1(L(V[i+1]))=L(π1(V[i+1])). The result follows. ▪

A diagrammatic representation of Theorem 2.17 and Corollary 2.18 is given in Figure 2.5. This highlights that the input partition is always transformed in the same way through each basic operation of the encryption process. The results obtained so far can be summarized as follows: if an SPN maps a partition Aof the plaintext space to a partition Bof the ciphertext space no matter the round keys used, then the substitution layer has to map at least one linear partition to another one. This shows that our study can be reduced to the substitution layer without loss of generality.

Figure 2.5.

Results of Section 2.2.

3. Structure of the substitution layer

In the remainder of this chapter, Vand Wwill denote two subspaces of (F2n)m.

As explained in the previous section, it remains to understand how the substitution layer can map the linear partition L(V)to L(W). This problem is far more complex for the substitution layer than it was for the diffusion layer. The reasons for this are twofold. First, the substitution layer is nonlinear. It is even the only part of the SPN, which is not affine. As a consequence, to map the linear partition L(V)to L(W), we have to consider all the parts of both partitions and not only the subspaces Vand W, as was the case for the diffusion layer (see Proposition 2.15).

Second, the substitution layer should not be considered as a whole, but as the parallel application of its S-boxes. Therefore our problem becomes the following. Given two subspaces Vand W, what are the necessary and/or sufficient conditions on the S-boxes for the substitution layer to map L(V)to L(W).

Before going any further, let us introduce an example that we will continue throughout this section.

Example 2.19. Consider the substitution layer made up of the four 5-bit S-boxes S0, S1, S2 and S3 described in Figure 2.6. Its parameters are then m= 4 and n= 5. Observe that the S-box S2 was previously studied in Example 2.7. Define the two families EV=(vi)0i<7and EW=(wi)0i<7of elements of (F25)4by


Figure 2.6.

Specification of the S-boxes used throughout Section 3.

Finally, define Vand Was the subspaces spanned by EVand EW, respectively. Note that the family EVis linearly independent because it is echelonized. Hence, EVis a basis of V. The same applies for EWand W. As a consequence, Vand Ware both seven-dimensional subspaces of (F25)4.

We claim that the substitution layer σmaps L(V)to L(W). Naturally, we will not verify this statement by hand because it requires to check for each of the 213 cosets of Vthat the 27 images of its elements under σlies in the same coset of W. However, the reader who is relectant to accept this claim is encouraged to check it with a computer.▴

3.1. Truncating the substitution layer

To understand how the substitution layer can map L(V)to L(W), we will adopt a divide and conquerstrategy. That is to say, we want to break down this problem into several independent sub-problems, each involving less S-boxes than the full substitution layer. The first idea is to truncate the substitution layer and the subspaces Vand Wto get a local view of what happens on some S-boxes.

Definition 2.20 (truncation and substitution layer). Let Ebe any non-empty subset of 0,mand define the following mappings


If Ehas cardinality p, then we identify (F2n)Ewith (F2n)p.

The mapping TEallows to shorten a vector of (F2n)mto keep only the coordinates whose indices belong to E. The application σEis a substitution layer truncated to the S-boxes whose indices lie in E.

Remark 2.21. Note that TEis a linear mapping. Observe that σ0,mis the substitution layer of the SPN. Moreover, the truncated substitution layer σ{i}and the S-box Siare equal for all 0i<m.

Proposition 2.22 (truncating to a few S-boxes). Suppose that σmaps L(V)to L(W). Let Ebe a nonempty subset of 0,m. Then, the permutation σEmaps L(TE(V))to L(TE(W)).

Proof. Let x=(xi)iEbe an element of (F2n)E. Let ybe the element of (F2n)mdefined by yi= xiif ibelongs to Eand yi=0notherwise. Thus, TE(y)=x. By hypothesis, σmaps L(V)to L(W). Hence, Lemma 2.8 implies that σ(y+V)=σ(y)+W. Next,


since TEis a linear mapping. Furthermore,


Therefore, σE(x+TE(V))=TE(σ(y))+TE(W). In other words, the image of any part of L(TE(V))under σElies in L(TE(W)). The result is a consequence of Lemma 2.4. ▪

Example 2.23. By choosing E={0,3}, the previous proposition ensures that the truncated substitution layer σ{0,3}maps L(T{0,3}(V))to L(T{0,3}(W)). First, it is easy to see that


Again, we will not explicitly check that σ{0,3}maps L(T{0,3}(V))to L(T{0,3}(W))but limit ourselves to prove that the coset (07,03)+T{0,3}(V)is mapped to one coset of T{0,3}(W). Its image can be found using Lemma 2.8 as follow


The images of every element of this coset are given in Figure 2.7. For instance,


Figure 2.7.

σ{0,3} mapping a coset of T{0,3}(V) to a coset of T{0,3}(W).

This explains the second image.▴

Choosing E= {i} in Proposition 2.22 gives that the S-box Simaps L(T{i}(V))to L(T{i}(W)). As this result holds for each index iin 0,m, we deduce that


However, the equivalence does not hold in general. Hence, this only gives a necessary condition on each S-box. In other words, this means that we can lose information when considering each S-box independently. The next example stresses this fact.

Example 2.24. In our example, the truncated subspaces T{i}(V) and T{i}(W) are the following:


First, observe that the truncated subspaces for S0 and S1 are trivial. Hence, the associated linear partitions are also trivial and no information on S0 or S1 can be drawn from 2.1. Yet, the last two truncated subspaces are nontrivial and 1 gives the following equalities:


The first property has already been highlighted in Example 2.7 and in Figure 2.2. The second one is represented in Figure 2.8.

Figure 2.8.

The S-boxS3 mappingL(V)toL(W)whereV=span(0B,17)andW=span(08,15).

Let us now show that the converse of Implication 2.1 does not hold in general. Consider the substitution layer σ′ made up of the four S-boxes S0, S1, S2and S3where


Thus, this new substitution layer differs from σby only one S-box. Recall that the linear partition associated with T{0}(V)=T{0}(W)is trivial. Therefore, S0necessarily preserves this partition. As the other S-boxes remain the same, the right side of 2.1 still holds for σ′, that is


However, we will prove that σ′ does not map L(V)to L(W). Suppose by contradiction that it does. Then Proposition 2.22 ensures that σ{0,3}maps L(T{0,3}(V))to L(T{0,3}(W)). By Lemma 2.8,




This is a contradiction since (0B,1D) does not belong to T{0,3}(W)as can be seen in Figure 2.7. As a consequence, the substitution layer σ′ does not map L(V)to L(W).▴

As shown in the previous example, truncating the substitution layer and the subspaces Vand Wto each S-box independently of the others is too restrictive in general. This suggests that some S-boxes can in a way be linked together. That is to say, considering them independently results in a loss of information on the subspaces Vand W. Recall that we are interested in splitting the problem of finding all the substitution layers σmapping L(V)to L(W)into several independent smaller problems. Taking into account that some S-boxes can be linked together, we require the following:

  • a sub-problem can involve several S-boxes;

  • the same S-box cannot be involved in two different sub-problems (in other words, the sub-problems are independent);

  • each S-box is involved in one sub-problem (possibly trivial).

This is naturally formalized by a partition Iof 0,m. Each part Iof Irepresents a sub-problem, and its elements are the indices of the S-boxes involved in. By virtue of Proposition 2.22, it holds that


The next section aims to find a sufficient condition on the partition Ito obtain the equivalence. In such a case, this means that combining the solutions of these sub-problems yields a substitution layer mapping L(V)to L(W)and vice versa.

3.2. Structure of the subspaces Vand W

With the aim of ending up with partitions for which the converse of 2.2 holds, let us introduce a few definitions and notations.

Definition 2.25 (trivial product). Let Ebe a subset of 0,m. The trivial product subspaceassociated with E, denoted by TrivE, is defined to be


Moreover, we denote by VEthe intersection of Vand TrivE, that is VE=VTrivE={vV|iEc,vi=0n}. The subspace WEis defined in the same way.

Remark 2.26. It is easily seen that


Thus, a trivial product subspace is the Cartesian product of trivial spaces for each S-box; this justifies its name. Additionally, if EF, then TrivETrivF, and hence VEVFand WEWF.

The subspaces TrivEare essential in the study of the substitution layer because the latter always preserves the partition L(TrivE)regardless of its S-boxes. This result, together with Proposition 2.10, establishes the following corollary.

Corollary 2.27. Let Ebe a subset of 0,m. If σmaps L(V)to L(W), then σalso maps L(VE)to L(WE).

Example 2.28. All the subspaces VEare graphically represented in Figure 2.9. For instance,


Figure 2.9.

The subspacesVE,WEfor each subsetEof {0,1,2,3}.

Additionally, this figure also highlights the expected inclusions given by Remark 2.26. Observe that BV=(vi)0i<7is a basis of V. This new basis is more convenient than the echelonized basis EVpreviously introduced in Example 2.19 since all the VEare then easily described. It is worth noting that the same picture remains valid for the subspace W. For example,


This emphasizes that when the substitution layer maps L(V)to L(W), the subspaces Vand Whave the same structure.

According to Corollary 2.27, the substitution layer maps L(V{0})to L(W{0}). Next, truncate to E= {0} using Proposition 2.22 to obtain


This property is depicted in Figure 2.10. Finally, it should be underlined that with Proposition 2.22 alone, no property can be established on the S-box S0 (see Example 2.24).▴

Figure 2.10.

The S-boxS0 mappingL(V)toL(W)whereV=span(03,0D,15)andW=span(01,0E,14).

Definition 2.29 (projection PE). Let Ebe a subset of 0,m. The projectionPEfrom (F2n)monto TrivEis defined by PE(x0,,xm1)=(y0,,ym1)where yi= xiif ibelongs to Eand yi=0notherwise.

Remark 2.30. It is not hard to see that PEis a linear mapping and that VEis always a subspace of PE(V). Moreover, it holds that TE(V)=TE(PE(V)).

The next lemma gives some relations between the previous definitions. It is quite important and will be used several times by the end of the current chapter.

Lemma 2.31. Let Ibe a partition of 0,m. Then Vequals the internal direct sum IIVIif and only if VI=PI(V)for any part Iof I. In this case, the decomposition of an element vof Vis v=IIPI(v).

Remark 2.32. Suppose that Iis a partition of 0,msuch that V=IIVI. The previous lemma, together with Remark 2.30, establishes that TI(V)=TI(VI)for each part Iof I.

Proposition 2.33 (Substitution layer structure). Let Ibe a partition of 0,msatisfying both V=IIVIand W=IIWI. The permutation σmaps L(V)to L(W)if and only if σImaps L(TI(V))to L(TI(W))for any Iin I.

The preceding proposition establishes that the converse of Implication 2.2 (page 21) holds whenever the partition Isatisfies both V=IIVIand W=IIWI. For such a partition, the problem of finding all the substitution layers σmapping L(V)to L(W)can equivalently be broken down into the independent sub-problems of finding all the σImapping L(TI(V))to L(TI(W))for each part Iof I.

3.3. Linked and independent S-boxes

Of course, there may be several partitions Isuch that V=IIVIand W=IIWI, each yielding a different decomposition of the substitution layer. A few of these decompositions are certainly more interesting or easier to solve. The purpose of this section is to study such partitions. Let us begin with the following lemma.

Lemma 2.34. Suppose that σmaps L(V)to L(W). For every partition Iof 0,m, V=IIVIif and only if W=IIWI.

The contrapositive of Lemma 2.34 is the following: if there exists a partition Isuch that V=IIVIand WIIWIor such that VIIVIand W=IIWI, then there exists no substitution layer mapping L(V)to L(W). Because we intend to study the substitution layers mapping L(V)to L(W), Lemma 2.34 suggests to assume the following.

Assumption 2.35. For the remainder of this section, we assume that for any partition Iof 0,m, it holds that


Proposition 2.33, together with the preceding assumption, suggests the following definition.

Definition 2.36 (decomposition partition). A decomposition partition(with respect to Vand W) is a partition of 0,msuch that V=IIVI.

Remark 2.37 (partial order on partitions). Recall that if Iand Jare two partitions of 0,m, then the partition Iis said to be finerthan Jif for any part Iin I, there exists a part Jin Jsuch that IJ.

Example 2.38. The purpose of this example is to find all the decomposition partitions with regard to Vand W. By virtue of Lemma 2.31, the subspace Vcan be decomposed as IIVIif and only if VIis equal to PI(V) for each part Iof I. The eight-framed subspaces in the middle of Figure 2.9 are exactly those that satisfy VE=PE(V). Hence, the decomposition partitions are the partitions whose parts are selected from the following:


It is then easy to check that the decomposition partitions of Vare:


In Figure 2.11, all the partitions of 0,4are ordered by the “finer-than” relation, and the decomposition partitions are emphasized. What stands out is that the decomposition partition {{1}, {2}, {0, 3}} is finer than all other decomposition partitions.▴

Figure 2.11.

The partitionsIof {0, 1, 2, 3} such thatV=IIVI.

The existence of this least decomposition partition in the example above is a very welcome and nontrivial property. This means that all the truncated substitution layers obtained using Proposition 2.33 are the smallest possible. Thus, such a partition should be preferred to any other decomposition partition. We will now prove that this least decomposition partition always exists.

Proposition 2.39. The set of the partitions Iof 0,msatisfying V=IIVIhas a least element denoted Ild.

Consequently, the only decomposition partition that will be considered in the remainder of this chapter is the least decomposition partition Ild. The following definition is inspired by Proposition 2.33 and Proposition 2.39.

Definition 2.40 (linked and independent S-boxes). Suppose that σmaps L(V)to L(W). Let Ibe a part of Ild.

  • If I= {i}, the S-box Siis said to be independentof the other S-boxes.

    Moreover, if V{i}={0nm}or V{i}=Triv{i}, the S-box Siis said to be inactive. Otherwise, Siis active.

  • If #I≥ 2, then the S-boxes whose indices lie in Iare said to be linked together.

Remark 2.41. Let 0 ≤ imbe an integer. We have already noted that the substitution layer σalways preserves L({0nm})and L(Triv{i}). In addition, Proposition 2.33 ensures that σmaps L(V{i})to L(W{i}). Consequently, if V{i}={0nm}or if V{i}=Triv{i}, then V{i}= W{i}.

Suppose that the S-box Siis independent with regard to the subspaces Vand W. As established by Proposition 2.33 and Remark 2.32, if Siis replaced with another S-box Si, then this new substitution layer still maps L(V)to L(W)provided that Simaps L(T{i}(V{i}))to L(T{i}(W{i})).

Suppose further that Siis active. By definition, {0nm}V{i}Triv{i}. Observe that the restriction of T{i} to Triv{i} is one-to-one, hence


Thus, T{i}(V{i}) is a nontrivial subspace of F2nand the requirement that Simaps L(T{i}(V{i}))to L(T{i}(W{i}))is also nontrivial. Therefore, an independent active S-box can be chosen independently of the other S-boxes but has to respect the structure of the subspaces Vand W.

Now suppose that Siis inactive. By definition, V{i}={0nm}or V{i}=Triv{i}. Then, the equality V{i}=W{i}follows from Remark 2.41 and we have that


In either case, the condition that Simaps L(T{i}(V{i}))to L(T{i}(W{i}))is trivial, and any S-box fulfills it. As a consequence, an independent inactive S-box can be freely chosen. In other words, such an S-box has no impact on the fact that σmaps L(V)to L(W).

Finally, suppose that some S-boxes are linked together. If only one of these S-boxes is replaced independently of the others, then the desired property of the substitution layer may not hold.

Example 2.42. As we have seen in Example 2.38 and Figure 2.11, the least decomposition partition with regard to the subspaces Vand Wis Ild={{1},{2},{0,3}}. By Proposition 2.33, the substitution layer maps L(V)to L(W)is and only if the following equalities hold:


Thus, the S-box S1 is independent of the other S-boxes, the same applies to S2 and the S-boxes S0 and S3 are linked together. As was already noted in Figure 2.9, we have that


Therefore, the S-box S2 is active while S1 is inactive.▴

3.4. The forbidden case

Throughout this section, we assume that the substitution layer σmaps L(V)to L(W). In order to prove the last main theorem of this chapter, we need to consider the following particular case.

Proposition 2.43. Let Ibe a decomposition partition. Let Ibe a part of Isuch that #I≥ 2 and let Ebe a nonempty proper subset of I. Suppose that VE=VI\E={0nm}and PE(V)=TrivE. Then, for all iin E, Siis an affine mapping.

If the subspace Vsatisfies the assumption of the proposition above, then at least one of S-boxes has to be affine. Nowadays, an SPN whose substitution layer has an affine S-box cannot be taken seriously. Additionally, such a cipher is likely to be very weak to differential and linear cryptanalysis. This discussion explains the title of this section.

Example 2.44. As seen in Example 2.38, the least decomposition partition is Ild={{1},{2},{0,3}}. Its only part of cardinality greater than or equal to 2 is I={0,3}. The nonempty proper subsets of Iare the E= {0} and E= {1}. According to Figure 2.9, we have V{0}{020}. Consequently, Proposition 2.43 does not apply to this example, and this is good news because none of the S-boxes is affine. Otherwise, this would have disproved the contrapositive of Proposition 2.43.

Now let us introduce another example. Consider a substitution layer σ′ made up of two 3-bit S-boxes S0and S1; hence, its parameters are m= 2 and n= 3. Define the subspaces V′ and W′ of (F23)2by


Finally, suppose that σ′ maps L(V)to L(W). It is easily seen that


Thus, the least decomposition partition with regard to V′ and W′ is {{0, 1}}. The S-boxes S0and S1are then linked together. Choosing E= {0} in Proposition 2.43 ensures that S0must be affine. Similarly, we can prove that S1must also be affine by considering E= {1}. As a result, any substitution layer σ′ mapping L(V)to L(W)is necessary affine. These subspaces are thus completely prohibited as the whole cipher is then affine.▴

3.5. Reduction to one S-box

To prove our main result about the substitution layer, we need the following preliminary lemma.

Lemma 2.45. Let Ibe a part of Ildand Ebe a non-empty proper subset of I.

  • If VEis a trivial product subspace, then VE=Triv={0nm}.

  • If PE(V) is a trivial product subspace, then PE(V)=TrivE.

Now we have all the results needed, let us state and prove the main result of Section 3 which is depicted in Figure 2.12.

Figure 2.12.

Diagrammatic representation of Theorem 2.46.

Theorem 2.46. Let n≥ 2 and mbe two positive integers. Let S0,…,Sm–1 be n-bit S-boxes. Define the permutation σof (F2n)m, which maps the element (xi)0i<mto (Si(xi))0i<m. Let Vand Wbe two subspaces of (F2n)msuch that σmaps L(V)to L(W). Suppose that Vis not a trivial product subspace. Then, at least one of the S-boxes maps a nontrivial linear partition to another one.

Proof. Let us prove this result by complete induction on the number mof S-boxes. Suppose that m= 1. In this case, σ= S0. By hypothesis, Vis different from {0n} and F2n. Hence, L(V)is a nontrivial partition and S0 maps L(V)to L(W).

Let m≥ 2 be an integer. Suppose that the result holds for any positive integer strictly lower than m. First, suppose that all the S-boxes are independent. In other words, Ild={{i}|i0,m}. If each S-box is inactive, then Vis a trivial product subspace, a contradiction with our hypothesis. Thus, there exists at least one active S-box Si. In this case, {0nm}V{i}Triv{i}. According to Lemma 2.31, the equality P{i}(V)=V{i}holds. Then, T{i}(V{i})=T{i}(P{i}(V))=T{i}(V)is a nontrivial subspace of F2n, so L(T{i}(V))is also nontrivial. Finally, Proposition 2.22 states that Simaps L(T{i}(V))to L(T{i}(W)), and thus the result holds in this case.

Now, suppose that some S-boxes are linked together. Then, there exists an element Iof Ildsuch that #I≥ 2. Next, at least one of the following three cases holds.

  1. Suppose that there exists a nonempty proper subset Eof Isuch that PE(V) is not a trivial product subspace. Let pdenote the cardinality of E. Recall that TE(PE(V))=TE(V). It follows that TE(V) is not a trivial product subspace of (F2n)p. According to Proposition 2.22, σEmaps L(TE(V))to L(TE(W)). Note that Eis a non-empty proper subset of I, so of 0,m. Hence p< m, so the induction hypothesis ensures that at least one of the S-boxes of σmmaps a nontrivial partition to another one.

  2. Suppose that there exists a nonempty proper subset Eof Isuch that VEis not a trivial product subspace. Recall that σmaps L(VE)to L(WE). Proposition 2.22 ensures that σEmaps L(TE(VE))to L(TE(WE)). It is easily seen that TE(VE) is not a trivial product subspace. As before, the result is a consequence of the induction hypothesis.

  3. Suppose that there exists a nonempty proper subset Eof Isuch that PE(V), VEand VI\Eare all trivial product subspaces. Then, Lemma 2.45 implies that PE(V) = TrivEand VE=VI\E={0nm}. According to Proposition 2.43, the S-boxes whose indices belong to Eare affine mappings. Combining Proposition 2.15 and 2.13, we see that these S-boxes map any non-trivial linear partition to another one.

In any case, the result holds for this integer m. The result follows by induction. ▪

Example 2.47. It is worthwhile to note that the proof of Theorem 2.46 is constructive. Therefore, it gives a method to find necessary conditions on the S-boxes for the substitution layer to map L(V)to L(W). Let us apply this method to our main example.

The first step is equivalent to what had been done in Examples 2.38 and 2.42. Consider the least decomposition partition Ild={{1},{2},{0,3}}and deduce that:

  • S1 is inactive;

  • S2 is active and maps L(span(07,1A))to L(span(0E,12))(see Figure 2.2);

  • S0 and S3 are linked together.

Now, consider the part I= {0,3} of Ild. Thus, the nonempty proper subsets of Iare {0} and {3}. The first case requires to compute the following projections:


Thus, P{3}(V) is not a trivial product subspace. As in Example 2.24 and Figure 2.8, we see that S3 maps L(0B,1C)to L(08,15)by truncating σand the subspaces P{3}(V), P{3}(W)to {3}. Now, we need to compute the following subspaces:


Since V{0} is not a trivial product subspace, the second case apply. Then, truncate the substitution layer σand the subspaces V{0} and W{0} to prove that S0 maps L(03,0D,15)to L(01,0E,14). This property was stressed in Example 2.28 and Figure 2.9. Finally, recall that the third case does not apply to these subspaces, as observed in Example 2.44.▴

The preceding example covers only the first and the second cases in the treatment of linked S-boxes given by the proof of Theorem 2.46. To illustrate the third case, we introduced the following example.

Example 2.48. Let n= m= 3. Thus, the substitution layer σis made up of three 3-bit S-boxes denoted by S0, S1 and S2. Define the subspaces Vand Wof (F23)3by


and assume that the substitution layer σmaps L(V)to L(W). By definition, it holds that P(V)={(0,0,0)}and P{0,1,2}(V)=V. Then, for each nonempty proper subset Eof {0,1,2}, it is easily seen that PE(V)=TrivE. For instance,


We know that V={(0,0,0)}and V{0,1,2}(V)=V. The other subspaces VEare the following:


Thus, the equality PE(V)=VEholds only for E=and E= {0,1,2}. Consequently, the least decomposition partition is Ild={{0,1,2}}, and hence, all the S-boxes are linked together.

From now on, we follow the method given in the proof of Theorem 2.46. As previously noted, for each nonempty proper subset Eof {0,1,2}, the projection PE(V) is a trivial product. Therefore, the first case does not apply to this example. We move on to the second case. By induction, the substitution layer and the subspaces V{0,1} and W{0,1} are truncated to {0,1}. Hence, we now consider the permutation σ′ = σ{0,1}, which maps L(V)to L(W)where


Such a substitution layer has already been studied in Example 2.44. Recall that


Thus, the least decomposition partition with regard to V′ and W′ is {{0,1}}. Since V{0}, V{1}, P{0}(V)and P{1}(V)are all trivial products, the first and second cases do not apply. Choosing E= {0} and E= {1} in the third case proves that S0 and S1 are affine mappings. Come back to the full substitution layer. Similarly, it is straightforward to verify that S2 must be affine by truncating σand the subspaces V{0,2}, W{0,2} to {0,2}. To summarize, we have proven that any substitution layer mapping L(V)to L(W)is necessarily affine.▴

In this chapter, we have studied a generic SPN mapping a partition Aof F2nmto a partition Bof F2nm, independently of the round keys used. Combining Theorem 2.17 and Corollary 2.18, we proved that there exist two families (V[i])0irand (W[i])0irof subspaces of F2nmsuch that the substitution layer σmaps L(V[i])to L(W[i])for each 0 ≤ ir. This result has been illustrated in Figure 2.5.

First, suppose that all the V[i] are trivial products. In such a case, the diffusion layer of the cipher is probably not playing its role (or the round number is very small). As is generally the case, suppose that there is no diffusion layer in the last round of the SPN. Then, the input and the output partitions are both linear partitions associated with a trivial product subspace. This implies that some ciphertext bits are independent of some plaintext bits. Such a property must be avoided in any good cipher.

Now, suppose that at least one of the V[i] is not a trivial product. This second case is far more interesting than the previous one. By virtue of Theorem 2.46, at least one of the S-boxes must map a nontrivial linear partition to another one, as illustrated in Figure 2.12.

Thus, we have proven in this chapter that any good partition-based trapdoor SPN has at least on S-box mapping a nontrivial linear partition to another one. The following chapter aims to design such an S-box with the best security against both differential and linear cryptanalysis.

Analysis of a backdoor S-box

Differential [21] and linear [22] cryptanalysis are considered as the most important attacks against block ciphers [23]. The resistance of an S-box against these attacks is assessed by its difference distribution table and its linear approximation table respectively.

Let Sbe an n-bit S-box. The difference distribution table and the linear distribution table of Sare the two families DTSand LTSindexed by (F2n)2and defined for any (a, b) in (F2n)2by


Moreover, the S-box Sis said to be differentially δ-uniformif DTS(a,b)δfor any (a, b) in (F2n)2with a≠ 0. Similarly, Sis linearly λ-uniformif |LTS(a,b)|λfor every (a, b) in (F2n)2with b≠ 0. It is worthwhile to mention that the smaller the differential uniformity is, the more resistant Sis against differential cryptanalysis. The same applies for linear cryptanalysis.

Remark 3.1. It can be proven that any n-bit S-box is at least linearly 2n12-uniform.

Recall that two permutations S1 and S2 of F2nare said to be equivalentif there exist two linear mappings L1, L2 of F2nand two elements v1, v2 of F2nsuch that


It is well known that equivalent permutations have the same differential uniformity and the same linear uniformity, see for instance [24, 25]. More precisely, their differential tables are equal up to row and column permutations. This result holds for linear tables up to the sign of the coefficients.

Let Vand Wbe two subspaces of F2n. Suppose that S′ is an n-bit S-Box mapping L(V)to L(W). Proposition 2.11 ensures that there exists an automorphism Lof F2nsuch that L(V)=W. Since L1(W)=V, Proposition 2.15 states that L−1 maps L(W)to L(V). Then, S=L1Sis equivalent to S′ and maps L(V)to L(V). This discussion establishes the following proposition.

Proposition 3.2. Let Vand Wbe two subspaces of F2n. If S′ is an n-bit S-box mapping L(V)to L(W), then there exists an S-box Sequivalent to S′ preserving L(V).

Remark 3.3. Conversely, suppose that Spreserves L(V). Let Wbe any subspace isomorphic to V. Then find an automorphism Lsuch that L(V) = W. By Proposition 2.15, LSmaps L(V)to L(W).

As with Section 3, let us introduce an example that we will continue throughout this section.

Example 3.4. Consider the 5-bit S-box S′ given in Figure 3.1. This S-box has already been met twice in Examples 2.7 and 2.19 (refered to as fand S2 respectively). Thus, we know that S′ maps L(V)to L(W)where V=span(07,1A)and W=span(0E,12). Following the proof of Proposition 2.11, an automorphism Lof F25satisfying L(V) = Wwas constructed in Example 2.12. Its inverse L−1 and the composition S= L−1S′ are given in Figure 3.1. For instance, S(07)=L1(S(07))=L1(10)=18. It is easy to check in Figure 3.2 that Spreserves the linear partition L(V). Finally, it is worth observing how Figures 2.2 and 3.2 look similar. This explains our choices to construct the automorphism L.▴

Figure 3.1.

Construction of the S-boxSused throughout Chapter 3.

Figure 3.2.

The permutationSpreservingL(V)whereV=span(07,1A).

By virtue of Proposition 3.2, we can assume without loss of generality that V= Win our study of the linear and differential properties of an S-box mapping L(V)to L(W).

Throughout this section, we consider the following

  • let Vbe a d-dimensional nontrivial subspace of F2n,

  • let Ube a complement space of V,

  • let Sbe an n-bit S-box preserving L(V).

Therefore, the space F2ncan be written as the direct sum UV. In other words, every element xof F2ncan be uniquely written as the sum x= u+ vwhere uand vbelong to Uand V, respectively. Let [u] denote the coset of Vwith respect to u. Thus, [u] = u+ Vis the unique part of L(V)where ulies in and we have


Since Vis d-dimensional, the complement space Uis (nd)-dimensional. In addition, we have the following inequalities


because Vis assumed to be a nontrivial subspace of F2n.

The following theorem describes the structure of permutations preserving a linear partition. It can be seen as a corollary of the Krasner-Kaloujnine embedding theorem [26]. However, for convenience, we give a direct constructive proof.

Theorem 3.5. There exist a unique permutation ρof Uand a unique family of permutations (τu)uUof Vsuch that, for all x= u+ vin F2n,


Conversely, if ρis a permutation of Uand if (τu)uUis a family of permutations of V, then the mapping S′ defined by S(u+v)=ρ(u)+τu(v)preserves L(V).

Proof. By hypothesis, Spreserves L(V). Thus, Sinduces a permutation ρof Udefined as follows. Let ube an element of U. Hence, there exists a unique u′ in Usuch as f([u])=[u]. Define then ρ(u)=u. For each element uof U, define the permutation τuof V, which maps vto S(u+v)+ρ(u). By construction, for any uin Uand any vin V, we have


The existence of the permutations ρand τuis proven. Now, let us show their uniqueness. Suppose that there exist a permutation ρ˜of Uand a family of permutations (τ˜u)uUof Vsatisfying the result. Let (u, v) be an element of U× V. By hypothesis, we have


Because the sum of Uand Vis direct, it follows that ρ(u)=ρ˜(u)and τu(v)=τ˜u(v). The uniqueness of ρand the τufollows.

Conversely, let ρbe a permutation of Uand (τu)uUbe a family of permutations of V. Denote S′ the mapping from F2nto F2ndefined by S(u+v)=ρ(u)+τu(v). Since F2n=UVand ρand the τuare permutations of Uand Vrespectively, The mapping S′ is a permutation of F2n. Let ube an element of U. It holds that


Hence, S′ preserves the linear partition L(V). ▪

This theorem allows us to design an S-box that preserves L(V)using permutations with smaller domains. Furthermore, these permutations can be chosen arbitrarily.

Example 3.6. Consider the complement subspace Uof Vdefined by


Figure 3.2 shows that Sinduces a permutation ρof U. For instance, ρ(00)=02because Smaps the part [00] to [02]. The whole permutation ρis given in Figure 3.3. For each uin U, define the permutation τuof Vby τu(v)=S(u+v)+ρ(u). For example,


Figure 3.3.

The permutationSpreservingL(V)whereV=span(07,1A).

The permutations τuare also given in Figure 3.3. Informally, the permutation ρtells us how Spermutes the parts of L(V)and the permutations (τu)uUdescribe how the elements are moved inside each part (Figure 3.4).▴

Figure 3.4.

The linear transformationsLUandLV.

In the rest of this section, the permutation ρand the family (τu)uUgiven by Theorem 3.5 are fixed.

The goal of this part is to express the linear and differential properties of Saccording to the ones of the permutations ρand (τu)uU. However, these permutations are not defined on F2nbut on the subspaces Uand Vof F2n. Thus, the concept of linear or differential table is inexistent for such maps. To solve this problem, we define two isomorphisms between Uand F2ndand between Vand F2d. Then, we consider the maps induced by ρand (τu)uUon these spaces.

Notation 3.7. Let BU=(ui)i<ndand BV=(vi)i<ndbe two bases of Uand Vrespectively. Define the following mappings:


It is easily seen that LUand LVare both isomorphisms of vector spaces. Define the permutation ρ=LU1ρLUof F2nd. Finally, for each uin U, let τudenote the permutation LV1τuLVof F2d.

Example 3.8. Consider the bases BU=(01,02,08)and BV=(07,1A)and define the isomorphisms LUand LV. The permutation ρ′ of F23and the permutations τuof F22are given in Figure 3.5.

Figure 3.5.

The family of permutations(τu)uUand the permutationρ′.

1. Linear approximation table

The next theorem links the linear tables of Sand ρ′. The coefficients of the linear approximation table of Staken into account by this result are in practice the greatest. Thus, they generally determine the linear uniformity of S.

Theorem 3.9. Let aand bbe two elements of V. Denote at=LU(a)and bt=LU(b). Then,


Remark 3.10. Consider the map LU:F2nF2nd. Then, ker(LU)=(ImLU)=U.Observe that UV=(U+V)=(F2n)={0}. Consequently, the restriction LU:VF2ndis one-to-one and thus onto because of the rank-nullity theorem.

Example 3.11. The restriction LU:VF23is given by the following table.


Reorder the rows and the columns of the linear approximation table of Sto begin with ((LU)1(x))xF23, as suggested by Theorem 3.9. The reordered linear table is shown in Figure 3.6. Each dot “⋅” in this figure stands for the integer 0. With this order, it is easily seen that the top left part of LTSis exactly the linear table of ρ′ multiplied by 2d= 4. For instance, LTS(1D,16)=22×LTρ(5,2)=8because LU(1D)=5and LU(16)=2.▴

Figure 3.6.

The reordered linear table ofS.

Corollary 3.12. The S-box Sis at least linearly 2(n+d1)/2-uniform.

Proof. As noted in Remark 3.1, there exist two elements atand btof F2ndboth nonzero such that |LTρ(at,bt)|2(nd1)/2. Let aand bdenote the elements (LU)1(at)and (LU)1(bt)of F2n. Then, Theorem 3.9 implies that


Observe that aand bare nonzero and the result is proven. ▪

Remark 3.13. It is well-known that any 4-bit S-box is at least linearly 4-uniform, see for example [27]. As a consequence, the permutation Sis at least 2d+2-uniform if nd= 4. Similarly, any 2-bit S-Box is linearly 2-uniform, and hence Sis at least 2d+1-uniform if nd= 2.

Example 3.14. It is easily seen that Sis linearly 8-uniform in Figure 3.6. The lower bound given by Corollary 3.12 is 2(n+d1)/2=2(5+21)/2=8. Therefore, this bound is tight on this example.▴

2. Differential distribution table

Unlike linear cryptanalysis, where only a local view of the table was provided, the results for differential cryptanalysis bring both local and global outlooks.

Theorem 3.15. Let a=ua+vaand b=ub+vbbe elements of F2n. Denote ua=LU1(ua)and ub=LU1(ub). Then


Especially, DTS(a,b)2d×DTρ(ua,ub).

The preceding theorem can be restated in the following way. If DTSis rearranged coset by coset, a simple operation enables recovery of DTρ′. On the other hand, the next theorem is similar to Theorem 3.9 but for differential cryptanalysis. Again, it generally highlights the coefficients of DTSinvolved in the differential uniformity of S.

Theorem 3.16. Let vaand vbbe two elements of V. Denote va=LV1(va)and vb=LV1(vb). Then


Particularly, the subtable (DTS(va,vb))va,vbVis uniquely determined by the differential tables (DTτu)uU.

Example 3.17. To illustrate Theorems 3.15 and 3.16, reorder the rows and the columns of the differential table of Sas presented in Figure 3.7. With this order, we can see the differential table of ρ′ by considering the differential table of Scoset by coset. In fact, Theorem 3.15 states that the sum of all elements in the same row or column of the subtable DTS([u1],[u2])is equal to the coefficient (x1, x2) of DTρmultiplied by 22, where xi=LV1(ui). For instance, if we consider the subtable

Figure 3.7.

The reordered differential table ofS.

we can see that the sum of each row or column is equal to 8=22×DTρ(5,3)since LV(5) =09and LV(3) =03.

Finally, Theorem 3.16 ensures that the subtable DTS(V,V)=DTS([00],[00])is the sum of the differential tables (DTτu)uU.▴

Corollary 3.18. The permutation Sis at least δ-uniform for the differential cryptanalysis where δdenotes the even integer directly greater than or equal to 2n2d1.

Example 3.19. In Figure 3.7, we can see that Sis differentially 12-uniform. Thus, this S-box reaches the lower bound given by Corollary 3.18.▴

3. The design of a trapdoor S-box

First, let us summarize the theorems of this section.

  • Theorem 3.9 implies to reduce at most the linear uniformity of ρ′ to keep the one of Sas small as possible.

  • In the same way, Theorem 3.15 implies to reduce at most the differential uniformity of ρ′.

  • The same theorem also stresses that the greater the number of nonzero coefficients of DTρis, the better.

  • Finally, Theorem 3.16 teaches us that the sum of the differential distribution tables DTτushould be as low as possible.

Now, to design the S-box S, one needs to pick a permutation ρ′ of F2ndwith the smallest uniformities for linear and differential cryptanalysis. Then, one searches for permutations τuof F2dsatisfying the last condition. This search can be conducted randomly over every d-bit S-boxes. Finally, construct the S-box Sas in the converse of Theorem 3.5. If the differential and linear uniformities of Sare too far from the lower bounds given by Corollaries 3.12 and 3.18 and by Remark 3.13, then start again. In practice, these bounds are reached (or almost reached) after a small number of iterations.

Moreover, observe that the closer the dimension dof Vfrom nis, the weaker the S-box Sis against linear cryptanalysis and the stronger Sis against differential cryptanalysis. The lower bounds given by Corollaries 3.12 and 3.18 and by Remark 3.13 are given in Figure 3.8 for each 3 ≤ n≤ 8.

Figure 3.8.

Lower bounds for the linear (left) and differential (right) uniformities ofS.

Finally, it should be highlighted that these results can be used to easily prove that a given S-box does not map any linear partition to another one. For instance, the linear and differential uniformities of the S-box of Rijndeal [11] are far below the lower bounds given by Corollaries 3.12 and 3.18, no matter what the dimension dof the subspace Vis. As a consequence, this S-box does not map any linear partition to another linear one.

Backdoored Encryption Algorithm 1

BEA-1 [28] (Backdoored Encryption Algorithm) is an AES-like cipher together with a backdoor based on the theory developed in Chapters 2 and 3. This cipher is designed to resist linear and differential cryptanalysis. Nonetheless, the backdoor enables recovery of the full 120-bit cipher key in just a few seconds on a laptop computer using 216 chosen plaintext blocks, as presented in [29].

This chapter is organized as follows. First, the specification of the cipher BEA-1 and its security analysis against linear and differential cryptanalysis are given in Section 1. Then, Section 2 explains the hidden property of the algorithm and its design. To conclude, the cryptanalysis exploiting the backdoor is detailed in Sections 3 and 4.

1. Presentation of BEA-1

The cipher BEA-1 is directly inspired by Rijndael[7], the block cipher designed by Joan Daemen and Vincent Rijmen, now known as the AES. Our algorithm encrypts 80-bit plaintext blocks using a 120-bit cipher key. Unlike the AES, the internal state is not seen as a matrix of bytes but as an array of 10-bit bundles. Therefore, the message and key spaces are respectively (F210)8and (F210)12.

1.1. Specification of the encryption process

The encryption consists in applying 11 times a simple keyed operation called round functionto the data block. A different 80-bit round key is used for each iteration of the round function. Since the last round is slightly different and uses two round keys, the encryption requires twelve 80-bit round keys. These round keys are derived from the 120-bit cipher key using a key schedule.

Like any other substitution-permutation network, the round function is made up of three stages: a key addition, a substitution layerand a diffusion layer.

  • The key addition is just a bitwise “exclusive or” (XOR) between the data block and the round key.

  • The substitution layer consists in the parallel evaluation of four different 10-bit S-boxes and is the only part of the cipher that is not affine. These S-boxes are referred to as S0, S1, S2, S3 and are defined in Figures 5A, 7A, 9A and 11A given in Appendix. They should not be confused with the secret S-boxes S0, S1, S2 and S3, only used in the design and the cryptanalysis of BEA-1.

  • Following the design principles of the AES, the diffusion layer comes in two parts: theShiftRowsand theMixColumnsoperations. The first part is a bundle permutation. The second evaluates in parallel the linear transformation M:(F210)4(F210)4processing four 10-bit bundles. Because of its linearity, Mis only defined over the standard basis of (F210)4in Figure 3A in Appendix. For convenience, its inverse M−1 is also in the same figure.

The pseudo-codes for the key schedule and the encryption algorithm are both given in Figure 4.1. To provide an overview of their structures, the first step of the key schedule and the round function is illustrated in Figure 4.2. This representation also emphasizes the similarities between our algorithm and the AES.

Figure 4.1.

The key schedule and the encryption function of BEA-1.

Figure 4.2.

Diagrammatic representations of the key schedule and the round function of BEA-1.

Remark 4.1. The decryption is straightforward from the encryption since all the primitives are bijective. Thus, to decrypt, we just have to apply the inverse operations in the reverse order. It should be stressed that the key addition and theShiftRowsare involutions; therefore the same operations are used in the decryption process. Finally, note that the inverse S-boxes are not given here but can be computed by using the equation Si1(S(x))=xholding for each xin F210.

1.2. Differential and linear cryptanalysis

In [7], Daemen and Rijmen introduced the differential and the linear branch numbers of a linear transformation. With an exhaustive search, it can be checked that the differential and linear branch numbers of Mare both equal to 5, which is the maximum. This implies that any 2-round trail has at least 5 active S-boxes. Thus, a 10-round trail involves at least 25 active S-boxes.

Note that all the S-boxes are (at most) differentially 40-uniform and linearly 128-uniform. Therefore, the probability of any 10-round differential trail is upper bounded by (401024)252116.9and the absolute bias of a 10-round linear trail is upper bounded by (128512)25=250. Consequently, a differential cryptanalysis of the 10-round version of our cipher would require at least 2117 chosen plaintext/ciphertext pairs and a linear cryptanalysis would require 2100 known plaintext/ciphertext pairs.

Even if this is a rough approximation since it does not take into account the inter-column diffusion provided by theShiftRowsoperation, it suffices to prove the cipher’s practical resistance against classical differential and linear cryptanalysis. In fact, there are only 280 different plaintext/ciphertext pairs for a fixed cipher key.

2. Design of the backdoor

The presentation of secret structure of BEA-1 comes in two parts. First, Section 2.1 explains the nature of this backdoor and provides all the results needed to address the cryptanalysis. Then, the design of BEA-1’s primitives is given in Sections 2.2 and 2.3. The reader who just wants to understand how the backdoor works can skip these two sections. Indeed, they are more technical and are also independent of the remainder of this chapter.

2.1. The linear partitions throughout the encryption

As said in introduction, the backdoor of BEA-1 relies on the theoretical framework developed in Chapters 2 and 3. Thus, it should not be surprising that linear partitions must play a key role in it. For this purpose, let us introduce the following 5-dimensional subspaces of F210


Then, define the 40-dimensional subspaces V=i=07Vimod4and W=i=07Wimod4of message space (F210)8. Therefore, the linear partitions L(V)and L(W)are both made up with 240 cosets, each containing 240 elements.

The S-boxes S0, S1, S2 and S3 given in the specification of BEA-1 are actually derived from the secretS-boxes S0, S1, S2 and S3 given in Figures 4A, 6A, 8A and 10A in Appendix. The relation between the secret S-boxes Siand their modified versions Siwill be detailed later in Section 2.2. In the first place, let us state the following theorem relating BEA-1 to the theory of partition-based backdoor ciphers.

Theorem 4.2. Consider the encryption function of BEA-1 where the modifiedS-boxes S0, S1, S2, and S3 are replaced with their secretcounterparts S0, S1, S2, and S3. Then, the round function preserves the linear partition L(V)of (F210)8and the last round maps L(V)to L(W), no matter the round keys used. As a consequence, the full encryption maps L(V)to L(W).

More precisely, Figure 4.3 depicts the evolution of the linear partition L(V)throughout each primitive of the (secret) encryption process. For instance, we can see that the S-box Simaps the linear partition L(Vi)to L(Wi), and hence, the substitution layer maps L(V)to L(W). Similarly, the diffusion layer comes back to the original partition, since it maps L(W)to L(V).

Figure 4.3.

The linear partitions throughout the encryption.

Remark 4.3. Theorem 4.2, as well as Theorem 18 stated hereinafter, will be proven in Sections 2.2 and 2.3. Indeed, they establish the main properties of the backdoor and are hence closely related to the design of the cipher’s primitives.

Thanks to Theorem 4.2, we can now explain our choices for the Viand Wi. Each of these subspaces of F210is a five-dimensional linear code whose minimal distance is equal to 4. This property ensures that the Hamming distance of any two different elements lying in the same coset is at least equal to 4. The subspaces Vnd Wof F280inherit this property. Thus, if pis a plaintext, then any other plaintext p′lying in the same coset of Vdiffers from pin at least four bits. Considering the secret encryption function, Theorem 4.2 establishes that their ciphertexts cand c′ belong to the same coset of W. Thus, cand c′have at least four different bits. As it will become clear in the next two sections, the subspaces Viand Wicould have been freely chosen among the five-dimensional subspaces of F210. We surmised that using linear codes with high minimal distance should reduce the likelihood of observing the backdoor by accident, hence our choice for the Viand Wi.

Having explained the main property of the secret encryption function, now is the time to introduce the following theorem establishing a link between the secret cipher and BEA-1.

Theorem 4.4. Let Fand Edenote the round function and the encryption function of BEA-1 using the secret S-boxes. Let p=p[0]be any plaintext. Define the following elements with respect to the round keys k[0],,k[10]:


Assume that the round keys k[0],…,k[10] are independent and uniformly distributed. The probability that all the equalities p[i] = p[i] hold for each 1 ≤ i≤ 11 is given by


Therefore, the probability that pis encrypted equally with Eand Ecan be approximated by 2−11.

Remark 4.5. The fact that theMixColumnsoperation is replaced with a key addition in the last round of BEA-1 does not matter in Theorem 4.4. For the sake of simplicity, we then ignore this detail. This explains why the last round key k[11] does not appear in the statement of this result.

Needless to say, the hypothesis that the round keys are independent and uniformly distributed is mathematically wrong in any practical cryptanalysis. Indeed, the twelve 80-bit round keys are all extracted from one 120-bit cipher key. However, the cipher key needs to have (at least) 960 bits to provide independence and uniform distribution to its round keys. Such a cipher key must be related to the concept of long-key cipher defined in [30]. Nonetheless, if the cipher key is uniformly distributed, the same applies for each round key.

In our cryptanalysis of BEA-1, we are given plaintexts with their ciphertexts encrypted under a fixed cipher key. Even if we forget about the independence of the round keys, each plaintext must be encrypted with a random cipher key to make use of Theorem 4.4.

Fortunately, our experiments suggest that the proportion of the plaintexts encrypted equally with EKand EKis approximatively 2−11, even when the round keys are derived from a fixed cipher key K. To put it another way, if Pis a subset of the plaintext space (F210)8, it seems reasonable to assume that


Now, suppose that Pis included in a coset of Vdenoted by x + V. As the secret encryption function EKmaps L(V)to L(W)(see Theorem 4.2), we know that the image of Punder EKis included in a coset of W. More precisely, Lemma 2.8 establishes that EK(P)is included in y+ Wwhere y= EK(x). Hence,


Combining (4.1) with (4.2), we conclude that approximately #P×211ciphertexts in C=Ek(P)belong to y+ W. In addition, we have observed that the ciphertexts c= EK(p) such that EK(p)EK(p)are spread over the 240 cosets of W.

The backdoor of BEA-1 is hence the following. First, choose a set Pof 216 plaintexts uniformly chosen in one coset x+ Vand collect their ciphertexts C=EK(P)encrypted under an unknown cipher key K. Then search for the most represented coset of Win Cand denote by yone of its representatives. According to our experiments, this coset should have roughly 216–11 = 32 elements, and the second most represented coset is unlikely to have more than six elements. As a consequence of the preceding discussion, we know that the coset x+ Vis mapped to y+ Wby the secret encryption function EK. This information can then be used to recover the cipher key Kwith a low computation cost, as detailed later in Sections 3 and 4.

To conclude this section, observe that no particular property of the key schedule has been used. It can be proven that each round of the key schedule preserves the linear partition L(i=011Wi), provided that the S-boxes Siare replaced with their secret equivalents Si. This implies that if two cipher keys Kand K′are in the same coset of i=011Wi, then we can approximate the probability that each pair of round keys k[i] and k′[i] are in the same coset of Wby (9443925240)723.5. However, for this property to be easily exploitable, the round keys ought to stay in the same coset of Vinstead of W(which can be simply achieved by switching the mappings Mand (S0||S1||S2||S3) in the key schedule). Therefore, if compared with our cryptanalysis, this property appears not to be very useful and was intentionally left as a wrong track.

2.2. The substitution layer

The nature of the hidden property of BEA-1 having been emphasized, this and the following sections detail the design of the cipher’s primitives and prove Theorems 4.2 and 4.4 stated above. As explained in introduction, these two sections are aimed at the reader who wants to understand how BEA-1 was made. For a first read, it is possible to jump directly to Section 3 explaining the basic principle of the cryptanalysis using the backdoor.

Let {0*} and {*0} denote respectively the subspaces {05}×F25and F25×{05}of F210. It should be noted that {*0} is a complement space of {0*} in F210. The design of each secret S-box Sirests on a permutation S'iof F210preserving the linear partition L({0*}). Following Theorem 3.5, we just need to choose a permutation ρiof {*0} and a family (τi,u)u{*0}of permutations of {0*}. Then, we define S'ifor all x= u+ vin F210by


where uis in {*0} and vin {0*}. The permutations ρiand τi,uwere selected following the method given in Section 3, in order to maximize the resistance of S'iagainst both differential and linear cryptanalysis.

Figure 1A in Appendix defines the linear mappings LViand LWi(for 0 ≤ i≤ 4) over the standard basis of F210. It is worthwhile to note that these mappings are automorphisms of F210. Moreover, LVi({0*})=Viand LWi({0*})=Wi. By virtue of Proposition 2.15, we know that LVimaps L({0*})to L(Vi)and that LWimaps L({0*})to L(Wi). Last, but not least, define for each 0 ≤ i< 4 the secret S-box Siby


These S-boxes are given in Figures 4A, 6A, 8A and 10A in Appendix. Obviously, (LVi)1maps L(Vi)to L({0*}), then Sipreserves L({0*}), and LWimaps L({0*})to L(Wi). This implies the following proposition.

Proposition 4.6. For each 0 ≤ i< 4, the secret S-box Simaps L(Vi)to L(Wi).

Remark 4.7. If the reader is interested in an explicit definition of the permutations ρiand the families of permutations (τi,u)i{*0}, they can be recovered in the following way. First, compute Si=(LWi)1SiLViusing the tables of Figures 1A and 4A (or 6A, 8A, 10A). As noted previously, the permutation Sipreserves the linear partition L({0*}). To obtain its decomposition, we just have to follow the proof of Theorem 3.5. Thus, for each uin {*0}, define ρi(u) as the unique element of {*0}(Si(u)+{0*}). It is not hard to see that ρi(u) is simply equal to the element of F210, where the five leftmost bits are exactly the ones of Si(u)and the five remaining bits are all zero. Finally, for each uin {*0}, let τi,ube the permutation of {0*} defined by τi,u(v)=Si(u+v)+ρi(u). Again, τi,u(v) is just the 10-bit vector having its five leftmost bits all zero and its five rightmost bits identical to the ones of Si(u+v). Naturally, the permutations ρiand τi,ucan be seen as permutations of F25(instead of {*0} and {0*}) to obtain the more convenient definition


The modified S-boxes Sigiven in the specification of BEA-1 are such that Si(x) = Si(x) for almost all input xin F210. For instance, S0(x)=S0(x)for all except 80 elements xin F210. The images of these 80 particular points are emphasized in Figures 4A and 5A. These modifications were chosen so as to improve the differential and linear resistances of S0 compared to the original secret S-box S0. More generally, Siand Sihave 80 different images for iin {0,1,2}. The last-modified S-box S3 is less close to it secret equivalent since S3 and S3 have 99 different images.

Consequently, if xis uniformly distributed over F210, then the equality Si(x) = Si(x) holds with probability qiwhere


This implies that when xis uniformly distributed over (F210)8, the images of xunder the secret and the modified substitution layers are equal with probability q=(i=03qi)2.

Let p= p[0] be a plaintext. In the following, we use the notations of Theorem 4.4. If k[i] is uniformly distributed, then so is p[i] + k[i]. Thus, p[i+1]=Fk[i](p[i])is equal to p[i+1]=Fk[i](p[i])with probability q. Assuming moreover that the round keys are independent implies that the events p[i] = p[i] for each 1 ≤ i≤ 11 are independent. Therefore, the probability that the equalities p[i] = p[i] hold for all 1 ≤ i≤ 11 is given by q11. This discussion proves Theorem 4.4.

2.3. The diffusion layer

Some components used to design the linear transformation Mare defined over the finite field F25. In order to have an explicit construction of this field, we consider the irreducible polynomial X5 + X2 + 1 over F2and define F25as the quotient ring F2[X]/(X5+X2+1). Let αdenote the equivalence class of Xin F25. By construction, the equality α5+α2+1=0holds, or equivalently, α5=α2+1. Each element of F25can hence be uniquely written as i=04xiαiwhere (x4,…, x0) belongs to F25. More precisely, the family (αi)i<5is a basis of F25seen as a 5-dimensional vector space over F2. The field F25will then be identified with (F2)5via the isomorphism from F25to F25mapping (x4,…, x0) to i=04xiαi. For instance, the element α2 + α+ 1 in F25is identified with 07 in F25. Now define the 4 × 4 matrices MUand MVover F25by


It can be verified that these matrices are MDS. In other words, the [8, 4]-linear code having G=[Id4,MU]as generator matrix has minimal distance equals to 5, which is the maximum achievable.

Each of these matrices naturally induces an automorphism of (F25)4and hence of (F210)4. For instance, MUmaps the element x=(x0,x1,x2,x3)to x× MU. Observe that we chose to see elements of (F210)4as row vectors to keep the common notations of linear codes.

Example 4.8. To illustrate these notations, let us compute the image of the element x=(00,02,00,00)of (F210)4under the automorphism induced by MU. First, xis identified with the element (0, α, 0, 0) of (F25)4. Then,


Therefore, (00,02,00,00)×MU=(1B,0D,0F,18).▴

As was the case for the secret S-boxes Si, the linear transformation Mrests upon the linear transformation M′ defined as follows


where ρ(u)=u×MUand τu(v)=v×MV+PUV(u). The strength of this construction is that M′ inherits the linear and differential branch numbers of MUand MV, as stated in the proposition hereunder. But first, we introduce the following example.

Example 4.9. Let us compute the image of x=(000,070,000,000)under M′. As a first step, observe that xcan be written as


where u=(00,03,00,00)and v=(00,10,00,00). Let e9=(00,02,00,00)and e10=(00,01,00,00). Then u=e9+e10, it is indeed its decomposition over the standard basis of (F25)4. Thus, for any linear mapping L, it holds that L(u)=L(e9)+L(e10). The image of uunder ρcan hence be computed by


In the same way,


Consequently, M(x)=(0408,1911,1A1A,141E)=(088,331,35A,29E).▴

Proposition 4.10. The linear and the differential branch numbers of M′are both equal to 5. Thus, M′is a perfect diffusion layer.

Proof. Let x=(uivi)i<4be a nonzero element of (F210)4. In order to prove that the differential branch number of M′is equal to 5, we need to show that w10(x)+w10(M(x))is greater than or equal to 5. First, assume that u=(ui)i<4is nonzero. Using the fact that MUis MDS, we obtain the inequality w5(u)+w5(u×MU)5. Next,


Now, suppose that u= 0. It must be the case that v≠ 0 as xis nonzero by definition. Again, it holds that w5(v)+w5(v×MV)5because MVis also MDS. Then,


We have proven that w10(x)+w10(M(x))5for any nonzero element xof (F210)4. Consequently, the differential branch number of M′is greater than or equal to 5. The equality BD(M)=5follows as 5 is the maximum achievable. Similarly, it can be proven that M′has also the maximum linear branch number. It follows that M′is a perfect diffusion layer and the result is proven. ▪

Recall that the notation {0*} denotes the subspace {05}×F25and that the linear mappings LViand LWi(see Figure 1A) map respectively L({0*})to L(Vi)and L({0*})to L(Wi). It is then easily seen that M′maps {0*}4 to itself. Thus, M′preserves the partition L({0*}4)by Proposition 2.15. Finally, define


From its definition, it is straightforward to check that Mmaps the linear partition L(i=03Wi)to L(i=03Vi).

Example 4.11. We are going to compute M(000,080,000,000). First, we have that


Then, the image of (000,070,000,000)under M′is (088,331,35A,29E), as already established in Example 4.9. Finally,


Indeed, LV0(088)=LV0(080)+LV0(008)=21D+343=15E. The three other bundles are computed in the same manner.▴

Because each mapping LVior LWioperates on different bundles and is invertible, it is clear that the linear and differential branch numbers of Mare the same as M′. This discussion completes the proof of the following corollary.

Corollary 4.12. The linear mapping Mis a perfect diffusion layer, which maps L(i=03Wi)to L(i=03Vi).

In conclusion, Proposition 2.13 ensures that any key addition preserves all the linear partitions, and hence it preserves L(V). Next, it has been proven in Section 2.2 that every secret S-box Simaps L(Vi)to L(Wi). Thus, the secret substitution layer maps L(V)to L(W). It is clear that theShiftRowsoperation is linear and maps Wto itself. According to Proposition 2.15, this mapping preserves L(W). Finally, theMixColumnoperation maps L(W)to L(V)by Corollary 4.12. This discussion is summarized in Figure 4.3 and proves Theorem 4.2 previously given in Section 2.1.

3. Main idea of the cryptanalysis

As we have seen in Section 2.1, the cipher BEA-1 does not map a linear partition to another one but behaves as though it did for a nonnegligible fraction of the message space. This nontrivial property can be used to recover the cipher key in an operational cryptanalysis. But before considering the full cipher, we give the main idea of this attack.

3.1. A detailed example

To explain how to take advantage of this backdoor, we introduce a toy example. First, let us mention that all the notations of this section are independent of the remainder of this chapter. The message space of this toy cipher is simply F26. Then, consider the subspaces Vand Wof F26defined by


Thus, L(V)={x+V|x{00,04,08,0C}}and L(W)={y+W|y{00,08,20,28}}.

Let Sbe the permutation of F26given in Figure 4.4. We defined another permutation Sof F26satisfying S(x) = S(x) for any input xin F26except00, 01, 04, 05, 08, 09, 0Cand0D. The images of these eight specific points under Sare also given in Figure 4.4. By analogy with Section 2, the permutation Srepresents the secretS-box used to design the trapdoor whereas Srepresents the modifiedS-box given in the specification of the algorithm. Lastly, define the following keyed mappings


Figure 4.4.

The theoretical and the modified S-boxes.

representing respectively the secret and the modified round functions. Naturally, the key kcan be any element of F26.

It can be easily verified that the secret S-box Smaps L(V)to L(W). In fact, we have that


In contrast with the secret permutation S, the modified S-box Sdoes not map L(V)to L(W). However, the equality S(x) = S(x) holds with probability 56/64assuming that xis uniformly distributed over F26. This can be stated equivalently as


It should also be noted that this statement remains valid when considering their inverse mappings, that is #{yF26|S1(y)=S1(y)}=56. Indeed, if xis an element of F26such that S(x) = S(x), then y = S(x) satisfies the equality S−1(x) = S−1(y). As a consequence,


The converse inequality can be proven in the same way, establishing the equality.

Now, consider the subset Pof F26defined hereinafter. We assume that the round key is k=37. The image of Punder Sand its encryption with F37 are given below.

It should be stressed that the coset04+ Vis significantly more represented in Pthan any other coset of V. Since F37(P)maps the linear partition L(V)to L(W), the messages belonging to the same coset of Vare all mapped to the same coset of W. Therefore, the most represented coset of Win F37(P)has also ten elements.

As we have seen above, the modified round function F37does not map L(V)to L(W). Figure 4.5 displays the differences between the encryption of Pwith F37and its encryption with F37by highlighting the messages xin Psuch that S(x) ≠ S(x) (that is04, 05, and0D) and their images throughout the encryption.

Figure 4.5.

Encryption withF37andF37.

To explain these differences, let us first consider the set Qof the ten messages lying in both Pand04+ V. Knowing that the equality S(x) = S(x) holds with probability 56/64when xis uniformly distributed, it seems reasonable to assume that only 10×56/64=8.75messages of Qwill remain in the same coset when computing their images under S. By comparing with the actual messages in Q, we can see that this is a good approximation since eight messages in S(Q)belong to the same coset of W.

Needless to say, there are also eight messages in F37(Q)lying in the same coset of Wbecause the key addition preserves L(W).

We focus now to the set Pas a whole. According to the discussion above, we know that the most represented coset of Win F37(P)has at least eight elements. We have seen that the images under Sof messages lying in the same coset may not stay together. Nonetheless, the converse can also be true, and messages in different cosets may end up in the same coset. This is exactly what happens with the message0D, as illustrated in Figure 4.5. Consequently, the most represented coset in F37(P)has actually nine elements.

The fact that the most represented coset may not only lose but occasionally retrieve elements should be seen as a side effect. Its impact remains low when

  • one coset has significantly more elements than all other cosets (say at least 5 times more), and

  • when the number of messages is lower than the total number of cosets.

We must nevertheless keep this fact in mind to understand why the right key will not necessarily have the best score.

It is now time to explain how to recover the round key using only the set C=F37(P)of encrypted messages. First, we have to determine the most represented coset in C. In our example, this coset is08+ Wwith nine messages, and u=08is one of its representatives.

Now, assume that kis the round key used to encrypt C. We need to find the coset of Vwhich is mapped to u+ Wby the secret round function Fk. According to Lemma 2.8, Fkmaps t+ Vto Fk(t) + W. A representative of this coset of Vis then t= S−1(u+ k). Finally, the scoreof the guessed key kis the number of messages Fk1(c)=S1(c+k)that belong to the theoretical coset t+ V, that is to say


Figure 4.6 illustrates the scoring process applied to the right key (37) and to a wrong key (07). We naturally recover the set Pand the coset t+V=34+V=04+Vwhen using the right key. Thus, the score of k=37is equal to 10. In the same way, the score of k=07is the number of decrypted messages in the coset t+V=32+V=00+V, so score(07)=8.

Figure 4.6.

Decryption with the right key and with a wrong key.

Let us now explain why a wrong key tends to have a lower score than the right key. First, the addition of the wrong key randomizes the cosets and the messages within. Recall that when the input xis uniformly distributed, the equality S−1(x) = S−1(x) holds with probability 56/64. The most represented coset after the addition of the wrong key should then lose some elements by applying S−1. Thus, the score of any wrong key should be lower than or equal to 8.

It goes without saying that the previous discussion gives just the main idea of the cryptanalysis. For some wrong keys, the side effects are significant, and their scores can even be higher than the score of the right key, as shown in Figure 4.7. Indeed, the key 37 is one the four best keys but is not the one that has the highest score (0B). For this reason, we will not only return the best key but also the NbCand candidate keys having the highest scores when running this cryptanalysis.

Figure 4.7.

The scores for each key.

3.2. Formalization of the attack

The aim of this section is to formalize and to generalize the cryptanalysis introduced previously in Section 3.1. As we have just seen, this attack really begins in Figure 4.6. The very first data needed is the set Ccontaining the encrypted messages under the unknown key, given by


Naturally, Cis included in the set c=F26of all possible ciphertexts. Similarly, the set of all possible round keys is denoted by k=F26. Next, define the keyed mapping


Each mapping Gk:cG(k,c)is the inverse of the round function Fk. The secret counterpart of Gis G:(k,c)S1(c+k). Observe that for each round key k, the mapping Gkmaps L(W)to L(V). It is also necessary to know the most represented coset u+ Win C. Using these notations, the cryptanalysis is formalized in Algorithm 3. Finally, to include potential information on the round keys, this attack processes only a subset Kof k.

More generally, the parameters can be outlined as follows.

  • The sets of all possible keys and ciphertexts are referred to as kand c.

  • The keyed mapping G:k×cEtypically undoes (or partially undoes) one or two rounds of the encryption process.

  • Its secret counterpart is denoted by G:k×cE. It is assumed that Gkmaps a linear partition L(W)to another partition L(V)no matter the key kused.

  • The set of the given ciphertexts is denoted by C. The set of the keys that must be scored by this attack is denoted by K.

  • It is assumed that there is a coset of Wcontaining significantly more ciphertexts than any other coset. The element uof cis a representative of this coset.

  • Finally,NbCandis the number of candidate keys to return.

Remark 4.13. Taking a closer look at Algorithm 3, we can see that the structureCandrequires an efficient way to remove the lowest scored key. In our implementation,Candis a sorted array of couples (s, L) where Lis a list containing the keys having the score s. Since there are very few different scores, the sorted insertion inCandis (almost) in constant time. Removing the lowest scored key is also in constant time. Thus, the time complexity of this cryptanalysis is O(#K×#C).

4. Cryptanalysis of BEA-1 using the backdoor

The algorithmSelectKeys(see Algorithm 3) detailed into the previous section enables recovery of information on the last round key, using the fact that the round function acts as a function mapping a linear partition to another one with high probability. In this section, we explain how this algorithm can be used to recover the full 120-bit cipher key in just a few seconds on a laptop computer.

This cryptanalysis requires N = 216 chosen plaintexts and their corresponding ciphertexts encrypted under one unknown cipher key K. As BEA-1 operates on 80-bit blocks, this amounts to 2 × 640 KiB of data. The plaintexts only need to be uniformly chosen in one coset of V, and there is no requirement on the cipher key.

Our cryptanalysis is naturally divided in five distinct parts. First, we give a brief overview of each part. By hypothesis, all the plaintexts are in the same coset of V. As explained in Section 2.1, a coset of Wshould be more represented among the ciphertexts. The first part is aimed at finding a representative uof this coset. The second part consists in using the algorithmSelectKeysto find 215 candidates for the full 80-bit last round key k[11]. Next, relying on a property of the key schedule,SelectKeysis applied to these 215 candidates to find the right last key in a third part. So far, we have recovered 80 bits of the cipher key. Knowing the last round key, it is then possible to undo the last round of each ciphertext. The fourth part is really close to the first one and provides 215 candidates for the 40 remaining bits. Finally, deduce the 215 candidate cipher keys from k[11] and the preceding candidates. The last part involves testing these cipher keys on the plaintext/ciphertext pairs available to find the right one.

The presentation of our cryptanalysis is structured as follows. First, we provide the full attack in Algorithm 4. Then, each part of this algorithm is detailed in one dedicated section. It should be noted that we keep the notations of Section 2 (and not those of Section 3) in the remainder of this chapter. This work has been presented at the RusKrypto 2017 conference [31].

4.1. Part 1: finding the right output coset

Let Pdenote the set of the 216 plaintexts uniformly chosen in one coset of Vand let C={EK(p)|pP}denote the set of their ciphertexts. As said previously, we first need to find the most represented coset of Win C. Let Uibe the subspace of F210defined by Ui=LWi({*0})for each 0 ≤ i< 3. Since {*0} is a complement space of {0*} and LWiis an automorphism, we know that Uiis a complement space of LWi({0*})=Wi. Define Uas the subspace i=07Uimod4of (F210)8. Of course, Uis a complement space of W.

Let cbe a ciphertext and u=(ui)i<8be in U. Because both Uand Ware product spaces, it is easily seen that uis the unique representative in Uof the coset c+ Wif, and only if, ciand uiare in the same coset of Wimod4 for each i< 8. We deduce the following efficient way to compute the representative in Uof the coset c+ W. First, precompute the four tablesRepWisuch that, for each xin F210, RepWi[x]gives the representative in Uiof x+ Wi. These tables are just arrays of 1024 integers. Then, the representative of c=(ci)i<8is just u=(RepWimod4[ci])i<8.

To find the most represented coset of Win C, we first compute the representative in Uof each ciphertext as described above. Then, we search for the representative that occurs the most. Any naive algorithm should work since there are only 215 representatives.

4.2. Part 2: obtaining candidates for the last round key

This part is intended to find candidates for the last round key k[11] using the algorithm SelectKeys (see Algorithm 3) to undo the last round of BEA-1. However, if this algorithm is naively applied, then the last round has to be undone for each of the 216 ciphertexts and 280 possible values of k[11], yielding an order of 296 time complexity.

To solve this problem, the 215 candidates for k[11] are obtained bundle by bundle, as illustrated in Figure 4.8. First, we partially decrypt the bundles of index 3 and 7. We begin by these bundles since they both involve the S-box S3, being the most different from its secret equivalent. Following the notations ofSelectKeys, the set containing the ciphertexts is C{3,7}={(c3,c7)|cC}, and the set of the keys is K{3,7}={(k3,k7)|k3,k7F210}. The mapping used to partially decrypt the last round of these ciphertexts is