key: cord-0061124-olpu4yvq
authors: Bao, Zhenzhen; Guo, Chun; Guo, Jian; Song, Ling
title: TNT: How to Tweak a Block Cipher
date: 2020-03-25
journal: Advances in Cryptology - EUROCRYPT 2020
DOI: 10.1007/978-3-030-45724-2_22
sha: 7cbff1ffa67b008b7ba51ddb8ffbf03f230ec1ff
doc_id: 61124
cord_uid: olpu4yvq

In this paper, we propose Tweak-aNd-Tweak ([Formula: see text] for short) mode, which builds a tweakable block cipher from three independent block ciphers. [Formula: see text] handles the tweak input by simply XOR-ing the unmodified tweak into the internal state of block ciphers twice. Due to its simplicity, [Formula: see text] can also be viewed as a way of turning a block cipher into a tweakable block cipher by dividing the block cipher into three chunks, and adding the tweak at the two cutting points only. [Formula: see text] is proven to be of beyond-birthday-bound [Formula: see text] security, under the assumption that the three chunks are independent secure n-bit SPRPs. It clearly brings minimum possible overhead to both software and hardware implementations. To demonstrate this, an instantiation named TNT-AES with [Formula: see text] , [Formula: see text] , [Formula: see text] rounds of AES as the underlying block ciphers is proposed. Besides the inherent proven security bound and tweak-independent rekeying feature of the [Formula: see text] mode, the performance of TNT-AES is comparable with all existing TBCs designed through modular methods.

Together with the development of authenticated encryption (AE) in CAESAR competition [1] and the on-going lightweight cryptography competition [64], tweakable block ciphers (TBC) are playing a more and more important role. Besides the plaintext, TBCs take a tweak as an additional input, which can be viewed as an index to the underlying block cipher, so it becomes a family of (independent) block ciphers v.s. a single instance of block cipher. Its formalization is motivated by the needs of (more than one) independent block ciphers in some modes, e.g., OCB [67] , while using multiple independent ciphers or keys could cause efficiency issues. In contrast, using a TBC that typically lends itself to very efficient (both software and hardware) implementations, a new instance of block cipher could be obtained by simply choosing a new value of the tweak.

Beyond-Birthday-Bound Security. Most of the current (tweakable) block cipher standards have a block length of 128 bits or less, providing a security level at most 64 bits when instantiated in designs offering only birthday-bound security. Such a security level has become largely inadequate [35] . Even worse, in order to save hardware implementation costs, many lightweight block cipher designs tend to have a smaller block length like 64 bits, providing a birthday security of 32 bits only. Hence, the needs of modes providing BBB security are emerging, and the same has been observed by Gueron and Lindell [35] and in this whitepaper [2] .

There are two different ways to construct TBCs. Following the modular approach, they can be built from classical block ciphers via various modular constructions, and security is ensured by a reduction to that of the underlying block ciphers. Alternatively, one could appeal to (probably more efficient) dedicated algorithms, the security guarantees of which come from comprehensive cryptanalysis. Below we'll review both methods.

A classical popular approach is to construct TBCs from existing (traditional) block ciphers in a black-box fashion. Such proposals are further divided into two classes. The "old school" approach, initiated by Liskov et al. [54] , works in the so-called standard model, models the underlying block cipher as a pseudorandom permutation. The "new school" approach recently popularized by Mennink [56] models the block cipher as an ideal cipher. The two approaches deviate not only in their security assumptions, but also in their design philosophies. Concretely, standard assumption-based constructions typically tried to avoid tweakdependent rekeying, which were deemed as (arguably) costly. Another shortage of rekeying is the unavoidable "hybrid security loss" in their security bounds [58, 69] (some withstand this loss using carefully-chosen parameters [17, 61] ). Such a loss doesn't appear in the ideal cipher model, and this is leveraged by many constructions for good bounds and efficiency at the same time. Indeed, ideal cipher-based TBCs have achieved ≥ n-bit security within 1 or 2 cipher-calls [43, 53, 77] .

In this paper we follow the standard model. In this respect, the original Liskov et al.'s paper [54] proposed two constructions that were subsequently named LRW1 and LRW2 by Landecker et al. [51] . The former is based on a block cipher E with key space K E and message space {0, 1} n , and is defined as

where (K, K ) ∈ K E × K E is the key, T ∈ T is the tweak, and X ∈ {0, 1} n is the message. Unfortunately it is only CPA secure up to a tight birthday bound, i.e., 2 n/2 adversarial queries. Actually, achieving CCA security was an important motivation for their second proposal LRW2, which is based on a block cipher E and message space {0, 1} n and an almost XOR-universal (AXU) family of hash functions H = (H K ) K∈KH from some set T to {0, 1} n , and defined as

where (K, K ) ∈ K E × K H is the key. This construction was proved CCA secure in [54] up to a tight birthday bound. To seek for beyond-birthday-bound (BBB) secure TBCs, pioneered by Landecker et al. [51] , subsequent works studied cascade of LRW2 (with independent underlying keys): its 2-cascade was first proved secure up to about 2 2n/3 queries [51] and latter improved to a tight bound of 2 3n/4 queries [44, 59] , while its r-cascade for general r was proved secure up to roughly 2 rn r+2 adversarial queries. A somewhat independent series of works considered tweakable Even-Mansour (TEM) ciphers that are built upon public random permutations [18, 20, 57] , which could also be instantiated with fixed-key block ciphers. It is important to note their security is only provable in the ideal (permutation) model.

The Tweakey framework was introduced in 2014 by Jean et al. [41] , which provides a general guideline for TBC designs. The core idea is to treat the key and tweak in the same way during the primitive design process so that the cryptanalysis can be unified, and becomes simpler than before. So the word "tweakey" is invented to reflect the combined input of tweak and key. Following tweakey framework, various dedicated algorithms such as the Deoxys-BC in the Deoxys AE design [42] , SKINNY [7] , and Kiasu [40] have been proposed. In detail, SKINNY takes lightweightness into account, and hence makes use of lightweight linear layer-0/1 matrices-almost MDS rather than MDS, although it still follows AES-like design strategy. Up to date, Deoxys is one of the finalists of the CAESAR competition and SKINNY is one of the lightest TBCs in terms of area in the optimized hardware implementations.

When the tweak length is long, TBC-based designs [3, 38] can take advantage of its efficiency to process additional input such as associated data. There is also a recent direction of designing TBCs of short tweaks to offer a small family of yet independent block ciphers [12] , where tweaks are mainly used as domain separators in the design of authenticated encryption schemes.

It is well-known that, to hide the key of a block cipher, it requires several iterations of the simple round functions. Since Tweakey framework does not distinguish key and tweak, the tweak input has been iterated the same amount of rounds as well. We notice that, rather than hiding, the functionality of a tweak is no more than an index to the block cipher in most of use-cases, and are even assumed to be under attacker's full control in some cryptanalytic settings. Hence, the required level of "protection" for a tweak is essentially lower than that for the key. Inspired by this observation, a natural question to be asked is: what is the minimum number of iterations (or tweak addition) required to produce a secure TBC (especially those with BBB security), with provable security. We seek for an approach slotting between the above two and (hopefully) enjoying the advantages of both, i.e., achieving (some level of) provable guarantees and high efficiency at the same time. Our result is a proposal of a new design of dedicated TBCs based on AES. Our approach is "prove-then-prune", i.e., proving security and then instantiating with a scaled-down primitive (a reduced-round block cipher), that has been used in symmetric designs for a long time, see e.g., [60] (while the terminology was due to Hoang et al. [37] ). Below we elaborate in detail.

TNT: A New TBC Construction with BBB Security. Our starting point is a new block cipher-based TBC construction with provable BBB security. Concretely, the idealized version of our mode is built upon three secret independent random permutations π 1 , π 2 , and π 3 , and is defined as

as pictured in Fig. 1 . We term our mode as TNT, meaning Tweak-aNd-Tweak. It can also be viewed as a cascaded LRW1 TBC construction (if we "split" π 2 into two permutations, then the scheme turns into a cascade of two LRW1 constructions).

The TNT π 1 ,π 2 ,π 3 mode with the notations (for the intermediate values) used in this paper.

While the original (two-permutation-based) LRW1 construction was proved CPA secure up to birthday 2 n/2 queries and it turns out to be tight, the security of TNT (or cascaded LRW1) remains as a long-standing open problem. In this paper, using the χ 2 technique recently proposed by Dai et al. [24] , we prove the idealized TNT construction is CCA secure up to BBB 2 2n/3 queries. To our knowledge, this constitutes the first "non-trivial" application of the χ 2 technique to domain expanding constructions, and our proof thus demonstrates relevant issues and their solutions.

We refer to Table 1 for a summary of comparison to existing TBC constructions (we omit the TEM ciphers as they either appear a bit theoretical or are specific for sponges [57] ). It is rather difficult to make a comparison with the ideal cipher-based designs [43, 53, 56, 77] . In general, they achieve ≥n bits security (as mentioned) at the expense of a smaller safety margin (similar concern has been raised in other settings [36] ). Also, their provable bounds should be interpreted with a bit of caution [58] . In terms of efficiency, it is widely believed that tweak-dependent rekeying used in the above designs as well as [61] is a bit costly, particularly when AES-NI is available. It appears that LRW2 and its cascades are the closest designs. In short, while LRW2 and CLRW2 accept long tweaks, their uses of AXU hash are expected to result in a lower efficiency when n-bit tweaks already suffice. The additional requirement of AXU hash usually results in lower software efficiency and/or higher gate counts as additional registers and operations are needed.

Instantiation from AES. To take the advantage of the AES-NI for better software performance, it is natural for us to instantiate TNT with AES. To further improve the software performance, we reduce the number of rounds of each of the permutations π 1 , π 2 , and π 3 to 6, 6, and 6 rounds respectively (rather than the full AES itself), which are named TNT-AES. Although, it is not possible to assume the round-reduced AES to be ideal any more, we show, through comprehensive cryptanalysis, the security of TNT-AES are sound. Similar design strategy was introduced by Hoang et al. [37] and used in the design of AES-PRF [60] by Mennink and Neves. The estimated performance shows, with help from AES-NI, TNT-AES is among the fastest TBCs in software, and in some cases it can be implemented as light as AES itself in area constrained hardware environment thanks to the simplicity of TNT, smaller than most of the existing TBCs.

Organization. The rest of the paper is organized as follows. Section 2 gives the preliminary necessary for the introduction of the new mode in Sect. 3. The security TNT is proven in Sect. 4. Section 5 proposes a concrete design following TNT based on AES, and finally Sect. 6 concludes the paper.

For a finite set X , X $ ← − X denotes selecting an element from X uniformly at random and |X | denotes its cardinality.

A tweakable permutation with tweak space T and message space M is a mapping Π : T ×M → M such that for any tweak T ∈ T , X → Π(T, X) is a permutation of M. We denote TP(T , n) the set of all tweakable permutations with tweak space T and message space {0, 1} n . A tweakable block cipher with key space K, tweak space T , and message space M is a mapping TBC : K×T ×M → M such that for any key K ∈ K, (T, X) → TBC(K, T, X) is a tweakable permutation in TP(T , n).

A secure TBC should be indistinguishable from a tweakable random permutation. As our mode TNT is specified in an idealized manner, our security definition is also given for such cases. For this, we denote P(n) the set of all n-bit permutations. By default, we always allow D to make forward and inverse queries to its tweakable permutation oracle (though we do not write this explicitly). With these, for the TBC construction C π1,...,πr built upon r independent secret n-bit permutations, we define the advantage of any distinguisher D breaking its strong tweakable pseudorandomness (STPRP) as And for any non-negative integer q, we define the insecurity of C π1,...,πr as Adv stprp

where the maximum is taken over all distinguishers D making exactly q queries to the oracle.

The above definition focuses on the information-theoretic setting. Later in Sect. 5 we will instantiate the multiple secret permutations π 1 , . . . , π r with multiple "independent" block ciphers E 1 , . . . , E r using the same secret key K (thus the key space does not increase with the number of permutations). Proving the indistinguishability of such two systems (π 1 , . . . , π r ) and ((E 1 ) K , . . . , (E r ) K ) seems out of reach of current techniques (note that existing works typically instantiated π 1 , . . . , π r with the same block cipher using r independent keys K 1 , . . . , K r , which deviates from us). As such, our mode TNT will be specified only in the idealized manner.

For the proof, we will employ the χ 2 method of Dai et al. [24] . We recall this technique here. Below we mainly follow Dai et al.'s notations (with some necessary supplementaries borrowed from Chen et al. [13] ). Concretely, consider two stateless systems S 0 and S 1 (e.g., S 0 and S 1 may be the tweakable random permutation Π and the TNT construction TNT π1,π2,π3 respectively) and any computationally unbounded deterministic distinguisher D that has query access to either of these systems. The distinguisher's goal is to distinguish the two systems. It is well-known that, the distinguishing advantage Adv S0,S1 (D) is bounded by the statistical distance p S0,D (·) − p S1,D (·) , where p S0,D (·) and p S1,D (·) are the respective probability distributions of the answers obtained by D. The χ 2 method concerns with bounding p S0,D (·) − p S1,D (·) . To this end, if we denote the maximum amount of queries by q, we can define a transcript Q = (τ 1 , . . . , τ q ) with τ i = (T i , X i , Y i ), and let Q = (τ 1 , . . . , τ ) for every ≤ q. The distinguisher D can make its queries adaptively, but as it makes them in a deterministic manner, the -th query input is determined by the first − 1 query-responses Q −1 .

For system S b with b ∈ {0, 1} and fixed tuple Q −1 , we denote by p S b ,D (Q −1 ) the probability that D interacting with S b yields transcript Q −1 for its first −1 queries. If p S b ,D (Q −1 ) > 0, then we denote by p S b ,D (R | Q −1 ) the conditional probability that D receives response R upon its -th query, given transcript Q −1 of the first − 1 queries (that deterministically fixes the -th query). Define for any ∈ {1, . . . , q} and any query-response tuple Q −1 :

where the sum is taken over all R in the support of the distribution p S0,D (· | Q −1 ). The χ 2 method states the following: Lemma 3] ). Consider a fixed deterministic distinguisher D and two systems S 0 , S 1 . Suppose that for any ∈ {1, . . . , q} and any query-response tuple Q , p S0,D (Q ) > 0 whenever p S1,D (Q ) > 0. Then:

where the expectation is taken over Q −1 of the − 1 first answers sampled according to interaction with S 1 .

In this section, we describe our mode TNT. As discussed in Sect. 2, we only give its idealized description, which is built upon secret random permutations rather than efficient block ciphers.

Concretely, TNT is built upon three independent secret random permutations π 1 , π 2 , and π 3 , and is formally defined as

4 Security Proof for TNT Mode

Proof. In our proof, S 0 denotes the tweakable random permutation Π, while S 1 denotes the TNT π1,π2,π3 TBC. The condition stated in Lemma 1, i.e., ∀Q , p S0,D (Q ) > 0 whenever p S1,D (Q ) > 0, is clearly satisfied. Given Q −1 , let T be the tweak of the -th query (note that it is determined by Q −1 ). It is easy to see that, regardless of the direction of this query, it holds

The real world probability p TNT,D (R | Q −1 ) however depends on the concrete state of the -th query and Q −1 , for which we distinguish eight cases as follows.

Case 1: the -th query is forward TNT(T , X ) → Y , and X ,

where the sum is taken over all the vectors of intermediate values

that are possible to appear given Q −1 . Now, for a certain intermediate vector Inter, it can be seen that there are three possibilities, according to which we divide all intermediate vectors into three disjoint classes A, B, and C:

• i.e., the vector Inter specifies S and W as the values corresponding to X and Y , as well as a input-output relation on π 2 (subsequently abbreviated as

• i.e., the two corresponding values U = T ⊕ S and V = T ⊕ W (as before) are "contradictory" to Inter: there exists a π 2 -relation (

By these, we have

With this, we derive upper and lower bounds as follows.

The Upper Bound: It's easy to see β(Inter) ≤ − 1. By this and Eq. (7), it holds 

and (W 1 , . . . , W −1 ). Out of these ≥ 2 2n /4 choices, the number of choices that ensure the desired property TNT(T , X ) = Y is at most − 1, which results from the following selection process: we first pick a pair of input-output

The Lower Bound: It can be seen β(Inter) ≥ μ , since every previous query under the tweak T gives rise to a unique pair

). Therefore, still from Eq. (7), we have

As before, out of the (2 n − α(Q −1 ))(2 n − γ(Q −1 )) choices of (S , W ), the number of choices that ensure the desired property

Summary. In all, in the first case, we have 

By these, we have

.

The Upper Bound: For this we need to consider

Then it can be seen

.

The Lower Bound: Still by Eq. (13), for any Inter ∈ A we have

where GW ("good W set") is the set of W such that:

It can seen that |GW| ≥ 2 n − − + μ = 2 n − 2 + μ : the reason is, for any

By these and Eq. (12), we have

To bound Pr[Inter ∈ C | Q −1 ], note that if Inter ∈ C, then there exists

as the lower bound. Further note that

Summary. In all, in the second case, we have

Case 3: the -th query is forward TNT(T , X ) → Y , and X / ∈ Q −1 , Y ∈ Q −1 . The analysis is similar to Case 2 by symmetry, resulting in the same bound

Case 4: the -th query is forward TNT(T , X ) → Y , and X , Y / ∈ Q −1 . The analyses for this case heavily resemble Case 2. First, the same upper bound

where GS is the set of S such that:

and GW is the set of W such that:

for which

Therefore,

To conclude, when the -th query is forward, from Eqs. (11) , (14) , (15) , and (16) we have

The remaining Cases 5, 6, 7, and 8 concern with the case where the -th query is backward, and the analyses are similar to Cases 1, 2, 3, and 4 by symmetry, resulting in the same bound

Consequently,

which implies Eq. (6) by Lemma 1.

In this section, we propose our instantiation of the TNT construction based on AES, which allows fast software implementations when AES-NI are available. We call the instantiation TNT-AES. To also enjoy the long-standing security of AES, we try to make minimum possible modifications over AES. Following these considerations, we only extend the number of rounds without any modification to its round function or key schedule, and pick the respective numbers of rounds for the three permutations π 1 , π 2 , and π 3 so that the design is secure against all relevant attacks. More explicitly, when the tweak T = 0, TNT-AES simply becomes AES with more rounds, which clearly leaves higher security margins over AES. Besides, we let the last round be complete instead of missing the MixColumns operation. In the remainder of the section, we give the description of TNT-AES, followed by a comprehensive cryptanalysis, and a comparison of software and hardware performances against other existing TBCs with similar security levels.

The Advanced Encryption Standard (AES) [23] is an iterated block cipher with block size 128 bits and secret key sizes 128, 192, and 256 bits. The internal state of AES, as well as the round keys, can be represented as a 4 × 4 matrix whose elements are byte value (8 bits). The round function consists of four basic transformations in the following order (see Fig. 2 ):

-SubBytes (SB) is a nonlinear substitution that applies the same S-box to each byte of the internal state. The key schedule of AES transforms the master key into subkeys that are used in each of the rounds. Here, we describe the key schedule of AES-128. 

The i-th round key is the concatenation of 4 words

RotByte is a cyclic shift by one byte to the left, and Rcon are the round constants defined as

where '·' denotes multiplication in GF(2 8 ) with irreducible polynomial x 8 + x 4 + x 3 + x + 1.

Although AES-128 consists of 10 rounds, it can be naturally extended to more rounds, each composed of all 4 transformations (AddRoundKey • MixColumns • ShiftRows • SubBytes), and the pre-whitening key addition to the first round is kept as it is. Then, TNT-AES[n 1 , n 2 , n 3 ] is defined to be the extension of AES to (n 1 +n 2 +n 3 ) rounds, i.e., π 1 , π 2 , π 3 are of n 1 , n 2 , n 3 full AES rounds respectively, and the 128-bit tweak is XOR-ed into the internal state at the output of π 1 and π 2 . It is natural to set n 1 = n 3 due to the symmetry of the design. Concretely, we define TNT-AES [6, 6, 6] , and will use TNT-AES to denote this choice for the sake of simplicity. We will justify the round numbers in the security analysis below.

In this subsection, we give our preliminary cryptanalysis against TNT-AES. As TNT-AES consists of 18 rounds in total, which is 8 more rounds than AES-128, we expect higher security margins of TNT-AES when the tweak is treated as a given constant. Hence, we focus on only the cases where the tweaks help the attack from cryptanalysts' point of view, i.e., it is assumed the tweak is under the attacker's full control (open tweak), and possibly extends the existing attacks against round-reduced AES. Under such a setting, we verify the most efficient attacks in terms of number of attacked rounds, against TNT-AES and claim the absence of key-recovery attack against the full TNT-AES in the single-key setting. While we do not claim security under the related-key setting for TNT-AES due to lack of security proof for TNT in such setting, our preliminary cryptanalysis below shows that there is no key-recovery attack either.

Following the proven security bound of TNT, TNT-AES offers 2n/3-bit security, i.e., there exists no key-recovery attack, given that the data (the combination of tweak and plaintext with no restriction on individual input) and time complexities are bounded by 2 2·128/3 2 85 . Due to the fact that there is no attack against TNT matching the 2 2n/3 bound, all our security analysis against TNT-AES are following the 2 n = 2 128 bound for both data and time. This allows TNT-AES offering higher security strength should a better than 2n/3-bit bound be proven for TNT. In summary, we claim that there is no shortcut attack on TNT-AES better than the generic attacks against the corresponding TNT mode.

In what follows, explicit security margins are given under each attack method whenever possible. Before moving to the individual attack methods, an overview of the impact of the tweak to the security at model level is given as follows. As mentioned above, the security margin will be higher for TNT-AES when tweak is a given constant, and we call such a tweak inactive. When the tweak is active, it may be used to cancel differences in differential attack, or to be used as the source of input structure in integral attacks. Under the single-key setting, the activeness of the round functions will be consistent within each of the three permutations π 1 , π 2 , and π 3 . This allows us using 0/1 to denote the activeness of the permutations with 1 for active (0 for inactive), and a simple exhaustive search shows there are activity patterns {(0, 1, 0), (0, 1, 1), (1, 1, 0), (1, 0, 1), (1, 1, 1)} for differential attacks, and {(1, 1, 1), (1, 1, 0), (0, 1, 1)} for integral attacks and alike.

Differential and Linear Attacks. In the single-key setting, we will employ the known results of 4-round AES to justify the security of TNT-AES. It is wellknown that there are at least 25 active S-boxes in 4 rounds of AES, which makes sure that there exists no 4-round differential characteristic (resp. linear approximation) with differential probability (resp. linear correlation) higher than 2 −6×25 (resp. 2 −3×25 ) [22] . For the maximum expected differential and linear probability (MEDP and MELP), known results can be obtained following the work of Keliher and Sui [47] , which suggests that the upper bound on the MEDP (and MELP) of 4-round AES is about (53/2 34 ) 4 ≈ 2 −110 . For TNT-AES in the singlekey setting where the difference can be injected on the plaintext or the tweak, there is at least one active permutation among π 1 , π 2 , π 3 since their activity patterns fall in {(0, 1, 0), (0, 1, 1), (1, 1, 0), (1, 0, 1), (1, 1, 1)}. As long as π 2 is active, there must be more than 25 active S-boxes. In the case of (1, 0, 1), it happens only when the first addition of the tweak cancels out the differences introduced from plaintext through π 1 , and the same difference is then re-introduced through the second addition of tweak through π 3 . Due to the fact that the same tweak is added and the difference in tweak is the same as well, π 1 and π 3 can be concatenated together with respective to differences. Note that π 1 + π 3 is of 12 rounds in total, out of which any 4 consecutive rounds will ensure 25 active S-boxes. We also note the security analysis of TNT under such a setting is very similar to that of AES-PRF [60] except one has the control over the extra input tweak in TNT added to the unknown internal state.

In the related-key setting, we only considered differential cryptanalysis, as there is no cancellation of active S-boxes between subkeys and the state in linear approximations. In [73] , it is shown that in the related-key setting, there are at least 21 active S-boxes in consecutive 6 rounds of AES-128, and the optimal 6-round differential has probability 2 −131 . Therefore, no useful related-key differential characteristic covering more than π 2 can be found no matter whether there is a difference in the tweak or not.

Impossible-Differential Attacks. In [71] , it is proven that there does not exist any truncated impossible-differential of AES which covers more than 5 rounds. Furthermore, the best impossible-differential attack, in terms of number of attacked rounds, is 7 rounds against AES-128 [55] . Following a similar discussion for differential attacks, when π 2 is active, impossible-differential attack does not apply naturally since π 2 is of 6 rounds, more than what impossibledifferential distinguisher can cover. For the case of activity pattern (1, 0, 1), there are 12 rounds in total for π 1 + π 3 , more than the best attack against AES-128 can cover.

The Demirci-Selçuk Meet-in-the-Middle Attack. The Demirci-Selçuk meet-in-the-middle attack led to the best cryptanalytic result on 7 rounds of AES-128 in the single-key setting, where data/time/memory complexities are below 2 100 [25] . The distinguisher covers 4 rounds, following a differential characteristic. Note, the distinguisher here tries to limit the number of possibilities for the actual values related to the differential characteristic, and it is not clear how the addition of the tweak helps reduce that. Actually, it is not even clear the addition of round key can help reduce the counts either. Hence, round keys are treated as independent fixed constants in such attacks. Thus, we can treat the tweak in the same way. Therefore, the Demirci-Selçuk meet-in-the-middle attack would work in the same way on TNT-AES as on AES, and 7 rounds of TNT-AES can be attacked.

Yoyo Tricks. In [68] , Rønjom et al. presented several key-independent yoyodistinguishers on 3-to 5-round AES, which require up to 2 25.8 data and 2 24.8 XOR computations. A key-independent impossible-differential yoyo-distinguisher on 6-round AES requiring an amount of 2 122.83 data was also proposed. Besides, a key-recovery attack on 5-round AES requiring practical complexities was devised based on the 4-round yoyo-distinguisher. In these attacks, the attacker queries pair of plaintexts to the encryption and uses swap operation on the obtained pair of ciphertexts to generate new queries to the decryption, and observes difference in the obtained pair of plaintexts, then she may continually construct new pairs of plaintexts by swapping words in the obtained pairs and iterate the same procedure enough times. It can be seen that, instead of collecting all chosen plaintexts/ciphertexts (CPs/CCs) at once, these attacks use adaptivelychosen-plaintexts/-ciphertexts (ACPs/ACCs). In TNT-AES, tweaks are always inserted as input to the encryption/decryption, and will never be output. So, for activity pattern (0, 1, 1) (resp. (1, 1, 0) for decryption), the attacker cannot play the yoyo game by adaptively choosing and observing the differences of tweak pairs and ciphertext (resp. plaintext) pairs. Accordingly, we claim that these yoyo-distinguishers and yoyo-distinguisher-based key-recovery attacks cannot be directly applied in their current form to TNT-AES.

Subspace Trail Attacks. Subspace trail cryptanalysis [32] can be seen as a generalization of invariant subspace cryptanalysis [52] , whereas it can be launched independently on specific choices of round constants or subkeys. By analyzing subspace trails, Grassi et al. re-interpreted the 3-round truncated differential and integral, the 4-round impossible-differential and integral distinguishers on AES [33] . Besides, new distinguishers on round-reduced AES are found using subspace trail cryptanalysis, including the 5-round impossibledifferential distinguisher [33] , the 5-round multiple-of-8 distinguisher [34] , the 4-round mixture-differential [31] , and the 5-round (probabilistic, threshold, and impossible) mixture-differential distinguishers [30] . Exploiting the 4-round mixture-differential distinguisher, a record for key-recovery attack on 5-round AES-128 in single-key model is set [4] . In [6] , Bardeh and Rønjom proposed the exchange attacks. Like in yoyo and mixture-differential attacks, exchange attacks also involve swap (exchange) operations on the pairs of chosen data. On 6-round AES, the exchange distinguishers requires 2 88.2 CPs and 2 88.2 encryptions. In the attacks, new plaintext pairs are obtained by exchanging certain active diagonal of other pairs that are different in diagonals, and an invariant property on the number of active columns of the differences of ciphertext pairs under such exchange operation are considered.

Using subspace trail cryptanalysis and comparing with distinguishers on round-reduced AES, we analyze distinguishers and corresponding attacks on round-reduced TNT-AES. The activity patterns of the three permutations that we considered are (0, 1, 0), (1, 0, 1), (0, 1, 1), (1, 1, 0), and (1, 1, 1) . The activity pattern (0, 1, 0) requires that all differences are comes from tweaks and canceled by the same tweaks through n 2 (i.e., 6) AES-rounds, which has no shortcut method up to now. Considering that all subspace-trail-based distinguishers on round-reduced AES are no more than n 2 (i.e., 6) AES-rounds, it seems hard to construct an exploitable subspace trail under activity patterns (0, 1, 1), (1, 1, 0) , (1, 1, 1) , which indicate more than a chunk of active 6-round AES. The activity pattern (1, 0, 1) implies that the coset of subspace related to the internal states at the end of π 1 (resulted from a set of plaintexts) equals a coset of the same subspace formed by the chosen tweaks (and the differences between tweak pairs should cancel the differences caused by the plaintext pairs), and thus the coset of subspace formed by the chosen tweaks will cause the internal states at the beginning of π 3 forming a coset of the same subspace. A subspace trail on internal states can be seen as bypassing π 2 via choosing a coset of subspace of the tweak. Thus, devising an attack using a subspace trail under activity pattern (1, 0, 1) requires that one can devise a subspace trail attack on the concatenated permutation π 3 • π 1 that is of (n 1 + n 3 ) AES-rounds, which is unknown when (n 1 + n 3 ) > 6. In Appendix A, we discuss in detail the subspace-trail-based distinguishers and key-recovery attacks on round-reduced TNT-AES.

Cube Attack, Dynamic Cube Attack. AES is immune to cube attacks [27] or dynamic cube attacks [28] due to the high algebraic degree of the AES S-box. Specifically, the algebraic degree is 7 for one round of AES and increases to 32 (<7 2 ) and 128 (<32 × 7) for two and three rounds. Therefore, AES, which has 10 rounds, is believed to be resistant to such types of attacks. So is TNT-AES since it has more rounds than AES.

Integral Attacks and Division Property. The integral attacks utilize an integral distinguisher for 3 rounds (or 4 rounds without MixColumns for the last round), with a starting point of ALL values for a diagonal and a BALANCED output, i.e., the sum of each individual byte is 0. The best attack setting will be to utilize the degrees of freedom from the tweak to achieve the distinguisher starting from the input of π 2 in forward direction with activity pattern (0, 1, 1) (or output of π 2 in backward direction with activity pattern (1, 1, 0) ). The attack will start with a fixed plaintext, and take ALL values of a diagonal from tweak. Thus, the target is π 2 + π 3 only with a secret input to π 2 . In the key-recovery phase, the attacker is able to append one round only, so this attack will work for at most n 1 + 5 out of (n 1 + n 2 + n 3 ) rounds, i.e., 6 + 5 out of 18 for TNT-AES.

The division property due to Todo et al. [74, 75] can be viewed as an extension of integral distinguisher, which has been successfully applied to many block ciphers. However, there is no reported results on AES better than integral attack so far.

Slide Attacks. The slide attack was first described by Biryukov and Wagner [10, 11] in 1999 to attack round-reduced DES. The core idea is to make use of the similarity of the round functions and that of key schedule. Thus, the difference of encryption process in its original form and one (or few) rounds shifted is within control, e.g., with high probability. The addition of tweak will allow canceling the difference in at most one round, while TNT-AES has 8 more rounds than AES-128. Hence, we expect higher security margin here. Furthermore, there is no reported slide attack against full AES-128 so far.

(Related-Subkey) Boomerang Attacks. Boomerang attacks [76] construct long distinguishers by connecting two short differential characteristics. Recently, a new tool named Boomerang Connectivity Table [16] was proposed to formulate the dependency that the two differential characteristics contain and offer guidance towards better boomerang distinguishers. We utilize the framework of the boomerang connectivity table when mounting boomerang attacks on TNT-AES. First, we consider the single-key setting where the difference can be introduced on the plaintext or the tweak. When the difference is introduced only on the tweak, as shown in Fig. 3 in Appendix B, high-probability boomerang distinguishers can be constructed on n 1 + n 2 + n 3 rounds, where n 1 , n 3 can be any number and n 2 < 6. When n 2 ≥ 6, such high-probability distinguishers do not exist. Note that these distinguishers with zero plaintext and ciphertext difference are not useful in key-recovery attacks. When the difference is also introduced to the plaintext or ciphertext, by making π 2 inactive through the tweak difference, the cipher can be seen as π 1 • π 3 with respective to differences and boomerang attacks of n 2 + r rounds can be mounted, where r is the number of rounds that boomerang attacks of AES-128 can cover and is 5. That is, only 11 rounds can be attacked. Next, we consider the related-subkey setting where the key difference can be injected on a round key. The related-subkey setting is more powerful and usually allows longer boomerang distinguishers than the related-key setting where the difference is injected on the master key. In the related-subkey setting, there exists a 6-round boomerang distinguisher of AES-128 with probability 2 −109.42 [70] . This distinguisher can be naturally extended to the 7 middle rounds of TNT-AES with the same probability under the condition that the tweak difference cancels the input difference or the output difference of the 6round boomerang distinguisher. When we add one more round to the bottom or to the top of the 7-round distinguisher, the numbers of active S-boxes will increase at least by one, leading to a negligible probability. Therefore, there seem no boomerang distinguishers of TNT-AES in the related-subkey setting that cover more than 8 rounds.

Software Performance. We estimate the software performance of TNT-AES on the basis of the best results of AES software provided by Park et al. [65] . In what follows, we consider both "Plaintext" and "Tweak" as data since when used in some authenticated encryption schemes, both of them are used to process data such as associated data. Hence, the software performance is then calculated as the total number of CPU cycles divided by the total byte length of plaintext and tweak of the TBCs. To obtain a fair comparison, we estimate the same for other existing TBCs as well (omitting their additional cost for updating tweaks), using the following formula:

For TNT-AES, the number of rounds are different from AES. To evaluate the performance, we multiply a factor to the speed of AES. Accordingly, the formula we used to calculate the software speed of TNT-AES is (where, AES means AES-128):

speed of AES × block size block size + tweak size × TNT-AES round number AES round number .

We note that the optimization technique proposed in [65] is for the CTR mode of AES, which extends the counter-mode caching [9, 78] . It caches and reuses intermediate results up to AES round 1 (R1) or up to AES round 2 (R2). For TNT-AES, tweaks are added until round (n 1 + 1). Thus, this optimization technique is applicable. Whereas, for other TBCs in which tweaks are added before the first round, this technique may not be applicable. Table 2 presents the estimated results on software performance of TNT-AES, together with the results of other TBCs under the similar setting (considering both "Plaintext" and "Tweak" as data).

To see the scenario that profits considerably by using a tweakable block cipher processing tweak efficiently, we performed a performance comparison between retweaking in TNT-AES and rekeying in AES-128. Table 3 reports the timing results. Because in the AES-NI set, the reciprocal throughput of the AESKEYGENASSIST instruction that assists the key-schedule is higher than that of the instruction AESENC that executes one round of encryption, in Table 3 , it can be seen that the process of rekeying in AES becomes slower. Whereas, the process of retweaking in TNT-AES benefits a lot from the fast AES-NI instruction for encryption. 1 Table 2 . A table of comparison with other TBCs on software (all TBCs are with 128bit block, 128-bit master key). The platform is Intel Haswell CPU i7-4770, which is the commonly used CPU in references [8, 40, 42, 65] . TNT-AES Plaintext + Tweak Hardware Performance. We estimate the hardware performance of TNT-AES with area minimization as optimizations target. The current record of minimized area of AES is kept by the bit-serial implementations provided by Jean et al. [39] . Apart from AES, Jean et al. also provided bit-serial implementations of another tweakable block cipher SKINNY. Using those state-of-the-art results provided by Jean et al. [39] , we estimate the area and latency of TNT-AES and make comparisons with other TBCs. The results are summarized in Table 5 .

In the table, results for AES, SKINNY-128-256, and Deoxys-BC-256 are all from existing studies. The results for TNT-AES are calculated using the following method based on the results for AES. Let δ be the number of bits in the data path in all implementations. Let C 1DFF be the cost of a 1-bit D flip-flop (D FF), let C XOR be the cost of a 2-input XOR gate, and let C MUX be the cost of a 2-to-1 Multiplexer in a library. We use Table 4 to estimate C 1DFF , C XOR , and C MUX in various libraries. Table 3 . Software performance of AES-128 when rekeying for every block and that of TNT-AES when fixing a key but retweaking for every block, both with plaintexts as data (unlike in Table 2 where we consider both "Plaintext" and "Tweak" as data), and both with help of AES-NI (on an Intel(R) Core(TM) i7-8565U CPU 1.80 GHz, which belongs to products formerly Whiskey Lake). Compared with implementations of AES, the additional area cost for implementations of TNT-AES comes from the cost for storing a 128-bit tweak and the cost for implementing the XOR with tweak (we ignore the additional cost for the signals controlling the tweak/key inputs). We note that there are cases where as input, the tweak can be sent twice by the external provider. In such cases, extra storage for the tweak can be saved. We note that this is possible for a design without a "tweak-schedule". For other designs, such as that permute the bytes of the tweak, this becomes difficult as it requires this permutation to be followed by external provider if not stored locally. In TNT-AES, there is no tweak-schedule, hence no storage for tweak is required. When storage is required, the 128-bit tweak can be stored using 128 1-bit D FF. To implement the XOR with tweak, besides δ 2-input XOR gates, δ 2-to-1 multiplexers are also required for selecting the bits of tweak after the n 1 -th round and the (n 1 +n 2 )-th round and selecting constant 0 after other rounds. The additional area cost for XOR gates and multiplexers is δ × (C XOR + C MUX ). Thus, additional area cost is 128 × C 1DFF + δ × (C XOR + C MUX ) when the tweak needs to be stored locally, and δ × (C XOR + C MUX ) otherwise. To get a better view of the performances, we provide the gate sizes for both scenarios. Comparison to TAES. Here, we briefly discuss the comparison between the performance of TNT-AES and that of TAES, where TAES is an AES-based TBC used to instantiate ZOCB and ZOTR that are two tweakable blockcipher modes for authenticated encryption with full absorption [3] . TAES tweaks AES-256 by simply replacing the second half part of the secrete key with 128-bit tweak and keeping all other operations and parameters unchanged. Thus, it has 14 rounds, 128-bit blocks, 128-bit keys, and 128-bit tweaks.

Because TNT-AES consists of 18 AES-rounds, i.e., 4 more rounds than TAES, under the use-cases where both the key and the tweak are fixed and all subtweaks/sub-keys can be precomputed, TAES outperforms TNT-AES. Whereas, for other use-cases where retweaking is necessary, TNT-AES is expected to perform better. The reasons are as follows. TNT-AES has no tweak-schedule, while that for TAES is related to the key-schedule for AES-256. For software implementation using AES-NI, the instruction for one-round encryption outperforms that for the key-schedule as mentioned above. Thus, in retweaking use-cases, TNT-AES will be much faster than TAES. For hardware implementation, when the 128-bit tweak can be stored in external storage, TNT-AES does not need additional storage to process the tweak. The area requirement is hence much less than that of TAES, which requires local storage to hold and process the tweak.

In this paper, we proposed a new mode named TNT for constructing tweakable block ciphers with proven BBB security based on three block ciphers. To demonstrate the effectiveness of the mode, an instantiation based on AES named TNT-AES was proposed, which enjoys the long-standing security of AES, fast software performance due to AES new instructions, and hardware efficiency due to the simplicity of TNT mode. Following the prove-then-prune design strategy, we reduced the number of rounds of the three underlying AES-based block ciphers from 10 for the original AES, to 6, 6, and 6, respectively. Our comprehensive cryptanalysis shows no security issues against TNT-AES, while the reduced number of rounds allow achieving competitive software and hardware performances with existing TBCs designed through modular methods. We expect TNT to be a generic way to turn a block cipher into a tweakable block cipher securely, especially for those lightweight block ciphers with smaller block lengths.

Potential Applications. While TNT-AES only supports n-bit tweaks which seems a limitation compared to CLRW2 2 , such a parameter has already been sufficient for many important applications. For example, many TBC-based MACs, including the chaining-via-tweak mode proposed by Liskov et al. [54] (its security was later proved optimal by Landecker et al. [51] ) and the AXU-hash-based MACs proposed by Cogliati et al. [19] , are exactly built from TBCs with n-bit tweaks, and thus instantiating the TBCs with CLRW2 2 (as done in [51] ) clearly wastes power and causes unnecessary efficiency loss. Consequently, TNT-AES would probably be a better building block. Moreover, TNT-AES could also be used to build BBB secure variable length domain extenders via the construction of Chen et al. [13] or double-length block cipher via the construction of Coron et al. [21] . As discussed in [13] , such construction may further motivate highly secure format-preserving encryption schemes might be a very valuable alternative to the recently broken standards.

Besides, TNT-AES could be used to replace the TBC module in the standard OCB3 mode and the OTR mode [62] (the 2nd round candidate during CAESAR competition). Both modes are optimally secure when the underlying TBC-module is optimal [49, 62] but fall down to the birthday bound due to instantiating the TBC with XEX-like constructions [67] . Therefore, once instantiating with TNT-AES, we obtain corresponding variants secure against BBB 2 2n/3 queries in both cases. Consider the application to OCB3 for concreteness. The resulting AE TNT-AES-ΘCB is a ΘCB instance [49] with TNT-AES being its underlying TBC, and the security is boosted from n/2 bits of OCB3 to 2n/3 bits. Perhaps surprisingly, the hardware efficiency might be improved as well: the original OCB3 mode requires to store an AXU hash key E K (0) during the lifetime of the master key K, which is avoided in TNT-AES-TAE.

We anticipate more such applications, especially when AES-based TBCs are used and constructed from other modes than TNT.

The Security Gap. Although the security of TNT is proven to be 2 2n/3 , there is no matching attack -note that Dinur et al.'s attack strategy [26] against the 3-round Even-Mansour ciphers does not help here since the permutations in TNT cannot be queried by the adversary, and Mennink's distinguisher [59] does not work directly either due to the 2 3n/2 offline computational complexity besides the 2 3n/4 online query complexity. Then, the same applies to the instantiation TNT-AES. It will be interesting to see the closure of this gap, by either improving the proven security bound or finding a better attack. We leave this as an open problem to the community.

In this section, we discuss subspace-trail-based distinguishers and key-recovery attacks on TNT-AES under the activity pattern (0, 1, 1). Attacking encryption under the pattern (1, 1, 0) can be seen as attacking decryption with the pattern (0, 1, 1) . Thus, similar attacks under the pattern (1, 1, 0) can be devised once attacks under the pattern (0, 1, 1) are established. Subspace trail cryptanalysis of TNT-AES under the activity pattern (0, 1, 1) can be compared with subspace trail cryptanalysis of (n 2 + n 3 )-round AES. The difference lies in that the initial coset of concerned subspace is formed by chosen tweaks instead of by chosen plaintext and elements in the coset will be XOR-ed with the internal state (an unknown constant) c * which can not be observed during the attack. Besides, the same chosen tweaks are XOR-ed after π 2 .

As introduced in Sect. 5.2, a series of attacks on round-reduced (no more than 6 rounds) AES based on subspace trail cryptanalysis and the extended mixturedifferential, exchange attacks were proposed in [4] [5] [6] [31] [32] [33] [34] . Among these r-round distinguishers, those which do not require the knowledge of part of the secret key can be directly turned into (n 1 + r)-round distinguishers with the same complexity on round-reduced TNT-AES. This can be done by using a unique plaintext p and a structure of tweaks to construct required cosets of concerned subspace at the beginning of π 2 . Although the exact cosets are unknown, required relations among input states at the beginning of the active permutation can be constructed using chosen tweaks. For example, when turn the 4-round mixturedifferential distinguisher on AES [4, 30] into an (n 1 + 4)-round distinguisher on TNT-AES, if some chosen tweaks can form mixture quadruples, then after being XOR-ed with a common unknown internal state, the resulting states still keep the relation of being mixture quadruples. There are r-round distinguishers on round-reduced AES that require considering part of the key, which can also be turned into (n 1 + r)-round distinguisher on TNT-AES. Take the 5-round impossible-differential distinguishers based on the impossible subspace trail on 4-round AES [33] for example. When we turn it into (n 1 + 5)-round distinguisher on TNT-AES, we use a unique plaintext p and structures of chosen tweaks (chosen in the way of choosing plaintexts in the original distinguisher). Then, unlike in the original distinguisher on AES, where we guess the single-byte key difference k 0,0 ⊕k 1,1 , we guess the single-byte difference c * 0,0 ⊕c * 1,1 , where c * is the unknown internal state before XOR-ing the tweak. Again, the complexities of these (n 1 + r)-round distinguishers on TNT-AES will be almost the same with those r-round distinguishers on AES.

As for key-recovery attacks exploiting those r-round distinguishers on roundreduced AES (e.g., the 5-round key-recovery attack exploiting the 4-round mixture-differential distinguisher [4] and the 6-round key-recovery attack exploiting the 5-round probabilistic mixture-differential distinguisher [30] ), they add one round in front of the distinguisher, and guess parts of the whitening key (e.g., key bits in SR −1 (Col(i)), or say in diagonal space D i , i ∈ {0, 1, 2, 3}) to filter out useful plaintexts from a chosen structure or to classify chosen plaintexts into properly defined sets. Such attacks may not be directly used to construct corresponding attacks on (n 1 + 1 + r)-round TNT-AES by guessing part of the subkey, because the internal state is also unknown. However, by guessing the internal state before XOR-ing the tweak, we can recover this unknown state part by part (instead of recovering key bits). Using this recovered internal state, one may further analyze π 1 to recover the key. However, because the dependent unknown values are in the diagonal SR −1 (Col(i)) that depend on the full state one round before, extending such attacks to cover one more round seems to be difficult. Thus, exploiting current techniques in such attacks on r-round AES-128, an attack on TNT-AES is limited to be no more than (n 1 + 1 + r) rounds. Key-recovery attacks using those (n 1 + r)-round (r ≥ 5) distinguishers to recovery the subkey in an appended (complete) round seems also very hard. That is because, the considered cosets at the end of the exploited distinguishers are commonly cosets of mixed space M I (I ⊆ {0, 1, 2, 3}), which are mapped into the full state. Thus, in an (n 1 + r + 1)-round (r ≥ 5) key-recovery attack, checking the distinguishable properties one round before the last round requires guessing the entire key.

Based on these analyses and together with previous analyses of other activity patterns, we believe TNT-AES is strong enough to resist subspace trail attacks. 

CAESAR: Competition for Authenticated Encryption: Security, Applicability, and Robustness

Challenges in authenticated encryption

ZOCB and ZOTR: tweakable blockcipher modes for authenticated encryption with full absorption

Improved key recovery attacks on reduced-round AES with practical data and memory complexities

A key-independent distinguisher for 6-round AES in an adaptive setting

The exchange attack: how to distinguish six rounds of AES with 2 88.2 chosen plaintexts

The SKINNY family of block ciphers and its low-latency variant MANTIS

The SKINNY family of block ciphers and its low-latency variant MANTIS. Cryptology ePrint Archive

New AES software speed records

Slide attacks

Advanced slide attacks

Elastictweak: a framework for short tweak tweakable block cipher

Short variable length domain extenders with beyond birthday bound security

Advances in Cryptology -ASIACRYPT 2016, Part I

Progress in Cryptology -INDOCRYPT 2008: 9th International Conference in Cryptology in India, Kharagpur

Boomerang connectivity table: a new cryptanalysis tool

Tweaking a block cipher: multi-user beyond-birthday-bound security in the standard model

Tweaking even-mansour ciphers

New constructions of MACs from (tweakable) block ciphers

Beyond-birthday-bound security for tweakable evenmansour ciphers with linear tweak and key mixing

A domain extender for the ideal cipher

AES Proposal: Rijndael

The Design of Rijndael: AES -The Advanced Encryption Standard. Information Security and Cryptography

Information-theoretic indistinguishability via the Chi-Squared method

Improved key recovery attacks on reducedround, in the single-key setting

Key recovery attacks on 3-round even-mansour, 8-step LED-128, and Full AES 2

Cube attacks on tweakable black box polynomials

Breaking grain-128 with dynamic cube attacks

Advances in Cryptology -CRYPTO 2015, Part I

Structural truncated differential attacks on round-reduced AES. Cryptology ePrint Archive

Mixture differential cryptanalysis: a new approach to distinguishers and attacks on round-reduced AES

Subspace trail cryptanalysis and its applications to AES

Subspace trail cryptanalysis and its applications to AES

A new structural-differential property of 5-Round AES

Better bounds for block cipher modes of operation via nonce-based key derivation

Fast garbling of circuits under standard assumptions

Robust authenticated-encryption AEZ and the problem that it solves

ZMAC: a fast tweakable block cipher mode for highly secure message authentication

Bit-sliding: a generic technique for bit-serial implementations of SPN-based primitives

KIASU v1. Additional first-round candidates of CAESAR compeition

Tweaks and keys for block ciphers: the TWEAKEY framework

Deoxys-II. Finalist of CAESAR compeition

XHX -a framework for optimally secure tweakable block ciphers from classical block ciphers and universal hashing

Tight security of cascaded LRW2

Fast Software Encryption -FSE

Advances in Cryptology -CRYPTO 2017, Part III

Exact maximum expected differential and linear probability for 2-round advanced encryption standard (AES)

Fast Software Encryption -FSE 1999

The software performance of authenticated-encryption modes

Tweakable blockciphers with asymptotically optimal security

Tweakable blockciphers with beyond birthday-bound security

A cryptanalysis of PRINTcipher: the invariant subspace attack

Tweakable block ciphers secure beyond the birthday bound in the ideal cipher model

Tweakable block ciphers

New impossible differential attacks on AES

Optimally secure tweakable blockciphers

XPX: generalized tweakable Even-Mansour with improved security guarantees

Insuperability of the standard versus ideal model gap for tweakable blockcipher security

Towards tight security of cascaded LRW2

Optimal PRFs from blockcipher designs

Beyond-birthday-bound security based on tweakable block cipher

Parallelizable rate-1 authenticated encryption from pseudorandom functions

Pushing the limits: a very compact and a threshold implementation of AES

FACE: Fast AES CTR mode encryption techniques based on the reuse of repetitive data

Advances in Cryptology -ASIACRYPT 2018, Part I

Efficient instantiations of tweakable blockciphers and refinements to modes OCB and PMAC

Yoyo tricks with AES

Salvaging weak security bounds for blockcipherbased constructions

Boomerang connectivity table revisited

Provable security evaluation of structures against impossible differential and zero correlation linear cryptanalysis

Links among impossible differential, integral and zero correlation linear cryptanalysis

Analysis of AES, SKINNY, and others with constraint programming

Integral cryptanalysis on full MISTY1

Bit-based division property and application to Simon family

The boomerang attack

How to build fully secure tweakable blockciphers from classical blockciphers

Hongjun's optimized C-code for AES-128 and AES-256. eSTREAM project

Acknowledgements. We thank the anonymous reviewers for their helpful comments and thank Tetsu Iwata, Eik List and Kazuhiko Minematsu for fruitful discussions. 

 [42] * In column 8 for AES, in the form x/y, x is the number of cycles taken by the entire encryption, y is the number of cycles taken by one full round which is used to estimate the latency of TNT-AES. † In column 3-7 for TNT-AES, in the form x/y, x is the area when the tweak is stored locally, y is the area when the tweak is not stored locally.The references for TNT-AES indicated by means that basing on the results of AES in these works, we calculated the presented results for TNT-AES.For latency, selecting and XOR-ing bits of tweak can be implemented in the same clock cycles for AddRoundKey and SubBytes, thus cost no additional cycles. The additional cycle-cost comes from the fact that TNT-AES has more rounds and the last round is complete instead of missing the MixColumns. Thus, to estimate the latency of TNT-AES, we use the clock cycles taken by one full round of AES (denoted by Cycles round ), times the total number of rounds (n 1 +n 2 +n 3 ), plus the cycles taken by the last AddRoundKey (128/δ cycles), i.e., Cycles round × (n 1 + n 2 + n 3 ) + 128/δ, where Cycles round is listed in Table 5 (column 8 for AES).From Table 5 , when the tweak has to be stored locally, the hardware performance of TNT-AES is slightly inferior to those of SKINNY-128-256 and Deoxys-BC-256, otherwise, the hardware performance of TNT-AES can be superior.