The "Conference on Cryptographic Hardware and Embedded Systems (CHES)" is the world's largest and one of the most important international conferences on hardware-related cryptography and will be held in Prague (Czech Republic) from September 10 to 14, 2023.
CASA scientists will present a total of six papers and three posters this year (overview below). The conference is organized by the International Association for Cryptologic Research (IACR) and attracts researchers as well as participants from industry and government. CHES was founded in 1999 by CASA speaker Christof Paar at Worcester Polytechnic Institute in Massachusetts.
Florian Stolz, Jan Philipp Thoma, Pascal Sasdrich, Tim Güneysu, Horst Görtz Institute for IT-Security, Ruhr University Bochum
Abstract: Microarchitectural side-channel vulnerabilities in modern processors are known to be a powerful attack vector that can be utilized to bypass common security boundaries like memory isolation. As shown by recent variants of transient execution attacks related to Spectre and Meltdown, those side channels allow to leak data from the microarchitecture to the observable architectural state. The vast majority of attacks currently build on the cache-timing side channel, since it is easy to exploit and provides a reliable, fine-grained communication channel. Therefore, many proposals for side-channel secure cache architectures have been made. However, caches are not the only source of side-channel leakage in modern processors and mitigating the cache side channel will inevitably lead to attacks exploiting other side channels. In this work, we focus on defeating side-channel attacks based on page translations.It has been shown that the Translation Lookaside Buffer (TLB) can be exploited in a very similar fashion to caches.
Since the main caches and the TLB share many features in their architectural design, the question arises whether existing countermeasures against cache-timing attacks can be used to secure the TLB. We analyze state-ofthe-art proposals for side-channel secure cache architectures and investigate their applicability to TLB side channels. We find that those cache countermeasures are notdirectly applicable to TLBs, and propose TLBcoat, a new side-channel secure TLB architecture. We provide evidence of TLB side-channel leakage on RISC-V-based Linux systems, and demonstrate that TLBcoat prevents this leakage. We implement TLBcoat using the gem5 simulator and evaluate its performance using the PARSEC benchmark suite.
Aein Rezaei Shahmirzadi, Siemen Dhooghe, Amir Moradi, Ruhr University Bochum; KU Leuven/ imec-COSIC
Abstract: Masking schemes are the most popular countermeasure to mitigate Side-Channel Analysis (SCA) attacks. Compared to software, their hardware implementations require certain considerations with respect to physical defaults, such as glitches. To counter this extended leakage effect, the technique known as Threshold Implementation (TI) has proven to be a reliable solution. However, its efficiency, namely the number of shares, is tied to the algebraic degree of the target function. As a result, the application of TI may lead to unaffordable implementation costs. This dependency is relaxed by the successor schemes where the minimum number of d + 1 shares suffice for dth-order protection independent of the function’s algebraic degree.
By this, although the number of input shares is reduced, the implementation costs are not necessarily low due to their high demand for fresh randomness. It becomes even more challenging when a joint low-latency and low-randomness cost is desired. In this work, we provide a methodology to realize the second-order glitch-extended probing-secure implementation of cubic functions with three shares while allowing to reuse fresh randomness. This enables us to construct low-latency second-order secure implementations of several popular lightweight block ciphers, including Skinny, Midori, and Prince, with a very limited number of fresh masks. Notably, compared to state-of-the-art equivalent implementations, our designs lower the latency in terms of the number of clock cycles while keeping randomness costs low.
Lukasz Chmielewski, Peter Schwabe, Lejla Batina, Niels Samwel, Björn Haase, Masaryk University, Brno, Czechia; Radboud University, The Netherlands; Riscure, The Netherlands; Max Planck Institute for Security and Privacy, Bochum, Germany; Endress+Hauser Liquid Analysis GmbH&Co. KG, Germany
Abstract: This paper describes an ECC implementation computing the X25519 keyexchange protocol on the Arm Cortex-M4 microcontroller. For providing protections against various side-channel and fault attacks we first review known attacks and countermeasures, then we provide software implementations that come with extensive mitigations, and finally we present a preliminary side-channel evaluation. To our best knowledge, this is the first public software claiming affordable protection against multiple classes of attacks that are motivated by distinct real-world application scenarios. We distinguish between X25519 with ephemeral keys and X25519 with static keys and show that the overhead to our baseline unprotected implementation is about 37% and 243%, respectively. While this might seem to be a high price to pay for security, we also show that even our (most protected) static implementation is at least as efficient as widely-deployed ECC cryptographic libraries, which offer much less protection.
Jannik Zeitschner, Nicolai Müller, Amir Moradi, Ruhr University Bochum
Abstract: A decisive contribution to the all-embracing protection of cryptographic software, especially on embedded devices, is the protection against Side-Channel Analysis (SCA) attacks. Masking countermeasures can usually be integrated into the software during the design phase. In theory, this should provide reliable protection against such physical attacks. However, the correct application of masking is a non-trivial task that often causes even experts to make mistakes. In addition to human-caused errors, micro-architectural Central Processing Unit (CPU) effects can lead even a seemingly theoretically correct implementation to fail to satisfy the desired level of security in practice. This originates from different components of< the underlying CPU which complicates the tracing of leakage back to a particular source and hence avoids making general and device-independent statements about its security.PROLEAD has recently been presented at CHES 2022 and has originally been developed as a simulation-based tool to evaluate masked hardware designs.
In this work, we adapt PROLEAD for the evaluation of masked software, and enable the transfer of the already known benefits of PROLEAD into the software world. These include (1) evaluation of larger designs compared to the state of the art, e.g. a full Advanced Encryption Standard (AES) masked implementation, and (2) formal verification under our new generic leakage model for CPUs. Concretely, we formalize leakages, observed across different CPU architectures, into a generic abstraction model that includes all these leakages and is therefore independent of a specific CPU design. Our resulting tool PROLEAD_SW allows to provide a formal statement on the security based on the derived generic model. As a concrete result, using PROLEAD_SW we evaluated the security of several publicly available masked software implementations in our new generic leakage model and reveal multiple vulnerabilities.
Marvin Staib, Amir Moradi, Ruhr University Bochum
Abstract: With the breakthrough of Deep Neural Networks, many fields benefited from its enormously increasing performance. Although there is an increasing trend to utilize Deep Learning (DL) for Side-Channel Analysis (SCA) attacks, previous works made specific assumptions for the attack to work. Especially the concept of template attacks is widely adapted while not much attention was paid to other attack strategies. In this work, we present a new methodology, that is able to exploit side-channel collisions in a black-box setting. In particular, our attack is performed in a non-profiled setting and requires neither a hypothetical power model (or let’s say a many-to-one function) nor details about the underlying implementation. While the existing non-profiled DL attacks utilize training metrics to distinguish the correct key, our attack is more efficient by training a model that can be applied to recover multiple key portions, e.g., bytes.
In order to perform our attack on raw traces instead of pre-selected samples, we further introduce a DL-based technique that can localize input-dependent leakages in masked implementations, e.g., the leakages associated to one byte of the cipher state in case of AES. We validated our approach by targeting several publicly available power consumption datasets measured from implementations protected by different masking schemes. As a concrete example, we demonstrate how to successfully recover the key bytes of the ASCAD dataset with only a single trained model in a non-profiled setting.
Alexander May, Carl Richard Theodor Schneider, Ruhr University Bochum
Abstract: Assume that we have a group G of known order q, in which we want to solve discrete logarithms (dlogs). In 1994, Maurer showed how to compute dlogs in G in poly time given a Diffie-Hellman (DH) oracle in G, and an auxiliary elliptic curve ˆÊ (Fq) of smooth order. The problem of Maurer’s reduction of solving dlogs via DH oracles is that no efficient algorithm for constructing such a smooth auxiliary curve is known. Thus, the implications of Maurer’s approach to real-world applications remained widely unclear.In this work, we explicitly construct smooth auxiliary curves for 13 commonly used, standardized elliptic curves of bit-sizes in the range [204, 256], including e.g., NIST P-256, Curve25519, SM2 and GOST R34.10. For all these curves we construct a corresponding cyclic auxiliary curve ˆÊ(Fq), whose order is 39-bit smooth, i.e., its largest factor is of bit-length at most 39 bits.This in turn allows us to compute for all divisors of the order of ˆÊ(Fq) exhaustively a codebook for all discrete logarithms. As a consequence, dlogs on ˆÊ(Fq) can efficiently be computed in a matter of seconds. Our resulting codebook sizes for each auxiliary curve are less than 29 TByte individually, and fit on our hard disk.We also construct auxiliary curves for NIST P-384 and NIST P-521 with a 65-bit and 110-bit smooth order.
Further, we provide an efficient implementation of Maurer’s reduction from the dlog computation in G with order q to the dlog computation on its auxiliary curve ˆÊ (Fq). Let us provide a flavor of our results, e.g., when G is the NIST P-256 group, the results for other curves are similar. With the help of our codebook for the auxiliary curve Ê(Fq), and less than 24,000 calls to a DH oracle in G (that we simulate), we can solve discrete logarithms on NIST P-256 in around 30 secs.From a security perspective, our results show that for current elliptic curve standards< the difficulty of solving DH is practically tightly related to the difficulty of computing dlogs. Namely, unless dlogs are easy to compute on these curves G, we provide a very concrete security guarantee that DH in G must also be hard. From a cryptanalytic perspective, our results show a way to efficiently solve discrete logarithms in the presence of a DH oracle.
1. A Thorough Evaluation of RAMBAM. Daniel Lammers; Amir Moradi; Nicolai Müller; Aein Rezaei Shahmirzadi, Ruhr University Bochum, Horst Görtz Institute for IT Security, Germany
2. Proper Evaluation of Glitch-free Masked Circuits. Nicolai Müller; Daniel Lammers; Amir Moradi, Ruhr University Bochum, Horst Görtz Institute for IT Security, Germany
3. Static Leakage in Dual-Rail Precharge. LogicsBijan Fadaeinia; Amir Moradi, Ruhr University Bochum, Horst Görtz Institute for IT Security, Germany
General note: In case of using gender-assigning attributes we include all those who consider themselves in this gender regardless of their own biological sex.