Papers updated in last 183 days (1594 results)
Plinko: Single-Server PIR with Efficient Updates via Invertible PRFs
We study single-server private information retrieval (PIR) where a client wishes to privately retrieve the $x$-th entry from a database held by a server without revealing the index $x$. In our work, we focus on PIR with client pre-processing where the client may compute hints during an offline phase. The hints are then leveraged during queries to obtain sub-linear online time. We present Plinko that is the first single-server PIR with client pre-processing that obtains optimal trade-offs between client storage and total (client and server) query time for all parameters. Our scheme uses $t = \tilde{O}(n/r)$ query time for any client storage size $r$. This matches known lower bounds of $r \cdot t = \Omega(n)$ up to logarithmic factors for all parameterizations whereas prior works could only match the lower bound when $r = \tilde{O}(\sqrt{n})$. Moreover, Plinko is also the first updateable PIR scheme where an entry can be updated in worst-case $\tilde{O}(1)$ time.
As our main technical tool, we define the notion of an invertible pseudorandom function (iPRF) that generalizes standard PRFs to be equipped with an efficient inversion algorithm. We present a construction of an iPRF from one-way functions where forward evaluation runs in $\tilde{O}(1)$ time and inversion runs in time linear in the inverse set (output) size. Furthermore, our iPRF construction is the first that remains efficient and secure for arbitrary domain and range sizes (including small domains and ranges). In the context of single-server PIR, we show that iPRFs may be used to construct the first hint set representation where finding a hint containing an entry $x$ may be done in $\tilde{O}(1)$ time.
Post-Quantum Online/Offline Signatures
Post-quantum signatures have high costs compared to RSA and ECDSA, in particular for smart cards. A line of work originating from Even, Goldreich, and Micali (CRYPTO'89) aimed to reduce digital signature latency by splitting up signing into an online and offline phase. The online/offline paradigm combines an ordinary long-term signature scheme with a fast, generally one-time, signature scheme. We reconsider this paradigm in the context of lattice-based post-quantum signatures in the GPV framework, with an example instantiation based on Falcon.
Mira: Efficient Folding for Pairing-based Arguments
Pairing-based arguments offer remarkably small proofs and space-efficient provers, but aggregating such proofs remains costly. Groth16 SNARKs and KZG polynomial commitments are prominent examples of this class of arguments. These arguments are widely deployed in decentralized systems, with millions of proofs generated per day. Recent folding schemes have greatly reduced the cost of proving incremental computations, such as batch proof verification. However, existing constructions require encoding pairing operations in generic constraint systems, leading to high prover overhead. In this work, we introduce Mira, a folding scheme that directly supports pairing-based arguments. We construct this folding scheme by generalizing the framework in Protostar to support a broader class of special-sound protocols. We demonstrate the versatility and efficiency of this framework through two key applications: Groth16 proof aggregation and verifiable ML inference. Mira achieves 5.8x faster prover time and 9.7x lower memory usage than the state-of-the-art proof aggregation system while maintaining a constant-size proof. To improve the efficiency of verifiable ML inference, we provide a new lincheck protocol with a verifier degree that is independent of the matrix order. We show that Mira scales effectively to larger models, overcoming the memory bottlenecks of current schemes.
A Note on P $\neq$ NP
The question of whether the complexity class P equals NP is a major unsolved problem in theoretical computer science. Proving that a language that belongs to NP does not have a polynomial-time algorithm is almost impossible, i.e., proving that P $\neq$ NP is almost impossible. In all attempts to prove P $\neq$ NP, all we can do is try to provide the best clues or evidence for P $\neq$ NP.
In this paper, we introduce a new language, the Add/XNOR problem, which has the simplest structure and perfect randomness, by extending the subset sum problem. We show that the square-root complexity is necessary to solve the Add/XNOR problem, which is by far the strongest evidence for P $\neq$ NP. That is, problems that are verifiable in polynomial time are not necessarily solvable in polynomial time.
Furthermore, by giving up commutative and associative properties, we design a magma equipped with a permutation and successfully achieve Conjecture 1. Based on this conjecture, we obtain the Add/XOR/XNOR problem and one-way functions that are believed to require exhaustive search to solve or invert.
Shadowfax: Combiners for Deniability
As cryptographic protocols transition to post-quantum security, most adopt hybrid solutions combining pre-quantum and post-quantum assumptions. However, this shift often introduces trade-offs in terms of efficiency, compactness, and in some cases, even security. One such example is deniability, which enables users, such as journalists or activists, to deny authorship of potentially incriminating messages. While deniability was once mainly of theoretical interest, protocols like X3DH, used in Signal and WhatsApp, provide it to billions of users. Recent work (Collins et al., PETS'25) has further bridged the gap between theory and real-world applicability. In the post-quantum setting, however, protocols like PQXDH, as well as others such as Apple’s iMessage with PQ3, do not support deniability. This work investigates how to preserve deniability in the post-quantum setting by leveraging unconditional (statistical) guarantees instead of computational assumptions - distinguishing deniability from confidentiality and authenticity.
As a case study, we present a hybrid authenticated key encapsulation mechanism (AKEM) that provides statistical deniability, while maintaining authenticity and confidentiality through a combination of pre-quantum and post-quantum assumptions. To this end, we introduce two combiners at different levels of abstraction. First, at the highest level, we propose a black-box construction that combines two AKEMs, showing that deniability is preserved only when both constituent schemes are deniable. Second, we present Shadowfax, a non-black-box combiner that integrates a pre-quantum NIKE, a post-quantum KEM, and a post-quantum ring signature. We demonstrate that Shadowfax ensures deniability in both dishonest and honest receiver settings. When instantiated, we rely on statistical security for the former, and on a pre- or post-quantum assumption in the latter. Finally, we provide an optimised, yet portable, implementation of a specific instantiation of Shadowfax yielding ciphertexts of 1781 bytes and public keys of 1449 bytes. Our implementation achieves competitive performance: encapsulation takes 1.9 million cycles and decapsulation takes 800000 cycles on an Apple M1 Pro.
Optimizing Key Recovery in Impossible Cryptanalysis and Its Automated Tool
Impossible differential (ID) cryptanalysis and impossible boomerang (IB) cryptanalysis are two methods of impossible cryptanalysis against block ciphers. Since the seminal work introduced by Boura et al. in 2014, there have been no substantial advancements in the key recovery process for impossible cryptanalysis, particularly for the IB attack.In this paper, we propose a generic key recovery framework for impossible cryptanalysis that supports arbitrary key-guessing strategies, enabling optimal key recovery attacks. Within the framework, we provide a formal analysis of probabilistic extensions in impossible cryptanalysis for the first time. Besides, for the construction of IB distinguishers, we propose a new method for finding contradictions in multiple rounds.
By incorporating these techniques, we propose an Mixed-Integer Linear Programming (MILP)-based tool for finding full ID and IB attacks. To demonstrate the power of our methods, we applied it to several block ciphers, including SKINNY, SKINNYee, Midori, and Deoxys-BC. Our approach yields a series of optimal results in impossible cryptanalysis, achieving significant improvements in time and memory complexities. Notably, our IB attack on SKINNYee is the first 30-round attack.
Breaking the Blindfold: Deep Learning-based Blind Side-channel Analysis
Physical side-channel analysis (SCA) operates on the foundational assumption of access to known plaintext or ciphertext. However, this assumption can be easily invalidated in various scenarios, ranging from common encryption modes like Cipher Block Chaining (CBC) to complex hardware implementations, where such data may be inaccessible. Blind SCA addresses this challenge by operating without the knowledge of plaintext or ciphertext. Unfortunately, prior such approaches have shown limited success in practical settings.
In this paper, we introduce the Deep Learning-based Blind Side-channel Analysis (DL-BSCA) framework, which leverages deep neural networks to recover secret keys in blind SCA settings. In addition, we propose a novel labeling method, Multi-point Cluster-based (MC) labeling, accounting for dependencies between leakage variables by exploiting multiple sample points for each variable, improving the accuracy of trace labeling.
We validate our approach across four datasets, including symmetric key algorithms (AES and Ascon) and a post-quantum cryptography algorithm, Kyber, with platforms ranging from high-leakage 8-bit AVR XMEGA to noisy 32-bit ARM STM32F4. Notably, previous methods failed to recover the key on the same datasets. Furthermore, we demonstrate the first successful blind SCA on a desynchronization countermeasure enabled by DL-BSCA and MC labeling. All experiments are validated with real-world SCA measurements, highlighting the practicality and effectiveness of our approach.
Non Linearizable Entropic Operator
In [Pan21] a linearization attack is proposed in order to break the cryp-
tosystem proposed in [Gli21]. We want to propose here a non-linearizable
operator that disables this attack as this operator doesn't give raise to a
quasigrup and doesn't obey the latin square property.
TallyGuard: Privacy Preserving Tallied-as-cast Guarantee
This paper presents a novel approach to verifiable vote tallying using additive homomorphism, which can be appended to existing voting systems without modifying the underlying infrastructure. Existing End-to-End Verifiable (E2E-V) systems like Belenios and ElectionGuard rely on distributed trust models or are vulnerable to decryption compromises, making them less suitable for general elections. Our approach introduces a tamper-evident commitment to votes through cryptographic hashes derived from homomorphic encryption schemes such as Paillier. The proposed system provides tallied-as-cast verifiability without revealing individual votes, thereby preventing coercion. The system also provides the ability to perform public verification of results. We also show that this system can be transitioned to quantum-secure encryption like Regev for future-proofing the system. We discuss how to deploy this system in a real-world scenario, including for general political elections, analyzing the security implications and report on the limitations of this system. We believe that the proposed system offers a practical solution to the problem of verifiable vote tallying in general elections.
A notion on S-boxes for a partial resistance to some integral attacks
In two recent papers, we introduced and studied the notion of $k$th-order sum-freedom of a vectorial function $F:\mathbb F_2^n\to \mathbb F_2^m$. This notion generalizes that of almost perfect nonlinearity (which corresponds to $k=2$) and has some relation with the resistance to integral attacks of those block ciphers using $F$ as a substitution box (S-box), by preventing the propagation of the division property of $k$-dimensional affine spaces. In the present paper, we show that this notion, which is rarely satisfied by vectorial functions, can be weakened while retaining the property that the S-boxes do not propagate the division property of $k$-dimensional affine spaces. This leads us to the property that we name $k$th-order $t$-degree-sum-freedom, whose strength decreases when $t$ increases, and which coincides with $k$th-order sum-freedom when $t=1$. The condition for $k$th-order $t$-degree-sum-freedom is that, for every $k$-dimensional affine space $A$, there exists a non-negative integer $j$ of 2-weight at most $t$ such that $\sum_{x\in A}(F(x))^j\neq 0$. We show, for a general $k$th-order $t$-degree-sum-free function $F$, that $t$ can always be taken smaller than or equal to $\min(k,m)$ under some reasonable condition on $F$, and that it is larger than or equal to $\frac k{\deg(F)}$, where $\deg(F)$ is the algebraic degree of $F$. We also show two other lower bounds: one, that is often tighter, by means of the algebraic degree of the compositional inverse of $F$ when $F$ is a permutation, and another (valid for every vectorial function) by means of the algebraic degree of the indicator of the graph of the function. We study examples for $k=2$ (case in which $t=1$ corresponds to APNness) showing that finding $j$ of 2-weight 2 can be challenging, and we begin the study of power functions, for which we prove upper bounds. We study in particular the multiplicative inverse function (used as an S-box in the AES), for which we characterize the $k$th-order $t$-degree-sum-freedom by the coefficients of the subspace polynomials of $k$-dimensional vector subspaces (deducing the exact value of $t$ when $k$ divides $n$) and we extend to $k$th-order $t$-degree-sum-freedom the result that it is $k$th-order sum-free if and only if it is $(n-k)$th-order sum-free.
SHIFT SNARE: Uncovering Secret Keys in FALCON via Single-Trace Analysis
This paper presents a novel single-trace side-channel attack on FALCON---a lattice-based post-quantum digital signature protocol recently approved for standardization by NIST. We target the discrete Gaussian sampling operation within the FALCON key generation scheme and use a single power measurement trace to succeed. Notably, negating the 'shift right 63-bit' operation (for 64-bit values) leaks critical information about the '-1' vs. '0' assignments to intermediate coefficients. These leaks enable full recovery of the generated secret keys. The proposed attack is implemented on an ARM Cortex-M4 microcontroller running both reference and optimized software implementation from FALCON's NIST Round 3 package.
Statistical analysis with 500k tests reveals a per coefficient success rate of 99.9999999478% and a full key recovery success rate of 99.99994654% for FALCON-512. This work highlights the vulnerability of current software solutions to single-trace attacks and underscores the urgent need to develop single-trace resilient software for embedded systems.
A Comprehensive Formal Security Analysis of OPC UA
OPC UA is a standardized Industrial Control System (ICS) protocol, deployed in critical infrastructures, that aims to ensure security. The forthcoming version 1.05 includes major changes in the underlying cryptographic design, including a Diffie-Hellmann based key exchange, as opposed to the previous RSA based version. Version 1.05 is supposed to offer stronger security, including Perfect Forward Secrecy (PFS).
We perform a formal security analysis of the security protocols specified in OPC UA v1.05 and v1.04, for the RSA-based and the new DH-based mode, using the state-of-the-art symbolic protocol verifier ProVerif. Compared to previous studies, our model is much more comprehensive, including the new protocol version, combination of the different sub-protocols for establishing secure channels, sessions and their management, covering a large range of possible configurations. This results in one of the largest models ever studied in ProVerif raising many challenges related to its verification mainly due to the complexity of the state machine. We discuss how we mitigated this complexity to obtain meaningful analysis results. Our analysis uncovered several new vulnerabilities, that have been reported to and acknowledged by the OPC Foundation. We designed and proposed provably secure fixes, most of which are included in the upcoming version of the standard.
Cycles and Cuts in Supersingular L-Isogeny Graphs
Supersingular elliptic curve isogeny graphs underlie isogeny-based cryptography. For isogenies of a single prime degree $\ell$, their structure has been investigated graph-theoretically.
We generalise the notion of $\ell$-isogeny graphs to $L$-isogeny graphs (studied in the prime field case by Delfs and Galbraith), where $L$ is a set of small primes dictating the allowed isogeny degrees in the graph. We analyse the graph-theoretic structure of $L$-isogeny graphs. Our approaches may be put into two categories: cycles and graph cuts.
On the topic of cycles, we provide: a count for the number of non-backtracking cycles in the $L$-isogeny graph using traces of Brandt matrices; an efficiently computable estimate based on this approach; and a third ideal-theoretic count for a certain subclass of $L$-isogeny cycles. We provide code to compute each of these three counts.
On the topic of graph cuts, we compare several algorithms to compute graph cuts which minimise a measure called the edge expansion, outlining a cryptographic motivation for doing so. Our results show that a greedy neighbour algorithm out-performs standard spectral algorithms for computing optimal graph cuts. We provide code and study explicit examples.
Furthermore, we describe several directions of active and future research.
KZH-Fold: Accountable Voting from Sublinear Accumulation
Accumulation schemes are powerful primitives that enable distributed and incremental verifiable computation with less overhead than recursive SNARKs. However, existing schemes with constant-size accumulation verifiers, suffer from linear-sized accumulators and deciders, leading to linear-sized proofs that are unsuitable in distributed settings. Motivated by the need for bandwidth efficient accountable voting protocols, (I) We introduce KZH, a novel polynomial commitment scheme, and (II) KZH-fold, the first sublinear accumulation scheme where the verifier only does $3$ group scalar multiplications and $O(n^{1/2})$ accumulator size and decider time. Our scheme generalizes to achieve accumulator and decider complexity of $k \cdot n^{1/k}$ with verifier complexity $k$. Using the BCLMS compiler, (III) we build an IVC/PCD scheme with sublinear proof and decider. (IV) Next, we propose a new approach to non-uniform IVC, where the cost of proving a step is proportional only to the size of the step instruction circuit, and unlike previous approaches, the witness size is not linear in the number of instructions. (V) Leveraging these advancements, we demonstrate the power of KZH-fold by implementing an accountable voting scheme using a novel signature aggregation protocol supporting millions of participants, significantly reducing communication overhead and verifier time compared to BLS-based aggregation. We implemented and benchmarked our protocols and KZH-fold achieves a 2000x reduction in communication and a 50x improvement in decider time over Nova when proving 2000 Poseidon hashes, at the cost of 3x the prover time.
Error floor prediction with Markov models for QC-MDPC codes
Quasi-cyclic moderate-density parity check (QC-MDPC) code-based encryption schemes under iterative decoders offer highly-competitive performance in the quantum-resistant space of cryptography, but the decoding-failure rate (DFR) of these algorithms are not well-understood. The DFR decreases extremely rapidly as the ratio of code-length to error-bits increases, then decreases much more slowly in regimes known as the waterfall and error-floor, respectively.
This work establishes three, successively more detailed probabilistic models of the DFR for iterative decoders for QC-MPDC codes: the simplified model, the refined model for perfect keys, and the refined model for all keys. The models are built upon a Markov model introduced by Sendrier and Vasseur that closely predicts decoding behavior in the waterfall region but does not capture the error floor behavior. The simplified model introduces a modification which captures the dominant contributor to error floor behavior which is convergence to near codewords introduced by Vasseur in his PhD thesis. The refined models give more accurate predictions taking into account certain structural features of specific keys.
Our models are based on the step-by-step decoder, which is highly simplified and experimentally displays worse decoding performance than parallel decoders used in practice. Despite the use of the simplified decoder, we obtain an accurate prediction of the DFR in the error floor and demonstrate that the error floor behavior is dominated by convergence to a near codeword during a failed decoding instance. Furthermore, we have run this model for a simplified version of the QC-MDPC code-based cryptosystem BIKE to better ascertain whether the DFR is low enough to achieve IND-CCA2 security. Our model for a modified version of BIKE 1 gives a DFR which is below $2^{-129.5}$, using a block length $r = 13477$ instead of the BIKE 1 parameter $r = 12323$.
On Maximum Size Simultaneous Linear Approximations in Ascon and Keccak and Related Translation and Differential Properties
In this paper we study the S-box known as Chi or \chi initially proposed by Daemen in 1995 and very widely used ever since in Keccak, Ascon, and many other. This type of ciphers is typically analyzed [in recent research] in terms of subspace trail attacks [TeDi19] and vector space invariants. An interesting question is then, when different spaces are mapped to each other by translations with a constant.
In this paper we relax this fundamental question and we consider arbitrary sets of points and their translations. We generalize previous S-box partial linearization results on Keccak and Ascon from Eurocrypt 2017. We basically introduce new ways to linearize the Ascon S-box to the maximum possible extent. On this basis we show further remarkable properties and some surprising connections between [simultaneous] linear and [prominent] differential properties. We exhibit a new family of maximum size and optimal approximation properties with 11 points, beyond the maximum size of any set in the DDT table.
We prove a theorem which guarantees that this type of properties are stable by arbitrary input-side translations which holds for all quadratic permutations. All this will be placed in the context of previous work on classification of 5-bit quadratic permutations.
Efficient Quantum-safe Distributed PRF and Applications: Playing DiSE in a Quantum World
We propose the first $\textit{distributed}$ version of a simple, efficient, and provably quantum-safe pseudorandom function (PRF). The distributed PRF (DPRF) supports arbitrary threshold access structures based on the hardness of the well-studied Learning with Rounding (LWR) problem. Our construction (abbreviated as $\mathsf{PQDPRF}$) practically outperforms not only existing constructions of DPRF based on lattice-based assumptions, but also outperforms (in terms of evaluation time) existing constructions of: (i) classically secure DPRFs based on discrete-log hard groups, and (ii) quantum-safe DPRFs based on any generic quantum-safe PRF (e.g. AES). The efficiency of $\mathsf{PQDPRF}$ stems from the extreme simplicity of its construction, consisting of a simple inner product computation over $\mathbb{Z}_q$, followed by a rounding to a smaller modulus $p < q$. The key technical novelty of our proposal lies in our proof technique, where we prove the correctness and post-quantum security of $\mathsf{PQDPRF}$ (against semi-honest corruptions of any less than threshold number of parties) for a polynomial $q/p$ (equivalently, "modulus to modulus")-ratio.
Our proposed DPRF construction immediately enables efficient yet quantum-safe instantiations of several practical applications, including key distribution centers, distributed coin tossing, long-term encryption of information, etc. We showcase a particular application of $\mathsf{PQDPRF}$ in realizing an efficient yet quantum-safe version of distributed symmetric-key encryption ($\mathsf{DiSE}$ -- originally proposed by Agrawal et al. in CCS 2018), which we call $\mathsf{PQ-DiSE}$. For semi-honest adversarial corruptions across a wide variety of corruption thresholds, $\mathsf{PQ-DiSE}$ substantially outperforms existing instantiations of $\mathsf{DiSE}$ based on discrete-log hard groups and generic PRFs (e.g. AES). We illustrate the practical efficiency of our $\mathsf{PQDPRF}$ via prototype implementation of $\mathsf{PQ-DiSE}$.
Towards Verifiable FHE in Practice: Proving Correct Execution of TFHE's Bootstrapping using plonky2
In this work we demonstrate for the first time that a full FHE bootstrapping operation can be proven using a SNARK in practice. We do so by designing an arithmetic circuit for the bootstrapping operation and prove it using plonky2. We are able to prove the circuit on an AWS Hpc7a instance in under 20 minutes. Proof size is about 200kB and verification takes less than 10ms. As the basis of our bootstrapping operation we use TFHE's programmable bootstrapping and modify it in a few places to more efficiently represent it as an arithmetic circuit (while maintaining full functionality and security). In order to achieve our results in a memory-efficient way, we take advantage of the structure of the computation and plonky2's ability to efficiently prove its own verification circuit to implement a recursion-based IVC scheme. Lastly, we present a security proof in the UC model that captures active attacks in real world applications of verifiable FHE and augment our prototype to fit such applications.
Quantum function secret sharing
We propose a quantum function secret sharing scheme in which the communication is exclusively classical. In this primitive, a classical dealer distributes a secret quantum circuit $C$ by providing shares to $p$ quantum parties. The parties on an input state $\ket{\psi}$ and a projection $\Pi$, compute values $y_i$ that they then classically communicate back to the dealer, who can then compute $\lVert\Pi C\ket{\psi}\rVert^2$ using only classical resources. Moreover, the shares do not leak much information about the secret circuit $C$.
Our protocol for quantum secret sharing uses the Cayley path, a tool that has been extensively used to support quantum primacy claims. More concretely, the shares of $C$ correspond to randomized version of $C$ which are delegated to the quantum parties, and the reconstruction can be done by extrapolation. Our scheme has two limitations, which we prove to be inherent to our techniques: First, our scheme is only secure against single adversaries, and we show that if two parties collude, then they can break its security. Second, the evaluation done by the parties requires exponential time in the number of gates.
ProbeShooter: A New Practical Approach for Probe Aiming
Electromagnetic side-channel analysis is a powerful method for monitoring processor activity and compromising cryptographic systems in air-gapped environments. As analytical methodologies and target devices evolve, the importance of leakage localization and probe aiming becomes increasingly apparent for capturing only the desired signals with a high signal-to-noise ratio. Despite its importance, there remains substantial reliance on unreliable heuristic approaches and inefficient exhaustive searches. Furthermore, related studies often fall short in terms of feasibility, practicality, and performance, and are limited to controlled DUTs and low-end MCUs.
To address the limitations and inefficiencies of the previous approaches, we propose a novel methodology―${\rm P{\tiny ROBE}S{\tiny HOOTER}}$―for leakage localization and probe aiming. This approach leverages new insights into the spatial characteristics of amplitude modulation and intermodulation distortion in processors. As a result, ${\rm P{\tiny ROBE}S{\tiny HOOTER}}$ provides substantial improvements in various aspects: $\boldsymbol 1)$ it is applicable to not only simple MCUs but also complex SoCs, $\boldsymbol 2)$ it effectively handles multi-core systems and dynamic frequency scaling, $\boldsymbol 3)$ it is adoptable to uncontrollable DUTs, making it viable for constrained real-world attacks, and $\boldsymbol 4)$ it performs significantly faster than previous methods. To demonstrate this, we experimentally evaluate ${\rm P{\tiny ROBE}S{\tiny HOOTER}}$ on a high-end MCU (the NXP i.MX RT1061 featuring a single ARM Cortex-M7 core) and a complex SoC (the Broadcom BCM2711 equipped with the Raspberry Pi 4 Model B, featuring four ARM Cortex-A72 cores).
Distributional Private Information Retrieval
A private-information-retrieval (PIR) scheme lets a client fetch a record from a remote database without revealing which record it fetched. Classic PIR schemes treat all database records the same but, in practice, some database records are much more popular (i.e., commonly fetched) than others. We introduce distributional PIR, a new type of PIR that can run faster than classic PIR---both asymptotically and concretely---when the popularity distribution is skewed. Distributional PIR provides exactly the same cryptographic privacy as classic PIR. The speedup comes from a relaxed form of correctness: distributional PIR guarantees that in-distribution queries succeed with good probability, while out-of-distribution queries succeed with lower probability. Because of its relaxed correctness, distributional PIR is best suited for applications where "best-effort" retrieval is acceptable. Moreover, for security, a client's decision to query the server must be independent of whether its past queries were successful.
We construct a distributional-PIR scheme that makes black-box use of classic PIR protocols, and prove a lower bound on the server runtime of a natural class of distributional-PIR schemes. On two real-world popularity distributions, our construction reduces compute costs by $5$-$77\times$ compared to existing techniques. Finally, we build CrowdSurf, an end-to-end system for privately fetching tweets, and show that distributional-PIR reduces the end-to-end server cost by $8\times$.
MicroSecAgg: Streamlined Single-Server Secure Aggregation
This work introduces MicroSecAgg, a framework that addresses the intricacies of secure aggregation in the single-server landscape, specifically tailored to situations where distributed trust among multiple non-colluding servers presents challenges. Our protocols are purpose-built to handle situations featuring multiple successive aggregation phases among a dynamic pool of clients who can drop out during the aggregation. Our different protocols thrive in three distinct cases: firstly, secure aggregation within a small input domain; secondly, secure aggregation within a large input domain; and finally, facilitating federated learning for the cases where moderately sized models are considered. Compared to the prior works of Bonawitz et al. (CCS 2017), Bell et al. (CCS 2020), and the recent work of Ma et al. (S&P 2023), our approach significantly reduces the overheads. In particular, MicroSecAgg halves the round complexity to just 3 rounds, thereby offering substantial improvements in communication cost efficiency. Notably, it outperforms Ma et al. by a factor of n on the user side, where n represents the number of users. Furthermore, in MicroSecAgg the computation complexity of each aggregation per user exhibits a logarithmic growth with respect to $n$, contrasting with the linearithmic or quadratic growth observed in Ma et al. and Bonawitz et al., respectively. We also require linear (in n) computation work from the server as opposed to quadratic in Bonawitz et al., or linearithmic in Ma et al. and Bell et al. In the realm of federated learning, a delicate tradeoff comes into play: our protocols shine brighter as the number of participating parties increases, yet they exhibit diminishing computational efficiency as the sheer volume of weights/parameters increases significantly.
We report an implementation of our system and compare the performance against prior works, demonstrating that MicroSecAgg significantly reduces the computational burden and the message size.
On pairs of primes with small order reciprocity
We give a sieving algorithm for finding pairs of primes with small multiplicative orders modulo each other. This problem is a necessary condition for obtaining constructions of $2$-cycles of pairing-friendly curves, which have found use in cryptographic applications. Our database of examples suggests that, with the exception of a well-known infinite family of such primes, instances become increasingly rare as the size of the primes increase. This leads to some interesting open questions for which we hope our database prompts further investigation.
Practical Asynchronous Distributed Key Reconfiguration and Its Applications
In this paper, we study practical constructions of asynchronous distributed key reconfiguration ($\mathsf{ADKR}$), which enables an asynchronous fault-tolerant system with an existing threshold cryptosystem to efficiently generate a new threshold cryptosystem for a reconfigured set of participants. While existing asynchronous distributed threshold key generation ($\mathsf{ADKG}$) protocols theoretically solve $\mathsf{ADKR}$, they fail to deliver satisfactory scalability due to cubic communication overhead, even with simplifications to the reconfiguration setting.
We introduce a more efficient \textit{share-dispersal-then-agree-and-recast} paradigm for constructing $\mathsf{ADKR}$ with preserving adaptive security. The method replaces expensive $O(n)$ asynchronous verifiable secret sharing protocols in classic $\mathsf{ADKG}$ with $O(n)$ cheaper dispersals of publicly-verifiable sharing transcripts; after consensus confirms a set of finished dispersals, it selects a small $\kappa$-subset of finished dispersals for verification, reducing the total overhead to $O(\kappa n^2)$ from $O(n^3)$, where $\kappa$ is a small constant (typically $\sim$30 or less). To further optimize concrete efficiency, we propose an interactive protocol with linear communication to generate publicly verifiable secret sharing (PVSS) transcripts, avoiding computationally expensive non-interactive PVSS. Additionally, we introduce a distributed PVSS verification mechanism, minimizing redundant computations across different parties and reducing the dominating PVSS verification cost by about one-third.
Our design also enables diverse applications: (i) given a quadratic-communication asynchronous coin-flipping protocol, it implies the first quadratic-communication $\mathsf{ADKG}$; and (ii) it can be extended to realize the first quadratic-communication asynchronous dynamic proactive secret sharing (ADPSS) protocol with adaptive security. Experimental evaluations on a global network of 256 AWS servers show up to 40\% lower latency compared to state-of-the-art $\mathsf{ADKG}$ protocols (with simplifications to the reconfiguration setting), highlighting the practicality of our $\mathsf{ADKR}$ in large-scale asynchronous systems.
Efficient algorithms for the detection of $(N,N)$-splittings and endomorphisms
We develop an efficient algorithm to detect whether a superspecial genus 2 Jacobian is optimally $(N, N)$-split for each integer $N \leq 11$. Incorporating this algorithm into the best-known attack against the superspecial isogeny problem in dimension 2 (due to Costello and Smith) gives rise to significant cryptanalytic improvements. Our implementation shows that when the underlying prime $p$ is 100 bits, the attack is sped up by a factor of $25$; when the underlying prime is 200 bits, the attack is sped up by a factor of $42$; and, when the underlying prime is 1000 bits, the attack is sped up by a factor of $160$. Furthermore, we describe a more general algorithm to find endomorphisms of superspecial genus 2 Jacobians.
Breaking RSA with Overclocking-induced GPU Faults
Overclocking is a a supported functionality of Nvidia GPUs, and is a common performance enhancement practice. However, overclocking poses a danger for cryptographic applications. As the temperature in the overclocked GPU increases, spurious computation faults occur. Coupled with well known fault attacks against RSA implementations, one can expect such faults to allow compromising RSA private keys during decryption or signing.
We first validate this hypothesis: We evaluate two commercial-grade GPU-based implementations of RSA within openSSL (called RNS and MP), under a wide range of overclocking levels and temperatures, and demonstrate that both implementations are vulnerable.
However, and more importantly, we show for the first time that even if the GPU is benignly overclocked to a seemingly ``safe'' rate, a successful attack can still be mounted, over the network, by simply sending requests at an aggressive rate to increase the temperature. Hence, setting any level of overclocking on the GPU is risky.
Moreover, we observe a huge difference in the implementations'
vulnerability: the rate of RSA breaks for RNS is 4 orders of magnitude higher than that of MP. We attribute this difference to the implementations' memory usage patterns: RNS makes heavy use of the GPU's global memory, which is accessed via both the Unified (L1) cache and the L2 cache; MP primarily uses ``shared'' on-chip memory, which is local to each GPU Streaming MultiProcessor (SM) and is uncached, utilizing the memory banks used for the L1 cache. We believe that the computation faults are caused by reads from the global memory, which under a combination of overclocking, high temperature and high memory contention, occasionally return stale values.
How to Prove False Statements: Practical Attacks on Fiat-Shamir
The Fiat-Shamir (FS) transform is a prolific and powerful technique for compiling public-coin interactive protocols into non-interactive ones. Roughly speaking, the idea is to replace the random coins of the verifier with the evaluations of a complex hash function.
The FS transform is known to be sound in the random oracle model (i.e., when the hash function is modeled as a totally random function). However, when instantiating the random oracle using a concrete hash function, there are examples of protocols in which the transformation is not sound. So far all of these examples have been contrived protocols that were specifically designed to fail.
In this work we show such an attack for a standard and popular interactive succinct argument, based on the GKR protocol, for verifying the correctness of a non-determinstic bounded-depth computation. For every choice of FS hash function, we show that a corresponding instantiation of this protocol, which was been widely studied in the literature and used also in practice, is not (adaptively) sound when compiled with the FS transform. Specifically, we construct an explicit circuit for which we can generate an accepting proof for a false statement.
We further extend our attack and show that for every circuit $C$ and desired output $y$, we can construct a functionally equivalent circuit $C^*$, for which we can produce an accepting proof that $C^*$ outputs $y$ (regardless of whether or not this statement is true). This demonstrates that any security guarantee (if such exists) would have to depend on the specific implementation of the circuit $C$, rather than just its functionality.
Lastly, we also demonstrate versions of the attack that violate non-adaptive soundness of the protocol -- that is, we generate an attacking circuit that is independent of the underlying cryptographic objects. However, these versions are either less practical (as the attacking circuit has very large depth) or make some additional (reasonable) assumptions on the underlying cryptographic primitives.
On Linear Equivalence, Canonical Forms, and Digital Signatures
Given two linear codes, the code equivalence problem asks to find an isometry mapping one code into the other.
The problem can be described in terms of group actions and, as such, finds a natural application in signatures derived from a Zero-Knowledge Proof system.
A recent paper, presented at Asiacrypt 2023, showed how a proof of equivalence can be significantly compressed by describing how the isometry acts only on an information set. Still, the resulting signatures are far from being optimal, as the size for a witness to this relation is still significantly larger than the theoretical lower bound, which is twice the security parameter.
In this paper, we fill this gap and propose a new notion of equivalence, which leads to a drastically reduced witness size. For many cases, the resulting size is exactly the optimal one given by the lower bound. We achieve this by introducing the framework of canonical representatives, that is, representatives for classes of codes which are equivalent under some notion of equivalence. We propose new notions of equivalence which encompass and further extend all the existing ones: this allows to identify broader classes of equivalent codes, for which the equivalence can be proved with a very compact witness. We associate these new notions to a specific problem, called Canonical Form Linear Equivalence Problem (CF-LEP), which we show to be as hard as the original one (when random codes are considered), providing reductions in both ways. As an added consequence, this reduction leads to a new solver for the code equivalence problem, which is the fastest solver when the finite field size is large enough. Finally, we show that our framework yields a remarkable reduction in signature size when compared to the LESS submission.
Our variant is able to obtain very compact signatures, around 2 KB or less, which are among the smallest in the code-based setting.
Trustless Bridges via Random Sampling Light Clients
The increasing number of blockchain projects introduced annually has led to a pressing need for secure and efficient interoperability solutions. Currently, the lack of such solutions forces end-users to rely on centralized intermediaries, contradicting the core principle of decentralization and trust minimization in blockchain technology. In this paper, we propose a decentralized and efficient interoperability solution (aka Bridge Protocol) that operates without additional trust assumptions, relying solely on the Byzantine Fault Tolerance (BFT) of the two chains being connected. In particular, relayers (actors that exchange messages between networks) are permissionless and decentralized, hence eliminating any single point of failure. We introduce Random Sampling, a novel technique for on-chain light clients to efficiently follow the history of PoS blockchains by reducing the signature verifications required. Here, the randomness is drawn on-chain, for example, using Ethereum's RANDAO. We analyze the security of the bridge from a crypto-economic perspective and provide a framework to derive the security parameters. This includes handling subtle concurrency issues and randomness bias in strawman designs. While the protocol is applicable to various PoS chains, we demonstrate its feasibility by instantiating a bridge between Polkadot and Ethereum (currently deployed), and discuss some practical security challenges. We also evaluate the efficiency (gas costs) of an on-chain light-client verifier implemented as a smart contract on ethereum against SNARK-based approaches. Even for large validator set sizes (up to $10^6$), the signature verification gas costs of our light-client verifier are a magnitude lower.
Efficient Oblivious Sorting and Shuffling for Hardware Enclaves
Oblivious algorithms are being deployed at large scale in real world to enable privacy-preserving applications such as Signal's private contact discovery. Oblivious sorting is a fundamental building block in the design of oblivious algorithms for numerous computation tasks. Unfortunately, there is still a theory-practice gap for oblivious sort. The commonly implemented bitonic sorting algorithm is not asymptotically optimal, whereas known asymptotically optimal algorithms suffer from large constants.
In this paper, we construct a new oblivious sorting algorithm called flexway o-sort, which is asymptotically optimal, concretely efficient, and suitable for implementation in hardware enclaves such as Intel SGX. For moderately large inputs of $12$ GB, our flexway o-sort algorithm outperforms known oblivious sorting algorithms by $1.32\times$ to $28.8\times$ when the data fits within the hardware enclave, and by $4.1\times$ to $208\times$ when the data does not fit within the hardware enclave. We also implemented various applications of oblivious sorting, including histogram, database join, and initialization of an ORAM data structure. For these applications and data sets from 8GB to 32GB, we achieve $1.44 \sim 2.3\times$ speedup over bitonic sort when the data fits within the enclave, and $4.9 \sim 5.5\times$ speedup when the data does not fit within the enclave.
DeepFold: Efficient Multilinear Polynomial Commitment from Reed-Solomon Code and Its Application to Zero-knowledge Proofs
This work presents Deepfold, a novel multilinear polynomial commitment scheme (PCS) based on Reed-Solomon code that offers optimal prover time and a more concise proof size. For the first time, Deepfold adapts the FRI-based multilinear PCS to the list decoding radius setting, requiring significantly fewer query repetitions and thereby achieving a 3$\times$ reduction in proof size compared to Basefold (Crypto'24), while preserving its advantages in prover time. Compared with PolyFRIM (USENIX Security'24), Deepfold achieves a 2$\times$ improvement in prover time, verifier time, and proof size. Another contribution of this work is a batch evaluation scheme, which enables the FRI-based multilinear PCS to handle polynomials whose size is not a power of two more efficiently.
Our scheme has broad applications in zk-SNARKs, since PCS is a key component in modern zk-SNARK constructions. For example, when replacing the PCS component of Virgo (S&P'20) with Deepfold, our scheme achieves a 2.5$\times$ faster prover time when proving the knowledge of a Merkle tree with 256 leaves, while maintaining the similar proof size. When replacing the PCS component of HyperPlonk (Eurocrypt'23) with Deepfold, our scheme has about 3.6$\times$ faster prover time. Additionally, when applying our arbitrary length input commitment to verifiable matrix multiplications for matrices of size
1200$\times$768 and 768$\times$2304, which are actual use cases in GPT-2 model, the performance showcases a 2.4$\times$ reduction in prover time compared to previous approaches.
Miller Inversion is Easy for the Reduced Tate Pairing of Embedding Degree Greater than one
We present algorithms for Miller inversion for the reduced Tate pairing with embedding degree k>1. Let q be a number of elements of field of definition of an elliptic curve. For even k, our algorithm run deterministically with O((k log q)^3) bit operations. For odd k, out algorithm run probabilistically with O(k^6 (log q)^3) bit operations in average.
A New Way to Achieve Round-Efficient Asynchronous Byzantine Agreement
We translate the expand-and-extract framework by Fitzi, Liu-Zhang, and Loss (PODC 21) to the asynchronous setting.
While they use it to obtain a synchronous BA with $2^{-\lambda}$ error probability in $\lambda+1$ rounds, we make it work in asynchrony in $\lambda+3$ rounds. At the heart of their solution is a generalization of crusader agreement and graded agreement to any number of grades called proxcensus. They achieve graded consensus with $2^r+1$ grades in $r$ rounds by reducing proxcensus with $2s-1$ grades to proxcensus with $s$ grades in one round. The expand-and-extract paradigm uses proxcensus to expand binary inputs to $2^\lambda+1$ grades in $\lambda$ rounds before extracting a binary output by partitioning the grades using a $\lambda$ bit common coin. However, the proxcensus protocol by Fitzi et al. does not translate to the asynchronous setting without lowering the corruption threshold or using more rounds in each recursive step.
This is resolved by attaching justifiers to all messages: forcing the adversary to choose between being ignored by the honest parties, or sending messages with certain validity properties. Using these we define validated proxcensus and show that it can be instantiated in asynchrony with the same recursive structure and round complexity as synchronous proxcensus. In asynchrony the extraction phase incurs a security loss of one bit which is recovered by expanding to twice as many grades using an extra round of communication. This results in a $\lambda+2$ round VABA and a $\lambda+3$ round BA, both with $2^{-\lambda}$ error probability and communication complexity matching Fitzi et al.
Bounded CCA2 Secure Proxy Re-encryption Based on Kyber
Proxy re-encryption (PRE) allows a semi-honest party (called a proxy) to convert ciphertexts under a public key into ciphertexts under another public key. Due to this functionality, there are various applications such as encrypted email forwarding, key escrow, and secure distributed file systems. On the other hand, post-quantum cryptography (PQC) is one of the most important research areas. However, there is no post-quantum PRE scheme with security against adaptive chosen ciphertext attacks (denoted by $\mathsf{CCA}2$ security) while many PRE schemes have been proposed so far.
In this paper, we propose a bounded $\mathsf{CCA}2$ secure PRE scheme based on CRYSTALS-Kyber (Kyber, for short) which is a selected algorithm in the NIST PQC competition. To this end, we present generic constructions of bounded $\mathsf{CCA}2$ secure PRE. Our generic constructions start from PRE with a variant of security against chosen plaintext attacks (denoted by $\mathsf{CPA}$ security) and a new PRE's property introduced in this paper. In order to instantiate our generic constructions, we present a Kyber-based PRE scheme with the required property. As a result, we can construct a bounded $\mathsf{CCA}2$ secure PRE scheme from Kyber.
hax: Verifying Security-Critical Rust Software using Multiple Provers
We present hax, a verification toolchain for Rust targeted at security-critical software such as cryptographic libraries, protocol imple- mentations, authentication and authorization mechanisms, and parsing and sanitization code. The key idea behind hax is the pragmatic observation that different verification tools are better at handling different kinds of verification goals. Consequently, hax supports multiple proof backends, including domain-specific security analysis tools like ProVerif and SSProve, as well as general proof assistants like Coq and F*. In this paper, we present the hax toolchain and show how we use it to translate Rust code to the input languages of different provers. We describe how we systematically test our translated models and our models of the Rust system libraries to gain confidence in their correctness. Finally, we briefly overview various ongoing verification projects that rely on hax.
Arbitrary-Threshold Fully Homomorphic Encryption with Lower Complexity
Threshold fully homomorphic encryption (ThFHE) enables multiple parties to compute functions over their sensitive data without leaking data privacy. Most of existing ThFHE schemes are restricted to full threshold and require the participation of all parties to output computing results. Compared with these full-threshold schemes, arbitrary threshold (ATh)-FHE schemes are robust to non-participants and can be a promising solution to many real-world applications. However, existing AThFHE schemes are either inefficient to be applied with a large number of parties $N$ and a large data size $K$, or insufficient to tolerate all types of non-participants. In this paper, we propose an AThFHE scheme to handle all types of non-participants with lower complexity over existing schemes. At the core of our scheme is the reduction from AThFHE construction to the design of a new primitive called approximate secret sharing (ApproxSS). Particularly, we formulate ApproxSS and prove the correctness and security of AThFHE on top of arbitrary-threshold (ATh)-ApproxSS's properties. Such a reduction reveals that existing AThFHE schemes implicitly design ATh-ApproxSS following a similar idea called ``noisy share''. Nonetheless, their ATh-ApproxSS design has high complexity and become the performance bottleneck. By developing ATASSES, an ATh-ApproxSS scheme based on a novel ``encrypted share'' idea, we reduce the computation (resp. communication) complexity from $\mathcal{O}(N^2K)$ to $\mathcal{O}(N^2+K)$ (resp. from $\mathcal{O}(NK)$ to $\mathcal{O}(N+K)$). We not only theoretically prove the (approximate) correctness and security of ATASSES, but also empirically evaluate its efficiency against existing baselines. Particularly, when applying to a system with one thousand parties, ATASSES achieves a speedup of $3.83\times$ -- $15.4\times$ over baselines.
Falcon on ARM Cortex-M4: an Update
This note reports new implementation results for the Falcon signature algorithm on an ARM Cortex-M4 microcontroller. Compared with our previous implementation (in 2019), runtime cost has been about halved.
Isogeny-based Cryptography using Isomorphisms of Superspecial Abelian Surfaces
We investigate the algorithmic problem of computing isomorphisms between products of supersingular elliptic curves, given their endomorphism rings. This computational problem seems to be difficult when the domain and codomain are fixed, whereas we provide efficient algorithms to compute isomorphisms when part of the codomain is built during the construction. We propose an authentication protocol whose security relies on this asymmetry. Its most prominent feature is that the endomorphism rings of the elliptic curves are not hidden. Furthermore, it does not require a trusted setup.
Quickly after this preprint was published, Benjamin Wesolowski found a way to solve efficiently Problem 5.1 that we assumed to be hard. This kills our authentication protocol.
Space-Lock Puzzles and Verifiable Space-Hard Functions from Root-Finding in Sparse Polynomials
Timed cryptography has initiated a paradigm shift in the design of cryptographic protocols: Using timed cryptography we can realize tasks fairly, which is provably out of range of standard cryptographic concepts. To a certain degree, the success of timed cryptography is rooted in the existence of efficient protocols based on the sequential squaring assumption.
In this work, we consider space analogues of timed cryptographic primitives, which we refer to as space-hard primitives. Roughly speaking, these notions require honest protocol parties to invest a certain amount of space and provide security against space constrained adversaries. While inefficient generic constructions of timed-primitives from strong assumptions such as indistinguishability obfuscation can be adapted to the space-hard setting, we currently lack concrete and versatile algebraically structured assumptions for space-hard cryptography.
In this work, we initiate the study of space-hard primitives from concrete algebraic assumptions relating to the problem of root-finding of sparse polynomials. Our motivation to study this problem is a candidate construction of VDFs by Boneh et al. (CRYPTO 2018) which are based on the hardness of inverting permutation polynomials. Somewhat anticlimactically, our first contribution is a full break of this candidate. However, we then revise this hardness assumption by dropping the permutation requirement and considering arbitrary sparse high degree polynomials. We argue that this type of assumption is much better suited for space-hardness rather than timed cryptography. We then proceed to construct both space-lock puzzles and verifiable space-hard functions from this assumption.
Blink: An Optimal Proof of Proof-of-Work
Designing light clients to securely and efficiently read Proof-of-Work blockchains has been a foundational problem since the inception of blockchains. Nakamoto themselves, in the original Bitcoin paper, presented the first client protocol, i.e., the Simplified Payment Verification, which consumes an amount of bandwidth, computational, and storage resources that grows linearly in the system's lifetime $\mathcal{C}$.
Today, the blockchain ecosystem is more mature and presents a variety of applications and protocols deployed on-chain and, often, cross-chain. In this landscape, light clients have become the cornerstone of decentralized bridges, playing a pivotal role in the security and efficiency of cross-chain operations.
These new use cases, combined with the growth of blockchains over time, raise the need for more minimalist clients, which further reduce the resource requirements and, when applicable, on-chain costs.
Over the years, the light client resource consumption has been reduced from $\mathcal{O}( \mathcal{C})$ to $\mathcal{O}(\text{polylog}( \mathcal{C}))$, and then down to $\mathcal{O}(1)$ with zero-knowledge techniques at the cost of often assuming a trusted setup.
In this paper, we present Blink, the first interactive provably secure $\mathcal{O}(1)$ PoW light client without trusted setup. Blink can be used for a variety of applications ranging from payment verification and bootstrapping, to bridges.
We prove Blink secure in the Bitcoin Backbone model, and we evaluate its proof size demonstrating that, at the moment of writing, Blink obtains a commitment to the current state of Bitcoin by downloading only 1.6KB, instead of 67.3MB and 197KB for SPV and zk-based clients, respectively.
Vector Commitments With Proofs of Smallness: Short Range Proofs and More
Vector commitment schemes are compressing commitments to vectors that make it possible to succinctly open a commitment for individual vector positions without revealing anything about other positions. We describe vector commitments enabling constant-size proofs that the committed vector is small (i.e., binary, ternary, or of small norm). As a special case, we obtain range proofs featuring the shortest proof length in the literature with only $3$ group elements per proof. As another application, we obtain short pairing-based NIZK arguments for lattice-related statements. In particular, we obtain short proofs (comprised of $3$ group elements) showing the validity of ring LWE ciphertexts and public keys. Our constructions are proven simulation-extractable in the algebraic group model and the random oracle model.
An Improved Algorithm for Code Equivalence
We study the linear code equivalence problem (LEP) for linear $[n,k]$-codes over finite fields $\mathbb{F}_q$. Recently, Chou, Persichetti and Santini gave an elegant algorithm that solves LEP over large finite fields (with $q = \Omega(n)$) in time $2^{\frac{1}{2}\operatorname{H}\left(\frac{k}{n}\right)n}$, where $\operatorname{H}(\cdot)$ denotes the binary entropy function. However, for small finite fields, their algorithm can be significantly slower. In particular, for fields of constant size $q = \mathcal{O}(1)$, its runtime increases by an exponential factor in $n$.
We present an improved version of their algorithm, which achieves the desired runtime of $2^{\frac{1}{2}\operatorname{H}\left(\frac{k}{n}\right)n}$ for all finite fields of size $q \geq 7$. For a wide range of parameters, this improves over the runtime of all previously known algorithms by an exponential factor.
New Quantum Cryptanalysis of Binary Elliptic Curves (Extended Version)
This paper improves upon the quantum circuits required for the Shor's attack on binary elliptic curves. We present two types of quantum point addition, taking both qubit count and circuit depth into consideration.
In summary, we propose an in-place point addition that improves upon the work of Banegas et al. from CHES'21, reducing the qubit count – depth product by more than $73\%$ – $81\%$ depending on the variant. Furthermore, we develop an out-of-place point addition by using additional qubits. This method achieves the lowest circuit depth and offers an improvement of over $92\%$ in the qubit count – quantum depth product (for a single step).
To the best of our knowledge, our work improves from all previous works (including the CHES'21 paper by Banegas et al., the IEEE Access'22 paper by Putranto et al., and the CT-RSA'23 paper by Taguchi and Takayasu) in terms of circuit depth and qubit count – depth product.
Equipped with the implementations, we discuss the post-quantum security of the binary elliptic curve cryptography. Under the MAXDEPTH metric (proposed by the US government's NIST), the quantum circuit with the highest depth in our work is $2^{24}$, which is significantly lower than the MAXDEPTH limit of $2^{40}$. For the gate count – full depth product, a metric for estimating quantum attack cost (proposed by NIST), the highest complexity in our work is $2^{60}$ for the curve having degree 571 (which is comparable to AES-256 in terms of classical security), considerably below the post-quantum security level 1 threshold (of the order of $2^{156}$).
FINAL bootstrap acceleration on FPGA using DSP-free constant-multiplier NTTs
This work showcases Quatorze-bis, a state-of-the-art Number Theoretic Transform circuit for TFHE-like cryptosystems on FPGAs. It contains a novel modular multiplication design for modular multiplication with a constant for a constant modulus. This modular multiplication design does not require any DSP units or any dedicated multiplier unit, nor does it require extra logic when compared to the state-of-the-art modular multipliers. Furthermore, we present an implementation of a constant multiplier Number Theoretic Transform design for TFHE-like schemes. Lastly, we use this Number Theoretic Transform design to implement a FINAL hardware accelerator for the AMD Alveo U55c which improves the Throughput metric of TFHE-like cryptosystems on FPGAs by a factor 9.28x over Li et al.'s NFP CHES 2024 accelerator and by 10-25% over the absolute state-of-the-art design FPT while using one third of FPTs DSPs.
HELP: Everlasting Privacy through Server-Aided Randomness
Everlasting (EL) privacy offers an attractive solution to the Store-Now-Decrypt-Later (SNDL) problem, where future increases in the attacker's capability could break systems which are believed to be secure today. Instead of requiring full information-theoretic security, everlasting privacy allows computationally-secure transmissions of ephemeral secrets, which are only "effective" for a limited periods of time, after which their compromise is provably useless for the SNDL attacker.
In this work we revisit such everlasting privacy model of Dodis and Yeo (ITC'21), which we call Hypervisor EverLasting Privacy (HELP). HELP is a novel architecture for generating shared randomness using a network of semi-trusted servers (or "hypervisors"), trading the need to store/distribute large shared secrets with the assumptions that it is hard to: (a) simultaneously compromise too many publicly accessible ad-hoc servers; and (b) break a computationally-secure encryption scheme very quickly. While Dodis and Yeo presented good HELP solutions in the asymptotic sense, their solutions were concretely expensive and used heavy tools (like large finite fields or gigantic Toeplitz matrices).
We abstract and generalize the HELP architecture to allow for more efficient instantiations, and construct several concretely efficient HELP solutions. Our solutions use elementary cryptographic operations, such as hashing and message authentication. We also prove a very strong composition theorem showing that our EL architecture can use any message transmission method which is computationally-secure in the Universal Composability (UC) framework. This is the first positive composition result for everlasting privacy, which was otherwise known to suffer from many "non-composition" results (Müller-Quade and Unruh; J of Cryptology'10).
Path Privacy and Handovers: Preventing Insider Traceability Attacks During Secure Handovers
The rise of 5G and IoT has shifted secure communication from centralized and homogeneous to a landscape of heterogeneous mobile devices constantly travelling between myriad networks. In such environments, it is desirable for devices to securely extend their connection from one network to another, often referred to as a handover. In this work we introduce the first cryptographic formalisation of secure handover schemes. We leverage our formalisation to propose path privacy, a novel security property for handovers that has hitherto remained unexplored. We further develop a syntax for secure handovers, and identify security properties appropriate for secure handover schemes. Finally, we introduce a generic handover scheme that captures all the strong notions of security we have identified, combining our novel path privacy concept with other security properties characteristic to existing handover schemes, demonstrating the robustness and versatility of our framework.
Application-Aware Approximate Homomorphic Encryption: Configuring FHE for Practical Use
Fully Homomorphic Encryption (FHE) is a powerful tool for performing computations on encrypted data. The Cheon-Kim-Kim-Song (CKKS) scheme, an instantiation of approximate FHE, is particularly effective for privacy-preserving machine learning applications over real and complex numbers. Although CKKS offers clear efficiency advantages, confusion persists around accurately describing applications in FHE libraries and securely instantiating the scheme for these applications, particularly after the key recovery attacks by Li and Micciancio (EUROCRYPT'21) for the $IND-CPA^D$ setting.
There is presently a gap between the application-agnostic, generic definition of $IND-CPA^D$, and efficient, application-specific instantiation of CKKS in software libraries, which led to recent attacks by Guo et al. (USENIX Security'24).
To close this gap, we introduce the notion of application-aware homomorphic encryption (AAHE) and devise related security definitions. This model corresponds more closely to how FHE schemes are implemented and used in practice, while also identifying and addressing the potential vulnerabilities in popular libraries. We then provide an application specification language (ASL) and formulate guidelines for implementing the AAHE model to achieve $IND-CPA^D$ security for practical applications of CKKS. We present a proof-of-concept implementation of the ASL in the OpenFHE library showing how the attacks by Guo et al. can be countered. Moreover, we show that our new model and ASL can be used for the secure and efficient instantiation of exact FHE schemes and to counter the recent $IND-CPA^D$ attacks by Cheon et al. (CCS'24) and Checri et al. (CRYPTO'24).
Further Improvements in AES Execution over TFHE: Towards Breaking the 1 sec Barrier
Making the most of TFHE advanced capabilities such as programmable or circuit bootstrapping and their generalizations for manipulating data larger than the native plaintext domain of the scheme is a very active line of research. In this context, AES is a particularly interesting benchmark, as an example of a nontrivial algorithm which has eluded "practical" FHE execution performances for years, as well as the fact that it will most likely be selected by NIST as a flagship reference in its upcoming call on threshold (homomorphic) cryptography. Since 2023, the algorithm has thus been the subject of a renewed attention from the FHE community and has served as a playground to test advanced operators following the LUT-based, $p$-encodings or several variants of circuit bootstrapping, each time leading to further timing improvements. Still, AES is also interesting as a benchmark because of the tension between boolean- and byte-oriented operations within the algorithm. In this paper, we resolve this tension by proposing a new approach, coined "Hippogryph", which consistently combines the (byte-oriented) LUT-based approach with a generalization of the (boolean-oriented) $p$-encodings one to get the best of both worlds. In doing so, we obtain the best timings so far, getting a single-core execution of the algorithm over TFHE from 46 down to 32 seconds and approaching the 1 second barrier with only a mild amount of parallelism. We should also stress that all the timings reported in this paper are consistently obtained on the same machine which is often not the case in previous studies. Lastly, we emphasize that the techniques we develop are applicable beyond just AES since the boolean-byte tension is a recurrent issue when running algorithms over TFHE.
Preprocessing Security in Multiple Idealized Models with Applications to Schnorr Signatures and PSEC-KEM
In modern cryptography, relatively few instantiations of foundational cryptographic primitives are used across most cryptographic protocols. For example, elliptic curve groups are typically instantiated using P-256, P-384, Curve25519, or Curve448, while block ciphers are commonly instantiated with AES, and hash functions with SHA-2, SHA-3, or SHAKE. This limited diversity raises concerns that an adversary with nation-state-level resources could perform a preprocessing attack, generating a hint that might later be exploited to break protocols built on these primitives. It is often notoriously challenging to analyze and upper bound the advantage of a preprocessing attacker even if we assume that we have idealized instantiations of our cryptographic primitives (ideal permutations, ideal ciphers, random oracles, generic groups). Coretti et al. (CRYPTO/EUROCRYPT'18) demonstrated a powerful framework to simplify the analysis of preprocessing attacks, but they only proved that their framework applies when the cryptographic protocol uses a single idealized primitive. In practice, however, cryptographic constructions often utilize multiple different cryptographic primitives.
We verify that Coretti et al. (CRYPTO/EUROCRYPT'18)'s framework extends to settings with multiple idealized primitives and we apply this framework to analyze the multi-user security of (short) Schnorr Signatures and the CCA-security of PSEC-KEM against pre-processing attackers in the Random Oracle Model (ROM) plus the Generic Group Model (GGM). Prior work of Blocki and Lee (EUROCRYPT'22) used complicated compression arguments to analyze the security of {\em key-prefixed} short Schnorr signatures where the random oracle is salted with the user's public key. However, the security analysis did not extend to standardized implementations of Schnorr Signatures (e.g., BSI-TR-03111 or ISO/IEC 14888-3) which do not adopt key-prefixing, but take other measures to protect against preprocessing attacks by disallowing signatures that use a preimage of $0$. Blocki and Lee (EUROCRYPT'22) left the (in)security of such "nonzero Schnorr Signature" constructions as an open question. We fully resolve this open question demonstrating that (short) nonzero Schnorr Signatures are also secure against preprocessing attacks. We also analyze PSEC-KEM in the ROM+GGM demonstrating that this Key Encapsulation Mechanism (KEM) is CPA-secure against preprocessing attacks.
Early Stopping for Any Number of Corruptions
Minimizing the round complexity of byzantine broadcast is a fundamental question in distributed computing and cryptography. In this work, we present the first early stopping byzantine broadcast protocol that tolerates up to $t=n-1$ malicious corruptions and terminates in $O(\min\{f^2,t+1\})$ rounds for any execution with $f\leq t$ actual corruptions. Our protocol is deterministic, adaptively secure, and works assuming a plain public key infrastructure. Prior early-stopping protocols all either require honest majority or tolerate only up to $t=(1-\epsilon)n$ malicious corruptions while requiring either trusted setup or strong number theoretic hardness assumptions. As our key contribution, we show a novel tool called a polariser that allows us to transfer certificate-based strategies from the honest majority setting to settings with a dishonest majority.
Naysayer proofs
This work introduces the notion of naysayer proofs. We observe that in numerous (zero-knowledge) proof systems, it is significantly more efficient for the verifier to be convinced by a so-called naysayer that a false proof is invalid than it is to check that a genuine proof is valid. We show that every NP language has constant-size and constant-time naysayer proofs. We also show practical constructions for several example proof systems, including FRI polynomial commitments, post-quantum secure digital signatures, and verifiable shuffles. Naysayer proofs enable an interesting new optimistic verification mode potentially suitable for resource-constrained verifiers, such as smart contracts.
Quadratic Modelings of Syndrome Decoding
This paper presents enhanced reductions of the bounded-weight and exact-weight Syndrome Decoding Problem (SDP) to a system of quadratic equations. Over $\mathbb{F}_2$, we improve on a previous work and study the degree of regularity of the modeling of the exact weight SDP. Additionally, we introduce a novel technique that transforms SDP instances over $\mathbb{F}_q$ into systems of polynomial equations and thoroughly investigate the dimension of their varieties. Experimental results are provided to evaluate the complexity of solving SDP instances using our models through Gröbner bases techniques.
PRISM: Simple And Compact Identification and Signatures From Large Prime Degree Isogenies
The problem of computing an isogeny of large prime degree from a supersingular elliptic curve of unknown endomorphism ring is assumed to be hard both for classical as well as quantum computers.
In this work, we first build a two-round identification protocol whose security reduces to this problem. The challenge consists of a random large prime $q$ and the prover simply replies with an efficient representation of an isogeny of degree $q$ from its public key.
Using the hash-and-sign paradigm, we then derive a signature scheme with a very simple and flexible signing procedure and prove its security in the standard model.
Our optimized C implementation of the signature scheme shows that signing is roughly $1.8\times$ faster than all SQIsign variants, whereas verification is $1.4\times$ times slower. The sizes of the public key and signature are comparable to existing schemes.
TockOwl: Asynchronous Consensus with Fault and Network Adaptability
BFT protocols usually have a waterfall-like degradation in performance in the face of crash faults. Some BFT protocols may not experience sudden performance degradation under crash faults. They achieve this at the expense of increased communication and round complexity in fault-free scenarios. In a nutshell, existing protocols lack the adaptability needed to perform optimally under varying conditions.
We propose TockOwl, the first asynchronous consensus protocol with fault adaptability. TockOwl features quadratic communication and constant round complexity, allowing it to remain efficient in fault-free scenarios. TockOwl also possesses crash robustness, enabling it to maintain stable performance when facing crash faults. These properties collectively ensure the fault adaptability of TockOwl.
Furthermore, we propose TockOwl+ that has network adaptability. TockOwl+ incorporates both fast and slow tracks and employs hedging delays, allowing it to achieve low latency comparable to partially synchronous protocols without waiting for timeouts in asynchronous environments. Compared to the latest dual-track protocols, the slow track of TockOwl+ is simpler, implying shorter latency in fully asynchronous environments.
Impossible Differential Automation: Model Generation and New Techniques
In this paper, we aim to enhance and automate advanced techniques for impossible differential attacks. To demonstrate these advancements, we present improved attacks on the LBlock and HIGHT block ciphers. More precisely, we
(a) introduce a methodology to automatically invert symmetric ciphers when represented as directed acyclic graphs, a fundamental step in the search for impossible differential trails and in key recovery techniques;
(b) automate the search for impossible differential distinguishers, reproducing recent techniques and results;
(c) present a new hybrid model combining cell-wise properties and bit-wise granularity;
(d) integrate these techniques in the automated tool CLAASP;
(e) demonstrate the effectiveness of the tool by
reproducing a state-of-the-art 16-round impossible differential for LBlock previously obtained using a different technique and
exhibiting a new 18-round improbable trail;
(f) improve the state-of-the-art single-key recovery of HIGHT for 27 rounds, by automating the use of hash tables to current state-of-the-art results.
Cryptanalysis of an Efficient Signature Based on Isotropic Quadratic Forms
We present a key-recovery attack on DEFI, an efficient signature scheme proposed recently by Feussner and Semaev, and based on isotropic quadratic forms, borrowing from both multivariate and lattice cryptography.
Our lattice-based attack is partially heuristic, but works on all proposed parameters: experimentally, it recovers the secret key in a few minutes, using less than ten (message,signature) pairs.
Improved Provable Reduction of NTRU and Hypercubic Lattices
Lattice-based cryptography typically uses lattices with special properties to improve efficiency. We show how blockwise reduction can exploit lattices with special geometric properties, effectively reducing the required blocksize to solve the shortest vector problem to half of the lattice's rank, and in the case of the hypercubic lattice $\mathbb{Z}^n$, further relaxing the approximation factor of blocks to $\sqrt{2}$.
We study both provable algorithms and the heuristic well-known primal attack, in the case where the lattice has a first minimum that is almost as short as that of the hypercubic lattice $\mathbb{Z}^n$.
Remarkably, these near-hypercubic lattices cover Falcon and most concrete instances of the NTRU cryptosystem: this is the first provable result showing that breaking NTRU lattices can be reduced to finding shortest lattice vectors in halved dimension, thereby providing a positive response to a conjecture of Gama, Howgrave-Graham and Nguyen at Eurocrypt 2006.
Yet, the best primal attack on NTRU heuristically decreases the $1/2$ provable dimension reduction factor to $4/9$.
Intmax2: A ZK-rollup with Minimal Onchain Data and Computation Costs Featuring Decentralized Aggregators
We present a blockchain scaling solution called Intmax2, which is a Zero-Knowledge rollup (ZK-rollup) protocol with stateless and permissionless block production, while minimizing the usage of data and computation on the underlying blockchain. Our architecture distinctly diverges from existing ZK-rollups since essentially all of the data and computational costs are shifted to the client-side as opposed to imposing heavy requirements on the block producers or the underlying Layer 1 blockchain. The only job for block producers is to periodically generate a commitment to a set of transactions, distribute inclusion proofs to each sender, and collect and aggregate signatures by the senders. This design allows permissionless and stateless block production, and is highly scalable with the number of users. We give a proof of the main security property of the protocol, which has been formally verified by the Nethermind Formal Verification Team in the Lean theorem prover.
On the Anonymity of Linkable Ring Signatures
Security models provide a way of formalising security properties in a rigorous way, but it is sometimes difficult to ensure that the model really fits the concept that we are trying to formalise. In this paper, we illustrate this fact by showing the discrepancies between the security model of anonymity of linkable ring signatures and the security that is actually expected for this kind of signature. These signatures allow a user to sign anonymously within an ad hoc group generated from the public keys of the group members, but all their signatures can be linked together. Reading the related literature, it seems obvious that users' identities must remain hidden even when their signatures are linked, but we show that, surprisingly, almost none have adopted a security model that guarantees it. We illustrate this by presenting two counter-examples which are secure in most anonymity model of linkable ring signatures, but which trivially leak a signer's identity after only two signatures.
A natural fix to this model, already introduced in some previous work, is proposed in a corruption model where the attacker can generate the keys of certain users themselves, which seems much more coherent in a context where the group of users can be constructed in an ad hoc way at the time of signing. We believe that these two changes make the security model more realistic. Indeed, within the framework of this model, our counter-examples becomes insecure. Furthermore, we show that most of the schemes in the literature we surveyed appear to have been designed to achieve the security guaranteed by the latest model, which reinforces the idea that the model is closer to the informal intuition of what anonymity should be in linkable ring signatures.
Symmetric Perceptrons, Number Partitioning and Lattices
The symmetric binary perceptron ($\mathrm{SBP}_{\kappa}$) problem with parameter $\kappa : \mathbb{R}_{\geq1} \to [0,1]$ is an average-case search problem defined as follows: given a random Gaussian matrix $\mathbf{A} \sim \mathcal{N}(0,1)^{n \times m}$ as input where $m \geq n$, output a vector $\mathbf{x} \in \{-1,1\}^m$ such that $$|| \mathbf{A} \mathbf{x} ||_{\infty} \leq \kappa(m/n) \cdot \sqrt{m}~.$$
The number partitioning problem ($\mathrm{NPP}_{\kappa}$) corresponds to the special case of setting $n=1$. There is considerable evidence that both problems exhibit large computational-statistical gaps.
In this work, we show (nearly) tight average-case hardness for these problems, assuming the worst-case hardness of standard approximate shortest vector problems on lattices.
For $\mathrm{SBP}_\kappa$, statistically, solutions exist with $\kappa(x) = 2^{-\Theta(x)}$ (Aubin, Perkins and Zdeborova, Journal of Physics 2019). For large $n$, the best that efficient algorithms have been able to achieve is a far cry from the statistical bound, namely $\kappa(x) = \Theta(1/\sqrt{x})$ (Bansal and Spencer, Random Structures and Algorithms 2020). The problem has been extensively studied in the TCS and statistics communities, and Gamarnik, Kizildag, Perkins and Xu (FOCS 2022) conjecture that Bansal-Spencer is tight: namely, $\kappa(x) = \widetilde{\Theta}(1/\sqrt{x})$ is the optimal value achieved by computationally efficient algorithms.
We prove their conjecture assuming the worst-case hardness of approximating the shortest vector problem on lattices.
For $\mathrm{NPP}_\kappa$, statistically, solutions exist with $\kappa(m) = \Theta(2^{-m})$ (Karmarkar, Karp, Lueker and Odlyzko, Journal of Applied Probability 1986). Karmarkar and Karp's classical differencing algorithm achieves $\kappa(m) = 2^{-O(\log^2 m)}~.$
We prove that Karmarkar-Karp is nearly tight: namely, no polynomial-time algorithm can achieve $\kappa(m) = 2^{-\Omega(\log^3 m)}$, once again assuming the worst-case subexponential hardness of approximating the shortest vector problem on lattices to within a subexponential factor.
Our hardness results are versatile, and hold with respect to different distributions of the matrix $\mathbf{A}$ (e.g., i.i.d. uniform entries from $[0,1]$) and weaker requirements on the solution vector $\mathbf{x}$.
DewTwo: a transparent PCS with quasi-linear prover, logarithmic verifier and 4.5KB proofs from falsifiable assumptions
We construct the first polynomial commitment scheme (PCS) that has a transparent setup, quasi-linear prover time, $\log N$ verifier time, and $\log \log N$ proof size, for multilinear polynomials of size $N$. Concretely, we have the smallest proof size amongst transparent PCS, with proof size less than $4.5$KB for $N\leq 2^{30}$. We prove that our scheme is secure entirely under falsifiable assumptions about groups of unknown order. The scheme significantly improves on the prior work of Dew (PKC 2023), which has super-cubic prover time and relies on the Generic Group Model (a non-falsifiable assumption). Along the way, we make several contributions that are of independent interest: PoKEMath, a protocol for efficiently proving that an arbitrary predicate over committed integer vectors holds; SIPA, a bulletproofs-style inner product argument in groups of unknown order; we also distill out what prior work required from the Generic Group Model and frame this as a falsifiable assumption.
Asynchronous YOSO a la Paillier
We present the first complete asynchronous MPC protocols for the YOSO (You Speak Only Once) setting. Our protocols rely on threshold additively homomorphic Paillier encryption and are adaptively secure. We rely on the paradigm of Blum et al. (TCC `20) in order to recursively refresh the setup needed for running future steps of YOSO MPC, but replace any use of heavy primitives such as threshold fully homomorphic encryption in their protocol with more efficient alternatives. In order to obtain an efficient YOSO MPC protocol, we also revisit the consensus layer upon which our protocol is built. To this end, we present a novel total-order broadcast protocol with subquadratic communication complexity in the total number $M$ of parties in the network and whose complexity is optimal in the message length. This improves on recent work of Banghale et al. (ASIACRYPT `22) by giving a simplified and more efficient broadcast extension protocol for the asynchronous setting that avoids the use of cryptographic accumulators.
Summation-based Private Segmented Membership Test from Threshold-Fully Homomorphic Encryption
In many real-world scenarios, there are cases where a client wishes to check if a data element they hold is included in a set segmented across a large number of data holders. To protect user privacy, the client's query and the data holders' sets should remain encrypted throughout the whole process. Prior work on Private Set Intersection (PSI), Multi-Party PSI (MPSI), Private Membership Test (PMT), and Oblivious RAM (ORAM) falls short in this scenario in many ways. They either require data holders to possess the sets in plaintext, incur prohibitively high latency for aggregating results from a large number of data holders, leak the information about the party holding the intersection element, or induce a high false positive.
This paper introduces the primitive of a Private Segmented Membership Test (PSMT). We give a basic construction of a protocol to solve PSMT using a threshold variant of approximate-arithmetic homomorphic encryption and show how to overcome existing challenges to construct a PSMT protocol without leaking information about the party holding the intersection element or false positives for a large number of data holders ensuring IND-CPA^D security. Our novel approach is superior to existing state-of-the-art approaches in scalability with regard to the number of supported data holders. This is enabled by a novel summation-based homomorphic membership check rather than a product-based one, as well as various novel ideas addressing technical challenges. Our PSMT protocol supports many more parties (up to 4096 in experiments) compared to prior related work that supports only around 100 parties efficiently. Our experimental evaluation shows that our method's aggregation of results from data holders can run in 92.5s for 1024 data holders and a set size of 2^25, and our method's overhead increases very slowly with the increasing number of senders. We also compare our PSMT protocol to other state-of-the-art PSI and MPSI protocols and discuss our improvements in usability with a better privacy model and a larger number of parties.
A Revision of CROSS Security: Proofs and Attacks for Multi-Round Fiat-Shamir Signatures
Signature schemes from multi-round interactive proofs are becoming increasingly relevant in post-quantum cryptography. A prominent example is CROSS, recently admitted to the second round of the NIST on-ramp standardisation process for post-quantum digital signatures. While the security of these constructions relies on the Fiat-Shamir transform, in the case of CROSS the use of the fixed-weight parallel-repetition optimisation makes the security analysis fuzzier than usual. A recent work has shown that the fixed-weight parallel repetition of a multi-round interactive proof is still knowledge sound, but no matching result appears to be known for the non-interactive version.
In this paper we provide two main results. First, we explicitly prove the EUF-CMA security of CROSS, filling a gap in the literature. We do this by showing that, in general, the Fiat-Shamir transform of an HVZK and knowledge-sound multi-round interactive proof is EUF-CMA secure. Second, we present a novel forgery attack on signatures obtained from fixed-weight repetitions of 5-round interactive proofs, substantially improving upon a previous attack on parallel repetitions due to Kales and Zaverucha. Our new attack has particular relevance for CROSS, as it shows that several parameter sets achieve a significantly lower security level than claimed, with reductions up to 24% in the worst case.
Universal Adaptor Signatures from Blackbox Multi-Party Computation
Adaptor signatures (AS) extend the functionality of traditional digital signatures by enabling the generation of a pre-signature tied to an instance of a hard NP relation, which can later be turned (adapted) into a full signature upon revealing a corresponding witness. The recent work by Liu et al. [ASIACRYPT 2024] devised a generic AS scheme that can be used for any NP relation---which here we will refer to as universal adaptor signatures scheme, in short UAS---from any one-way function. However, this generic construction depends on the Karp reduction to the Hamiltonian cycle problem, which adds significant overhead and hinders practical applicability.
In this work, we present an alternative approach to construct universal adaptor signature schemes relying on the multi-party computation in the head (MPCitH) paradigm. This overcomes the reliance on the costly Karp reduction, while inheriting the core property of the MPCitH---which makes it an invaluable tool in efficient cryptographic protocols---namely, that the construction is black-box with respect to the underlying cryptographic primitive (while it remains non-black-box in the relation being proven). Our framework simplifies the design of UAS and enhances their applicability across a wide range of decentralized applications, such as blockchain and privacy-preserving systems. Our results demonstrate that MPCitH-based UAS schemes offer strong security guarantees while making them a promising tool in the design of real-world cryptographic protocols.
Near-Optimal Oblivious Key-Value Stores for Efficient PSI, PSU and Volume-Hiding Multi-Maps
In this paper, we study oblivious key-value stores (OKVS) that enable encoding n key-value pairs into length $m$ encodings while hiding the input keys. The goal is to obtain high rate, $n/m$, with efficient encoding and decoding algorithms. We present $\mathsf{RB\text{-}OKVS}$ built on random band matrices that obtains near-optimal rates as high as 0.97 whereas prior works could only achieve rates up to 0.81 with similar encoding times.
Using $\mathsf{RB\text{-}OKVS}$, we obtain state-of-the-art protocols for private set intersection (PSI) and union (PSU). Our semi-honest PSI has up to 12% smaller communication and 13% reductions in monetary cost with slightly larger computation. We also obtain similar improvements for both malicious and circuit PSI. For PSU, our protocol obtains improvements of up to 22% in communication, 40% in computation and 21% in monetary cost. In general, we obtain the most communication- and cost-efficient protocols for all the above primitives.
Finally, we present the first connection between OKVS and volume-hiding encrypted multi-maps (VH-EMM) where the goal is to outsource storage of multi-maps while hiding the number of values associated with each key (i.e., volume). We present $\mathsf{RB\text{-}MM}$ with 16% smaller storage, 5x faster queries and 8x faster setup than prior works.
SoK: PQC PAKEs - Cryptographic Primitives, Design and Security
PAKE protocols are used to establish secure communication channels using a relatively short, often human memorable, password for authentication. The currently standardized PAKEs however rely on classical asymmetric (public key) cryptography. Thus, these classical PAKEs may no longer maintain their security, should the expected quantum threat become a reality. Unlike prominent security protocols such as TLS, IKEv2 and VPN, quantum-safe PAKEs did not receive much attention from the ongoing PQC integration efforts. Thus, there is a significant gap in awareness compared to PQC schemes that are subject to the official governmental and institutional standardization processes. In the work at hand, we provide a comprehensive overview of the existing PQC PAKEs focusing on their design rationales, authentication methods and used asymmetric key agreement primitives. We highlight their performance and properties as per their assumed security assurances and practical usage in applications. Moreover, we address PAKE designs that are still non-present in the PQC realm and discuss the possibility of their adaptation. Thus, we offer a detailed reference and derive future work for quantum-safe PAKEs.
STIR: Reed–Solomon Proximity Testing with Fewer Queries
We present STIR (Shift To Improve Rate), an interactive oracle proof of proximity (IOPP) for Reed-Solomon codes that achieves the best known query complexity of any concretely efficient IOPP for this problem. For $\lambda$ bits of security, STIR has query complexity $O(\log d + \lambda \cdot \log \log d )$, while FRI, a popular protocol, has query complexity $O(\lambda \cdot \log d )$ (including variants of FRI based on conjectured security assumptions). STIR relies on a new technique for recursively improving the rate of the tested Reed-Solomon code.
We provide an implementation of STIR compiled to a SNARK. Compared to a highly-optimized implementation of FRI, STIR achieves an improvement in argument size that ranges from $1.25\times$ to $2.46\times$ depending on the chosen parameters, with similar prover and verifier running times. For example, in order to achieve 128 bits of security for degree $2^{26}$ and rate $1/4$, STIR has argument size $114$ KiB, compared to $211$ KiB for FRI.
Always by Your Side: Constructing Traceable Anonymous Credentials with Hardware-Binding
With the development of decentralized identity (DID), anonymous credential (AC) technology, as well as its traceability, is receiving more and more attention. Most works introduce a trusted party (regulator) that holds a decryption key or backdoor to directly deanonymize the user identity of anonymous authentication. While some cryptographic primitives can help regulators handle complex tracing tasks among large amounts of user profiles (stored by the issuer) and authentication records (stored by the service provider), additional security primitives are still needed to ensure the privacy of other users. Besides, hardware-binding anonymous credential (hbAC) systems have been proposed to prevent credential sharing or address platform resource constraints, the traceability of hbAC has yet to be discussed.
In this paper, we introduce a public key encryption with equality test as a regulatory text for each authentication record to address the above-mentioned challenges. The security of this feature is guaranteed by the verifiability, non-frameability, and round isolation of the proposed scheme. We compared the asymptotic complexity of our scheme with other traceable AC schemes and shows our scheme has advantages in tracing tasks as well as securely outsourcing them. The key feature of our scheme is that the ability of equality test of regulatory texts is independent of the public key, but rather depends on the round identifier of the authentication. We instantiate a traceable, hardware-binding AC scheme based on smart cards and BBS+ signature and give the performance analysis of it.
Succinct Randomized Encodings from Laconic Function Evaluation, Faster and Simpler
Succinct randomized encodings allow encoding the input $x$ of a time-$t$ uniform computation $M(x)$ in sub-linear time $o(t)$. The resulting encoding $\tilde{x}$ allows recovering the result of the computation $M(x)$, but hides any other information about $x$. These encodings have powerful applications, including time-lock puzzles, reducing communication in MPC, and bootstrapping advanced encryption schemes.
Until not long ago, the only known constructions were based on indistinguishability obfuscation, and in particular were not based on standard post-quantum assumptions. In terms of efficiency, these constructions' encoding time is $\rm{polylog}(t)$, essentially the best one can hope for. Recently, a new construction was presented based on Circular Learning with Errors, an assumption similar to the one used in fully-homomorphic encryption schemes, and which is widely considered to be post-quantum resistant. However, the encoding efficiency significantly falls behind obfuscation-based scheme and is $\approx \sqrt{t} \cdot s$, where $s$ is the space of the computation.
We construct, under the same assumption, succinct randomized encodings with encoding time $\approx t^{\varepsilon} \cdot s$ for arbitrarily small constant $\varepsilon<1$. Our construction is relatively simple, generic and relies on any laconic function evaluation scheme that satisfies a natural "efficiency preservation" property.
Under sub-exponential assumptions, the encoding time can be further reduced to $\approx \sqrt{s}$, but at the account of a huge security loss.
As a corollary, assuming also bounded-space languages that are worst-case hard-to-parallelize, we obtain time-lock puzzles with an arbitrary polynomial gap between encoding and decoding times.
A Privacy Model for Classical & Learned Bloom Filters
The Classical Bloom Filter (CBF) is a class of Probabilistic Data Structures (PDS) for handling Approximate Query Membership (AMQ). The Learned Bloom Filter (LBF) is a recently proposed class of PDS that combines the Classical Bloom Filter with a Learning Model while preserving the Bloom Filter's one-sided error guarantees. Bloom Filters have been used in settings where inputs are sensitive and need to be private in the presence of an adversary with access to the Bloom Filter through an API or in the presence of an adversary who has access to the internal state of the Bloom Filter. Prior work has investigated the privacy of the Classical Bloom Filter providing attacks and defenses under various privacy definitions. In this work, we formulate a stronger differential privacy-based model for the Bloom Filter. We propose constructions of the Classical and Learned Bloom Filter that satisfy $(\epsilon, 0)$-differential privacy. This is also the first work that analyses and addresses the privacy of the Learned Bloom Filter under any rigorous model, which is an open problem.
Double Auction Meets Blockchain: Consensus from Scored Bid-Assignment
A double auction system, where buyers and sellers trade through bids, requires a transparent and immutable mechanism to record allocation results. This demand can be met with robust ledgers that ensure persistence and liveness, as exemplified by the Bitcoin blockchain (EuroCrypt {'}15). While existing blockchain-aided auction systems often rely on secure smart contracts or layer-$2$ techniques, this work proposes a more fundamental approach by constructing a provably secure blockchain protocol directly from the computation of bid allocations. The core component is an alternative proof-of-work (PoW) scheme based on a scored generalized multiple assignment problem (SGMAP), integrated into a tailored blockchain protocol. Unlike conventional PoW-based protocols, our leader selection is driven by block scores derived from the SGMAP scoring function, which is designed to be flexible enough to define the difficulty level and accommodate real-life requirements of the underlying double auction system. We prove persistence and a modified liveness property for our design, and present implementation results to validate its robustness and practicality.
GPU Implementations of Three Different Key-Switching Methods for Homomorphic Encryption Schemes
In this work, we report on the latest GPU implementations of the three well-known methods for the key switching operation, which is critical for Fully Homomorphic Encryption (FHE). Additionally, for the first time in the literature, we provide implementations of all three methods in GPU for leveled CKKS schemes. To ensure a fair comparison, we employ the most recent GPU implementation of the number-theoretic transform (NTT), which is the most time-consuming operation in key switching, and evaluate the performance across two fully homomorphic schemes: BFV and CKKS. Furthermore, we highlight the advantages and shortcomings of the three methods in the context of leveled HE schemes, and discuss other aspects such as memory requirements. Our GPU implementation is integrated with HEonGPU Library and delivers up to a ×380 improvement in execution time compared to the Microsoft SEAL Library. Since key switching is a specialized form of the external product common in many HE schemes, our results are directly relevant to time-intensive homomorphic operations such as relinearization and rotation. As homomorphic rotation is one of the most dominant operations in bootstrapping, our results are also applicable in bootstrapping algorithms of BFV, BGV and CKKS schemes.
SPY-PMU: Side-Channel Profiling of Your Performance Monitoring Unit to Leak Remote User Activity
The Performance Monitoring Unit (PMU), a standard feature in all modern computing systems, presents significant security risks by leaking sensitive user activities through microarchitectural event data. This work demonstrates the feasibility of remote side-channel attacks leveraging PMU data, revealing vulnerabilities that compromise user privacy and enable covert surveillance without physical access to the target machine. By analyzing the PMU feature space, we create distinct micro-architectural fingerprints for benchmark applications, which are then utilized in machine learning (ML) models to detect the corresponding benchmarks. This approach allows us to build a pre-trained model for benchmark detection using the unique micro-architectural fingerprints derived from PMU data. Subsequently, when an attacker remotely accesses the victim’s PMU data, the pre-trained model enables the identification of applications used by the victim with high accuracy. In our proof-of-concept demonstration, the pre-trained model successfully identifies applications used by a victim when the attacker remotely accesses PMU data, showcasing the potential for malicious exploitation of PMU data. We analyze stress-ng benchmarks and build our classifiers using logistic regression, decision tree, k-nearest neighbors, and random forest ML models. Our proposed models achieve an average prediction accuracy of 98%, underscoring the potential risks associated with remote side-channel analysis using PMU data and emphasizing the need for more robust safeguards. This work underscores the urgent need for robust countermeasures to protect against such vulnerabilities and provides a foundation for future research in micro-architectural security.
Qelect: Lattice-based Single Secret Leader Election Made Practical
In a single secret leader election (SSLE) protocol, all parties collectively and obliviously elect one leader. No one else should learn its identity unless it reveals itself as the leader. The problem is first formalized by Boneh \textit{et al.} (AFT'20), which proposes an efficient construction based on the Decision Diffie-Hellman (DDH) assumption. Considering the potential risk of quantum computers, several follow-ups focus on designing a post-quantum secure SSLE protocol based on pure lattices or fully homomorphic encryption. However, no concrete benchmarks demonstrate the feasibility of deploying such heavy cryptographic primitives.
In this work, we present Qelect, the first practical constant-round post-quantum secure SSLE protocol. We first adapt the commitment scheme in Boneh \textit{et al.} (AFT'23) into a \textit{multi-party randomizable commitment} scheme, and propose our novel construction based on an adapted version of ring learning with errors (RLWE) problem. We then use it as a building block and construct a \textit{constant-round} single secret leader election (crSSLE) scheme. We utilize the single instruction multiple data (SIMD) property of a specific threshold fully homomorphic encryption (tFHE) scheme to evaluate our election circuit efficiently. Finally, we built Qelect from the crSSLE scheme, with performance optimizations including a preprocessing phase to amortize the local computation runtime and a retroactive detection phase to avoid the heavy zero-knowledge proofs during the election phase. Qelect achieves asymptotic improvements and is concretely practical. We implemented a prototype of Qelect and evaluated its performance in a WAN. Qelect is at least two orders of magnitude faster than the state-of-the-art.
On symbolic computations over arbitrary commutative rings and cryptography with the temporal Jordan-Gauss graphs.
The paper is dedicated to Multivariate Cryptography over general commutative ring K and protocols of symbolic computations for safe delivery of multivariate maps. We consider itera-tive algorithm of generation of multivariate maps of prescribed degree or density with the trapdoor accelerator, i.e. piece of information which allows to compute the reimage of the map in polynomial time. The concept of Jordan-Gauss temporal graphs is used for the obfus-cation of known graph based public keys and constructions of new cryptosystems. We sug-gest use of the platforms of Noncommutative Cryptography defined in terms of Multivariate Cryptography over K for the conversion of Multivariate Public Keys into El Gamal type Cryptosystems. Some new platforms are introduced.
Multi-Key Homomorphic Secret Sharing
Homomorphic secret sharing (HSS) is a distributed analogue of fully homomorphic encryption (FHE) where following an input-sharing phase, two or more parties can locally compute a function over their private inputs to obtain shares of the function output.
Over the last decade, HSS schemes have been constructed from an array of different assumptions. However, all existing HSS schemes, except ones based on assumptions known to imply multi-key FHE, require a public-key infrastructure (PKI) or a correlated setup between parties. This limitation carries over to many applications of HSS.
In this work, we construct multi-key homomorphic secret sharing (MKHSS), where given only a common reference string (CRS), two parties can secret share their inputs to each other and then perform local computations as in HSS. We present the first MKHSS schemes supporting all NC1 computations from either the decisional Diffie-Hellman (DDH), decisional composite residuosity (DCR), or class group assumptions.
Our constructions imply the following applications in the CRS model:
- Succinct two-round secure computation. Under the same assumptions as our MKHSS schemes, we construct succinct, two-round secure two-party computation for NC1 circuits. Previously, such a result was only known from the learning with errors assumption.
- Attribute-based NIKE. Under DCR or class group assumptions, we construct non-interactive key exchange (NIKE) protocols where two parties agree on a key if and only if their secret attributes satisfy a public NC1 predicate. This significantly generalizes the existing notion of password-based NIKE.
- Public-key PCFs. Under DCR or class group assumptions, we construct public-key pseudorandom correlation functions (PCFs) for any NC1 correlation. This yields the first public-key PCFs for Beaver triples (and more) from non-lattice assumptions.
- Silent MPC. Under DCR or class group assumptions, we construct a p-party secure computation protocol in the silent preprocessing model where the preprocessing phase has communication O(p), ignoring polynomial factors. All prior protocols that do not rely on spooky encryption require $Ω(p^2)$ communication.
Simultaneous-Message and Succinct Secure Computation
We put forth and instantiate a new primitive we call simultaneous-message and succinct (SMS) secure computation. An SMS scheme enables a minimal communication pattern for secure computation in the following scenario: Alice has a large private input X, Bob has a small private input y, and Charlie wants to learn $f(X, y)$ for some public function $f$.
Given a common reference string (CRS) setup phase, an SMS scheme for a function f is instantiated with two parties holding inputs $X$ and $y$, and has the following structure:
- The parties simultaneously exchange a single message.
- Communication is succinct, scaling sublinearly in the size of $X$ and the output $f(X, y)$.
- Without further interaction, the parties can locally derive additive secret shares of $f(X, y)$.
Indeed, Alice and Bob simultaneously send each other a message using the CRS and their private inputs. Using the transcript and their private state, the parties locally derive additive secret shares of $f(X, y)$, which they can send to Charlie. As such, an SMS scheme incurs a communication cost to Charlie that is only twice that of the function output length. Importantly, the size of Alice’s message does not grow with the size of her input $X$, and both Alice’s and Bob’s first-round messages grow sublinearly in the size of the output. Additionally, Alice’s or Bob’s view provides no information about the other party’s input besides the output of $f(X, y)$, even if colluding with Charlie.
We obtain the following results:
- Assuming Learning With Errors (LWE), we build an SMS scheme supporting evaluation of depth-$d$ circuits, where Alice's message is of size $|f(X, y)|^{(2/3)}$· poly(λ, d), Bob's message is of size $(|y| + |f(X, y)|^{(2/3)})$ · poly(λ, d), and λ is the security parameter. We can further extend this to support all functions by assuming the circular security of LWE.
- Assuming sub-exponentially secure indistinguishability obfuscation, in conjunction with other standard assumptions, we build an SMS scheme supporting arbitrary polynomial-sized batch functions of the form $(f(x_1, y), ..., f(x_L, y))$, for $X = (x_1, ..., x_L)$. The size of Alice's and Bob's messages in this construction is poly(λ) and poly(λ, |f|, log L), respectively.
We show that SMS schemes have several immediate applications. An SMS scheme gives:
- A direct construction of trapdoor hash functions (TDH) (Döttling et al., Crypto'19) for the same class of functions as the one supported by the SMS scheme.
- A simple and generic compiler for obtaining compact, rate-1 fully homomorphic encryption (FHE) from any non-compact FHE scheme.
- A simple and generic compiler for obtaining correlation-intractable (CI) hash functions that are secure against all efficiently-searchable relations.
In turn, under the LWE assumption, we obtain the first construction of TDH for all functions and generic approaches for obtaining rate-1 FHE and CI hashing. We also show that our iO-based construction gives an alternative approach for two-round secure computation with communication succinctness in the output length (Hubáček and Wichs, ITCS'15).
Module Learning with Errors with Truncated Matrices
The Module Learning with Errors ($\mathsf{MLWE}$) problem is one of the most commonly used hardness assumption in lattice-based cryptography. In its standard version, a matrix $\mathbf{A}$ is sampled uniformly at random over a quotient ring $R_q$, as well as noisy linear equations in the form of $\mathbf{A} \mathbf{s}+ \mathbf{e} \bmod q$, where $\mathbf{s}$ is the secret, sampled uniformly at random over $R_q$, and $\mathbf{e}$ is the error, coming from a Gaussian distribution. Many previous works have focused on variants of $\mathsf{MLWE}$, where the secret and/or the error are sampled from different distributions. Only few works have focused on different distributions for the matrix $\mathbf{A}$. One variant proposed in the literature is to consider matrix distributions where the low-order bits of a uniform $\mathbf{A}$ are deleted. This seems a natural approach in order to save in bandwidth. We call it truncated $\mathsf{MLWE}$.
In this work, we show that the hardness of standard $\mathsf{MLWE}$ implies the hardness of truncated $\mathsf{MLWE}$, both for search and decision versions. Prior works only covered the search variant and relied on the (module) $\mathsf{NTRU}$ assumption, limitations which we are able to overcome. Overall, we provide two approaches, offering different advantages. The first uses a general Rényi divergence argument, applicable to a wide range of secret/error distributions, but which only works for the search variants of (truncated) $\mathsf{MLWE}$. The second applies to the decision versions, by going through an intermediate variant of $\mathsf{MLWE}$, where additional hints on the secret are given to the adversary. However, the reduction makes use of discrete Gaussian distributions.
DEEP Commitments and Their Applications
This note studies a method of committing to a polynomial in a way that allows executions of low degree tests such as FRI to be batched and even deferred. In particular, it achieves (unlimited-depth) aggregation for STARKs.
Efficient 2PC for Constant Round Secure Equality Testing and Comparison
Secure equality testing and comparison are two important primitives widely used in many secure computation scenarios, such as privacy-preserving machine learning, private set intersection, and secure data mining, etc. This work proposes new constant-round two-party computation (2PC) protocols for secure equality testing and comparison.
Our protocols are designed in the online/offline paradigm.
For 32-bit inputs, the online communication cost of our equality testing protocol and secure comparison protocol are as low as 76 bits (1\% of ABY) and 384 bits (5\% of ABY) , respectively.
Our benchmarks show that (i) for 32-bit equality testing, our scheme performs $9 \times$ faster than the Guo \emph{et al.} (EUROCRYPT 2023) and $15 \times$ of the garbled circuit (GC) with the half-gate optimization (CRYPTO 2015). (ii) for 32-bit secure comparison, our scheme performs $3 \times$ faster than Guo \emph{et al.} (EUROCRYPT 2023), $6 \times$ faster than both Rathee \emph{et al.} (CCS 2020) and GC with the half-gate optimization.
Twist and Shout: Faster memory checking arguments via one-hot addressing and increments
A memory checking argument enables a prover to prove to a verifier that it is correctly processing reads and writes to memory. They are used widely in modern SNARKs, especially in zkVMs, where the prover proves the correct execution of a CPU including the correctness of memory operations.
We describe a new approach for memory checking, which we call the method of one-hot addressing and increments. We instantiate this method via two different families of protocols, called Twist and Shout. Twist supports read/write memories, while Shout targets read-only memories (also known as lookup arguments). Both Shout and Twist have logarithmic verifier costs. Unlike prior works, these protocols do not invoke "grand product" or "grand sum" arguments.
Twist and Shout significantly improve the prover costs of prior works across the full range of realistic memory sizes, from tiny memories (e.g., 32 registers as in RISC-V), to memories that are so large they cannot be explicitly materialized (e.g., structured lookup tables of size $2^{64}$ or larger, which arise in Lasso and the Jolt zkVM). Detailed cost analysis shows that Twist and Shout are well over 10x cheaper for the prover than state-of-the-art memory-checking procedures configured to have logarithmic proof length.
Prior memory-checking procedures can also be configured to have larger proofs. Even then, we estimate that Twist and Shout are at least 2--4x faster for the prover in key applications.
Finally, using Shout, we provide two fast-prover SNARKs for non-uniform constraint systems, both of which achieve minimal commitment costs (the prover commits only to the witness): (1) SpeedySpartan applies to Plonkish constraints, substantially improving the previous state-of-the-art protocol, BabySpartan; and (2)Spartan++ applies to CCS (a generalization of Plonkish and R1CS), improving prover times over the previous state-of-the-art protocol, Spartan, by 6x.
Tighter Security for Schnorr Identification and Signatures: A High-Moment Forking Lemma for $\Sigma$-Protocols
Uncategorized
Uncategorized
The Schnorr identification and signature schemes have been amongst the most influential cryptographic protocols of the past three decades. Unfortunately, although the best-known attacks on these two schemes are via discrete-logarithm computation, the known approaches for basing their security on the hardness of the discrete logarithm problem encounter the ``square-root barrier''. In particular, in any group of order $p$ where Shoup's generic hardness result for the discrete logarithm problem is believed to hold (and is thus used for setting concrete security parameters), the best-known $t$-time attacks on the Schnorr identification and signature schemes have success probability $t^2/p$, whereas existing proofs of security only rule out attacks with success probabilities $(t^2/p)^{1/2}$ and $(q_{\mathcal{H}} \cdot t^2/p)^{1/2}$, respectively, where $q_{\mathcal{H}}$ denotes the number of random-oracle queries issued by the attacker.
We establish tighter security guarantees for identification and signature schemes which result from $\Sigma$-protocols with special soundness based on the hardness of their underlying relation, and in particular for Schnorr's schemes based on the hardness of the discrete logarithm problem. We circumvent the square-root barrier by introducing a high-moment generalization of the classic forking lemma, relying on the assumption that the underlying relation is ``$d$-moment hard'': The success probability of any algorithm in the task of producing a witness for a random instance is dominated by the $d$-th moment of the algorithm's running time.
In the concrete context of the discrete logarithm problem, already Shoup's original proof shows that the discrete logarithm problem is $2$-moment hard in the generic-group model, and thus our assumption can be viewed as a highly-plausible strengthening of the discrete logarithm assumption in any group where no better-than-generic algorithms are currently known. Applying our high-moment forking lemma in this context shows that, assuming the $2$-moment hardness of the discrete logarithm problem, any $t$-time attacker breaks the security of the Schnorr identification and signature schemes with probabilities at most $(t^2/p)^{2/3}$ and $(q_{\mathcal{H}} \cdot t^2/p)^{2/3}$, respectively.
VECTIS: Efficient Batching Framework for Group-based CP-SNARKs
Blockchain applications in finance and identity management increasingly require scalable and privacy-preserving solutions. Cryptographic commitments secure sensitive data on-chain, but verifying properties of these commitments efficiently remains challenging, particularly in large-scale scenarios. For multiple commitments, CP-SNARKs, a family of zk-SNARKs, enhance prover efficiency by shifting large-cost operations outside the circuit and verifying linkages between commitments, but incur verifier-side overhead due to linkage checks. Verification costs grow with the number of commitments, leading to inefficiencies in key size, proof size, and verification time.
We propose $\textbf{VECTIS}$, an efficient batching framework for proving multiple commitments. Our approach aggregates multiple commitments into a single batched commitment, enabling the linking proof system to operate on the aggregated commitment instead of individual commitments, thereby significantly reducing the overall verification cost.%streamlining the verification process and improving efficiency.
Experimental results show meaningful efficiency gains. For $2^{16}$ commitments, $\textbf{VECTIS}$ reduces the verification time to $0.064$s, achieving over $30\times$ improvement compared to LegoSNARK’s $1.972$s. These results show $\textbf{VECTIS}$’s potential for enabling scalable and efficient privacy-preserving solutions in blockchain applications.
One-shot Signatures and Applications to Hybrid Quantum/Classical Authentication
We define the notion of one-shot signatures, which are signatures where any secret key can be used to sign only a single message, and then self-destructs. While such signatures are of course impossible classically, we construct one-shot signatures using quantum no-cloning. In particular, we show that such signatures exist relative to a classical oracle, which we can then heuristically obfuscate using known indistinguishability obfuscation schemes.
We show that one-shot signatures have numerous applications for hybrid quantum/classical cryptographic tasks, where all communication is required to be classical, but local quantum operations are allowed. Applications include one-time signature tokens, quantum money with classical communication, decentralized blockchain-less cryptocurrency, signature schemes with unclonable secret keys, non-interactive certifiable min-entropy, and more. We thus position one-shot signatures as a powerful new building block for novel quantum cryptographic protocols.
Constrained Pseudorandom Functions for Inner-Product Predicates from Weaker Assumptions
In this paper, we provide a novel framework for constructing constrained pseudorandom functions (CPRFs) with inner-product constraint predicates, using ideas from subtractive secret sharing and related-key-attack security.
Our framework can be instantiated using a random oracle or any suitable related-key-attack (RKA) secure pseudorandom function. This results in three new CPRF constructions:
1. an adaptively-secure construction in the random oracle model;
2. a selectively-secure construction under the DDH assumption; and
3. a selectively-secure construction with a polynomial domain under the assumption that one-way functions exist.
All three instantiations are constraint-hiding and support inner-product predicates, leading to the first constructions of such expressive CPRFs under each corresponding assumption. Moreover, while the OWF-based construction is primarily of theoretical interest, the random oracle and DDH-based constructions are concretely efficient, which we show via an implementation.
QuietOT: Lightweight Oblivious Transfer with a Public-Key Setup
Oblivious Transfer (OT) is at the heart of secure computation and is a foundation for many applications in cryptography. Over two decades of work have led to extremely efficient protocols for evaluating OT instances in the preprocessing model, through a paradigm called OT extension.
A few OT instances generated in an offline phase can be used to perform many OTs in an online phase efficiently, i.e., with very low communication and computational overheads.
Specifically, traditional OT extension protocols use a small number of “base” OTs, generated using any black-box OT protocol, and convert them into many OT instances using only lightweight symmetric-key primitives.
Recently, a new paradigm of OT with a *public-key setup* has emerged, which replaces the base OTs with a non-interactive setup: Using only the public key of the other party, two parties can efficiently compute a virtually unbounded number of OT instances on-the-fly.
In this paper, we put forth a novel framework for OT extension with a public-key setup and concretely efficient instantiations. An implementation of our framework is 30-100 times faster when compared to the previous state-of-the-art public-key OT protocols, and remains competitive even when compared to OT protocols that *do not* offer a public-key setup. Additionally, our instantiations result in the first public-key schemes with plausible post-quantum security.
In summary, this paper contributes:
- QuietOT: A framework for OT extension with a public-key setup that uses fast, symmetric-key primitives to generate OT instances following a one-time public-key setup, and offering additional features such as precomputability.
- A public-key setup for QuietOT from the RingLWE assumption, resulting in the first post-quantum construction of OT extension with a public-key setup.
- An optimized, open-source implementation of our construction that can generate up to 1M OT extensions per second on commodity hardware. In contrast, the state-of-the-art public-key OT protocol is limited to approximately 20K OTs per second.
- The first formal treatment of the security of OT with a public-key setup in a multi-party setting, which addresses several subtleties that were overlooked in prior work.
Application of Mordell-Weil lattices with large kissing numbers to acceleration of multi-scalar multiplication on elliptic curves
This article aims to speed up (the precomputation stage of) multi-scalar multiplication (MSM) on ordinary elliptic curves of $j$-invariant $0$ with respect to specific ``independent'' (a.k.a. ``basis'') points. For this purpose, so-called Mordell--Weil lattices (up to rank $8$) with large kissing numbers (up to $240$) are employed. In a nutshell, the new approach consists in obtaining more efficiently a considerable number (up to $240$) of certain elementary linear combinations of the ``independent'' points. By scaling the point (re)generation process, it is thus possible to get a significant performance gain. As usual, the resulting curve points can be then regularly used in the main stage of an MSM algorithm to avoid repeating computations. Seemingly, this is the first usage of lattices with large kissing numbers in cryptography, while such lattices have already found numerous applications in other mathematical domains. Without exaggeration, MSM is a widespread primitive (often the unique bottleneck) in modern protocols of real-world elliptic curve cryptography. Moreover, the new (re)generation technique is prone to further improvements by considering Mordell--Weil lattices with even greater kissing numbers.
A Horizontal Attack on the Codes and Restricted Objects Signature Scheme (CROSS)
CROSS is a post-quantum secure digital signature scheme submitted to NIST’s Call for Additional Signatures which was recently selected for round 2. It features signature and key sizes in the range of SLH-DSA while providing a substantially faster signing operation. Within this work, we provide the first passive side-channel attack on the scheme. The attack recovers the secret key from all except one parameter sets from a single power trace while requiring at maximum two power traces for the R-SDP(G) 1 Fast instance. To successfully mount the attack, we show how to recover the secret key from side-channel information gained from the syndrome computation in CROSS’ identification protocol. We furthermore show how the hypothesis space for the attack can be restricted using information from the published signature.
Last updated: 2025-01-24
A Compact and Parallel Swap-Based Shuffler based on butterfly Network and its complexity against Side Channel Analysis
A prominent countermeasure against side channel attacks, the hiding countermeasure, typically involves shuffling operations using a permutation algorithm. Especially in the era of Post-Quantum Cryptography, the importance of the hiding coun- termeasure is emphasized due to computational characteristics like those of lattice and code-based cryptography. In this context, swiftly and securely generating permutations has a critical impact on an algorithm’s security and efficiency. The widely adopted Fisher-Yates shuffle, because of its high security and ease of implementation, is prevalent. However, it has a limitation of complexity O(𝑁) due to its sequential nature. In response, we propose a time-area trade-off swap algorithm, FSS, based on the Butterfly Network with only log(𝑁) depth, log(𝑁) works and O(1) operation time in parallel. We will calculate the maximum gain that an attacker can achieve through butterfly operations with only log(𝑁) depth from side channel analysis perspective. In particular, we will show that it is possible to derive a generalized formula of the attack complexity with higher-order side channel attacks for arbitrary input sizes through a fractal structure of the butterfly network. Furthermore, our research highlights the possibility of generating efficient and secure permutations utilizing a minimal amount of randomness.
A Formal Analysis of Apple’s iMessage PQ3 Protocol
We present the formal verification of Apple’s iMessage PQ3, a highly performant, device-to-device messaging protocol offering strong security guarantees even against an adversary with quantum computing capabilities. PQ3 leverages Apple’s identity services together with a custom, post-quantum secure initialization phase and afterwards it employs a double ratchet construction in the style of Signal, extended to provide post-quantum, post-compromise security.
We present a detailed formal model of PQ3, a precise specification of its fine-grained security properties, and machine-checked security proofs using the TAMARIN prover. Particularly novel is the integration of post-quantum secure key encapsulation into the relevant protocol phases and the detailed security claims along with their complete formal analysis. Our analysis covers both key ratchets, including unbounded loops, which was believed by some to be out of scope of symbolic provers like TAMARIN (it is not!).
Signatures with Tight Adaptive Corruptions from Search Assumptions
We construct the first tightly secure signature schemes in the multi-user setting with adaptive corruptions from classical discrete logarithm, RSA, factoring, or post-quantum group action discrete logarithm assumption. In contrast to our scheme, the previous tightly secure schemes are based on the decisional assumption (e.g., (group action) DDH) or interactive search assumptions (e.g., one-more CDH).
The security of our schemes is independent of the number of users, signing queries, and RO queries, and forging our signatures is as hard as solving the underlying search problem. Our starting point is an identification scheme with multiple secret keys per public key (e.g., Okamoto identification (CRYPTO'92) and parallel-OR identification (CRYPTO'94)). This property allows a reduction to solve a search problem while answering corruption queries for all users in the signature security game. To convert such an identification scheme into a signature scheme tightly, we employ randomized Fischlin's transformation introduced by Kondi and shelat (Asiacrypt 2022) that provides straight-line extraction. Intuitively, the transformation properties guarantee the tight security of our signature scheme in the programmable random oracle model, but we successfully prove its tight security in the non-programmable random oracle model.
Delegated Multi-party Private Set Intersection from Secret Sharing
In this work, we address the problem of Delegated PSI (D-PSI), where a cloud server is introduced to handle most computational and communication tasks. D-PSI enables users to securely delegate their private sets to the cloud, ensuring the privacy of their data while allowing efficient computation of the intersection. The cloud operates under strict security requirements, learning nothing about the individual sets or the intersection result. Moreover, D-PSI minimizes user-to-user communication and supports "silent" processing, where the cloud can perform computations independently without user interaction, apart from set delegation and result retrieval.
We formally define the D-PSI problem and propose a novel construction that extends beyond two-party scenarios to support multi-party settings. Our construction adheres to the D-PSI requirements, including security against semi-honest adversaries, and achieves computational and communication complexities close to the ideal "perfect" D-PSI protocol. Additionally, we demonstrate the practicality of our approach through a baseline implementation and an optimized version that further reduces computational overhead. Our results establish a strong foundation for secure and efficient PSI in real-world cloud computing scenarios.
NTRU+Sign: Compact NTRU-Based Signatures Using Bimodal Distributions
We present a new lattice-based signature scheme, called ‘NTRU+Sign’, using the Fiat-Shamir with Aborts framework. The proposed scheme is designed based on a novel NTRU-based key structure that fits well with bimodal distributions, enabling efficiency improvements compared to its predecessor, BLISS. The novel NTRU-based key structure is characterized by: (1) effectively changing a modulus from 2q to q, which is different from the existing usage of 2q for bimodal distributions, and (2) drastically reducing the magnitude of a secret key, which directly leads to compactness of signature sizes. We provide two concrete parameter sets for NTRU+Sign, supporting 93-bit and 211-bit security levels. Using the technique from GALACTICS (that was suggested as the constant-time implementation of BLISS),
our analysis shows that NTRU+Sign achieves a good balance between computational efficiency and signature compactness, with constant-time implementation. For instance, at the NIST-3 security level, NTRU+Sign produces signatures that are significantly smaller than Dilithium and HAETAE, while providing faster verification speeds. These advantages position NTRU+Sign as a competitive and practical
solution for real-world deployments.
GCKSign: Simple and Efficient Signatures from Generalized Compact Knapsacks
In 2009, Lyubashevsky proposed a lattice-based signature scheme by applying the Fiat-Shamir transformation and proved its security under the generalized compact knapsack (GCK) problem. This scheme has a simple structure but has large signature and key sizes due to the security requirement of their security reduction. Dilithium, which was submitted to the NIST Post-Quantum Cryptography standardization and selected as one of the final candidates, is an improvement of the Lyubashevsky's signature scheme and decreases key and signature sizes by modifying the form of a public key and including additional steps in key generation, signing, and verification algorithms. Thus, Dilithium has a more complex structure to implement compared to the Lyubashevsky's scheme. To combine the strength of both signature schemes, we modify the Lyubashevsky's signature scheme and present a new security proof that removes their security requirement. As a result, we propose a simple and practical GCKSign signature scheme based on the hardness of a new GCK assumption, called target-modified one-wayness of GCK function. The signature size of our signature scheme decreases 40 percent, the sum of signature and public key sizes decreases 25 percent, and the secret key size decreases 90 percent for the NIST security level III, compared to Dilithium. Furthermore, by the simplicity of our structure, the key generation, signing, and verification algorithms of our scheme run 2.4$\times$, 1.7$\times$, and 2.0$\times$ faster than those of Dilithium, respectively.
Comprehensive Robustness Analysis of GCM, CCM, and OCB3
Clarifying the robustness of authenticated encryption (AE) schemes, such as security under nonce misuse or Release of Unverified Plaintext (RUP), is critically important due to the extensive use of AEs in real-world applications.
We present a comprehensive analysis of the robustness of well-known standards, namely GCM, CCM, and OCB3. Despite many existing studies, we uncovered several robustness properties for them that were not known in the literature.
In particular, we show that both GCM and CCM maintain authenticity under RUP. Moreover, CCM keeps this feature even if a nonce is misused. Together with existing analysis, our work gives a complete picture of the robustness of these standards for the first time. Our results also imply several new robust AE schemes based on GCM and CCM.
EQSIGN: Practical Digital Signatures from the Non-Abelian Hidden Subgroup Problem and Information Theoretic Equivocation
We present a novel digital signature scheme grounded in non-commutative cryptography and implemented over a bilinear matrix group platform. At the core of our design is a unique equivocation function that obfuscates intermediate elements, effectively concealing outputs and minimizing observable information leakage. To the best of our knowledge, this is the first digital signature scheme to combine information-theoretic security with computational hardness, relying on a challenging instance of the Non-Abelian Hidden Subgroup Problem (NAHSP) and strengthened by practical guarantees. This dual-layered security approach ensures robustness against both classical and quantum adversaries while maintaining communication overheads competitive with RSA. Our work represents a significant advancement toward efficient, quantum-resilient digital signatures for real-world applications. This paper is an early pre-release intended to invite collaboration and feedback. The work is presented for research purposes only and is not intended for use in production systems.
Discrete gaussian sampling for BKZ-reduced basis
Discrete Gaussian sampling on lattices is a fundamental problem in lattice-based cryptography. In this paper, we revisit the Markov chain Monte Carlo (MCMC)-based Metropolis-Hastings-Klein (MHK) algorithm proposed by Wang and Ling
and study its complexity under the Geometric Series Assuption (GSA) when the given basis is BKZ-reduced. We give experimental evidence that the GSA is accurate in this context, and we give a very simple approximate formula for the complexity of the sampler that is accurate over a large range of parameters and easily computable. We apply our results to the dual attack on LWE of [Pouly and Shen 2024] and significantly improve the complexity estimates of the attack. Finally, we provide some results of independent interest on the Gaussian mass of a random $q$-ary lattices.
Last updated: 2025-01-23
Better Codes for the HQC Cryptosystem
In the HQC cryptosystem, the length $n$ of the code determines several concrete parameters such as the bandwidth usage, the memory consumption, or the decoding efficiency. In this paper, we show that currently known methods to explicitly generate asymptotically good (especially with high relative distances), binary codes with efficient associated procedures cannot be used to improve $n$. We also show that concatenated codes are currently better suited, and by exhausting small codes, find a closer to optimal concatenated code for HQC, which improves upon currently used codes.
Verifiable Oblivious Pseudorandom Functions from Lattices: Practical-ish and Thresholdisable
We revisit the lattice-based verifiable oblivious PRF construction from PKC'21 and remove or mitigate its central three sources of inefficiency. First, applying Rényi divergence arguments, we eliminate one superpolynomial factor from the ciphertext modulus \(q\), allowing us to reduce the overall bandwidth consumed by RLWE samples by about a factor of four. This necessitates us introducing intermediate unpredictability notions to argue PRF security of the final output in the Random Oracle model. Second, we remove the reliance on the \(\mathsf{1D-SIS}\) assumption, which reduces another superpolynomial factor, albeit to a factor that is still superpolynomial. Third, by applying the state-of-the-art in zero-knowledge proofs for lattice statements, we achieve a reduction in bandwidth of several orders of magnitude for this material.
Finally, we give a \(t\)-out-of-\(n\) threshold variant of the VOPRF for constant \(t\) and with trusted setup, based on a \(n\)-out-of-\(n\) distributed variant of the VOPRF (and without trusted setup).