Robustness of implemented device-independent protocols against   constrained leakage

Ernest Y.-Z. Tan

arXiv:2302.13928·quant-ph·July 6, 2023

Robustness of implemented device-independent protocols against constrained leakage

Ernest Y.-Z. Tan

PDF

Open Access

TL;DR

This paper analyzes the robustness of device-independent cryptographic protocols against constrained information leakage, providing new proof techniques and estimates of leakage effects on keyrates to enhance practical security assurances.

Contribution

It introduces a tailored leakage model and compatible proof methods for analyzing recent DI protocol demonstrations, improving understanding of leakage tolerance.

Findings

01

Leakage impacts on keyrates are quantitatively estimated.

02

The new proof structure is compatible with recent DI protocols.

03

Protocols can tolerate certain leakage levels while maintaining positive keyrates.

Abstract

Device-independent (DI) protocols have experienced significant progress in recent years, with a series of demonstrations of DI randomness generation or expansion, as well as DI quantum key distribution. However, existing security proofs for those demonstrations rely on a typical assumption in DI cryptography, that the devices do not leak any unwanted information to each other or to an adversary. This assumption may be difficult to perfectly enforce in practice. While there exist other DI security proofs that account for a constrained amount of such leakage, the techniques used are somewhat unsuited for analyzing the recent DI protocol demonstrations. In this work, we address this issue by studying a constrained leakage model suited for this purpose, which should also be relevant for future similar experiments. Our proof structure is compatible with recent proof techniques for flexibly…

Tables1

Table 1. Table 1: List of notation

Symbol	Definition
$\log$	Base- $2$ logarithm
$H (\cdot)$	Base- $2$ von Neumann entropy
${∥ \cdot ∥}_{p}$	Schatten $p$ -norm
$S_{=} (A)$ (resp. $S_{\leq} (A)$ )	Set of normalized (resp. subnormalized) states on register $A$
$A_{j}^{k}$	Registers $A_{j} A_{j + 1} \dots A_{k - 1} A_{k}$

Equations115

F (ρ, σ) : = ρ σ_{1} + (1 - Tr [ρ]) (1 - Tr [σ]),

F (ρ, σ) : = ρ σ_{1} + (1 - Tr [ρ]) (1 - Tr [σ]),

H_{min} (A ∣ B)_{ρ}

H_{min} (A ∣ B)_{ρ}

H_{max} (A ∣ B)_{ρ}

H_{min}^{ϵ_{s}} (A ∣ B)_{ρ} : = \tilde{ρ} \in S_{\leq} (A B) s.t. P (\tilde{ρ}, ρ) \leq ϵ_{s} max H_{min} (A ∣ B)_{\tilde{ρ}}, H_{max}^{ϵ_{s}} (A ∣ B)_{ρ} : = \tilde{ρ} \in S_{\leq} (A B) s.t. P (\tilde{ρ}, ρ) \leq ϵ_{s} min H_{max} (A ∣ B)_{\tilde{ρ}} .

H_{min}^{ϵ_{s}} (A ∣ B)_{ρ} : = \tilde{ρ} \in S_{\leq} (A B) s.t. P (\tilde{ρ}, ρ) \leq ϵ_{s} max H_{min} (A ∣ B)_{\tilde{ρ}}, H_{max}^{ϵ_{s}} (A ∣ B)_{ρ} : = \tilde{ρ} \in S_{\leq} (A B) s.t. P (\tilde{ρ}, ρ) \leq ϵ_{s} min H_{max} (A ∣ B)_{\tilde{ρ}} .

H_{α} (A)_{ρ} : = \frac{1}{1 - α} lo g ∥ ρ_{A} ∥_{α}^{α} .

H_{α} (A)_{ρ} : = \frac{1}{1 - α} lo g ∥ ρ_{A} ∥_{α}^{α} .

L_{j} [ω]

L_{j} [ω]

ρ_{Q^{A} Q^{B} L^{all} X Y R}^{test} = x y \sum p_{x y}^{test} ρ_{Q^{A} Q^{B} L^{all} X Y R}^{x y}, ρ_{Q^{A} Q^{B} L^{all} X Y R}^{gen} = x y \sum p_{x y}^{gen} ρ_{Q^{A} Q^{B} L^{all} X Y R}^{x y}, where ρ_{Q^{A} Q^{B} L^{all} X Y R}^{x y} = (L \otimes I_{R}) [ω_{Q^{A} Q^{B} R} \otimes ∣ x y ⟩ ⟨ x y ∣_{X Y}] .

ρ_{Q^{A} Q^{B} L^{all} X Y R}^{test} = x y \sum p_{x y}^{test} ρ_{Q^{A} Q^{B} L^{all} X Y R}^{x y}, ρ_{Q^{A} Q^{B} L^{all} X Y R}^{gen} = x y \sum p_{x y}^{gen} ρ_{Q^{A} Q^{B} L^{all} X Y R}^{x y}, where ρ_{Q^{A} Q^{B} L^{all} X Y R}^{x y} = (L \otimes I_{R}) [ω_{Q^{A} Q^{B} R} \otimes ∣ x y ⟩ ⟨ x y ∣_{X Y}] .

ρ_{A B X Y R}^{test} = x y \sum p_{x y}^{test} ρ_{A B X Y R}^{x y}, ρ_{A B X Y R}^{gen} = x y \sum p_{x y}^{gen} ρ_{A B X Y R}^{x y}, where ρ_{A B X Y R}^{x y} = (M^{A} \otimes M^{B} \otimes I_{R}) [ρ_{Q^{A} Q^{B} L^{B \to A} L^{A \to B} X Y R}^{x y}] .

ρ_{A B X Y R}^{test} = x y \sum p_{x y}^{test} ρ_{A B X Y R}^{x y}, ρ_{A B X Y R}^{gen} = x y \sum p_{x y}^{gen} ρ_{A B X Y R}^{x y}, where ρ_{A B X Y R}^{x y} = (M^{A} \otimes M^{B} \otimes I_{R}) [ρ_{Q^{A} Q^{B} L^{B \to A} L^{A \to B} X Y R}^{x y}] .

ω, L, M^{A}, M^{B} in f H (S ∣ X Y R)_{ρ^{gen}} s.t. ρ_{A B X Y}^{test} = ab x y \sum p_{x y}^{test} μ_{ab ∣ x y} ∣ ab x y ⟩ ⟨ ab x y ∣,

ω, L, M^{A}, M^{B} in f H (S ∣ X Y R)_{ρ^{gen}} s.t. ρ_{A B X Y}^{test} = ab x y \sum p_{x y}^{test} μ_{ab ∣ x y} ∣ ab x y ⟩ ⟨ ab x y ∣,

ω, L, M^{A}, M^{B} in f H (S ∣ X Y R)_{ρ^{gen}} s.t. ρ_{A B}^{x y} = ab \sum μ_{ab ∣ x y} ∣ ab ⟩ ⟨ ab ∣ \forall x, y .

ω, L, M^{A}, M^{B} in f H (S ∣ X Y R)_{ρ^{gen}} s.t. ρ_{A B}^{x y} = ab \sum μ_{ab ∣ x y} ∣ ab ⟩ ⟨ ab ∣ \forall x, y .

F (ρ_{Q^{A} L^{B \to A} Q^{B} L^{A \to B} X Y R}^{x y}, ρ_{Q^{A} Q^{B} X Y R}^{x y} \otimes ∣ ϕ ⟩ ⟨ ϕ ∣^{\otimes 2}) \geq 1 - δ_{leak} .

F (ρ_{Q^{A} L^{B \to A} Q^{B} L^{A \to B} X Y R}^{x y}, ρ_{Q^{A} Q^{B} X Y R}^{x y} \otimes ∣ ϕ ⟩ ⟨ ϕ ∣^{\otimes 2}) \geq 1 - δ_{leak} .

ω, M^{A}, M^{B} in f H (S ∣ X Y R)_{σ^{gen}} - f_{cont} (δ_{leak}) s.t. F (ab \sum μ_{ab ∣ x y} ∣ ab ⟩ ⟨ ab ∣, σ_{A B}^{x y}) \geq 1 - δ_{leak} \forall x, y,

ω, M^{A}, M^{B} in f H (S ∣ X Y R)_{σ^{gen}} - f_{cont} (δ_{leak}) s.t. F (ab \sum μ_{ab ∣ x y} ∣ ab ⟩ ⟨ ab ∣, σ_{A B}^{x y}) \geq 1 - δ_{leak} \forall x, y,

f_{cont} (δ) = t lo g dim (S) + (1 + t) h_{2} (\frac{t}{1 + t}), where t = 2 δ - δ^{2} .

f_{cont} (δ) = t lo g dim (S) + (1 + t) h_{2} (\frac{t}{1 + t}), where t = 2 δ - δ^{2} .

ω, M^{A}, M^{B} in f H (S ∣ X Y R)_{σ^{gen}} s.t. σ_{A B}^{x y} = ab \sum μ_{ab ∣ x y} ∣ ab ⟩ ⟨ ab ∣ \forall x, y,

ω, M^{A}, M^{B} in f H (S ∣ X Y R)_{σ^{gen}} s.t. σ_{A B}^{x y} = ab \sum μ_{ab ∣ x y} ∣ ab ⟩ ⟨ ab ∣ \forall x, y,

ρ_{A B X Y R}^{x y} = (1 - δ_{leak}) σ_{A B X Y R}^{x y} + δ_{leak} \overset{σ}{^}_{A B X Y R}^{x y},

ρ_{A B X Y R}^{x y} = (1 - δ_{leak}) σ_{A B X Y R}^{x y} + δ_{leak} \overset{σ}{^}_{A B X Y R}^{x y},

ω, M^{A}, M^{B}, \overset{μ}{^}_{ab ∣ x y} \in R_{\geq 0} in f (1 - δ_{leak}) H (S ∣ X Y R)_{σ^{gen}} s.t. (1 - δ_{leak}) σ_{A B}^{x y} = ab \sum (μ_{ab ∣ x y} - δ_{leak} \overset{μ}{^}_{ab ∣ x y}) ∣ ab ⟩ ⟨ ab ∣ and ab \sum \overset{μ}{^}_{ab ∣ x y} = 1 \forall x, y,

ω, M^{A}, M^{B}, \overset{μ}{^}_{ab ∣ x y} \in R_{\geq 0} in f (1 - δ_{leak}) H (S ∣ X Y R)_{σ^{gen}} s.t. (1 - δ_{leak}) σ_{A B}^{x y} = ab \sum (μ_{ab ∣ x y} - δ_{leak} \overset{μ}{^}_{ab ∣ x y}) ∣ ab ⟩ ⟨ ab ∣ and ab \sum \overset{μ}{^}_{ab ∣ x y} = 1 \forall x, y,

ρ_{q}^{W} : = (1 - 2 q) Φ^{+} ⟩ ⟨ Φ^{+} + 2 q \frac{I}{4}, where Φ^{+} ⟩ : = \frac{1}{2} (∣ 00 ⟩ + ∣ 11 ⟩) .

ρ_{q}^{W} : = (1 - 2 q) Φ^{+} ⟩ ⟨ Φ^{+} + 2 q \frac{I}{4}, where Φ^{+} ⟩ : = \frac{1}{2} (∣ 00 ⟩ + ∣ 11 ⟩) .

x = 0 : θ = 0, x = 1 : θ = \frac{π}{4}, x = 2 : θ = \frac{π}{2}, x = 3 : θ = \frac{3 π}{4}, y = 0 : θ = 0, y = 1 : θ = \frac{π}{4}, y = 2 : θ = \frac{π}{2}, y = 3 : θ = \frac{3 π}{4} .

x = 0 : θ = 0, x = 1 : θ = \frac{π}{4}, x = 2 : θ = \frac{π}{2}, x = 3 : θ = \frac{3 π}{4}, y = 0 : θ = 0, y = 1 : θ = \frac{π}{4}, y = 2 : θ = \frac{π}{2}, y = 3 : θ = \frac{3 π}{4} .

x = 0 : θ = 0, x = 1 : θ = \frac{π}{2}, y = 0 : θ = \frac{π}{4}, y = 1 : θ = \frac{3 π}{4} .

x = 0 : θ = 0, x = 1 : θ = \frac{π}{2}, y = 0 : θ = \frac{π}{4}, y = 1 : θ = \frac{3 π}{4} .

ϑ_{p} : = - lo g (1 - 1 - p^{2}) \leq lo g \frac{2}{p ^{2}},

ϑ_{p} : = - lo g (1 - 1 - p^{2}) \leq lo g \frac{2}{p ^{2}},

H_{min}^{ϵ_{s}^{'}} (Q ∣ Q^{'} Q^{''})

H_{min}^{ϵ_{s}^{'}} (Q ∣ Q^{'} Q^{''})

\geq H_{min}^{ϵ_{s}} (Q ∣ Q^{''}) + H_{min}^{ν} (Q^{'} ∣ Q Q^{''}) - H_{max}^{ν} (Q^{'} ∣ Q^{''}) - 3 ϑ_{τ}

\geq H_{min}^{ϵ_{s}} (Q ∣ Q^{''}) - 2 H_{max}^{ν} (Q^{'}) - 3 ϑ_{τ},

H_{min}^{ϵ_{s}^{'}} (S ∣ L^{A \to E} L^{B \to E} E P)_{ρ_{∣ PE}} \geq H_{min}^{ϵ_{s}} (S ∣ E P)_{ρ_{∣ PE}} - 2 H_{max}^{ν} (L^{A \to E})_{ρ_{∣ PE}} - 2 H_{max}^{ν} (L^{B \to E})_{ρ_{∣ PE}} - 6 ϑ_{τ} .

H_{min}^{ϵ_{s}^{'}} (S ∣ L^{A \to E} L^{B \to E} E P)_{ρ_{∣ PE}} \geq H_{min}^{ϵ_{s}} (S ∣ E P)_{ρ_{∣ PE}} - 2 H_{max}^{ν} (L^{A \to E})_{ρ_{∣ PE}} - 2 H_{max}^{ν} (L^{B \to E})_{ρ_{∣ PE}} - 6 ϑ_{τ} .

H_{max}^{ν} (L)_{ρ_{∣ PE}}

H_{max}^{ν} (L)_{ρ_{∣ PE}}

\leq H_{α} (L)_{ρ_{n}} + \frac{lo g ( 1/ ϵ _{PE} )}{1/ α - 1} + \frac{ϑ _{ν}}{1/ α - 1}

\leq j = 1 \sum n ω sup H_{α} (L_{j})_{L_{j} [ω]} + \frac{lo g ( 1/ ϵ _{PE} ) + ϑ _{ν}}{1/ α - 1},

H_{α} (L_{j})_{ρ} \leq H_{α} (L_{j})_{P [ρ]} = \frac{1}{1 - α} lo g k \sum w_{k}^{α},

H_{α} (L_{j})_{ρ} \leq H_{α} (L_{j})_{P [ρ]} = \frac{1}{1 - α} lo g k \sum w_{k}^{α},

w \in D sup k \sum w_{k}^{α} s.t. w_{0} \geq 1 - δ_{leak},

w \in D sup k \sum w_{k}^{α} s.t. w_{0} \geq 1 - δ_{leak},

(1 - δ_{leak})^{α} + (d_{L} - 1) (\frac{δ _{leak}}{d _{L} - 1})^{α},

(1 - δ_{leak})^{α} + (d_{L} - 1) (\frac{δ _{leak}}{d _{L} - 1})^{α},

H_{max}^{ν} (L)_{ρ_{∣ PE}} \leq \frac{1}{1 - α} lo g ((1 - δ_{leak})^{α} + (d_{L} - 1) (\frac{δ _{leak}}{d _{L} - 1})^{α}) n + \frac{lo g ( 1/ ϵ _{PE} ) + ϑ _{ν}}{1/ α - 1} .

H_{max}^{ν} (L)_{ρ_{∣ PE}} \leq \frac{1}{1 - α} lo g ((1 - δ_{leak})^{α} + (d_{L} - 1) (\frac{δ _{leak}}{d _{L} - 1})^{α}) n + \frac{lo g ( 1/ ϵ _{PE} ) + ϑ _{ν}}{1/ α - 1} .

w_{k} \in R_{\geq 0} sup k \sum w_{k}^{α} s.t. w_{0} \geq 1 - δ_{leak}, k \sum w_{k} E_{k} \leq E_{exp}, k \sum w_{k} = 1,

w_{k} \in R_{\geq 0} sup k \sum w_{k}^{α} s.t. w_{0} \geq 1 - δ_{leak}, k \sum w_{k} E_{k} \leq E_{exp}, k \sum w_{k} = 1,

g (κ, β, λ) : = (1 - α) (\frac{α}{λ - κ})^{\frac{α}{1 - α}} + k \neq = 0 \sum (\frac{α}{β E _{k} + λ})^{\frac{α}{1 - α}} - κ (1 - δ_{leak}) + β E_{exp} + λ .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPhysical Unclonable Functions (PUFs) and Hardware Security · Cryptographic Implementations and Security · Cryptography and Data Security

Full text

Robustness of implemented device-independent protocols against constrained leakage

Ernest Y.-Z. Tan

Institute for Quantum Computing and Department of Physics and Astronomy, University of Waterloo, Waterloo, Ontario N2L 3G1, Canada.

Abstract

Device-independent (DI) protocols have experienced significant progress in recent years, with a series of demonstrations of DI randomness generation or expansion, as well as DI quantum key distribution. However, existing security proofs for those demonstrations rely on a typical assumption in DI cryptography, that the devices do not leak any unwanted information to each other or to an adversary. This assumption may be difficult to perfectly enforce in practice. While there exist other DI security proofs that account for a constrained amount of such leakage, the techniques used are somewhat unsuited for analyzing the recent DI protocol demonstrations. In this work, we address this issue by studying a constrained leakage model suited for this purpose, which should also be relevant for future similar experiments. Our proof structure is compatible with recent proof techniques for flexibly analyzing a wide range of DI protocol implementations. With our approach, we compute some estimates of the effects of leakage on the keyrates of those protocols, hence providing a clearer understanding of the amount of leakage that can be allowed while still obtaining positive keyrates.

1 Introduction

Device-independent (DI) cryptography is the concept of exploiting Bell inequality violations from quantum devices to achieve cryptographic tasks [BHK05, PAB+09, Sca12]. Informally, it relies on the observation that if two or more devices are not allowed to communicate between each other, then the only way for them to violate a Bell inequality is for them to be performing some “genuinely quantum” operations; therefore, certifying a Bell inequality violation ensures that some form of quantum behaviour is occurring in the devices, which could potentially be used for cryptographic purposes. The critical point about this reasoning is that it holds regardless of what states and measurements are being implemented in the devices, hence the term “device-independent” — informally, the only assumption made on the devices in a DI security proof is that they do not communicate information outside of the protocol’s specifications. This is in contrast to more “standard” quantum cryptography protocols, which are “device-dependent” in the sense that they rely on the honest parties’ device measurements (and/or state preparation) being well-characterized. DI cryptography hence offers a path towards quantum cryptography that is robust to a wide class of device imperfections, since it remains secure even if the states and measurements deviate from the intended ones.

In the past years, a number of theoretical [DFR20, ZKB18, ZFK20] and experimental [HBD+15, SMSC+15, GVW+15, RBG+17] advances have led to significant development in DI security proofs and protocol implementations. In particular, two families of DI protocols have undergone especially notable progress: device-independent randomness expansion (DIRE), in which the goal is to expand a short secret string into a longer one, and device-independent quantum key distribution (DIQKD), in which the goal is for two parties to establish a shared secret key. There have been several recent experimental demonstrations of DIRE [LLR+21, SZB+21, LZL+21] (and the related task of DI randomness generation [LZL+18, ZSB+20]), and progress has also been made on DIQKD demonstrations [NDN+22, ZLR+22, LZZ+22], though with somewhat worse performance as compared to DIRE.

In light of the above, it is important to ensure that the security proofs for these DI protocols are based on as accurate a model of the physical devices as possible. One limitation of most existing DI security proofs is that they rely on the assumption that absolutely no unwanted information “leaks” from the devices: this assumption is used not only to enforce the condition that the devices do not communicate in order to achieve the Bell violation, but also to ensure that the raw data produced by the devices is not simply broadcasted to an adversary. While this is already a fairly weak assumption as compared to device-dependent quantum cryptography, it is perhaps somewhat unrealistic as an absolute condition — one would expect that in a given physical implementation, some small amount of information could potentially leak out from the devices. It is hence important to ensure that DI protocols have some “robustness” against a limited amount of such leakage: while it is certainly intuitive that such robustness should hold, past observations such as the phenomenon of information locking [KRB+07, DHL+04, Win17] indicate that some care is needed here, as there can be situations where a small amount of additional information can have an unexpectedly large effect.

There have been a few previous works studying the effects of leakage or communication between devices in DI protocols; however, the models they have used are not a very close fit for describing the recent experimental implementations of DIRE or DIQKD. For instance, [SPM13] studied a model for weak cross-talk between the devices, but it is not straightforward to extend that model to account for leakage to an adversary when the device behaviour is not independent and identically distributed (IID) across protocol rounds. A model that accounts for non-IID behaviour and leakage to an adversary was studied in [JK21], but the leakage constraint in that model is that only a bounded number of qubits111The number of leakage qubits can be linear with respect to the number of protocol rounds $n$ , but must be smaller than the amount of smooth min-entropy that would be produced without leakage (see Sec. 4 for more details). is leaked between the devices and/or adversary over the course of the protocol. While this is a clean model to study, in the context of experimental implementations it is not straightforward to rigorously formulate a nontrivial upper bound on the number of qubits leaked during a protocol — for instance, if the leakage occurs via photonic systems with some classical Poissonian number distribution, then in theory the state has infinite-dimensional support. (Of course, qualitatively one would expect that if for instance the state has a large vacuum component, then it should not leak too much “useful information”, but formalizing this idea is part of the goal in this work.) In a somewhat different direction, [MS14, CL19, LRR19] have studied methods to mitigate leakage arising from device-reuse attacks [BCK13] and malicious classical post-processing units, but this is mostly focused on leakage that occurs after the protocol has finished distributing and measuring the states, rather than leakage that occurs during that process.

We also note that outside of DI cryptography, there have been studies of modified prepare-and-measure scenarios that can potentially be viewed as describing some form of untrusted leakage between the devices [TZCW+22, TPW+21, PPW+22]; however, the setups studied in those works are currently somewhat different from those in DI cryptography (though we note that the “bounded-weight” model we describe later in this work has similarities to the model used in [PPW+22]). Also, for device-dependent QKD in particular, the possibility of leakage due to detector backflash was considered in e.g. [KZM+01, PCS+18]. That model is quite similar to what we consider in this work, but we extend the analysis to the DI scenario and handle a number of complications that arise when non-IID behaviour is allowed (see Sec. 2.2).

The contribution of this work is to study a model for constrained leakage from the devices in a DI protocol, suited for analyzing existing DIRE and DIQKD demonstrations (though the analysis may generalize to some other DI protocols). Qualitatively, the idea is that in each round we assign some registers that model the leakage processes, and impose the constraint that with high probability the leakage registers are in some fixed “blank” state. While this model is fairly simple, it seems possible to estimate such leakage probabilities in some DI experimental setups [NDN+22], and we note that (as observed in [SPM13]) allowing for leakage in any sense is already covering a strictly wider class of scenarios than is typical in DI security proofs, or for that matter most device-dependent security proofs. Our approach for analyzing this leakage model takes place in roughly two parts. First, we analyze its effect on single rounds of the protocol, essentially by arguing that in this constrained leakage model, the state in each round is “close” to one where no leakage occurred. Next, we describe how this can then be converted into a security proof for the full protocol by using a series of entropic chain rules, taking into account non-IID effects (in particular, the possibility that leakage in later rounds could contain information about the secret data produced in earlier rounds). We also remark that although the focus of this work is on DI cryptography, the techniques described here should generalize to a variety of device-dependent protocols as well, such as QKD or QRNG.

This paper is structured as follows. In Sec. 2, we introduce notation and specify the leakage model, as well as describing the overall proof structure. In Sec. 3 we present the analysis of single rounds, and in Sec. 4 we describe how to obtain a security proof for the full protocol; in each section we explicitly compute some examples showing how much the keyrates are reduced by the leakage effects. Finally, in Sec. 5 we describe some potential future directions to explore.

2 Preliminaries

2.1 Notation and definitions

For technical reasons (as some of the theorems we use may not have been proven yet for infinite-dimensional systems) we take all systems to be finite-dimensional, but we will not impose any bounds on the system dimensions unless otherwise specified, i.e. they can have unboundedly large finite dimension.

Definition 1.

For $\rho,\sigma\in\operatorname{S}_{\leq}(A)$ , the generalized fidelity is

[TABLE]

and the purified distance is $P(\rho,\sigma)\coloneqq\sqrt{1-F(\rho,\sigma)^{2}}$ .

(Note that this means we are using the convention that for normalized pure states, we have $F(\left|\psi\right>\!\!\left<\psi\right|,\left|\phi\right>\!\!\left<\phi\right|)=|\langle\psi|\phi\rangle|$ , not $|\langle\psi|\phi\rangle|^{2}$ .)

We now state the definitions of various entropies required in our analysis. We follow the presentation in [DFR20], which can be shown to be equivalent to the definitions in [Tom16].

Definition 2.

For $\rho\in\operatorname{S}_{\leq}(AB)$ , the min- and max-entropies of $A$ conditioned on $B$ are

[TABLE]

where in the first equation the $(\mathbb{I}_{A}\otimes\sigma_{B})^{-\frac{1}{2}}$ term should be understood in terms of the Moore-Penrose generalized inverse. (Note that the optimum is indeed attained in both equations [Tom16], and it can be attained by a normalized state so $\operatorname{S}_{\leq}(B)$ can be replaced by $\operatorname{S}_{=}(B)$ without loss of generality.)

Definition 3.

For $\rho\in\operatorname{S}_{\leq}(AB)$ and ${\epsilon_{s}}\in\left[0,\sqrt{\operatorname{Tr}\!\left[\rho_{AB}\right]}\right)$ , the ${\epsilon_{s}}$ -smooth min- and max-entropies of $A$ conditioned on $B$ are

[TABLE]

Definition 4.

For $\rho\in\operatorname{S}_{=}(A)$ and $\alpha\in(0,1)\cup(1,\infty)$ , the $\alpha$ -Rényi entropy of $A$ is

[TABLE]

In the $\alpha\to 1$ limit, the Rényi entropy reduces to the von Neumann entropy. (The Rényi entropy can be extended to include conditioning systems in multiple different ways [Tom16], but we will not require those in this work; all those definitions match the above expression when there is no conditioning system.)

2.2 Leakage model

We shall suppose (as is the case in DIRE and DIQKD) that the protocol begins with Alice and Bob performing $n$ sequential rounds of supplying some classical inputs to their devices and receiving some classical outputs. To account for leakage from the devices over this part of the protocol, we shall consider the following model. (The specific order of events described here may appear slightly restrictive, but the analysis remains essentially similar if we consider some more general versions; see Appendix A.1 for further discussion.) We use the following notation: for the $j^{\text{th}}$ round, $Q^{A}_{j}$ (resp. $Q^{B}_{j}$ ) denotes the quantum register that will be measured in Alice’s (resp. Bob’s) device, $X_{j}$ (resp. $Y_{j}$ ) denotes the input supplied to the device, $A_{j}$ (resp. $B_{j}$ ) denotes the output obtained, and $M^{A}_{j}$ (resp. $M^{B}_{j}$ ) denotes a memory register the device can retain from previous rounds. We require the classical registers storing the input and output values to have some known finite dimension; the other registers can be of unknown dimension. We also introduce several registers $L^{A\to B}_{j},L^{A\to E}_{j},L^{B\to A}_{j},L^{B\to E}_{j}$ to track the “leakage” processes we are about to describe. Finally, let $\mathsf{E}$ denote a register Eve holds to collect quantum side-information across all the rounds (she can update this register as each round occurs, as we shall describe below — in principle we could use a different register to denote her updated side-information after each such process, following e.g. [MFS+22], but as we allow the dimension of $\mathsf{E}$ to be unbounded, there is no loss of generality by just discussing this one register). We then model the physical process in each round as follows:

A state preparation process takes place, modelled as follows: Eve first performs some channel $\mathsf{E}\to\mathsf{E}Q^{A}_{j}Q^{B}_{j}$ , and then some other “update” channel222Here we have allowed this channel to act across the memory registers of both devices — the proof approach we shall use is compatible with such a structure [DFR20, AFRV19, MFS+22], so we include this possibility for generality, even if it may not correspond to some intuitive physical process. $M^{A}_{j}M^{B}_{j}Q^{A}_{j}Q^{B}_{j}\to Q^{A}_{j}Q^{B}_{j}$ is performed using the memory registers retained from the previous round, with the registers $Q^{A}_{j}$ and $Q^{B}_{j}$ at the end of this process being held in Alice and Bob’s devices respectively. 2. 2.

Alice prepares an input $X_{j}$ to supply to her device, which performs some “leakage channel” $\mathcal{L}_{j}^{A}:Q^{A}_{j}X_{j}\to Q^{A}_{j}L^{A\to B}_{j}L^{A\to E}_{j}X_{j}$ that does not disturb333This no-disturbance condition is just for ease of discussion in our subsequent analysis, so that we only need a single register $X_{j}$ to keep a “persistent” record of Alice’s input throughout the protocol. In principle, we could have instead said more formally that Alice copies the classical register $X_{j}$ onto another register $\hat{X}_{j}$ that is supplied to the device, which then performs some channel $Q^{A}_{j}\hat{X}_{j}\to Q^{A}_{j}L^{A\to B}_{j}L^{A\to E}_{j}$ without involving the original $X_{j}$ register. However, for brevity we will usually just say that this overall process is a channel that “does not disturb $X_{j}$ ”. the classical register $X_{j}$ . Analogously, Bob supplies an input $Y_{j}$ to his device, which performs some leakage channel $\mathcal{L}_{j}^{B}:Q^{B}_{j}Y_{j}\to Q^{B}_{j}L^{B\to A}_{j}L^{B\to E}_{j}Y_{j}$ that does not disturb $Y_{j}$ . For brevity, we shall write the overall leakage channel as $\mathcal{L}_{j}\coloneqq\mathcal{L}_{j}^{A}\otimes\mathcal{L}_{j}^{B}$ . The registers $L^{A\to E}_{j}L^{B\to E}_{j}$ are now sent to Eve, while $L^{B\to A}_{j}$ is sent to Alice’s device and $L^{A\to B}_{j}$ is sent to Bob’s device. 3. 3.

Alice’s device performs some uncharacterized measurement channel $\mathcal{M}^{A}:Q^{A}_{j}L^{B\to A}_{j}X_{j}\to A_{j}M^{A}_{j+1}X_{j}$ that does not disturb $X_{j}$ , where $A_{j}$ is a classical register storing the measurement outcome and $M^{A}_{j+1}$ is the memory register retained for the next round. Analogously, Bob’s device receives $L^{A\to B}_{j}$ and performs some uncharacterized measurement channel $\mathcal{M}^{B}:Q^{B}_{j}L^{A\to B}_{j}Y_{j}\to B_{j}M^{B}_{j+1}Y_{j}$ that does not disturb $Y_{j}$ . Alice and Bob then announce their inputs $X_{j}Y_{j}$ , and Eve can use those values to update her register $\mathsf{E}$ .444Technically, depending on the exact protocol specification, Alice and Bob might not announce their inputs after each round. However, this is a rather specialized discussion and we defer the details to Appendix B.

Note that in the above description, we have imposed a subtle restriction on Eve — specifically, while we allow her to update the register $\mathsf{E}$ (and thus the state preparation for the next round) using the values $X_{j}Y_{j}$ , we do not include the possibility of her using the leakage registers $L^{A\to E}_{j}L^{B\to E}_{j}$ as well when doing so in each round. This “restricted adaptiveness” condition is currently required for our proof approach, as we discuss in more detail (along with some motivating circumstances) in Sec. 4 later.

If the leakage registers are unconstrained, any attempt at DI cryptography is futile — Bell violations can be trivially faked by using the registers $L^{A\to B}_{j}L^{B\to A}_{j}$ to communicate either party’s input to the other’s device, and furthermore the registers $L^{A\to E}_{j}L^{B\to E}_{j}$ could just give copies of all the device outputs to Eve. However, note that the standard assumption in DI cryptography is equivalent to stating that all these leakage registers are trivial (or at least independent of the inputs/outputs), which might be considered to be too extreme of an assumption. Hence in this work, we study a scenario where we impose a more relaxed constraint on these registers. Specifically, we consider the following constraint. (As noted in the introduction, similar ideas have been explored in the context of device-dependent QKD, e.g. in [KZM+01, PCS+18]. However, here we consider the DI case; furthermore there are some technical issues to address for non-IID leakage — see Remark 1 below.)

Bounded-weight leakage constraint:

We suppose we have certified some value $\delta_{\mathrm{leak}}>0$ such that all $\mathcal{L}_{j}$ have the following property: there exists some state $\left|\phi\right>\!\!\left<\phi\right|$ such that if the measurement described by projectors $\left(\left|\phi\right>\!\!\left<\phi\right|^{\otimes 4},\mathbb{I}-\left|\phi\right>\!\!\left<\phi\right|^{\otimes 4}\right)$ is performed on the $L^{A\to B}_{j}L^{A\to E}_{j}L^{B\to A}_{j}L^{B\to E}_{j}$ registers in any state produced by $\mathcal{L}_{j}$ , the probability of getting the outcome $\left|\phi\right>\!\!\left<\phi\right|^{\otimes 4}$ is always at least $1-\delta_{\mathrm{leak}}$ (regardless of the input state to $\mathcal{L}_{j}$ ).

Qualitatively, an example of physical reasoning behind such a constraint could be if $\left|\phi\right>\!\!\left<\phi\right|$ is a ground state of those registers, and we have certified that if we measure those registers in some basis that includes the ground state as a possible outcome, we will obtain the ground state with high probability. Note that the bounded-weight constraint also straightforwardly implies (see Appendix A.2) that if e.g. we consider only the registers $L^{A\to B}_{j}L^{B\to A}_{j}$ and perform the projective measurement $\left(\left|\phi\right>\!\!\left<\phi\right|^{\otimes 2},\mathbb{I}-\left|\phi\right>\!\!\left<\phi\right|^{\otimes 2}\right)$ , then the probability of getting the outcome $\left|\phi\right>\!\!\left<\phi\right|^{\otimes 2}$ is at least $1-\delta_{\mathrm{leak}}$ as well (and analogous statements hold for any other subset of the registers $L^{A\to B}_{j}L^{A\to E}_{j}L^{B\to A}_{j}L^{B\to E}_{j}$ ). (A minor variant of the bounded-weight constraint would be to impose an analogous condition on each of the registers individually, but that yields basically the same results up to rescaling $\delta_{\mathrm{leak}}$ by a factor of $4$ ; we discuss it further in Appendix A.2.)

We remark that the bounded-weight leakage model is somewhat more general than a constraint of the form “with probability at least $1-\delta_{\mathrm{leak}}$ , the channel $\mathcal{L}_{j}$ acts locally in the devices and independently sets all leakage registers to $\left|\phi\right>\!\!\left<\phi\right|$ ”. Such a constraint would be more correctly expressed as the following version, which we shall term as “classical-probabilistic leakage”:

Classical-probabilistic leakage constraint:

We suppose we have certified some value $\delta_{\mathrm{leak}}>0$ such that the following holds: there exist channels $\hat{\mathcal{L}}^{A}_{j}:Q^{A}_{j}X_{j}\to Q^{A}_{j}X_{j}$ and $\hat{\mathcal{L}}^{B}_{j}:Q^{B}_{j}Y_{j}\to Q^{B}_{j}Y_{j}$ and $\hat{\mathcal{L}}_{j}:Q^{A}_{j}Q^{B}_{j}X_{j}Y_{j}\to Q^{A}_{j}Q^{B}_{j}L^{A\to B}_{j}L^{A\to E}_{j}L^{B\to A}_{j}L^{B\to E}_{j}X_{j}Y_{j}$ (that all do not disturb the classical registers $X_{j}Y_{j}$ ), such that all $\mathcal{L}_{j}$ have the form

[TABLE]

where the $\left|\phi\right>\!\!\left<\phi\right|^{\otimes 4}$ term is on registers $L^{A\to B}_{j}L^{A\to E}_{j}L^{B\to A}_{j}L^{B\to E}_{j}$ .

Since this version also seems to be a potentially plausible model to study, we shall analyze it as well in this work. Note that a classical-probabilistic leakage constraint implies the bounded-weight version (with the same $\delta_{\mathrm{leak}}$ ), but the converse is not generally true (as a simple counterexample, consider a channel $\mathcal{L}_{j}$ that reads the classical value $X_{j}$ and sets the register $L^{A\to B}_{j}$ to a state of the form $\sqrt{1-\delta_{\mathrm{leak}}}\left|\phi\right>+\sqrt{\delta_{\mathrm{leak}}}\left|\psi_{x_{j}}\right>$ ; a physical example of such states would be weak coherent states in photonic systems). Accordingly, we expect that more “optimistic” results could be obtained by assuming a classical-probabilistic leakage constraint, and (as we show in our results later) we find this can indeed be the case.

However, we now note that when non-IID behaviour is allowed, the above constraints by themselves seem to still be not quite sufficient to allow us to obtain nontrivial security guarantees. This is due to the following attack (that applies for either form of leakage constraint above), which for later reference we shall term a “random full-leakage attack”: independently in each round, with probability $1-\delta_{\mathrm{leak}}$ all the leakage registers of that round are set to $\left|\phi\right>\!\!\left<\phi\right|$ , otherwise both devices set their respective leakage registers to be a copy of all the past outputs from that device, which for brevity we shall call a “full-leakage event” (note that this event can indeed be coordinated between the devices, because they can hold preshared randomness). The underlying idea behind this attack is somewhat similar to the device-reuse attacks of [BCK13], but we now argue in detail that leakage of this form also poses a problem within a single protocol implementation.

Specifically, recall that if a DIQKD or DIRE protocol is to achieve some nontrivial asymptotic keyrate, that means there is some constant $r>0$ (the asymptotic keyrate) such that the length of the final key approaches $rn$ at large $n$ . The preceding attack renders this impossible, by the following argument: if we take any fraction $r^{\prime}\in[0,1]$ , observe that under that attack, the probability of at least one full-leakage event occurring within the last $r^{\prime}n$ protocol rounds is $1-(1-\delta_{\mathrm{leak}})^{r^{\prime}n}$ , which is close to $1$ at large $n$ . Furthermore, a full-leakage event renders the preceding rounds useless, since Eve learns all previous outputs — hence this means that with probability close to $1$ , only at most the last $r^{\prime}n$ rounds are useful for generating the final key. However, note that if we had picked for instance $r^{\prime}=0.01r$ (or some appropriately smaller value if the device outputs have high dimension), then those $r^{\prime}n$ rounds cannot have enough smooth min-entropy (see [RR12] or Sec. 4 below) to produce a secret key of length $rn$ .

Note that since the above attack directly obstructs the final goal of secret key generation (with nontrivial asymptotic keyrate), it does not represent a limitation of the proof approaches we use, but rather an inherent limitation of imposing only the preceding constraints on the leakage model. Hence to obtain nontrivial results in the non-IID case, we shall need to impose further constraints on the leakage registers. Informally, one could say that the main power of the above attack comes from being able to encode a large amount of information in the leakage registers when they are not set to the $\left|\phi\right>\!\!\left<\phi\right|$ state. To prevent this, in this work we shall consider two possible approaches: in Sec. 4.1 we consider a dimension bound on the leakage registers, while in Sec. 4.2 we consider a “softer” version of a dimension constraint in the form of an energy bound. We elaborate more on those constraints (and how they can be motivated) in their respective sections.

Remark 1.

It is perhaps worth noting that in the IID case, however, these further constraints are in fact unnecessary — the preceding attack relies on highly non-IID behaviour, and if one makes an IID assumption, it is possible to obtain nontrivial results without those further constraints. We sketch out the proof structure for that case in Remark 2 in Sec. 4. (This is also why we defer the discussion of the dimension or energy constraints to Sec. 4 on the full protocol analysis against non-IID attacks, because we do not in fact need those constraints yet when analyzing single rounds in Sec. 3.)

We also note that in device-dependent QKD, a common proof technique for the non-IID case is to use de Finetti arguments [Ren05, CKR09] to reduce the analysis to the IID case. However, once (non-IID) leakage is allowed, there is a subtle obstacle in making this reduction valid even for device-dependent QKD. Specifically, the de Finetti arguments show that (roughly speaking) it suffices to consider the case where the quantum states supplied to Alice and Bob are IID; however, when leakage is allowed, this does not imply that the leakage registers are also IID. Hence even for device-dependent QKD, existing arguments do not seem sufficient to reduce the non-IID case to the IID case when leakage is allowed.555As for device-dependent non-IID security proofs based on complementarity [SP00, Koa09] or one-shot uncertainty relations [TL17], their relation to the IID case is somewhat less straightforward, so the question of whether they achieve a reduction to the IID case in the presence of leakage is somewhat ill-posed. Although, the latter approach could be compatible with our analysis in Sec. 4, since it proceeds via first finding a bound on the smooth min-entropy.

For convenience, before proceeding further we list a quick summary of the constraints imposed in our leakage model:

•

Either a bounded-weight or classical-probabilistic leakage constraint for each round

•

Either a dimension bound or energy bound for each round (only needed in a non-IID scenario; see Remark 2)

•

A “restricted adaptiveness” constraint on Eve (only relevant in a non-IID scenario; discussed further in Sec. 4)

2.3 Security proof structure and protocol requirements

In this work, we shall focus on the security proofs for DIRE or DIQKD that are based on the entropy accumulation theorem (EAT) [DFR20, DF19, LLR+21, NDN+22].666A different proof technique known as quantum probability estimation (QPE) [KZB20, ZKB18, ZFK20] was used in some DIRE experiments [SZB+21, LZL+21], but this approach is sufficiently different that we will not be discussing it in detail here. However, it does basically end up providing a bound on smooth min-entropy similar to that discussed in Sec. 4 based on the EAT, so the analysis in that section should also generalize to the QPE approach. The EAT provides a flexible and modular framework for security analysis of such protocols. Roughly speaking, the structure of proofs based on this approach can be divided into two core aspects (see e.g. [DFR20, TSB+22, LLR+21, NDN+22] for details). The first aspect is focused on analyzing the individual rounds, where the goal is roughly to solve an optimization problem that lower-bounds the von Neumann entropy of the device outputs in a single round (conditioned on some side-information Eve may hold) — this quantity is useful for characterizing the asymptotic keyrates of such protocols, according to the EAT (or in the simpler IID case, this follows from e.g. the Devetak-Winter formula [DW05] or the quantum asymptotic equipartition property (AEP) [TCR09]). The second aspect is to convert the solution to this optimization into an explicit bound on the length of secure key that can be produced by the protocol after a finite number of rounds. In the subsequent sections, we discuss each of these aspects in more detail, and how they can be modified to account for the leakage model we have described above.

Since in this work we are focusing on EAT-based proofs, we suppose that the protocol has the following structure, to maintain compatibility with the EAT analysis. First, we suppose that each of the $n$ rounds is independently chosen with some probability $\gamma$ to be a test round (informally, one that is used to gather statistics that determine whether the protocol aborts), and is otherwise taken to be a generation round (informally, one that generates entropy for a raw key that will be processed into a final key). When a test round occurs, Alice and Bob generate values $x$ and $y$ respectively with some fixed probabilities $p^{\mathrm{test}}_{xy}$ (using trusted local randomness), and supply these values as the inputs to their devices in that round. Analogously, in a generation round, Alice and Bob generate inputs with some other probabilities $p^{\mathrm{gen}}_{xy}$ . After these $n$ input-output pairs have been gathered, some further classical post-processing procedures are performed to produce the final secret key for the protocol. In particular, these procedures include777Some other post-processing procedures that we do not discuss here include for instance an error correction step in the case of DIQKD; these procedures does not significantly affect the analysis in this work and hence we do not describe them further. a parameter estimation step, where the protocol accepts if888The EAT can in fact also accommodate other forms of accept condition, for instance an accept condition based only on the CHSH winning probability, rather than a condition that involves every input-output tuple $(a,b,x,y)$ individually. However, for brevity in this work we will focus on the version described. (for all input-output tuples $(a,b,x,y)$ ) the observed frequency of rounds that are test rounds in which Alice and Bob supplied inputs $x,y$ and obtained outputs $a,b$ lies within some small interval999Informally, this interval is simply a “tolerance” parameter to ensure the honest behaviour is still accepted with high probability when accounting for finite-size statistical fluctuations — see e.g. [LLR+21, TSB+22] for precise calculations of the required interval widths. around $\gamma p^{\mathrm{test}}_{xy}\mu^{\mathrm{hon}}_{ab|xy}$ , where $\mu^{\mathrm{hon}}_{ab|xy}$ are the probabilities of getting outputs $a,b$ given inputs $x,y$ when the devices are honest. Furthermore, the last classical post-processing procedure is a privacy amplification step, in which Alice and Bob process their data in such a way that it “amplifies” its privacy with respect to Eve, producing an ideal secret key (see [Ren05] for details). Our subsequent discussion will be based around protocols with this structure (in particular, we remark that this is indeed sufficient to cover the existing finite-size demonstrations of DIRE and DIQKD).

3 Single-round entropy bounds

3.1 Fundamental optimization task

We now describe more precisely the relevant single-round optimization problem that has to be analyzed in an EAT-based security proof [DFR20, TSB+22, NDN+22]. Since we are focusing on a single round, for brevity in this section we shall omit the $j$ subscripts specifying individual rounds. Let us first recall the physical process in each round according to our leakage model: after the state preparation process involving the memory registers, there is some quantum state $\omega_{Q^{A}Q^{B}}$ in which Alice and Bob’s devices hold registers $Q^{A}$ and $Q^{B}$ respectively. Let $\omega_{Q^{A}Q^{B}R}$ be an arbitrary purification of this state (this can be informally thought of as Eve’s side-information, though for the purposes of this optimization problem it is just an abstract register holding a purification of $\omega_{Q^{A}Q^{B}}$ ). Now, depending on whether it is a test or generation round, the input values $XY$ are generated according to the probabilities $p^{\mathrm{test}}_{xy}$ or $p^{\mathrm{gen}}_{xy}$ , using trusted local randomness. After these inputs are supplied to the devices, the leakage channel $\mathcal{L}:Q^{A}Q^{B}XY\to Q^{A}Q^{B}L^{\mathrm{all}}XY$ (writing $L^{\mathrm{all}}\coloneqq L^{A\to B}L^{A\to E}L^{B\to A}L^{B\to E}$ for brevity) is then applied. For the test and generation round cases, let us denote the resulting states as $\rho^{\mathrm{test}}$ and $\rho^{\mathrm{gen}}$ respectively, so we have

[TABLE]

Alice and Bob’s devices then perform some unknown measurements $\mathcal{M}^{A}:Q^{A}L^{B\to A}X\to AX$ and $\mathcal{M}^{B}:Q^{B}L^{A\to B}Y\to BY$ respectively (in a minor abuse of notation, here for brevity we omit the memory registers from these channel outputs, as they are not involved in analyzing this round). Let us use the following notation for the reduced states on $ABXYR$ after these measurements:

[TABLE]

(The $\rho^{\mathrm{test}}_{ABXYR}$ state as written above matches $\rho^{\mathrm{test}}_{Q^{A}Q^{B}L^{\mathrm{all}}XYR}$ on all registers that are present in both expressions, so there is no danger of ambiguity in using $\rho^{\mathrm{test}}$ to denote both; analogously for $\rho^{\mathrm{gen}}$ and for each $\rho^{xy}$ .) Note that these states are classical on the registers $ABXY$ . Finally, let us extend the states (12) to an additional classical register $S$ that is computed from $ABXY$ ; this register $S$ represents the value produced from this round that will be incorporated into the raw key of the protocol (we introduce this register for flexibility in our discussion, since roughly speaking in DIRE one typically takes $S=AB$ , whereas in DIQKD one often takes $S=A$ ; see e.g. [BRC20, TSB+22]).

With this process in mind, the core single-round optimization problem that needs to be solved is as follows: for any values $\mu_{ab|xy}\in\mathbb{R}$ , we need to evaluate or lower-bound the optimization101010Following standard conventions in optimization theory, if the optimization is infeasible then its value is taken as $+\infty$ .

[TABLE]

where the states $\rho^{\mathrm{test}},\rho^{\mathrm{gen}}$ are to be understood as functions of the state $\omega_{Q^{A}Q^{B}R}$ and the channels $\mathcal{L},\mathcal{M}^{A},\mathcal{M}^{B}$ via (9)–(12) (and those channels implicitly have the structure described in Sec. 2.2, including whichever leakage constraint we are considering). On an informal level, the objective value $H(S|XYR)_{\rho^{\mathrm{gen}}}$ roughly characterizes Eve’s uncertainty about the raw-key value $S$ , conditioned on a side-information register $R$ and the input values $XY$ (which are typically publicly announced at some point in DI protocols). As for the constraints, the values $p^{\mathrm{test}}_{xy}\mu_{ab|xy}$ can roughly be thought of as characterizing the states $\rho^{\mathrm{test}}_{ABXY}$ produced by devices that we would accept with high probability in the protocol (for instance, in the IID asymptotic case, we can view them as being the values $p^{\mathrm{test}}_{xy}\mu^{\mathrm{hon}}_{ab|xy}$ that would be produced by the honest devices, and suppose that the protocol only accepts devices that exactly reproduce those values). However, a detailed discussion of how to convert an algorithm for solving this optimization into a full EAT-based security proof is beyond the scope of this work; we defer the details to e.g. [DF19, LLR+21, TSB+22, BFF21] (refer to the sections on crossover min-tradeoff functions).

We highlight that in the above discussion, the registers $L^{A\to E}L^{B\to E}$ do not in fact appear in the final optimization (15), despite the fact that we are allowing Eve to collect them as side-information. This is because we will be handling those registers separately, in the full-protocol analysis in Sec. 4. Still, we note that if one makes an assumption that Eve’s attack is IID, then it would indeed be possible to perform a security proof by directly including $L^{A\to E}L^{B\to E}$ in the conditioning registers in the analysis here. However, without this IID assumption, we would need to rely on other techniques such as the EAT, in which case the registers $L^{A\to E}L^{B\to E}$ potentially violate a Markov condition [DFR20, DF19] or no-signalling condition [MFS+22] in the theorem, and hence it is not useful to compute a bound that already includes them at this step — we instead handle them separately in the Sec. 4 analysis.

Remark 2.

To elaborate further on the IID case: first let us specify that by “IID attacks”, in this work we mean that in every round Eve independently generates and distributes the same state $\omega_{Q^{A}Q^{B}R}$ across the devices, and the measurement channels $\mathcal{M}_{j}$ and leakage channels $\mathcal{L}_{j}$ are also identical in every round, with the memory registers $M^{A}_{j}M^{B}_{j}$ and “update channels” all being trivial. With this, the state produced after all measurements are performed will be of the form $(\gamma\rho^{\mathrm{test}}_{ABXYL^{A\to E}L^{B\to E}R}+(1-\gamma)\rho^{\mathrm{gen}}_{ABXYL^{A\to E}L^{B\to E}R})^{\otimes n}$ . In that case, to compute the keyrate it basically suffices (again, due to the AEP [TCR09]; see [TSB+22] for a detailed explanation of the security proof structure) to have a method to evaluate the optimization (15) except with $H(S|XYL^{A\to E}L^{B\to E}R)_{\rho^{\mathrm{gen}}}$ as the objective function, which can be achieved using the same arguments as we have given above. With $L^{A\to E}L^{B\to E}$ included in the conditioning registers of this optimization (and given the IID structure), it is not necessary to separately handle those registers in the subsequent analysis, i.e. the Sec. 4 analysis can be omitted in this scenario.

3.2 Relaxing the optimization

The optimization (15) is not straightforward to solve directly, because the leakage channel $\mathcal{L}$ may potentially have some complicated structure. (Furthermore, the optimization over the states and measurements makes it a nonconvex problem with systems of unbounded dimension, though techniques have been developed [NPA08, BFF21] to address these issues in the context of Bell nonlocality and DI cryptography, which we shall also be relying on subsequently.) However, we shall now discuss methods to relax it to more tractable versions, for each of the leakage models discussed in 2.2.

We begin by first rewriting the optimization (15) slightly: note that in the constraint, the $\left|xy\right>\!\!\left<xy\right|$ terms are orthogonal for distinct $(x,y)$ values, and hence that constraint is equivalent to having an individual constraint for each $(x,y)$ value. Written in the latter form, the factors of $p^{\mathrm{test}}_{xy}$ and the $\left|xy\right>\!\!\left<xy\right|$ terms can be “cancelled off” in the constraints, allowing us to write the optimization as

[TABLE]

With this, we now discuss the bounded-weight leakage model. Our key observation for this model is that since we have the constraint that measuring the systems $L^{A\to B}L^{B\to A}$ (in an appropriate basis) would return the outcome $\left|\phi\right>\!\!\left<\phi\right|^{\otimes 2}$ with probability at least $1-\delta_{\mathrm{leak}}$ , we can apply a slight modification of the Gentle Measurement Lemma (see Appendix C) to conclude that the states $\rho^{xy}$ after applying the leakage channel have the following property:

[TABLE]

In other words, they are in fact “close” to the states that would be produced if we simply discard the leakage registers $L^{A\to B}L^{B\to A}$ and re-initialize them in the fixed state $\left|\phi\right>\!\!\left<\phi\right|^{\otimes 2}$ . By concavity of fidelity, analogous bounds hold for the states $\rho^{\mathrm{test}},\rho^{\mathrm{gen}}$ , which are mixtures of the states $\rho^{xy}$ . (Here we began by presenting the bound (19) for each $\rho^{xy}$ rather than the mixtures $\rho^{\mathrm{test}},\rho^{\mathrm{gen}}$ , because this will allow us to impose sharper constraints in our final optimization — this is also basically the reason we first rewrote the optimization (15) in the form (18).)

The above property implies that we can lower-bound the optimization (18) by noting that the states involved in it are “close” to some other states where effectively no leakage has occurred, after which we can handle the latter using standard techniques for DI optimizations. More precisely, we see that (18) is lower-bounded by the following optimization, which is written entirely in terms of some states $\sigma$ that are produced without leakage (this result is quite intuitive, but we provide a detailed derivation in Appendix D):

[TABLE]

where $\widetilde{\mathcal{M}}^{A},\widetilde{\mathcal{M}}^{B}$ are measurement channels analogous to $\mathcal{M}^{A},\mathcal{M}^{B}$ except without involving leakage registers, and the states $\sigma^{\mathrm{gen}},\sigma^{xy}$ are defined in terms of these channels and the state $\omega$ in an analogous fashion to (9)–(12), omitting the leakage processes (see (59)–(60) in Appendix D for an exact formula). The function $f_{\mathrm{cont}}$ in the objective is required to be a (uniform) continuity bound for conditional entropies, in the following sense: for any states $\rho,\sigma$ with $F(\rho,\sigma)\geq 1-\delta$ , we have $\left|H(S|XYR)_{\rho}-H(S|XYR)_{\sigma}\right|\leq f_{\mathrm{cont}}(\delta)$ . Such continuity bounds have been a subject of some previous interest [Win16, SBV+21]; for the purposes of this work, we use the following approach: by the Fuchs–van de Graaf inequalities, the constraint $F(\rho,\sigma)\geq 1-\delta$ implies $d(\rho,\sigma)\leq\sqrt{1-(1-\delta)^{2}}=\sqrt{2\delta-\delta^{2}}$ , and hence the trace-distance-based continuity bound in [Win16] lets us take

[TABLE]

(The intermediate conversion to trace distance here makes this approach somewhat suboptimal; however, this appears to be the best bound we can obtain using only existing results — we return to this point at the end of Sec. 3.3 for further discussion.)

The optimization (22) can now be tackled, because a substantial body of work [NPA08, NSPS14, BSS14, BFF21] in DI cryptography has been developed on the topic of solving optimizations (without leakage) of the form

[TABLE]

where the channels $\widetilde{\mathcal{M}}^{A},\widetilde{\mathcal{M}}^{B}$ and states $\omega,\sigma^{xy}$ have the same structure as in (22). In particular, the techniques in [NSPS14, BSS14, BFF21] transform the objective function in the above optimization problem in such a way that the optimization over states and measurements can be lower-bounded using semidefinite programming (SDP) bounds developed in [NPA08]. Compared to our optimization (22), the only significant difference is that we have a looser fidelity-based constraint instead of an exact equality constraint (the $f_{\mathrm{cont}}(\delta_{\mathrm{leak}})$ term is simply a constant in the optimization and hence does not introduce any difficulties). Fortunately, this fidelity-based constraint indeed has an SDP formulation (see Appendix D.1), and hence we can apply the SDP techniques in [NSPS14, BSS14, BFF21] to evaluate our optimization (22).

As for the classical-probabilistic leakage model, the analysis is easier: by similar arguments to the above, note that the channel structure (6) implies we can write (for each $(x,y)$ value)

[TABLE]

for some states $\sigma^{xy}$ produced without leakage (more precisely, of the form presented in (59)) and some other uncharacterized111111Strictly speaking, the channel structure $\mathcal{L}=\mathcal{L}^{A}\otimes\mathcal{L}^{B}$ imposes some constraints on the channel $\hat{\mathcal{L}}$ in (6), which could in principle slightly constrain the “uncharacterized” states $\hat{\sigma}^{xy}_{ABXYR}$ (and thus the values $\hat{\mu}_{ab|xy}$ in the optimization (30) below). However, we will not attempt to analyze this in detail here. states $\hat{\sigma}^{xy}$ . Again, a similar decomposition holds for the states $\rho^{\mathrm{test}},\rho^{\mathrm{gen}}$ as well, and by concavity of conditional entropy, the decomposition for the latter then implies we have $H(S|XYR)_{\rho^{\mathrm{gen}}}\geq(1-\delta_{\mathrm{leak}})H(S|XYR)_{\sigma^{\mathrm{gen}}}$ . Putting things together, we can thus relax the optimization (18) in this case121212Here it does not really matter whether we consider the original optimization (15) or the rewritten version (18) that has individual constraints for each $(x,y)$ ; both approaches yield equivalent results in this case because the constraints in (30) are affine and the states $\left|xy\right>\!\!\left<xy\right|_{XY}$ are orthogonal. to

[TABLE]

where the states $\sigma^{\mathrm{gen}},\sigma^{xy}$ are again defined via (59)–(60), and $\hat{\mu}_{ab|xy}$ are some non-negative scalars (basically, the constraints simply arise from writing out the classical state $\hat{\sigma}^{\mathrm{test}}_{ABXY}$ in its explicit form $\hat{\sigma}^{\mathrm{test}}_{ABXY}=\sum_{abxy}p^{\mathrm{test}}_{xy}\hat{\mu}_{ab|xy}\left|abxy\right>\!\!\left<abxy\right|$ for some unknown conditional probabilities $\hat{\mu}_{ab|xy}$ ).

3.3 Numerical results

As a small demonstration of our method, and to get a qualitative sense of how the value of $\delta_{\mathrm{leak}}$ affects the single-round entropies, we now evaluate the optimizations (22) and (30) for some choices of the constraint values $\mu_{ab|xy}$ . This is meant to be just a simple example; we do not aim to address the full range of implementations that have been used in DI protocol demonstrations, since in any case the appropriate choice of $\mu_{ab|xy}$ for a given experiment would depend on the details of the implementation. (Our approach does not place any particular requirements on $\mu_{ab|xy}$ , so it should generically be usable for any choice of those values, as long as the input and output sizes are not so large that the [NPA08] SDPs become intractable.)

Specifically, we follow e.g. [PAB+09, AFRV19] and consider distributions $\mu_{ab|xy}$ with binary-valued outputs $a,b\in\{0,1\}$ that are produced by measurements on a Werner state, parametrized by a depolarizing-noise value $q\in[0,1/2]$ (this parametrization arises from the fact that if matching Pauli measurements in the $X$ - $Z$ plane of the Bloch sphere are performed on the qubits in the Werner state, then the probability of getting different outcomes is simply $q$ ):

[TABLE]

We consider two choices of measurements on this state, with different numbers of inputs (i.e. measurement settings) in each. Firstly, we consider a scenario with $4$ inputs each for Alice and Bob, where each input value corresponds to the same measurement by either Alice or Bob, namely a (rotated) Pauli measurement in the $X$ – $Z$ plane at the following polar angles $\theta$ :

[TABLE]

This can be viewed as a combination of the Mayers-Yao self-test [MY98] with the measurements that maximize the violation of the CHSH inequality (up to sign conventions) [CHS+69], or alternatively as having both Alice and Bob perform the latter measurements. Secondly, we consider a simpler situation with just $2$ inputs each for Alice and Bob, corresponding to (rotated) Pauli measurements at the following angles in the $X$ – $Z$ plane:

[TABLE]

These measurements are the ones that maximize the violation of the CHSH inequality, and are very commonly studied in DI security proofs.

As for the choices of the testing probabilities $p^{\mathrm{test}}_{xy}$ and $p^{\mathrm{gen}}_{xy}$ , in both scenarios we take $p^{\mathrm{test}}_{xy}$ to be uniform over all inputs, and we set $p^{\mathrm{gen}}_{00}=1$ and $p^{\mathrm{gen}}_{xy}=0$ otherwise, i.e. the only inputs we use in the generation rounds are $x=y=0$ . We also set the register $S$ to just be equal to $A$ , i.e. the relevant quantity in DIQKD protocols. With these choices, the objective function can be written as $H(A|R;X=Y=0)$ . We plot our resulting bounds on the optimizations (22), (30) in Figs. 2–2. Regarding the details of the numerical computations: we used the SDP approaches from [NSPS14, BSS14] which lower-bound $H(A|R;X=Y=0)$ in terms of guessing probability, and for the 4-input scenario (34) we used local level 1 of the [NPA08] hierarchy, while for the 2-input scenario (37) we used local level 2 of the [NPA08] hierarchy.

From the plots, it can be seen that our entropy bounds for the bounded-weight leakage model are lower than those for the classical-probabilistic leakage model (given the same $\delta_{\mathrm{leak}}$ value). This is as expected since the latter is a special case of the former; however, it is noteworthy that the difference turned out to be quite dramatic — for instance at $\delta_{\mathrm{leak}}=10^{-3}$ the entropy in the bounded-weight model is already very low, while that in the classical-probabilistic model is only slightly affected. This indicates that under the bounded-weight model, we cannot tolerate particularly large values of $\delta_{\mathrm{leak}}$ before the results become trivial, at least with the methods proposed here. (Similar behaviour was observed in the analysis of prepare-and-measure scenarios in [PPW+22], though the analysis in that work was mostly based on trace distance rather than fidelity.)

Still, from the plots it can also be seen that the continuity-bound term $f_{\mathrm{cont}}(\delta_{\mathrm{leak}})$ is playing a significant role in reducing the entropy in the bounded-weight model. This is likely because the approach we have used here to obtain the continuity bound $f_{\mathrm{cont}}(\delta_{\mathrm{leak}})$ is rather suboptimal — firstly, despite originally having a bound on the fidelity, we took an intermediate step of converting it to trace distance; secondly, working with the trace distance has an inherent disadvantage that the von Neumann entropy is not Lipschitz continuous with respect to trace distance (i.e. viewing the formula (23) as a function of trace distance $t$ , it grows faster than any linear function at small values of $t$ , and this scaling behaviour is essentially unavoidable [Win16]). These points combined led to a continuity bound $f_{\mathrm{cont}}(\delta_{\mathrm{leak}})$ that scales approximately as $O(h_{2}(\sqrt{2\delta_{\mathrm{leak}}}))$ (taking the dominant terms in (23) at small $\delta_{\mathrm{leak}}$ ). If we had directly worked with the lower bound of $1-\delta_{\mathrm{leak}}$ on the fidelity, some results in [SBV+21, LKA+22] heuristically suggest that it might be possible to instead obtain a continuity bound that is Lipschitz with respect to the angular distance $\cos^{-1}(1-\delta_{\mathrm{leak}})$ , i.e. $f_{\mathrm{cont}}(\delta_{\mathrm{leak}})$ would scale as $O(\cos^{-1}(1-\delta_{\mathrm{leak}}))\approx O(\sqrt{2\delta_{\mathrm{leak}}})$ . This is significantly better than the $O(h_{2}(\sqrt{2\delta_{\mathrm{leak}}}))$ scaling in (23), and would lead to tighter bounds. However, it has not been rigorously proven that such a Lipschitz continuity bound (with respect to angular distance) holds when the conditioning systems are quantum; the results here indicate that deriving such a result in future work could be very useful in improving the keyrates from this approach.

4 Security of full protocol

We now turn to the question of ensuring security of the entire protocol. Informally, the idea here is to argue that with the constraints on the leakage systems, they cannot reveal “too much” additional information to Eve as compared to a situation where no leakage occurs. However, formalizing this intuition turns out to be rather technical, and we begin by laying out some of the required foundations.

Recall we have the requirement that the protocol ends in a privacy amplification step, where the parties process their data (after all the other classical post-processing steps) into an ideal secret key. Let $\mathbf{S}$ denote the string to be processed in that step, and let $E_{\mathrm{all}}$ denote all the side-information that Eve holds at that point. It is known that the length of secret key that can be produced through privacy amplification is essentially characterized [RW05]131313Recent work [Dup21] has provided an alternative characterization in terms of Rényi entropies, but we defer further discussion of this version to the conclusion. by the (conditional) smooth min-entropy $H_{\mathrm{min}}^{{\epsilon_{s}}}\left(\mathbf{S}|E_{\mathrm{all}}\right)$ (Definition 3). (To be precise, in a full security proof this entropy should be evaluated for the state conditioned on the protocol accepting, and there are subtleties in choosing the security definitions in a manner compatible with rigorous analysis of this conditioning. However, a detailed discussion is beyond the scope of this work; refer to e.g. [TL17, Tan21, PR21] for further discussion.)

In the standard DI scenario where there is no leakage, a security proof can typically proceed by taking $E_{\mathrm{all}}$ to consist of two parts: the register $\mathsf{E}$ representing the side-information Eve holds immediately after Alice and Bob have collected all their device outputs and announced their input choices (see Appendix B for a discussion of some technical details on this point), and a register $\mathbf{P}$ holding all the public communication after that point. (In the case of DIRE, $\mathbf{P}$ is typically small or trivial, but we include it here to maintain generality for DIQKD.) The main contribution of the EAT is that it provides a lower bound on $H_{\mathrm{min}}^{{\epsilon_{s}}}\left(\mathbf{S}|\mathsf{E}\right)$ (given a method to bound the single-round optimization (15)). This is then converted into a lower bound on the quantity of interest $H_{\mathrm{min}}^{{\epsilon_{s}}}\left(\mathbf{S}|E_{\mathrm{all}}\right)=H_{\mathrm{min}}^{{\epsilon_{s}}}\left(\mathbf{S}|\mathsf{E}\mathbf{P}\right)$ , by simply lower bounding it with $H_{\mathrm{min}}^{{\epsilon_{s}}}\left(\mathbf{S}|\mathsf{E}\right)-\log\dim(\mathbf{P})$ using a chain rule [WTH+11, Tom16] for smooth min-entropy.

In our context, it would seem we could modify the above approach by simply appending the string of leakage registers $\mathbf{L}^{A\to E}\mathbf{L}^{B\to E}$ (collected by Eve) to $E_{\mathrm{all}}$ , and aim to find a way to relate $H_{\mathrm{min}}^{{\epsilon_{s}}}\left(\mathbf{S}|\mathbf{L}^{A\to E}\mathbf{L}^{B\to E}\mathsf{E}\mathbf{P}\right)$ to $H_{\mathrm{min}}^{{\epsilon_{s}}}\left(\mathbf{S}|\mathsf{E}\mathbf{P}\right)$ (since by the above outline, an EAT-based security proof already provides methods to bound the latter). However, this runs into a subtle difficulty — since the leakage registers $L^{A\to E}_{j}L^{B\to E}_{j}$ in each round are immediately leaked to Eve, in principle she could perform a joint operation on these registers and the side-information she holds at that point, in order to generate the states sent to Alice and Bob’s devices in the next round. Such an operation could potentially couple the registers $L^{A\to E}_{j}L^{B\to E}_{j}$ to $Q^{A}_{j+1}Q^{B}_{j+1}$ in some complicated way, which prevents one from applying the EAT to bound $H_{\mathrm{min}}^{{\epsilon_{s}}}\left(\mathbf{S}|\mathsf{E}\mathbf{P}\right)$ (see Appendix B for a specialized discussion of the details). More trivially, it could also change the state on the registers $L^{A\to E}_{j}L^{B\to E}_{j}$ in some uncharacterized fashion, making it hard to preserve any particular structure for the state on those registers at the end.

Hence in this work, we impose a form of “restricted adaptiveness” assumption on Eve, as was described in the leakage model in Sec. 2.2. Specifically, we have supposed in that model that Eve’s strategy consists of gathering the leakage registers $L^{A\to E}_{j}L^{B\to E}_{j}$ in each round, but not operating further on them until all the state distribution and measurement steps have been completed. We note that this condition would indeed be plausible in, for instance, a DIRE implementation where the entire process of state generation and measurement takes place within a “shielded” lab where the leakage between the devices and out of the lab can be constrained (but apart from this constraint, the honest parties do not have any a priori certification of the states or measurements), and any other side-information Eve can hold about the measurement outcomes must come from some (possibly quantum) extension that she kept before distributing the devices. Such a context has been considered in for instance [PM13] (though only for zero leakage rather than bounded leakage, and focused on classical side-information), and is similar to the usage contexts for existing QRNG devices (though those are based on characterized states and/or measurements). On the other hand, in the context of DIQKD it is perhaps harder to justify this assumption, since in this setting one usually considers Alice and Bob to be receiving their states from an untrusted source. Still, we note that if an adaptive attack for Eve can be modelled via some state preparation process using the memory registers $M^{A}_{j}M^{B}_{j}$ described in Sec. 2.2, for instance if there is a way to “partition off” the parts of the leakage that encode the adaptive behaviour and include them in $M^{A}_{j}M^{B}_{j}$ rather than $L^{A\to E}_{j}L^{B\to E}_{j}$ , then our model would be sufficient to cover it as well.

With this restriction, our task is then to bound the smooth min-entropy of $\mathbf{S}$ conditioned on $\mathbf{L}^{A\to E}\mathbf{L}^{B\to E}\mathsf{E}\mathbf{P}$ , where the registers $\mathbf{L}^{A\to E}\mathbf{L}^{B\to E}$ were not modified after they were initially generated. Intuitively, we would expect that it should be possible to compensate for the leakage registers by subtracting some amount from $H_{\mathrm{min}}^{{\epsilon_{s}}}\left(\mathbf{S}|\mathsf{E}\mathbf{P}\right)$ (which can be bounded using the EAT-based analysis as described above). However, the simple dimension-based chain rule that was used to handle the $\mathbf{P}$ register is not sufficient in this context, because the log-dimension of $\mathbf{L}^{A\to E}\mathbf{L}^{B\to E}$ could easily be larger than $H_{\mathrm{min}}^{{\epsilon_{s}}}\left(\mathbf{S}|\mathsf{E}\mathbf{P}\right)$ (for instance simply if each $L^{A\to E}_{j}$ and/or $L^{B\to E}_{j}$ register is the same dimension as $S_{j}$ ). Instead, we use some different chain rules [VDT+13, Tom16] involving the smooth max-entropy (Definition 3). Specifically, introducing the notation

[TABLE]

if we take any values ${\epsilon_{s}},{\epsilon^{\prime}_{s}},\tau,\nu\in(0,1)$ such that ${\epsilon^{\prime}_{s}}={\epsilon_{s}}+2\tau+4\nu$ , the following inequality holds (for any state on some registers $QQ^{\prime}Q^{\prime\prime}$ ):

[TABLE]

where the last line holds by applying a duality relation $H_{\mathrm{min}}^{\nu}(Q^{\prime}|QQ^{\prime\prime})=-H_{\mathrm{max}}^{\nu}(Q^{\prime}|Q^{\prime\prime\prime})$ [Tom16] (where $Q^{\prime\prime\prime}$ is any purification of $QQ^{\prime}Q^{\prime\prime}$ ) and a data-processing property $H_{\mathrm{max}}^{\nu}(Q^{\prime}|R)\leq H_{\mathrm{max}}^{\nu}(Q^{\prime})$ . (In principle one could of course choose the parameters in the two chain rules separately, arriving at a final bound $H_{\mathrm{min}}^{{\epsilon^{\prime}_{s}}}(Q|Q^{\prime}Q^{\prime\prime})\geq H_{\mathrm{min}}^{{\epsilon_{s}}}(Q|Q^{\prime\prime})-H_{\mathrm{max}}^{\nu}(Q^{\prime})-H_{\mathrm{max}}^{\nu^{\prime}}(Q^{\prime})-2\vartheta_{\tau}-\vartheta_{\tau^{\prime}}$ for ${\epsilon^{\prime}_{s}}={\epsilon_{s}}+\tau+\tau^{\prime}+2\nu+2\nu^{\prime}$ , but it seems unclear if this provides any benefit — it essentially comes down to whether the value of $H_{\mathrm{max}}^{\nu}(Q^{\prime})+H_{\mathrm{max}}^{\nu^{\prime}}(Q^{\prime})$ subject to an upper bound on $\nu+\nu^{\prime}$ is minimized by taking $\nu=\nu^{\prime}$ , and analogously for $2\vartheta_{\tau}+\vartheta_{\tau^{\prime}}$ .)

Remark 3.

If the system $Q^{\prime}$ is classical, then a sharper result is possible since in that case we have $H_{\mathrm{min}}^{{\epsilon_{s}}}(QQ^{\prime}|Q^{\prime\prime})\geq H_{\mathrm{min}}^{{\epsilon_{s}}}(Q|Q^{\prime\prime})$ (Lemma 6.7 of [Tom16]), so we can basically stop after the first inequality in (39) and end up subtracting only about $H_{\mathrm{max}}^{\nu}(Q^{\prime})$ instead of $2H_{\mathrm{max}}^{\nu}(Q^{\prime})$ .

In our context, let $\rho_{n}$ denote the state just before privacy amplification, and let $\rho_{|\mathrm{PE}}$ denote that state conditioned (with normalization) on the event that the protocol accepted during the parameter-estimation step.141414As briefly mentioned at the start of this section, when performing a security proof what we finally need would instead be the smooth min-entropy of the state conditioned on all steps of the protocol accepting. However, we will not further discuss here how to handle conditioning on any additional events; techniques to handle this are described in e.g. [TL17] (Lemma 10) or [TSB+22] (Sec. 4.2). By applying the above result twice151515Alternatively, we could just apply it a single time, identifying all the registers $\mathbf{L}^{A\to E}\mathbf{L}^{B\to E}$ with $Q^{\prime}$ . Our subsequent analysis should also essentially be able to provide upper bounds on $H_{\mathrm{max}}^{\nu}\left(\mathbf{L}^{A\to E}\mathbf{L}^{B\to E}\right)$ with appropriate modifications (e.g. the choice of value for the dimension bound in Sec. 4.1 may change, or the Hamiltonian used in Sec. 4.2). Whether this alternative approach yields better results would seem to depend on the details of the protocol setup and parameter choices, so we do not discuss it in further depth within this work. , we see that for any ${\epsilon_{s}},{\epsilon^{\prime}_{s}},\tau,\nu\in(0,1)$ such that ${\epsilon^{\prime}_{s}}={\epsilon_{s}}+4\tau+8\nu$ (again, it would be possible to choose different smoothing parameters in each use of the bound, but we omit this here for brevity), we have

[TABLE]

In other words, this basically means that we can compensate for the registers $\mathbf{L}^{A\to E}\mathbf{L}^{B\to E}$ by subtracting (twice) their smooth max-entropies along with an additional constant “correction” term $6\vartheta_{\tau}$ (as well as slightly changing the smoothing parameter, which slightly worsens the final secrecy properties of the key, but not too much — see the Leftover Hashing Lemma in e.g. [Ren05, TL17] for details). Since the EAT can be used to prove that the $H_{\mathrm{min}}^{{\epsilon_{s}}}(\mathbf{S}|\mathsf{E}\mathbf{P})_{\rho_{|\mathrm{PE}}}$ term is of order $\Omega(n)$ in typical protocols, the $6\vartheta_{\tau}$ term is an almost-negligible correction, and we do not consider it in further detail. Our task is hence reduced to upper-bounding the smooth max-entropies $H_{\mathrm{max}}^{\nu}\left(\mathbf{L}^{A\to E}\right)_{\rho_{|\mathrm{PE}}}$ and $H_{\mathrm{max}}^{\nu}\left(\mathbf{L}^{B\to E}\right)_{\rho_{|\mathrm{PE}}}$ . Since these terms have basically the same structure in our model, in the remainder of this section we shall for brevity use $\mathbf{L}$ to denote either $\mathbf{L}^{A\to E}$ or $\mathbf{L}^{B\to E}$ , and describe how to bound $H_{\mathrm{max}}^{\nu}\left(\mathbf{L}\right)_{\rho_{|\mathrm{PE}}}$ .

The approach we shall use to bound this smooth max-entropy is to relate it to the Rényi entropies (Definition 4), using some intermediate results that were proven in the derivation of the EAT in [DFR20], as well as some security proof techniques used in e.g. [Ren05, AFRV19, TSB+22]. Specifically, let $p_{\mathrm{PE}}$ be the (unknown) probability that the state was accepted during parameter estimation. Then for any value $\epsilon_{\mathrm{PE}}\in(0,1)$ , one of the following must be true:

•

Either $p_{\mathrm{PE}}\leq\epsilon_{\mathrm{PE}}$ , in which case the protocol’s security condition is trivially satisfied (see e.g. [AFRV19, TSB+22] for details — the more precise claim would be that the secrecy condition holds with secrecy parameter $\epsilon_{\mathrm{PE}}$ ) and hence we do not discuss it further here;

•

Or $p_{\mathrm{PE}}>\epsilon_{\mathrm{PE}}$ , in which case for any $\alpha\in[1/2,1)$ we have

[TABLE]

where we have used the following results from [DFR20]: the first line is Lemma B.10 (which holds for $\alpha\in[1/2,1)$ ), the second line is Lemma B.5161616It would also have been possible to apply Lemma B.6 instead, but that would yield slightly worse dependence on the Renyi parameter in this context. (which holds for $\alpha\in(0,1)$ ) together with the bound $p_{\mathrm{PE}}>\epsilon_{\mathrm{PE}}$ , and the third line follows from a chain rule for Rényi entropies171717We cannot simply claim that $H_{\alpha}(\mathbf{L})$ is upper bounded by $\sum_{j=1}^{n}H_{\alpha}(L_{j})$ (without the supremum over input states to the channels), because the Rényi entropies are not subadditive. presented as Corollary 3.5 in that work (see also Appendix B below), with the supremum in the last expression taking place over all input states to the channel $\mathcal{L}_{j}$ .

For the purposes of our analysis, $\alpha\in[1/2,1)$ can be considered a free parameter that should be chosen to optimize the upper bound on $H_{\mathrm{max}}^{\nu}\left(\mathbf{L}\right)_{\rho_{|\mathrm{PE}}}$ . To estimate the scaling of this bound at large $n$ , we can follow [DFR20] and take $1-\alpha\propto 1/\sqrt{n}$ , so we have $1/({1/\alpha-1})\leq 1/({1-\alpha})=O(\sqrt{n})$ , and for the dimension-bounded case181818This analysis does not carry over to the energy-bounded case with no dimension bound, because the $O(1-\alpha)$ continuity bound presented in [DFR20] has a dependence on $\dim(L_{j})$ (or at least the dimension of the support of the state, due to a $H_{0}$ term). In fact there exist infinite-dimensional states with infinite Rényi entropy for all $\alpha<1$ but finite von Neumann entropy; whether such states can be ruled out would depend on the Hamiltonian describing a given implementation. each $H_{\alpha}(L_{j})$ term converges to $H(L_{j})$ on order $O(1-\alpha)=O(1/\sqrt{n})$ , by the continuity bounds in [DFR20]. Accounting for the sum over $j$ , this means our upper bound in this case scales as $\left(\sum_{j=1}^{n}\sup_{\omega}H(L_{j})_{\mathcal{L}_{j}[\omega]}\right)+O(\sqrt{n})$ , roughly similar to the AEP [TCR09]. (In fact we could also have obtained a bound with this scaling for the dimension-bounded case by directly applying the final EAT result [DFR20]; however, the approach we use here yields tighter results as we shall analyze $H_{\alpha}(L_{j})$ directly rather than taking an intermediate step of relating it to $H(L_{j})$ .)

With the bound (41), our task is reduced to upper-bounding $\sup_{\omega}H_{\alpha}(L_{j})_{\mathcal{L}_{j}[\omega]}$ (for all $j$ ). At first glance, it might seem that this is possible using only the bounded-weight leakage constraint (or the classical-probabilistic leakage constraint), since that enforces that the state produced by the leakage channel $\mathcal{L}_{j}$ is “close” to the pure state $\left|\phi\right>\!\!\left<\phi\right|$ , which has zero entropy. However, this alone does not quite work, since if the $L_{j}$ systems can have arbitrarily high dimension, they can still have arbitrarily high $H_{\alpha}(L_{j})$ despite the bounded-weight or classical-probabilistic leakage constraint (as can be seen from the subsequent sections, and also basically implied by the “random full-leakage attack” previously described in Sec. 2.2). Hence in this section we shall also require some choice of additional constraint as mentioned in Sec. 2.2, i.e. either a dimension bound or an energy bound.

Before proceeding, we show a helpful reduction to classical probability distributions (rather than quantum states): for each $L_{j}$ register, let $\big{\{}\left|e_{k}\right>_{L_{j}}\big{\}}$ be any orthonormal basis such that the first basis state $\left|e_{0}\right>$ is equal to $\left|\phi\right>$ , and let $\mathcal{P}$ denote the pinching channel with respect to that basis, i.e. $\mathcal{P}[\rho]\coloneqq\sum_{k}\left|e_{k}\right>\!\!\left<e_{k}\right|\rho\left|e_{k}\right>\!\!\left<e_{k}\right|$ . Since pinching channels are unital, they cannot decrease the Rényi entropy [Tom16], i.e. for any state $\rho$ we have

[TABLE]

where $w_{k}\coloneqq\left<e_{k}\right|\rho\left|e_{k}\right>$ (these values $\mathbf{w}$ form a probability distribution, i.e. we have $w_{k}\geq 0$ and $\sum_{k}w_{k}=1$ ). Since the right-hand-side of the above bound is monotone increasing with respect to the $\sum_{k}w_{k}^{\alpha}$ term, to upper bound $H_{\alpha}(L_{j})$ it suffices to just consider the latter instead. Furthermore, note that under either the bounded-weight or classical-probabilistic leakage constraint (both these models give the same results in our subsequent analysis, in contrast to Sec. 3), we have $w_{0}=\left<\phi\right|\rho\left|\phi\right>\geq 1-\delta_{\mathrm{leak}}$ . With this, our task is reduced to studying the following optimization:

[TABLE]

where the optimization domain $D$ encodes the requirement that $\mathbf{w}$ is a probability distribution, as well as either a dimension bound or an energy bound, depending on which we choose to use (the above optimization is unbounded if $D$ is allowed to be e.g. all probability distributions $\mathbf{w}$ of arbitrary finite dimension). Given some upper bound $U_{\alpha}$ on the above optimization, the quantity of interest $\sup_{\omega}H_{\alpha}(L_{j})_{\mathcal{L}_{j}[\omega]}$ is simply upper bounded by $\frac{1}{1-\alpha}\log U_{\alpha}$ .

This is now just an optimization of Rényi entropy (with the logarithm omitted) over classical probability distributions; furthermore, the objective function is a concave function of $\mathbf{w}$ (recalling that we are using $\alpha<1$ ) and hence this is a concave optimization as long as $D$ is a convex set. We remark that this approach yields a tight bound on $\sup_{\omega}H_{\alpha}(L_{j})_{\mathcal{L}_{j}[\omega]}$ whenever for instance the leakage channels satisfy the following property: there exists a state $\omega$ attaining the supremum in $\sup_{\omega}H_{\alpha}(L_{j})_{\mathcal{L}_{j}[\omega]}$ , such that $\mathcal{P}\circ\mathcal{L}_{j}[\omega]$ is also a possible output state of $\mathcal{L}_{j}$ (i.e. basically that the leakage channels can also produce a classical version of an output state attaining the supremum). We now discuss the details of how to bound the above optimization when choosing $D$ to encode either a dimension bound or an energy bound.

4.1 Dimension bounds

Here, we shall suppose that we are given some constant $d_{L}\in\mathbb{N}$ such that every $L_{j}$ register has dimension at most $d_{L}$ , i.e. so the domain $D$ in (45) is the set of $d_{L}$ -dimensional probability distributions. We first remark that this constraint could be motivated, for instance, if we suppose that the leakage register $L_{j}$ in each round is produced by some channel acting on a classical memory register $C_{j}$ with dimension upper bounded by some constant $d_{C}\in\mathbb{N}$ , i.e. a form of bounded-memory constraint. In that case, one can show that without loss of generality we can set $d_{L}=d_{C}+1$ : intuitively, this is because we only need that many dimensions to “encode the information” in $C_{j}$ while preserving the leakage constraints; we formalize this model and present the rigorous details in Appendix E (there are some subtleties, e.g. we need to start by analyzing $H_{\mathrm{min}}^{{\epsilon^{\prime}_{s}}}\left(\mathbf{S}|\mathbf{L}^{A\to E}\mathbf{L}^{B\to E}\mathsf{E}\mathbf{P}\right)_{\rho_{|\mathrm{PE}}}$ directly rather than the optimization (45)). As an example of a possible choice for $d_{C}$ , focusing on the case where $\mathbf{L}=\mathbf{L}^{A\to E}$ , we could for instance suppose that $C_{j}$ only stores the past ${k_{\mathrm{max}}}$ outputs from Alice’s device for some fixed ${k_{\mathrm{max}}}$ , in which case we have $d_{C}=d_{A}^{k_{\mathrm{max}}}$ where $d_{A}$ is the dimension of Alice’s single-round output (or if we want to allow $C_{j}$ to store the past ${k_{\mathrm{max}}}$ outputs from both devices, just take $d_{C}=(d_{A}d_{B})^{k_{\mathrm{max}}}$ instead).

We also highlight that such a dimension constraint is rather different from the one analyzed in the leakage model of [JK21] — in that work, to obtain nontrivial results, the total dimension of all the leakage registers over the protocol must be small (more precisely, while the log-dimension can be of order $\Omega(n)$ , it must be strictly less than the amount of smooth min-entropy the devices would have generated without leakage). In our model, however, we can allow the total log-dimension of $\mathbf{L}$ to be much larger than $H_{\mathrm{min}}^{{\epsilon_{s}}}(\mathbf{S}|\mathsf{E}\mathbf{P})_{\rho_{|\mathrm{PE}}}$ , e.g. we can set $d_{L}$ to be larger than the dimension of an $S_{j}$ register (as would be the case if we choose $d_{L}=d_{A}^{k_{\mathrm{max}}}+1$ following the above bounded-memory discussion), and still obtain nontrivial results.191919Still, qualitatively speaking it seems that a potential alternative approach in our setting might have been to use our leakage constraints to argue that in a classical sense, with high probability “not too many” of the $L_{j}$ registers are in a nontrivial state, and use this to bound the dimension of the support of the state on $\mathbf{L}$ (given a dimension bound $d_{L}$ ), then apply the analysis in [JK21]. However, it currently does not seem straightforward to formalize this into a rigorous argument when accounting for non-IID behaviour. Although, we highlight that in any case $H_{\mathrm{max}}^{\nu}\left(\mathbf{L}\right)$ approximately characterizes the size of the support of the state (see e.g. [TL17]), so the approach we present here could already be viewed in some sense as one approach to formalize this intuition.

We now turn to the main goal of upper-bounding (45) given some dimension bound $d_{L}$ . This is in fact straightforward: intuitively, the maximum entropy should be achieved by setting $w_{0}$ to be as low as possible and then distributing the remaining probability uniformly over the other $d_{L}-1$ variables $w_{k}$ (assuming that $\delta_{\mathrm{leak}}<1-1/d_{L}$ ; otherwise we can just set $\mathbf{w}$ to be the uniform distribution and attain the trivial maximum value $H_{\alpha}(L_{j})=\log d_{L}$ for the Rényi entropy), i.e. set $w_{0}=1-\delta_{\mathrm{leak}}$ and $w_{k}=\delta_{\mathrm{leak}}/(d_{L}-1)$ otherwise. This can be quickly confirmed by a symmetry argument: since the optimization is invariant under permutations of the variables $\{w_{k}|k\neq 0\}$ , and the objective function is concave, the maximum value can always be attained by some solution in which all $\{w_{k}|k\neq 0\}$ have the same value, from which the claim easily follows. (Alternatively, one could use a Lagrange-multiplier argument; see Appendix G.) Hence the optimization (45) evaluates to

[TABLE]

as long as $\delta_{\mathrm{leak}}<1-1/d_{L}$ (otherwise the optimal value becomes just the trivial maximum value and we cannot obtain any useful results; though in any case, for any scenario where we can expect to obtain nontrivial results we almost certainly have $\delta_{\mathrm{leak}}\leq 1/2$ and thus $\delta_{\mathrm{leak}}\leq 1-1/d_{L}$ ).

Putting this together with (41)–(42), we obtain an explicit upper bound on $H_{\mathrm{max}}^{\nu}\left(\mathbf{L}\right)_{\rho_{|\mathrm{PE}}}$ , for $\delta_{\mathrm{leak}}<1-1/d_{L}$ and any $\alpha\in[1/2,1)$ :

[TABLE]

To provide a simple example calculation of the above bound, we plot it as a function of $n$ for various choices of $\delta_{\mathrm{leak}}$ and the parameters $\nu,\epsilon_{\mathrm{PE}}$ in Fig. 3, with numerical optimization of the Rényi parameter $\alpha\in[1/2,1)$ . We choose $d_{L}=2^{5}+1$ , which in terms of our above discussion regarding bounded memory, we can view as supposing each leakage register $L_{j}$ is produced from a classical memory register of no more than $5$ bits (alternatively we can just suppose the leakage registers inherently have maximum dimension $d_{L}$ for some physical reason). We see that it is possible to obtain nontrivial bounds on $\frac{1}{n}H_{\mathrm{max}}^{\nu}\left(\mathbf{L}\right)_{\rho_{|\mathrm{PE}}}$ for these parameter choices, and also that the value becomes close to the asymptotic value (which should be $\sup_{\omega}H(L_{j})_{\mathcal{L}_{j}[\omega]}$ ) at approximately $n\sim 10^{9}$ .

4.2 Energy bounds

If we do not wish to impose a “hard” dimension bound on the leakage registers (or the registers they are generated from), we can instead follow a common approach for avoiding such dimension bounds, namely to impose an upper bound $E_{\mathrm{exp}}$ on the expectation value of the energy with respect to some Hamiltonian — this informally ensures that the state cannot have too much “weight” on high energy levels.202020In the analysis for this section we will allow the leakage registers to have infinite dimension. Technically, this is not precisely consistent with our generic assumption in this work that all systems have finite (though possibly unknown) dimension, but we can just view this as a minor relaxation of that condition within this section. At first glance, it would seem that with this we could argue that the state is close to one with support on some finite-dimensional low-energy subspace, then apply the dimension-bound analysis in the previous section. Unfortunately, this argument does not quite work out because without further structure in the Hamiltonian, for any fixed $\delta>0$ it is possible for a state on an infinite-dimensional space to be $\delta$ -close (in e.g. trace distance) to one with finite-dimensional support and yet have arbitrarily high entropy, i.e. we do not have uniform continuity of entropy for infinite-dimensional systems.212121This can be seen from e.g. the discussion in [Win16], but for a concrete example in our context, take any Hamiltonian with infinitely many energy levels below some finite value $E_{\star}$ . Note that there always exists some sufficiently small $t>0$ such that setting $w_{0}=1-t$ and distributing the remaining weight in any fashion across the energy levels below $E_{\star}$ will still satisfy the energy constraint. Given there are infinitely many such energy levels, we can distribute this weight uniformly across arbitrarily many of them, which yields arbitrarily high entropy (and in fact this allows the “random full-leakage attack” described in Sec. 2.2). Hence in this section we analyze how to tackle the optimization (45) directly under an energy constraint, rather than attempting to relate it to some intermediate state with finite-dimensional support.

We briefly remark on some possible alternatives: another approach could have been to use the continuity bound in [Win16] based on an energy bound (and some properties of the Hamiltonian) to formalize the above argument involving an intermediate state with finite-dimensional support. However, the continuity bound in [Win16] is for the von Neumann entropy, and hence in our context we would need to either generalize the proof to Rényi entropy, or bound the difference between the von Neumann entropy and Rényi entropy under an energy bound (again, if the dimension is bounded, this follows from e.g. the dimension-dependent continuity bound in [DFR20], but with only an energy bound it is less clear how to proceed). In any case, this approach seems more indirect than what we use below, and hence is likely to yield worse bounds. Another potential approach could be to try using the energy bound to argue that the entire state produced by the protocol has e.g. trace distance $\tilde{\epsilon}$ with respect to another state where all the $L_{j}$ registers have bounded dimension. In that case, if we use the bounded-dimension analysis to prove that the latter produces an $\epsilon^{\mathrm{sec}}$ -secret key (see e.g. [PR21, Tan21] for a detailed definition and discussion of $\epsilon^{\mathrm{sec}}$ -secrecy), we can conclude that the original state produces an $(\epsilon^{\mathrm{sec}}+\tilde{\epsilon})$ -secret key by the triangle inequality. This approach bypasses the continuity-bound obstacles described above, because the $\tilde{\epsilon}$ -closeness is used directly to bound the trace distance to some final ideal state produced at the very end of the protocol, rather than to invoke an continuity argument in the intermediate entropic analysis. However, as we discuss in Appendix F, the scaling of the dimension bounds we could obtain from such an argument seems unlikely to be useful.

We now present our approach for handling the optimization (45) with an energy bound. We require the Hamiltonian $H$ to have the property that one of the ground states is equal to the state $\left|\phi\right>$ we defined in our leakage model, and that the system has countable (possibly infinite) dimension. With this, we can take the eigenbasis of the Hamiltonian as the orthonormal basis $\big{\{}\left|e_{k}\right>_{L_{j}}\big{\}}$ used to construct the optimization (45), ordering the eigenvectors such that $\left|e_{0}\right>=\left|\phi\right>$ . We denote the energy eigenvalues as $E_{k}$ , and for ease of presentation we set the ground state energy $E_{0}$ to be zero without loss of generality. The energy bound we shall consider is to say that for any state $\rho$ that could be produced by the leakage channel $\mathcal{L}_{j}$ , the expectation value of the energy with respect to $H$ (i.e. $\operatorname{Tr}\!\left[\rho H\right]$ ) is upper bounded by some constant $E_{\mathrm{exp}}\geq 0$ . Recalling how the $w_{k}$ variables in (45) were defined in terms of the basis $\big{\{}\left|e_{k}\right>_{L_{j}}\big{\}}$ , we have $\operatorname{Tr}\!\left[\rho H\right]=\sum_{k}w_{k}E_{k}$ . With this, the optimization (45) can be written as follows (for use in our later analysis, we now write out the normalization condition as an explicit constraint):

[TABLE]

for a countable (possibly infinite) number of variables $w_{k}$ . (Note that even with the conditions listed above on the Hamiltonian, it is still possible for the optimization (50) to be unbounded; e.g. as a trivial example, if it has an infinite number of energy levels below $E_{\mathrm{exp}}$ . However, we shall not attempt to impose further conditions here, instead leaving it up to individual applications whether the Hamiltonian of interest yields a finite value in the optimization.)

The optimization (50) is essentially just an entropy maximization problem subject to an energy constraint (and a ground-state constraint) — this is similar to standard questions in thermodynamics, except that here we are considering a Rényi entropy rather than Shannon/Boltzmann entropy. Hence we can apply analogous approaches to tackle the optimization; specifically, here we use a Lagrange dual analysis. We present the details in Appendix G, with the main result being the following upper bound on (50). (In particular, we remark that the parameter $\beta$ in this bound is a Lagrange dual variable for the energy constraint, somewhat analogous to the thermodynamic inverse-temperature parameter $\beta=1/(k_{B}T)$ ; hence our choice of notation.) In the following lemma, the optimization (50) and the function $g(\kappa,\beta,\lambda)$ should be understood to have values in the extended reals $\mathbb{R}\cup\{\pm\infty\}$ , following standard conventions in optimization theory (i.e. for instance they take value $+\infty$ if (50) is unbounded above or if the summation in (51) diverges).

Lemma 1.

For any values $\kappa,\beta,\lambda\in\mathbb{R}$ such that $\beta\geq 0$ and $\lambda>\kappa\geq 0$ , the optimization (50) is upper bounded by

[TABLE]

Furthermore, $g$ is a convex function of $(\kappa,\beta,\lambda)$ , and as long as $\delta_{\mathrm{leak}},E_{\mathrm{exp}}>0$ , it yields a tight bound in the sense that the optimal value of (50) is equal to

[TABLE]

This gives us a variational method to bound the optimization (50), by simply optimizing over the choice of $\kappa,\beta,\lambda$ ; furthermore, since the optimization (52) is convex, it should be well-behaved under heuristic numerical methods. We remark that for the scenarios we studied in this work, we found that the optimal value of $\beta$ could be very small, and hence numerical stability was improved by reparametrizing it as $\beta=10^{-z}$ and optimizing over $z$ instead. (With this reparametrization the optimization (52) may become nonconvex; however, since $10^{-z}$ is a strictly monotone function this reparametrization preserves the property that any local minimum is a global minimum, so heuristic numerical methods should still perform well.)

As a demonstration of our method, we now analyze the case of a harmonic oscillator Hamiltonian with $M$ independent modes222222Here we suppose that $M$ is a finite fixed value, since again, if there are arbitrarily many distinguishable modes then in principle they could be used to encode enough information to allow the “random full-leakage attack”, even under the $\delta_{\mathrm{leak}}$ constraint. Ruling out that attack under such conditions would require more constraints on the Hamiltonian and/or states (for instance, that only a finite number $M$ of the modes are “nontrivial”, i.e. only those modes can have correlations with the secret data, in which case the analysis in this section can indeed be applied). and a common ground state $\left|\phi\right>$ for all the modes, i.e. if for each mode $m$ we use $\Delta_{m}$ to denote the energy level spacing and $\left|e_{m,l}\right>$ to denote the eigenstate for the $l^{\text{th}}$ energy level, it has the form

[TABLE]

(In terms of the notation in (50), we are taking the index $k$ to have values in $\{0\}\cup\left(\{1,2,\dots,M\}\times\mathbb{N}\right)$ , where $k=0$ labels the ground state $\left|\phi\right>\!\!\left<\phi\right|$ and otherwise $k=(m,l)$ labels the $l^{\text{th}}$ energy level of mode $m$ .)

For this Hamiltonian, for each mode $m$ the infinite sum in (51) takes the form $\sum_{l=1}^{\infty}\left(\frac{\alpha}{\beta l\Delta_{m}+\lambda}\right)^{s}$ where the exponent $s\coloneqq\frac{\alpha}{1-\alpha}$ is larger than $1$ (and $\alpha,\beta,\Delta_{m},\lambda\geq 0$ ), hence the sum indeed converges as long as $\beta>0$ . If we consider the Hurwitz zeta function $\zeta(s,x)\coloneqq\sum_{l=0}^{\infty}(l+x)^{-s}$ , which some computational software programs can evaluate using inbuilt methods, we could write this sum as $\left(\frac{\alpha}{\beta\Delta_{m}}\right)^{s}\left(\zeta\!\left(s,\frac{\lambda}{\beta\Delta_{m}}\right)-\left(\frac{\beta\Delta_{m}}{\lambda}\right)^{s}\right)$ . However, for more flexibility (and numerical stability, as we find that the Hurwitz zeta computation can be unstable in some parameter regimes) we can instead use a simple upper bound in terms of the integral version of the sum (since the terms in the sum can be written as a decreasing function of $l\in\mathbb{R}_{\geq 0}$ ):

[TABLE]

Note that conversely, the sum is lower-bounded by the above expression with the $\left(\frac{\alpha}{\beta\Delta_{m}+\lambda}\right)^{\frac{\alpha}{1-\alpha}}$ term omitted, hence giving an estimate of the tightness of this bound. We find that for the examples considered below, this term is indeed very small, indicating the above bound is quite tight.

With this, we compute upper bounds on $H_{\mathrm{max}}^{\nu}\left(\mathbf{L}\right)_{\rho_{|\mathrm{PE}}}$ for some parameter choices. We first remark that the formula (54) unfortunately appears to be numerically unstable if $\alpha$ is close to $1$ , and hence we were unable to optimize over all $\alpha\in[1/2,1)$ as we did in the previous section. Instead we simply chose a few fixed values of $\alpha$ , and computed the corresponding bounds on $\sup_{\omega}H_{\alpha}(L_{j})_{\mathcal{L}_{j}[\omega]}$ ; recalling our discussion below the bound (41), this means the resulting bound on $\frac{1}{n}H_{\mathrm{max}}^{\nu}\left(\mathbf{L}\right)_{\rho_{|\mathrm{PE}}}$ asymptotically approaches the $\alpha$ -Rényi entropy for the chosen value of $\alpha$ , rather than the von Neumann entropy (though as previously observed, this is not unexpected since in principle there could be Hamiltonians such that the difference between the $\alpha$ -Rényi entropy and von Neumann entropy in this optimization is unbounded). Still, we find that for some $\alpha$ choices it is possible to obtain reasonable results (also, we compute the corresponding values for the $(\log(1/\epsilon_{\mathrm{PE}})+\vartheta_{\nu})/({1/\alpha-1})$ term to verify that it is not unreasonably large).

Specifically, we take the example of a Hamiltonian with two modes, with energy level spacings $\Delta_{1}=u$ and $\Delta_{2}=2u$ for some arbitrary energy unit $u$ (as one would expect, the choice of unit does not affect the final bound, as it corresponds to just rescaling the parameter $\beta$ in (51)). We compute results for $\alpha\in\{0.9,0.99,0.999\}$ , for which the corresponding values of $1/({1/\alpha-1})$ are $9$ , $99$ , and $999$ respectively (basically, when $\alpha=1-x$ for some $x\in(0,1)$ , the value is $1/x-1$ ). These values (even after multiplying by the numerator $\log(1/\epsilon_{\mathrm{PE}})+\vartheta_{\nu}$ of that term in (41)) should not be too large compared to the $\Omega(n)$ smooth min-entropy term, for instance in photonic implementations which may have $n\geq 10^{11}$ [LLR+21]. With this model, we obtain the following upper bounds on $\sup_{\omega}H_{\alpha}(L_{j})_{\mathcal{L}_{j}[\omega]}$ for $E_{\mathrm{exp}}=10^{5}u$ :

$\alpha=0.9$ $\alpha=0.99$ $\alpha=0.999$

$\delta_{\mathrm{leak}}=10^{-2}$ $1.14987$ $0.37129$ $0.33713$

$\delta_{\mathrm{leak}}=10^{-3}$ $0.19596$ $0.04566$ $0.04053$

$\delta_{\mathrm{leak}}=10^{-4}$ $0.03199$ $0.00545$ $0.00476$

and for $E_{\mathrm{exp}}=10^{12}u$ :

$\alpha=0.9$ $\alpha=0.99$ $\alpha=0.999$

$\delta_{\mathrm{leak}}=10^{-2}$ $5.37979$ $0.68498$ $0.57672$

$\delta_{\mathrm{leak}}=10^{-3}$ $1.00513$ $0.07861$ $0.06460$

$\delta_{\mathrm{leak}}=10^{-4}$ $0.16480$ $0.00890$ $0.00715$

We see that despite the large $E_{\mathrm{exp}}$ values, we can still obtain nontrivial bounds on $\sup_{\omega}H_{\alpha}(L_{j})_{\mathcal{L}_{j}[\omega]}$ for smaller values of $\delta_{\mathrm{leak}}$ and/or values of $\alpha$ closer to $1$ . We leave for future work the topic of choosing more specialized Hamiltonians tailored for specific implementations, and finding if they yield nontrivial bounds.

5 Conclusion and further work

In this work, we have provided techniques to compute lower bounds on the achievable key lengths in DI protocols with constrained leakage, covering both the analysis of single rounds and the required steps to obtain a finite-size security proof without an IID assumption. While we have not considered specific implementations in detail, the techniques we provide are intended to be flexible and easily built into the existing proof techniques, with the exact parameter choices being fine-tuned for individual implementations. Our results suggest that the existing DI protocol implementations should be robust against a small amount of leakage from the devices, although with our current proof techniques, we may require the leakage parameter $\delta_{\mathrm{leak}}$ to have rather small values to obtain nontrivial results (especially for the bounded-weight leakage model). However, we highlight that for the bounded-weight model, there is still room for potential sharpening of the bounds; specifically, the continuity bound (23) could be significantly sharpened if one were to find a continuity bound based directly on fidelity instead of trace distance.

As a possible extension or variant of the approach presented here, we note that recently, a privacy amplification theorem was developed in [Dup21] based on Rényi entropy rather than smooth min-entropy. If desired, it seems possible to implement our approach in a proof based on that theorem as well. Specifically, there exists a powerful chain rule for (appropriately defined) conditional Rényi entropies [Dup15]: for any $\alpha,\alpha^{\prime},\alpha^{\prime\prime}\in(1/2,1)\cup(1,\infty)$ such that $\frac{\alpha}{\alpha-1}=\frac{\alpha^{\prime}}{\alpha^{\prime}-1}+\frac{\alpha^{\prime\prime}}{\alpha^{\prime\prime}-1}$ , one has $H_{\alpha}(QQ^{\prime}|Q^{\prime\prime})\geq H_{\alpha^{\prime}}(Q|Q^{\prime}Q^{\prime\prime})+H_{\alpha^{\prime\prime}}(Q^{\prime}|Q^{\prime\prime})$ if $(\alpha-1)(\alpha^{\prime}-1)(\alpha^{\prime\prime}-1)>0$ , and the inequality is reversed if $(\alpha-1)(\alpha^{\prime}-1)(\alpha^{\prime\prime}-1)<0$ . By using this chain rule, we should be able to obtain an analogue of the bounds (39)–(40) here, hence lower-bounding $H_{\alpha}\left(\mathbf{S}|\mathbf{L}^{A\to E}\mathbf{L}^{B\to E}\mathsf{E}\mathbf{P}\right)$ in terms of $H_{\overline{\alpha}}(\mathbf{S}|\mathsf{E}\mathbf{P})$ and $H_{\hat{\alpha}}\left(\mathbf{L}^{A\to E}\right),H_{\hat{\alpha}}\left(\mathbf{L}^{B\to E}\right)$ for some $\overline{\alpha},\hat{\alpha}$ . Now, while in this work we have described the EAT as bounding $H_{\mathrm{min}}^{{\epsilon_{s}}}\left(\mathbf{S}|\mathsf{E}\mathbf{P}\right)$ , in fact it more fundamentally provides a lower bound on the corresponding Rényi entropy, so that would let us handle $H_{\overline{\alpha}}(\mathbf{S}|\mathsf{E}\mathbf{P})$ . Similarly, our approach for bounding $H_{\mathrm{max}}^{\nu}\left(\mathbf{L}\right)$ in this work also proceeds via bounding the Rényi entropy, so it already provides a bound on the $H_{\hat{\alpha}}\left(\mathbf{L}^{A\to E}\right),H_{\hat{\alpha}}\left(\mathbf{L}^{B\to E}\right)$ terms. Hence this approach seems plausible (in fact, this chain rule has already been applied in part of a security proof for device-dependent QKD in [GLH+22]), though we leave the details and potential numerical comparison for future work.

Acknowledgements

We thank Jean-Daniel Bancal, Peter Brown, Christopher Chubb, Omar Fawzi, Srijita Kundu, Tony Metger, Joseph Renes, Renato Renner, Nicolas Sangouard, Pavel Sekatski, and Marco Tomamichel for helpful discussions.

Financial support for this work has been provided by the Natural Sciences and Engineering Research Council of Canada (NSERC) Alliance, and Huawei Technologies Canada Co., Ltd.

Computations were performed using the MATLAB package YALMIP [Löf04] with the solver MOSEK [MOS19], as well as Mathematica.

Appendix A Minor variations

A.1 Event ordering in leakage channel

One slightly restrictive property of the structure we have imposed on the leakage process is that (focusing on Alice; the situation for Bob is analogous) the leakage register $L^{A\to E}_{j}$ for Eve is produced before Alice’s device receives $L^{B\to A}_{j}$ from Bob and measures it to produce an output. This is mainly to ensure we have a well-defined joint state on $L^{A\to B}_{j}L^{A\to E}_{j}L^{B\to A}_{j}L^{B\to E}_{j}$ after applying $\mathcal{L}_{j}$ , but has the drawback that $L^{A\to E}_{j}$ may not be fully able to encode information about Alice’s output in that round. Still, this seems to be a not very significant restriction, since Alice’s device can for instance produce some “preliminary” output $\hat{A}_{j}$ using only $Q^{A}_{j}X_{j}$ , then use $L^{A\to E}_{j}$ to encode some information about this preliminary output at least (although indeed this preliminary output may differ from the final output $A_{j}$ produced after $L^{B\to A}_{j}$ is received). Alternatively, if the device memories can retain the outputs for one round at least, the next round’s registers $L^{A\to E}_{j+1}$ could be used to leak information about the $j^{\text{th}}$ round outputs, i.e. the devices could just “defer” the leakage by one round. Hence when the number of rounds is large, this issue seems unlikely to be significant. A perhaps more significant restriction in the model is that only one leakage register is sent in each direction (per round), rather than allowing for arbitrarily many iterations of leakage between the devices in both directions.

Still, we can in fact somewhat accommodate the above possibilities while retaining the results we derived. To do so, we could instead allow the leakage channels $\mathcal{L}_{j}$ to have the following structure. After the state preparation process, Alice’s device performs some “preliminary” operations on $Q^{A}_{j}X_{j}$ to produce a state on some registers $Q^{A}_{j}L^{A}_{j}X_{j}$ without disturbing $X_{j}$ . Analogously, Bob’s device acts on $Q^{B}_{j}Y_{j}$ and produces a state on registers $Q^{B}_{j}L^{B}_{j}Y_{j}$ . Then some channel is applied on the registers $L^{A}_{j}L^{B}_{j}$ to produce registers $L^{A\to B}_{j}L^{A\to E}_{j}L^{B\to A}_{j}L^{B\to E}_{j}$ . Note that this last channel does not need to act locally in Alice and Bob’s devices; it can act arbitrarily across the registers $L^{A}_{j}L^{B}_{j}$ .

This model is more general than the main one we focused on in this work, since it could in theory even model multiple rounds of interaction between the devices in the last step. However, all the bounds we compute in this work apply to this model as well (under a bounded-weight or classical-probabilistic leakage constraint on the overall channel $\mathcal{L}_{j}$ ). This is because for instance in the Sec. 3 analysis, considering the state obtained by tracing out $L^{A\to B}_{j}L^{A\to E}_{j}L^{B\to A}_{j}L^{B\to E}_{j}$ from the output of $\mathcal{L}_{j}$ is equivalent to considering the state obtained by tracing out $L^{A}_{j}L^{B}_{j}$ from the output of the “preliminary” operations in this model. Since these operations act locally on Alice and Bob’s systems, when $L^{A}_{j}L^{B}_{j}$ are removed from their outputs we can absorb the effects of these operations into the state before the measurement, in which case the remainder of the analysis holds by the same arguments. As for the Sec. 4 analysis, it only used the fact that the $L^{A\to E}_{j}L^{B\to E}_{j}$ registers are subject to the $\delta_{\mathrm{leak}}$ constraint, which is the same in this model.

One potential drawback here is that interpreting the $\delta_{\mathrm{leak}}$ constraint in this model seems less straightforward, since the registers $L^{A\to B}_{j}L^{A\to E}_{j}L^{B\to A}_{j}L^{B\to E}_{j}$ produced by such a model seem less directly related to the physical registers being sent between the devices during the actual leakage process — they are more of a “summary” of the final result. On the other hand, if the physical setup justifies imposing the leakage constraints we have used on the “preliminary registers” $L^{A}_{j}L^{B}_{j}$ themselves, then it seems our analysis should also basically generalize to this model, by a suitable data-processing argument — we leave the details for future work, if there is a setup which seems reasonably described by this model.

A.2 Relations between probability bounds

We first make a simple observation: if we have $k$ registers $Q_{1}\dots Q_{k}$ , then the outcome probabilities produced by performing the projective measurement $\left(\left|\phi\right>\!\!\left<\phi\right|^{\otimes k},\mathbb{I}-\left|\phi\right>\!\!\left<\phi\right|^{\otimes k}\right)$ on these $k$ registers are the same as what we would obtain if we had measured each register individually with the projectors $\left(\left|\phi\right>\!\!\left<\phi\right|,\mathbb{I}-\left|\phi\right>\!\!\left<\phi\right|\right)$ and then coarse-grained the outcomes (in the sense that if all $k$ of the individual measurements in the latter scenario returned $\left|\phi\right>\!\!\left<\phi\right|$ then we identify it with the outcome $\left|\phi\right>\!\!\left<\phi\right|^{\otimes k}$ in the former scenario, and otherwise we identify it with the outcome $\mathbb{I}-\left|\phi\right>\!\!\left<\phi\right|^{\otimes k}$ ). Note that we are only claiming that the outcome probabilities from these two processes are the same; the post-measurement states will in general be different, but we will not require them in our discussion here (our arguments only involve the outcome probabilities).

With this, to see the effect of the bounded-weight leakage constraint on e.g. just the registers $L^{A\to B}_{j}L^{B\to A}_{j}$ , we see that the above observation implies that the probability of getting the outcome $\left|\phi\right>\!\!\left<\phi\right|^{\otimes 2}$ from the projective measurement $\left(\left|\phi\right>\!\!\left<\phi\right|^{\otimes 2},\mathbb{I}-\left|\phi\right>\!\!\left<\phi\right|^{\otimes 2}\right)$ on these registers is the same as the probability of both of them giving outcome $\left|\phi\right>\!\!\left<\phi\right|$ when measured individually with $\left(\left|\phi\right>\!\!\left<\phi\right|,\mathbb{I}-\left|\phi\right>\!\!\left<\phi\right|\right)$ . Now note that this probability must be at least the probability of all $4$ outcomes being $\left|\phi\right>\!\!\left<\phi\right|$ when individually measuring all $4$ registers $L^{A\to B}_{j}L^{A\to E}_{j}L^{B\to A}_{j}L^{B\to E}_{j}$ with $\left(\left|\phi\right>\!\!\left<\phi\right|,\mathbb{I}-\left|\phi\right>\!\!\left<\phi\right|\right)$ ; however, by again invoking the above observation, this is just the probability of getting the outcome $\left|\phi\right>\!\!\left<\phi\right|^{\otimes 4}$ from the measurement $\left(\left|\phi\right>\!\!\left<\phi\right|^{\otimes 4},\mathbb{I}-\left|\phi\right>\!\!\left<\phi\right|^{\otimes 4}\right)$ on the registers $L^{A\to B}_{j}L^{A\to E}_{j}L^{B\to A}_{j}L^{B\to E}_{j}$ . As an alternative, one could just prove the desired result by direct calculation:

[TABLE]

as claimed.

Furthermore, that observation allows us to easily analyze a minor variant of the bounded-weight leakage model, where we instead say that for each of the registers $L^{A\to B}_{j},L^{A\to E}_{j},L^{B\to A}_{j},L^{B\to E}_{j}$ individually, the probability of getting outcome $\left|\phi\right>\!\!\left<\phi\right|$ when measured with $\left(\left|\phi\right>\!\!\left<\phi\right|,\mathbb{I}-\left|\phi\right>\!\!\left<\phi\right|\right)$ is at least $1-\delta_{\mathrm{leak}}^{\prime}$ for some $\delta_{\mathrm{leak}}^{\prime}>0$ . We shall now show the version in the main text can be straightforwardly converted into this variant and vice versa, up to a small change of the leakage parameter (for one direction of the conversion). Specifically, first note that the argument in the previous paragraph already shows that the bounded-weight model in the main text (with leakage parameter $\delta_{\mathrm{leak}}$ ) automatically implies this variant with $\delta_{\mathrm{leak}}^{\prime}=\delta_{\mathrm{leak}}$ . As for the reverse conversion, let us rephrase this variant as the statement that for each individual measurement the probability of getting outcome $\overline{\left|\phi\right>\!\!\left<\phi\right|}$ is at most $\delta_{\mathrm{leak}}^{\prime}$ , where for brevity we introduce the notation $\overline{\left|\phi\right>\!\!\left<\phi\right|}\coloneqq\mathbb{I}-\left|\phi\right>\!\!\left<\phi\right|$ and analogously $\overline{\left|\phi\right>\!\!\left<\phi\right|^{\otimes k}}\coloneqq\mathbb{I}-\left|\phi\right>\!\!\left<\phi\right|^{\otimes k}$ for any $k\in\{1,2,3,4\}$ (i.e. this is just a compact notation for “complementary” outcomes). Invoking the observation in the first paragraph, the probability of getting the outcome $\overline{\left|\phi\right>\!\!\left<\phi\right|^{\otimes 4}}$ from the measurement $\left(\left|\phi\right>\!\!\left<\phi\right|^{\otimes 4},\overline{\left|\phi\right>\!\!\left<\phi\right|^{\otimes 4}}\right)$ on all $4$ registers is the same as the probability for individually measuring each register with $\left(\left|\phi\right>\!\!\left<\phi\right|,\overline{\left|\phi\right>\!\!\left<\phi\right|}\right)$ and getting at least one $\overline{\left|\phi\right>\!\!\left<\phi\right|}$ outcome. Applying the union bound, that probability is at most $4\delta_{\mathrm{leak}}^{\prime}$ (note that no independence assumptions on the state across the registers are needed to apply the union bound). Hence we can conclude that this variant model implies the bounded-weight leakage model in the main text with $\delta_{\mathrm{leak}}=4\delta_{\mathrm{leak}}^{\prime}$ .

While the above discussion gives a simple conversion between this variant and the bounded-weight model in the main text, it is technically slightly suboptimal to just convert the former to the latter and then apply the analysis in Sec. 3–4 — instead, slightly sharper results for this model could be obtained by modifying the analysis in those sections appropriately, using similar arguments as those we have described above. For instance, noting that the analysis in Sec. 3 technically only requires analyzing the $2$ leakage registers $L^{A\to B}_{j}L^{B\to A}_{j}$ , one can show that in this variant model, we can substitute $\delta_{\mathrm{leak}}$ in the optimization (22) with $2\delta_{\mathrm{leak}}^{\prime}$ rather than $4\delta_{\mathrm{leak}}^{\prime}$ .232323 However, if we are instead considering the specialized IID analysis described in Remark 2, we would still need to substitute $\delta_{\mathrm{leak}}$ with $4\delta_{\mathrm{leak}}^{\prime}$ since in that case the $L^{A\to E}_{j}L^{B\to E}_{j}$ registers are also involved in the analysis. In that analysis, though, there is no need to separately subtract off the smooth max-entropy of $\mathbf{L}^{A\to E}\mathbf{L}^{B\to E}$ , so no further $\delta_{\mathrm{leak}}^{\prime}$ -dependent corrections are involved in that case.

Similarly, for Sec. 4, in the optimization (45) we can substitute $\delta_{\mathrm{leak}}$ with $\delta_{\mathrm{leak}}^{\prime}$ rather than $4\delta_{\mathrm{leak}}^{\prime}$ , since we are only considering a single register $L^{A\to E}_{j}$ or $L^{B\to E}_{j}$ .

Appendix B Technical details regarding entropy accumulation

In this appendix, we give some brief specialized comments regarding the application of the EAT in our context, assuming some background familiarity with the use of the EAT in security proofs.

Firstly, in Sec. 2.2, we have described Alice and Bob as announcing their inputs immediately after each round, and Eve is allowed to update her side-information in each round using those values. We remark that strictly speaking, to validly accommodate such a process, one would have to use a more recent version of the EAT [MFS+22] rather than some earlier versions [DFR20, DF19] that had more restrictive conditions on the “update structure” of the side-information; however, all versions yield a smooth min-entropy bound of basically the same form, so it does not affect our claims in this work. (Alternatively, one could follow the approach used in the DIQKD security proofs [AFRV19, TSB+22] based on the EAT versions in [DFR20, DF19]. Specifically, in those works, during the physical protocol itself Alice and Bob do not announce their inputs until all the measurements has been performed. In that case the device measurements commute with the process of Eve preparing the states to send to the devices, allowing the security proof to instead be focused on a virtual process where Eve instead prepares the entire $\mathsf{E}$ register before the protocol begins, without updating it based on the input values — see e.g. [TSB+22] for further explanation. Another point worth highlighting is that there have been recent proposals for DI protocols in which the inputs are not revealed to the adversary [BRC21], though as noted in that work, for some such protocols it is not currently clear how to perform a full finite-size analysis due to some technical limitations of the EAT.)

Next, regarding the difficulty mentioned at the start of Sec. 4 in applying the EAT when the leakage registers are present: the main issue is that in each round, the leakage registers could potentially depend on the secret data generated in preceding rounds. If Eve updates her side-information in each round based on these leakage registers, this means that for the original EAT versions [DFR20, DF19], the technique mentioned above of commuting the measurement and preparation processes no longer works. As for the generalized version [MFS+22], it does not seem straightforward to cleanly “defer” the update processes involving the leakage registers to the end in such a way that the no-signalling condition in that version is fulfilled.

We also remark that technically, the versions of the EAT proven in [DFR20, DF19, MFS+22] may rely on the $Q^{A}_{j}Q^{B}_{j}$ systems being finite-dimensional (though the final bounds are independent of these dimensions, so we can still allow them to have unboundedly large finite dimension). However, a recent work [FGR22] has extended a version of the EAT to states on general von Neumann algebras, hence it may be possible to allow the systems to actually have infinite dimension. We also note that the techniques in [BFF21] for computing entropy bounds were inherently derived for infinite-dimensional systems.

Finally, a technical point regarding the derivation of the last line in (41). To obtain that bound from Corollary 3.5 of [DFR20], we technically needed to use the fact that from our model of the devices, one can define some registers $R_{j}$ , an initial state $\rho^{0}_{R_{0}}$ , and a sequence of channels $\mathcal{E}_{j}:R_{j-1}\to R_{j}L_{j}$ , such that the state we consider on $\mathbf{L}$ is of the form $(\mathcal{E}_{n}\circ\dots\circ\mathcal{E}_{1})[\rho^{0}_{R_{0}}]$ (leaving some identity channels implicit). (Basically, the idea would be simply to encode the processes described in Sec. 2.2 into the channels $\mathcal{E}_{j}$ , using the registers $R_{j}$ to store all registers other than $L_{j}$ after each round.) With this we can (inductively) apply Corollary 3.5 of [DFR20], identifying the $L_{j}$ and $R_{j}$ registers in our situation with the $A_{j}$ and $R$ registers in that theorem statement (and setting the $B_{j}$ registers in that statement to be trivial registers). Strictly speaking, this would technically give us a bound where the sum in the right-hand-side of (41) instead has terms of the form $\sup_{\omega}H_{\alpha}(L_{j}|L_{1}^{j-1})_{\mathcal{E}_{j}[\omega]}$ where $\omega$ is a state on $R_{j}L_{1}^{j-1}$ (and the definition of conditional Rényi entropy follows that used in [DFR20]). However, by noting that $H_{\alpha}(L_{j}|L_{1}^{j-1})\leq H_{\alpha}(L_{j})$ for $\alpha\geq 1/2$ [Tom16], and that the channels $\mathcal{E}_{j}$ can be defined such that they end with producing the $L_{j}$ systems via the $\mathcal{L}_{j}$ channels, we can upper bound these terms with the expression in (41).

Appendix C Modified Gentle Measurement Lemma

Lemma 2.

Let $\rho_{AB}$ be a state such that if a measurement with projectors $(\left|0\right>\!\!\left<0\right|,\mathbb{I}-\left|0\right>\!\!\left<0\right|)$ (for some pure state $\left|0\right>$ ) is performed on register $A$ , the probability of getting the $\left|0\right>\!\!\left<0\right|$ outcome is at least $1-\delta$ . Then we have $F(\rho_{AB},\left|0\right>\!\!\left<0\right|_{A}\otimes\rho_{B})\geq 1-\delta$ .

Proof.

Let $\left|\rho\right>_{ABR}$ be a purification of $\rho_{AB}$ . Extend the state $\left|0\right>_{A}$ to an orthonormal basis $\{\left|j\right>_{A}\}$ for $A$ , in which case we can write $\left|\rho\right>_{ABR}=\sum_{j}\left|j\right>_{A}\left|\omega^{j}\right>_{BR}$ for some subnormalized states $\left|\omega^{j}\right>_{BR}\coloneqq\left(\left<j\right|_{A}\otimes\mathbb{I}_{BR}\right)\left|\rho\right>_{ABR}$ . More specifically, these states have (squared) norm $\langle\omega^{j}|\omega^{j}\rangle=\operatorname{Tr}\!\left[\left|j\right>\!\!\left<j\right|_{A}{\rho}_{A}\right]$ , which implies that in particular we have $\langle\omega^{0}|\omega^{0}\rangle\geq 1-\delta$ by the condition on the state $\rho_{AB}$ . Also observe that by tracing out $A$ from that expression for $\left|\rho\right>_{ABR}$ , we can write ${\rho}_{BR}=\sum_{j}\left|\omega^{j}\right>\!\!\left<\omega^{j}\right|_{BR}$ (this is not a spectral decomposition of ${\rho}_{BR}$ because the terms may be non-orthogonal, but this does not affect our argument). With this, by monotonicity of fidelity we have

[TABLE]

as claimed. ∎

This lemma differs slightly from the standard Gentle Measurement Lemma [Win99, Wat18] in that it does not show $\rho_{AB}$ is close to a post-measurement state (after the measurement with projectors $(\left|0\right>\!\!\left<0\right|_{A}\otimes\mathbb{I}_{B},\mathbb{I}_{AB}-\left|0\right>\!\!\left<0\right|_{A}\otimes\mathbb{I}_{B})$ is performed and the first outcome is obtained), but rather a state where the reduced state on $B$ is the same as that before the measurement.242424In our analysis we need the latter property rather than the former, because the function that sends a state to the normalized post-measurement state (conditioned on a particular outcome) is not a valid CPTP map, which causes some problems in our argument. One potential modification of our approach would be to instead consider the subnormalized post-measurement state conditioned on the $\left|\phi\right>\!\!\left<\phi\right|$ outcome, in which case one could write the function as a completely positive trace-nonincreasing (rather than trace-preserving) map, which does have some usable properties. However, we leave a detailed analysis of this idea for future work. Furthermore, the fidelity bound here is slightly worse — in the standard Gentle Measurement Lemma (for e.g. the version in [Wat18]), the lower bound is $\sqrt{1-\delta}$ instead. However, note that at small $\delta$ we have $\sqrt{1-\delta}\approx 1-\delta/2$ , in which case the bounds are not too different (we have basically only lost about a factor of two on the $\delta$ parameter). There is also a technical restriction that in our version, we have only considered the case where the measurement outcome of interest corresponds to a rank- $1$ projector; it seems not entirely straightforward how to precisely formulate a generalization beyond this case, hence we leave it for future work.

Appendix D Details for relaxed optimizations

Recall that the leakage channel has the internal structure $\mathcal{L}=\mathcal{L}^{A}\otimes\mathcal{L}^{B}$ where $\mathcal{L}^{A}:Q^{A}X\to Q^{A}L^{A\to B}L^{A\to E}X$ and analogously for $\mathcal{L}^{B}$ . Let us define another channel $\widetilde{\mathcal{L}}^{A}:Q^{A}X\to Q^{A}X$ where

[TABLE]

i.e. it simply discards the leakage registers from the output of $\mathcal{L}^{A}$ and re-initializes $L^{A\to B}$ in the state $\left|\phi\right>\!\!\left<\phi\right|$ ; analogously define another channel $\widetilde{\mathcal{L}}^{B}$ from $\mathcal{L}^{B}$ . Putting this together with (9), observe that we can write the states $\rho^{xy}_{Q^{A}Q^{B}XYR}\otimes\left|\phi\right>\!\!\left<\phi\right|^{\otimes 2}$ from (19) in the form

[TABLE]

Now if we were to apply the measurement channel $\mathcal{M}^{A}\otimes\mathcal{M}^{B}$ on these states, the resulting states would have the form

[TABLE]

where we have defined a new channel $\widetilde{\mathcal{M}}^{A}\coloneqq\mathcal{M}^{A}\circ\widetilde{\mathcal{L}}^{A}:Q^{A}X\to AX$ , and analogously defined $\widetilde{\mathcal{M}}^{B}\coloneqq\mathcal{M}^{B}\circ\widetilde{\mathcal{L}}^{B}:Q^{B}Y\to BY$ . Crucially, note that these new channels are local measurement channels acting in Alice and Bob’s devices respectively (and they inherit the property that they do not disturb the inputs $XY$ ). In analogy to (12), let us now also define

[TABLE]

Note that by monotonicity of fidelity, the bound (19) ensures that these new $\sigma$ states are “close” to the original $\rho$ states defined in (12); more precisely, we have

[TABLE]

In other words, we have shown that each $\rho^{xy}$ state in the optimization (18) has the property that there is a “nearby” state $\sigma^{xy}$ (i.e. within fidelity $1-\delta_{\mathrm{leak}}$ ) of the form given by (59), and analogously for the $\rho^{\mathrm{gen}}$ state in the objective function. Therefore, we can relax the optimization (18) to the following problem:

[TABLE]

where the states $\rho^{xy}$ are now treated as optimization variables themselves (and with $\rho^{\mathrm{gen}}=\sum_{xy}p^{\mathrm{gen}}_{xy}\rho^{xy}$ ), while the states $\sigma^{\mathrm{gen}},\sigma^{xy}$ are to be understood as functions of the state $\omega_{Q^{A}Q^{B}R}$ and the measurement channels $\widetilde{\mathcal{M}}^{A},\widetilde{\mathcal{M}}^{B}$ via (59)–(60). (In this optimization, $\widetilde{\mathcal{M}}^{A},\widetilde{\mathcal{M}}^{B}$ are allowed to be arbitrary measurement channels acting on the appropriate registers without disturbing $X,Y$ . This technically means we have dropped the structure $\widetilde{\mathcal{M}}^{A}\coloneqq\mathcal{M}^{A}\circ\widetilde{\mathcal{L}}^{A}$ , $\widetilde{\mathcal{M}}^{B}\coloneqq\mathcal{M}^{B}\circ\widetilde{\mathcal{L}}^{B}$ in their construction; however, since the original measurement channels $\mathcal{M}^{A},\mathcal{M}^{B}$ were anyway arbitrary, this particular relaxation does not make a difference. The potential loss of tightness in changing from (18) to the above optimization arises only from relaxing the exact expressions for $\rho^{xy}$ to a “looser” characterization in terms of nearby states.)

Finally, to further relax the optimization to our final form (22), we simply note that the fidelity constraints imply that the objective function is lower-bounded by $H(S|XYR)_{\sigma^{\mathrm{gen}}}-f_{\mathrm{cont}}(\delta_{\mathrm{leak}})$ , after which we can relax the fidelity constraints to $F(\rho^{xy}_{AB},\sigma^{xy}_{AB})\geq 1-\delta_{\mathrm{leak}}$ by tracing out $XYR$ , and substitute in the equality constraints on $\rho^{xy}_{AB}$ . This yields the optimization (22). In the next section, we describe how to implement the constraints in that optimization in a manner suitable for an SDP.

D.1 Imposing fidelity constraints

We follow the approach in e.g. [Tom16]: for any $f_{\star}\in[0,1]$ and any normalized states $\tau_{Q},\sigma_{Q}$ on a register $Q$ , if we choose some purification $\left|\tau\right>_{QQ^{\prime}}$ of $\tau_{Q}$ onto an isomorphic register $Q^{\prime}$ , then by Uhlmann’s theorem we know that $F(\tau_{Q},\sigma_{Q})\geq f_{\star}$ if and only if there exists some extension $\sigma_{QQ^{\prime}}$ of $\sigma_{Q}$ such that $\sqrt{\left<\tau\right|\sigma\left|\tau\right>_{QQ^{\prime}}}\geq f_{\star}$ . To express this in a more “SDP-compatible” form252525To be more precise, this formulation is mainly only SDP-compatible in contexts where the state $\tau$ is a “fixed constant” rather than an optimization variable, and we can write a specific purification $\left|\tau\right>$ of it — if the state $\tau$ and/or its purification $\left|\tau\right>$ were intended to be an optimization variable itself, then this approach would not be SDP-compatible since the quantity $\left<\tau\right|\widetilde{\sigma}\left|\tau\right>_{QQ^{\prime}}=\operatorname{Tr}\!\left[\left|\tau\right>\!\!\left<\tau\right|_{QQ^{\prime}}\widetilde{\sigma}_{QQ^{\prime}}\right]$ is not jointly affine with respect to $\widetilde{\sigma},\tau$ . However, the alternative approach we present next (namely, the formulation in (71)) would be usable in that scenario. (and to avoid ambiguity from overloading the $\sigma$ symbol), we can rephrase this equivalently as the statement that there exists an operator $\widetilde{\sigma}_{QQ^{\prime}}\geq 0$ such that $\left<\tau\right|\widetilde{\sigma}\left|\tau\right>_{QQ^{\prime}}\geq f_{\star}^{2}$ and $\operatorname{Tr}_{Q^{\prime}}\!\left[\widetilde{\sigma}_{QQ^{\prime}}\right]=\sigma_{Q}$ (note that the latter constraint automatically imposes normalization on $\widetilde{\sigma}_{QQ^{\prime}}$ ).

With this in mind, we can rewrite the optimization (22) as follows: for each $xy$ , let us define the state $\tau^{xy}_{AB}\coloneqq\sum_{ab}\mu_{ab|xy}\left|ab\right>\!\!\left<ab\right|_{AB}$ , so the constraints simply become $F\!\left(\tau^{xy}_{AB},\sigma^{xy}_{AB}\right)\geq 1-\delta_{\mathrm{leak}}$ . Picking any particular purifications of the $\tau^{xy}$ states, for instance $\left|\tau^{xy}\right>_{ABA^{\prime}B^{\prime}}\coloneqq\sum_{ab}\sqrt{\mu_{ab|xy}}\left|abab\right>_{ABA^{\prime}B^{\prime}}$ , from the above argument we see that the optimization can be equivalently written as

[TABLE]

where $\widetilde{\sigma}^{xy}$ are states on $ABA^{\prime}B^{\prime}$ , and the states $\sigma^{\mathrm{gen}},\sigma^{xy}$ are again understood as functions of $\omega_{Q^{A}Q^{B}R},\widetilde{\mathcal{M}}^{A},\widetilde{\mathcal{M}}^{B}$ via (59)–(60). With this formulation, the constraints can indeed be imposed in an SDP.

As an alternative approach, one could use a result derived in [Kil12, Wat12, Wat18] — for two normalized quantum states $\tau,\sigma$ of dimension $d$ , their fidelity $F(\tau,\sigma)=\left\lVert\sqrt{\tau}\sqrt{\sigma}\right\rVert_{1}$ can be expressed as the following SDP (in which the maximum is indeed attained):

[TABLE]

(By symmetry properties of the above optimization, $\Xi$ can be restricted to be hermitian if desired, given that $\tau,\sigma$ are hermitian.) In particular, this implies that

[TABLE]

which also leads to an SDP formulation of the fidelity constraint. However, we found that this approach seemed less numerically stable in some cases, and hence we mostly used the formulation (67). (In more general circumstances though, the formulation in (71) has the advantage that the states $\tau,\sigma$ can both be treated as optimization variables, unlike the preceding approach based on Uhlmann’s theorem.)

Appendix E Dimension bound from memory bound

We first formalize the bounded-memory constraint within our leakage model. For each round $j$ , let $C_{j}$ denote a classical memory register that was stored within the memory registers $M^{A}_{j-1}M^{B}_{j-1}$ of the preceding round and passed forward to the registers $Q^{A}_{j}Q^{B}_{j}$ during the “update” channel; let $\widetilde{Q}^{A}_{j}\widetilde{Q}^{B}_{j}$ denote all registers in $Q^{A}_{j}Q^{B}_{j}$ other than $C_{j}$ ( $\widetilde{Q}^{A}_{j}\widetilde{Q}^{B}_{j}$ can still contain another copy of $C_{j}$ ). With this, we are requiring that $\mathcal{L}_{j}$ has the form $\overline{\mathcal{L}}^{A\leftrightarrow B}_{j}\otimes\overline{\mathcal{L}}^{A\to E}_{j}\otimes\overline{\mathcal{L}}^{B\to E}_{j}$ for some channels $\overline{\mathcal{L}}^{A\leftrightarrow B}_{j}:\widetilde{Q}^{A}_{j}\widetilde{Q}^{B}_{j}X_{j}Y_{j}\to\widetilde{Q}^{A}_{j}\widetilde{Q}^{B}_{j}L^{A\to B}_{j}L^{B\to A}_{j}X_{j}Y_{j}$ and $\overline{\mathcal{L}}^{A\to E}_{j}:C_{j}\to L^{A\to E}_{j}$ and $\overline{\mathcal{L}}^{B\to E}_{j}:C_{j}\to L^{B\to E}_{j}$ (in a minor abuse of notation we allow both the last two channels to have input register $C_{j}$ ; this does not cause any issues because $C_{j}$ is a classical register and can be copied without disturbance to input into both channels).

In the remainder of this analysis, we again sometimes omit the $j$ subscripts for brevity; also, analogous to our $L$ notation in Sec. 4, we let $\overline{\mathcal{L}}$ denote either $\overline{\mathcal{L}}^{A\to E}_{j}$ or $\overline{\mathcal{L}}^{B\to E}_{j}$ since the analysis is the same for both. For each classical basis state $\left|c\right>\!\!\left<c\right|_{C}$ of the classical register $C$ , let us write $\rho^{(c)}_{L}\coloneqq\overline{\mathcal{L}}\left[\left|c\right>\!\!\left<c\right|_{C}\right]$ . We first discuss the bounded-weight leakage constraint, which implies that every $\rho^{(c)}$ satisfies $F(\rho^{(c)},\left|\phi\right>\!\!\left<\phi\right|)=\sqrt{\left<\phi\right|\rho^{(c)}\left|\phi\right>}\geq\sqrt{1-\delta_{\mathrm{leak}}}$ . Hence letting $\widetilde{L}$ be a purifying register for $L$ , for each $c$ Uhlmann’s theorem gives a purification $\left|\rho^{(c)}\right>_{L\widetilde{L}}$ of $\rho^{(c)}_{L}$ such that $F\left(\left|\rho^{(c)}\right>\!\!\left<\rho^{(c)}\right|_{L\widetilde{L}},\left|\phi\right>\!\!\left<\phi\right|_{L}\otimes\left|\phi\right>\!\!\left<\phi\right|_{\widetilde{L}}\right)\geq\sqrt{1-\delta_{\mathrm{leak}}}$ (note that here we use the same purification $\left|\phi\right>_{L}\left|\phi\right>_{\widetilde{L}}$ of $\left|\phi\right>_{L}$ for every $c$ ). Then we have $\overline{\mathcal{L}}=\operatorname{Tr}_{\widetilde{L}}\circ\overline{\mathcal{L}}^{\prime}$ where $\overline{\mathcal{L}}^{\prime}$ is a classical-to-quantum channel defined by $\overline{\mathcal{L}}^{\prime}\left[\left|c\right>\!\!\left<c\right|_{C}\right]=\left|\rho^{(c)}\right>\!\!\left<\rho^{(c)}\right|_{L\widetilde{L}}$ , and this channel $\overline{\mathcal{L}}^{\prime}$ still satisfies the bounded-weight leakage constraint with the same $\delta_{\mathrm{leak}}$ value (with the new “reference state” being $\left|\phi\right>_{L}\left|\phi\right>_{\widetilde{L}}$ ). Also, the states $\left|\rho^{(c)}\right>_{L\widetilde{L}}$ and $\left|\phi\right>_{L}\left|\phi\right>_{\widetilde{L}}$ span a subspace of dimension at most $d_{C}+1$ , and hence we can restrict the output space of $\overline{\mathcal{L}}^{\prime}$ to be this subspace.

With the above properties, we see that if we were to consider a virtual process where the channels $\overline{\mathcal{L}}_{j}$ in the actual protocol are replaced with these new “virtual” channels $\overline{\mathcal{L}}^{\prime}_{j}$ , we would end up with the same reduced state on $\mathbf{S}\mathbf{L}^{A\to E}\mathbf{L}^{B\to E}\mathsf{E}\mathbf{P}$ , so the value of $H_{\mathrm{min}}^{{\epsilon^{\prime}_{s}}}\left(\mathbf{S}|\mathbf{L}^{A\to E}\mathbf{L}^{B\to E}\mathsf{E}\mathbf{P}\right)_{\rho_{|\mathrm{PE}}}$ in the actual protocol state is the same as for the state produced by the virtual process. Furthermore, for the latter state we can write $H_{\mathrm{min}}^{{\epsilon^{\prime}_{s}}}\left(\mathbf{S}|\mathbf{L}^{A\to E}\mathbf{L}^{B\to E}\mathsf{E}\mathbf{P}\right)_{\rho_{|\mathrm{PE}}}\geq H_{\mathrm{min}}^{{\epsilon^{\prime}_{s}}}\left(\mathbf{S}|\mathbf{L}^{A\to E}\mathbf{L}^{B\to E}\widetilde{\mathbf{L}}^{A\to E}\widetilde{\mathbf{L}}^{B\to E}\mathsf{E}\mathbf{P}\right)_{\rho_{|\mathrm{PE}}}$ . We can then simply lower-bound the latter by viewing $\overline{\mathcal{L}}^{\prime}_{j}$ as our leakage channels and noting that they still satisfy the original leakage constraints, and their output dimension can be restricted to $d_{C}+1$ , so the analysis in Sec. 4.1 can be applied. (Note that we cannot use an argument of this form to upper bound $H_{\mathrm{max}}^{\nu}\left(\mathbf{L}\right)_{\rho_{|\mathrm{PE}}}$ instead, since for quantum systems it is not always upper bounded by $H_{\mathrm{max}}^{\nu}\left(\mathbf{L}\overline{\mathbf{L}}\right)_{\rho_{|\mathrm{PE}}}$ — for our argument to work, we had to directly consider the conditional smooth min-entropy.)

As for the classical-probabilistic leakage constraint, there is a slight technicality: since it implies the bounded-weight constraint (with the same $\delta_{\mathrm{leak}}$ value), we could apply the above argument to again show that it suffices to consider $\overline{\mathcal{L}}^{\prime}$ to have an output space of dimension $d_{C}+1$ , but with the subtlety that this channel would only be subject to the bounded-weight constraint (with parameter $\delta_{\mathrm{leak}}$ ) rather than the original classical-probabilistic constraint. However, recall that the analysis in Sec. 4 yields the same results for either of those leakage constraints, and hence this relaxation of the type of constraint on the channel does not change the final results.

Appendix F Dimension bounds from energy bounds

The idea here is to introduce some finite “cutoff” energy value $E_{\mathrm{cutoff}}$ , and argue that if we remove all weight on the leakage registers with energy above this cutoff, the resulting state is still close to the original state. For this approach, we assume that the system Hamiltonians are noninteracting, so the total energy of the systems is given by just summing the energies of the individual systems. Furthermore, we just focus on discussing either one of the leakage systems $\mathbf{L}^{A\to E}$ or $\mathbf{L}^{B\to E}$ , since to analyze both of them we could repeat this argument for each of them and then apply the triangle inequality.

Let $P_{<}$ denote the projector onto the subspace spanned by energy eigenstates $\left|e_{k}\right>_{L}$ with energy (strictly) less than $E_{\mathrm{cutoff}}$ . Consider measuring all the $L_{j}$ systems individually using the measurement with projectors $(P_{<},\mathbb{I}-P_{<})$ , which is equivalent to measuring all the systems using the measurement with projectors $(P_{<}^{\otimes n},(\mathbb{I}-P_{<})\otimes P_{<}^{\otimes n-1},P_{<}\otimes(\mathbb{I}-P_{<})\otimes P_{<}^{\otimes n-2},\dots,(\mathbb{I}-P_{<})^{\otimes n})$ . If we could certify that the probability of getting the outcome $P_{<}^{\otimes n}$ is at least some value $1-\tilde{\epsilon}^{2}$ , then by the Gentle Measurement Lemma we would know that the original state is $\tilde{\epsilon}$ -close (in purified or trace distance) to the post-measurement state conditioned on that outcome. Observe also that this post-measurement state is supported on the subspace spanned by considering only the eigenstates $\left|e_{k}\right>_{L}$ with energy less than $E_{\mathrm{cutoff}}$ , hence we could try to prove its security by applying the dimension-bound-based analysis (assuming that there are only finitely many such $\left|e_{k}\right>_{L}$ ).262626For this sketch we gloss over the technicality mentioned in Appendix C that sending some state to its corresponding post-measurement state does not form a valid CPTP map; it should probably be possible to address this issue with another suitably modified Gentle Measurement Lemma and/or allowing trace-nonincreasing maps.

Thus our task is reduced to finding the probability bound $1-\tilde{\epsilon}^{2}$ as a function of $E_{\mathrm{cutoff}}$ (and the expected-energy bound $E_{\mathrm{exp}}$ ). To do so, first observe that not getting the outcome $P_{<}^{\otimes n}$ implies that at least one of the systems had an outcome corresponding to energy at least $E_{\mathrm{cutoff}}$ , which implies that the outcome value for total energy is at least $E_{\mathrm{cutoff}}$ as well, recalling we chose our ground state energy such that all energies are non-negative. (Here we implicitly used the fact that the measurements $(P_{<},\mathbb{I}-P_{<})$ produce the same outcome probabilities as performing an energy measurement and then coarse-graining the outcome depending on whether the value is below $E_{\mathrm{cutoff}}$ .) However, we know the total expected-energy of the systems is bounded by $nE_{\mathrm{exp}}$ , and hence by Markov’s inequality, the probability of such an outcome cannot be more than $nE_{\mathrm{exp}}/E_{\mathrm{cutoff}}$ . Therefore we could take $\tilde{\epsilon}=\sqrt{nE_{\mathrm{exp}}/E_{\mathrm{cutoff}}}$ .

Unfortunately, this bound is rather trivial as it is increasing in $n$ . (It might be possible to choose $E_{\mathrm{cutoff}}\propto n$ and still obtain nontrivial results for specific Hamiltonians, but this does not appear very promising in general). The core difficulty here seems to be that while the energy bound implies that the probability of each individual register having energy above $E_{\mathrm{cutoff}}$ is at most $E_{\mathrm{exp}}/E_{\mathrm{cutoff}}$ (by Markov’s inequality), in order to apply the dimension-bound argument we need to ensure that all the registers are subject to the cutoff. (Applying the Gentle Measurement Lemma to the $L_{j}$ registers individually also seems unlikely to help, because this would only give an exponentially decreasing fidelity bound between the original and post-measurement states on the full $\mathbf{L}$ system.) We leave for future work the question of whether approaching the analysis in a different way could overcome the scaling issue in this approach.

Appendix G Lagrange dual analysis

Here we focus on proving Lemma 1 for the energy-bound case (Sec. 4.2); the results for the dimension-bound case (Sec. 4.1) can be easily obtained by an analogous argument (simply omit the energy constraint and impose the fact that the number of $w_{k}$ variables is $d_{L}$ ).

In this section let us use the notation $w_{\mathrm{min}}\coloneqq 1-\delta_{\mathrm{leak}}$ , so the first constraint in (50) can be written more compactly as $w_{0}\geq w_{\mathrm{min}}$ . As mentioned above Lemma 1, we work in the extended reals $\mathbb{R}\cup\{\pm\infty\}$ . Since (50) is a constrained optimization problem, it is easily shown (see e.g. [BV04]) that for any choice of $\kappa,\beta\in\mathbb{R}_{\geq 0}$ and $\lambda\in\mathbb{R}$ , we can upper-bound the optimal value with the Lagrange dual function of the optimization:

[TABLE]

where $L$ is the Lagrangian (choosing a sign convention for the equality constraint that keeps the final expressions slightly cleaner):

[TABLE]

We shall now show that for $\lambda>\kappa$ , the above expression for $g$ reduces to the one presented in Lemma 1. (For $\kappa\geq\lambda$ , the above expression evaluates to $g(\kappa,\beta,\lambda)=+\infty$ and hence only yields a trivial bound — to see this, note that $L(\mathbf{w},\kappa,\beta,\lambda)$ is of the form $w_{0}^{\alpha}+\kappa w_{0}-\lambda w_{0}+Z$ where $Z$ denotes some terms independent of $w_{0}$ . Therefore, when $\kappa\geq\lambda$ the expression is unbounded as we take $w_{0}\to+\infty$ , and hence $g(\kappa,\beta,\lambda)=+\infty$ . We hence do not consider this regime in the rest of our analysis; if desired, from the perspective of optimization theory we can view the function $g(\kappa,\beta,\lambda)$ in Lemma 1 as being implicitly understood to have value $+\infty$ outside of the domain specified in the lemma statement.)

To simplify the expression for $g$ , we note that by concavity of the original optimization (50) (or by direct inspection) $L(\mathbf{w},\kappa,\beta,\lambda)$ is a concave function of $\mathbf{w}$ , which implies that if the domain contains a stationary point, then that point attains the maximum value over the domain. To find whether such a stationary point exists, we compute the partial derivatives $\frac{\partial L}{\partial w_{k}}$ over the interior of the domain, i.e. for $w_{k}>0$ :

[TABLE]

where in the first line we have used the fact that we set $E_{0}=0$ . We hence see that given $\lambda>\kappa$ , the system of equations $\frac{\partial L}{\partial w_{k}}=0$ indeed has a solution with strictly positive $w_{k}$ values, namely:

[TABLE]

Substituting this solution into (72), we conclude that for $\kappa,\beta,\lambda$ satisfying the lemma conditions, we have

[TABLE]

which simplifies to (51) after observing that

[TABLE]

and similarly for the terms in the summation.

As for the remaining claims in the lemma, the Lagrange dual function $g$ for a constrained maximization problem is always a convex function of the dual variables [BV04], since it is the supremum of a family of affine functions (of the dual variables). As for showing that the optimization (52) has the same value as the original optimization (50) when $\delta_{\mathrm{leak}}>0$ (i.e. $w_{\mathrm{min}}<1$ ) and $E_{\mathrm{exp}}>0$ , this means we have to show that strong duality holds for such parameter values (viewing the former as the dual optimization and the latter as the primal optimization). In principle, we could do this by noting that the primal optimization (50) has a strictly feasible point for those parameter values (except for some edge cases where the optimization is already unbounded)272727Explicitly: first consider a simple case where $E_{\mathrm{gap}}\coloneqq\inf\{E_{k}|E_{k}>0\}$ is strictly positive (i.e. we have a gapped Hamiltonian) and the ground state is nondegenerate. By the countable-dimension assumption, we can use $\mathbb{N}$ to label all the non-ground-state energy levels. Then for any $t>0$ , if we set $w_{k}=t/(2^{k}E_{k})>0$ for $k\neq 0$ , we have $\sum_{k=1}^{\infty}w_{k}E_{k}=\sum_{k=1}^{\infty}t/2^{k}=t$ , and choosing $w_{0}$ to satisfy normalization we have $w_{0}=1-\sum_{k=1}^{\infty}w_{k}\geq 1-t/E_{\mathrm{gap}}$ . Hence by choosing $t$ sufficiently small we can satisfy both the $w_{\mathrm{min}}$ and $E_{\mathrm{exp}}$ constraints with strict inequality (given $w_{\mathrm{min}}<1$ and $E_{\mathrm{exp}}>0$ ), with all $w_{k}$ values being strictly positive, yielding a strictly feasible point. To cover the edge cases, first note that for any Hamiltonian with $E_{\mathrm{gap}}=0$ there must be infinitely many energy levels arbitrarily close to zero and hence the optimization (50) is unbounded in the first place; finally, if $E_{\mathrm{gap}}>0$ and the ground state is degenerate, then either it is infinitely degenerate (in which case (50) is again unbounded) or it is finitely degenerate and we can just slightly modify the above construction to obtain a strictly feasible point (simply remove some weight from a sufficiently high energy level and redistribute it over the ground states). , and then invoking an appropriate generalization of Slater’s condition to infinite-dimensional domains. However, to offer an alternative approach that avoids technicalities in handling infinite-dimensional vector spaces, we present below a proof for our case that bypasses this aspect, by extracting only the necessary intermediate steps from the standard proofs of Slater’s condition (see e.g. [BV04, SB14]).

Specifically, consider the function $F(w_{\mathrm{min}},E_{\mathrm{exp}},p)$ defined as the optimal value of the following concave optimization (which is just a slight generalization of the optimization (50), by allowing the weights $w_{k}$ to sum to some $p\in\mathbb{R}$ instead of $1$ ):

[TABLE]

taking the optimization to have value $-\infty$ if it is infeasible. The domain $\mathrm{dom}(F)$ of this function is defined [BV04] to be the set of values $(w_{\mathrm{min}},E_{\mathrm{exp}},p)\in\mathbb{R}^{3}$ such that $F(w_{\mathrm{min}},E_{\mathrm{exp}},p)\neq-\infty$ . Now to show that strong duality holds for some particular choice of $(w_{\mathrm{min}},E_{\mathrm{exp}},p)$ in this optimization, it suffices to show that this choice of $(w_{\mathrm{min}},E_{\mathrm{exp}},p)$ lies in the interior of $\mathrm{dom}(F)$ (see the proofs of Slater’s condition in e.g. [BV04, SB14]; the geometric idea is that this ensures the existence of a nonvertical supporting hyperplane of the subgraph of $F$ at that point, from which an optimal dual solution can be obtained). In particular, we are focusing on the situation where $w_{\mathrm{min}}<1$ , $E_{\mathrm{exp}}>0$ and $p=1$ , in which case it is straightforward to find some sufficiently small $t>0$ such that for all $(w_{\mathrm{min}}^{\prime},E_{\mathrm{exp}}^{\prime},p^{\prime})$ within distance $t$ (in some norm) of $(w_{\mathrm{min}},E_{\mathrm{exp}},1)$ , we have $w_{\mathrm{min}}^{\prime}\leq p^{\prime}$ and $E_{\mathrm{exp}}^{\prime},p^{\prime}>0$ .282828 For example, we can set $t=\min\{(1-w_{\mathrm{min}})/2,E_{\mathrm{exp}}/2,1/2\}>0$ and use the $\infty$ -norm, i.e. $\max\{|w_{\mathrm{min}}^{\prime}-w_{\mathrm{min}}|,|E_{\mathrm{exp}}^{\prime}-E_{\mathrm{exp}}|,|p-1|\}$ . The optimization is feasible for all such $(w_{\mathrm{min}}^{\prime},E_{\mathrm{exp}}^{\prime},p^{\prime})$ , since there is a simple feasible point given by $w_{0}=p^{\prime}$ and $w_{k}=0$ for all $k\neq 0$ . Hence $F$ is finite in that neighbourhood, i.e. $(w_{\mathrm{min}},E_{\mathrm{exp}},1)$ is an interior point of $\mathrm{dom}(F)$ , as required.

We remark that if the optimization (50) is such that the optimal solution saturates both the inequality constraints (this will be the case in most situations; specifically, as long as the $E_{\mathrm{exp}}$ value is not so low that it enforces $w_{0}$ to be strictly larger than $w_{\mathrm{min}}$ , and the Hamiltonian is such that the maximum entropy is an increasing function of the expected energy), then we can replace them with equality constraints. In that case the theory of Lagrange multipliers (along with strict concavity of the objective function and linearity of the constraints) implies that solving the system of equations $\frac{\partial L}{\partial w_{k}},\frac{\partial L}{\partial\kappa},\frac{\partial L}{\partial\beta},\frac{\partial L}{\partial\lambda}=0$ for $\mathbf{w},\kappa,\beta,\lambda$ yields the optimal $\mathbf{w}$ in the primal optimization (50) and the optimal $\kappa,\beta,\lambda$ in the dual optimization (52). In particular, since the equations $\frac{\partial L}{\partial\kappa},\frac{\partial L}{\partial\beta},\frac{\partial L}{\partial\lambda}=0$ just reproduce the optimization constraints, this means that in principle the choice of $\kappa,\beta,\lambda$ that yields the best upper bound could be obtained by substituting the equations (74) into the optimization constraints and solving for $\kappa,\beta,\lambda$ . However, we currently do not have any explicit Hamiltonian in which we can solve the resulting expression (the various resulting summations can be expressed in terms of the Hurwitz zeta function or its derivatives via similar arguments as in the main text, but this is difficult to handle further in closed form).

We close this section with some side-remarks about the “thermodynamic version” of this argument, i.e. if we had instead taken the objective function to be the Shannon/Boltzmann entropy. (As mentioned in the main text, a solution to this version does not currently seem usable for our context as we do not have a good method to relate it to $H_{\alpha}(L_{j})$ without a dimension bound, and in any case our above approach should give better results as it directly analyzes $H_{\alpha}(L_{j})$ . Still, we mention it in case it highlights some useful properties.) Note that we still keep the constraint on $w_{0}$ , i.e. we are maximizing the entropy of a system subject to a ground-state probability constraint as well as an energy constraint. In that case, solving the system of equations $\frac{\partial L}{\partial w_{k}}=0$ yields solutions for $\{w_{k}|k\neq 0\}$ that are exponentially decreasing with respect to $E_{k}$ . If we furthermore suppose that the inequality constraints are saturated as mentioned above, this means the optimal solution is essentially a Gibbs state except with a larger value of $w_{0}$ due to the ground-state constraint, as one might intuitively expect. Furthermore, for some simple Hamiltonians (such as harmonic oscillators, as studied in the main text) one can explicitly solve for the optimal Lagrange-multiplier values, with the optimal value of $\beta$ yielding the inverse-temperature parameter $1/(k_{B}T)$ , and the optimal value of $\lambda$ being related to the partition function.

Bibliography75

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[AFRV 19] Rotem Arnon-Friedman, Renato Renner and Thomas Vidick “Simple and Tight Device-Independent Security Proofs” In SIAM Journal on Computing 48.1 Society for Industrial & Applied Mathematics (SIAM), 2019, pp. 181–225 DOI: 10.1137/18m 1174726 · doi ↗
2[BCK 13] Jonathan Barrett, Roger Colbeck and Adrian Kent “Memory Attacks on Device-Independent Quantum Cryptography” In Physical Review Letters 110 American Physical Society, 2013, pp. 010503 DOI: 10.1103/Phys Rev Lett.110.010503 · doi ↗
3[BFF 21] Peter Brown, Hamza Fawzi and Omar Fawzi “Device-independent lower bounds on the conditional von Neumann entropy” In ar Xiv:2106.13692 [quant-ph] , 2021 URL: https://arxiv.org/abs/2106.13692 v 1
4[BHK 05] Jonathan Barrett, Lucien Hardy and Adrian Kent “No Signaling and Quantum Key Distribution” In Physical Review Letters 95 American Physical Society, 2005, pp. 010503 DOI: 10.1103/Phys Rev Lett.95.010503 · doi ↗
5[BRC 20] P. J. Brown, S. Ragy and R. Colbeck “A Framework for Quantum-Secure Device-Independent Randomness Expansion” In IEEE Transactions on Information Theory 66.5 , 2020, pp. 2964–2987 DOI: 10.1109/TIT.2019.2960252 · doi ↗
6[BRC 21] Rutvij Bhavsar, Sammy Ragy and Roger Colbeck “Improved device-independent randomness expansion rates from tight bounds on the two sided randomness using CHSH tests” In ar Xiv:2103.07504 v 2 [quant-ph] , 2021 URL: https://arxiv.org/abs/2103.07504 v 2
7[BSS 14] Jean-Daniel Bancal, Lana Sheridan and Valerio Scarani “More randomness from the same data” In New Journal of Physics 16.3 IOP Publishing, 2014, pp. 033011 DOI: 10.1088/1367-2630/16/3/033011 · doi ↗
8[BV 04] Stephen Boyd and Lieven Vandenberghe “Convex Optimization” Cambridge University Press, 2004 DOI: 10.1017/CBO 9780511804441 · doi ↗

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Taxonomy

Robustness of implemented device-independent protocols against constrained leakage

Abstract

1 Introduction

2 Preliminaries

2.1 Notation and definitions

Definition 1**.**

Definition 2**.**

Definition 3**.**

Definition 4**.**

2.2 Leakage model

Bounded-weight leakage constraint:

Classical-probabilistic leakage constraint:

Remark 1**.**

2.3 Security proof structure and protocol requirements

3 Single-round entropy bounds

3.1 Fundamental optimization task

Remark 2**.**

3.2 Relaxing the optimization

3.3 Numerical results

4 Security of full protocol

Remark 3**.**

4.1 Dimension bounds

4.2 Energy bounds

Lemma 1**.**

5 Conclusion and further work

Acknowledgements

Appendix A Minor variations

A.1 Event ordering in leakage channel

A.2 Relations between probability bounds

Appendix B Technical details regarding entropy accumulation

Appendix C Modified Gentle Measurement Lemma

Lemma 2**.**

Proof.

Appendix D Details for relaxed optimizations

D.1 Imposing fidelity constraints

Appendix E Dimension bound from memory bound

Appendix F Dimension bounds from energy bounds

Appendix G Lagrange dual analysis

Definition 1.

Definition 2.

Definition 3.

Definition 4.

Remark 1.

Remark 2.

Remark 3.

Lemma 1.

Lemma 2.