Achieving GWAS with Homomorphic Encryption

Jun Jie Sim; Fook Mun Chan; Shibin Chen; Benjamin Hong Meng Tan; Khin; Mi Mi Aung

arXiv:1902.04303·stat.AP·August 2, 2019

Achieving GWAS with Homomorphic Encryption

Jun Jie Sim, Fook Mun Chan, Shibin Chen, Benjamin Hong Meng Tan, Khin, Mi Mi Aung

PDF

TL;DR

This paper presents a method using homomorphic encryption to perform genome-wide association studies securely, enabling privacy-preserving genetic analysis with practical performance on real datasets.

Contribution

It adapts the semi-parallel GWAS algorithm for homomorphic encryption, introducing matrix operations, approximations, and a cache module to improve efficiency and scalability.

Findings

01

Successfully performed GWAS with homomorphic encryption on real datasets

02

Achieved analysis time of under 25 minutes for 10,643 SNPs and 245 samples

03

Demonstrated feasibility of privacy-preserving genetic association studies

Abstract

One way of investigating how genes affect human traits would be with a genome-wide association study (GWAS). Genetic markers, known as single-nucleotide polymorphism (SNP), are used in GWAS. This raises privacy and security concerns as these genetic markers can be used to identify individuals uniquely. This problem is further exacerbated by a large number of SNPs needed, which produce reliable results at a higher risk of compromising the privacy of participants. We describe a method using homomorphic encryption (HE) to perform GWAS in a secure and private setting. This work is based on a proposed algorithm. Our solution mainly involves homomorphically encrypted matrix operations and suitable approximations that adapts the semi-parallel GWAS algorithm for HE. We leverage the complex space of the CKKS encryption scheme to increase the number of SNPs that can be packed within a…

Figures4

Click any figure to enlarge with its caption.

Tables6

Table 1. Table 1 : Depth of Homomorphic Operations

Homomorphic Operation	No. Successive Operations
Plaintext Multiplication^∗	$29$
Ciphertext Multiplication^†	$40$
Ciphertext Rotation	$256$

Table 2. Table 2 : Time Taken and Memory Consumption with iDASH server ( 4 4 4 cores) using HEAAN

Process	Time Taken (min)	Memory (GB)
Preprocessing^∗	$0.019$	$0.024$
Context Generation	$0.65$	$3.55$
Encryption	$0.79$	$0.802628$
Computations	$717.20$	$3.98849$
Decryption	$0.32$	$0.063$

Table 3. Table 3 : Time Taken and Memory Consumption with our server ( 22 22 22 cores) using HEAAN

Process	Time Taken (min)	Memory (GB)
Preprocessing^∗	$0.019$	$0.024$
Context Generation	$0.43$	$3.55$
Encryption	$0.404$	$0.886795$
Computations	$203.42$	$24.1119$
Decryption	$0.30$	$0.063$

Table 4. Table 4 : HEAAN Accuracy

Error $e$	No. of Different Entries	HEAAN Accuracy (%)
$0.1$	$0$	$100$
$0.01$	$168$	$98.42$
$0.005$	$645$	$93.94$

Table 5. Table 5 : Time Taken and Memory Consumption with our server ( 22 22 22 cores) using SEAL

Process	Time Taken (min)	Memory (GB)
Preprocessing^∗	$0.020$	$0.024$
Context Generation	$12.86$	$73.4$
Encryption	$0.20$	$1.60404$
Computations	$24.70$	$38.4843$
Decryption	$0.31$	$0.666103$
Total	$25.21$	$40.76$

Table 6. Table 6 : SEAL Accuracy

Error $e$	No. of Different Entries	SEAL Accuracy (%)
$0.1$	$127$	$98.81$
$0.01$	$4061$	$61.84$
$0.005$	$5940$	$44.19$

Equations51

lo g (\frac{p}{1 - p}) = β_{0} + β_{1} x_{1} + \dots + β_{d} x_{d} .

lo g (\frac{p}{1 - p}) = β_{0} + β_{1} x_{1} + \dots + β_{d} x_{d} .

p (x, β) = \frac{1}{1 + e ^{- β^{⊺} x}}

p (x, β) = \frac{1}{1 + e ^{- β^{⊺} x}}

L (X, β) = i = 1 \prod n P (y_{i} ∣ x_{i}) = i = 1 \prod n p (x_{i}, β)^{y_{i}} (1 - p (x_{i}, β))^{1 - y_{i}} .

L (X, β) = i = 1 \prod n P (y_{i} ∣ x_{i}) = i = 1 \prod n p (x_{i}, β)^{y_{i}} (1 - p (x_{i}, β))^{1 - y_{i}} .

ℓ (X, β) = i = 1 \sum n y_{i} lo g (p (x_{i}, β)) + i = 1 \sum n (1 - y_{i}) lo g (1 - p (x_{i}, β))

ℓ (X, β) = i = 1 \sum n y_{i} lo g (p (x_{i}, β)) + i = 1 \sum n (1 - y_{i}) lo g (1 - p (x_{i}, β))

= i = 1 \sum n y_{i} lo g (\frac{p ( x _{i} , β )}{( 1 - p ( x _{i} , β )}) + i = 1 \sum n lo g (\frac{1}{e ^{β^{⊺} x} + 1})

= i = 1 \sum n y_{i} (β^{⊺} x_{i}) - i = 1 \sum n lo g (e^{β^{⊺} x} + 1)

β^{(t + 1)} = β^{(t)} - H^{- 1} (β^{(t)}) g (β^{(t)})

β^{(t + 1)} = β^{(t)} - H^{- 1} (β^{(t)}) g (β^{(t)})

g (β)

g (β)

H (β)

\tilde{H} = \frac{1}{4} X^{⊺} X

\tilde{H} = \frac{1}{4} X^{⊺} X

σ_{7} (x) = 0.5 - 1.73496 (\frac{x}{8}) + 4.19407 (\frac{x}{8})^{3} - 5.43402 (\frac{x}{8})^{5} + 2.50739 (\frac{x}{8})^{7} .

σ_{7} (x) = 0.5 - 1.73496 (\frac{x}{8}) + 4.19407 (\frac{x}{8})^{3} - 5.43402 (\frac{x}{8})^{5} + 2.50739 (\frac{x}{8})^{7} .

z = X β + W^{- 1} (y - p)

z = X β + W^{- 1} (y - p)

S^{*} = S - X (X^{⊺} WX)^{- 1} X^{⊺} WS

S^{*} = S - X (X^{⊺} WX)^{- 1} X^{⊺} WS

z^{*} = z - X (X^{⊺} WX)^{- 1} X^{⊺} Wz .

z^{*} = z - X (X^{⊺} WX)^{- 1} X^{⊺} Wz .

b = \frac{( W z ^{*} ) ^{⊺} \cdot S ^{*}}{colsum ( W ( S ^{*} ) ^{2} )}

b = \frac{( W z ^{*} ) ^{⊺} \cdot S ^{*}}{colsum ( W ( S ^{*} ) ^{2} )}

err = \frac{1}{colsum ( W ( S ^{*} ) ^{2} )} .

err = \frac{1}{colsum ( W ( S ^{*} ) ^{2} )} .

M = I - X (X^{⊺} X)^{- 1} X^{⊺} .

M = I - X (X^{⊺} X)^{- 1} X^{⊺} .

S^{'} = MS = S - X (X^{⊺} X)^{- 1} X^{⊺} S

S^{'} = MS = S - X (X^{⊺} X)^{- 1} X^{⊺} S

z^{'} = Mz = z - X (X^{⊺} X)^{- 1} X^{⊺} z .

z^{'} = Mz = z - X (X^{⊺} X)^{- 1} X^{⊺} z .

b^{'} = \frac{( W z ^{'} ) ^{⊺} \cdot S ^{'}}{colSum ( W ( S ^{'} ) ^{2} )} .

b^{'} = \frac{( W z ^{'} ) ^{⊺} \cdot S ^{'}}{colSum ( W ( S ^{'} ) ^{2} )} .

err^{'} = \frac{1}{colsum ( W ( S ^{'} ) ^{2} )} .

err^{'} = \frac{1}{colsum ( W ( S ^{'} ) ^{2} )} .

(x + y i)^{2} = (x^{2} - y^{2}) + 2 x y i .

(x + y i)^{2} = (x^{2} - y^{2}) + 2 x y i .

(x + y i) (x - y i) = x^{2} + y^{2}

(x + y i) (x - y i) = x^{2} + y^{2}

x^{2} = \frac{R e ( z z + z ^{2} )}{2} y^{2} = \frac{R e ( z z - z ^{2} )}{2}

x^{2} = \frac{R e ( z z + z ^{2} )}{2} y^{2} = \frac{R e ( z z - z ^{2} )}{2}

S_{e v e n} = \frac{S _{i}^{'} S _{i}^{'} + S _{i}^{'} S _{i}^{'}}{2}

S_{e v e n} = \frac{S _{i}^{'} S _{i}^{'} + S _{i}^{'} S _{i}^{'}}{2}

S_{o dd} = \frac{S _{i}^{'} S _{i}^{'} - S _{i}^{'} S _{i}^{'}}{2}

S_{o dd} = \frac{S _{i}^{'} S _{i}^{'} - S _{i}^{'} S _{i}^{'}}{2}

τ = \frac{2 ^{N - 1}}{2 \times ⌈ n ⌉ _{PO 2}}

τ = \frac{2 ^{N - 1}}{2 \times ⌈ n ⌉ _{PO 2}}

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Achieving GWAS with Homomorphic Encryption

Jun Jie Sim