# Private Multi-File Retrieval From Distributed Databases

**Authors:** Zhifang Zhang, Jingke Xu

arXiv: 1704.00250 · 2017-04-12

## TL;DR

This paper investigates private multi-file retrieval from distributed databases, establishing capacity bounds, designing optimal schemes for large r, and reducing complexity compared to traditional PIR methods.

## Contribution

It introduces the first capacity bounds for PMFR, and proposes a scheme that achieves optimality for r ≥ M/2 with significantly reduced complexity.

## Key findings

- Achieves the capacity bound for r ≥ M/2
- Reduces subpackage count from N^M to N^2
- Near-optimal solutions for small r

## Abstract

Suppose there are $N$ distributed databases each storing a full set of $M$ independent files. A user wants to retrieve $r$ out of the $M$ files without revealing the identity of the $r$ files. When $r=1$ it is the classic problem of private information retrieval (PIR). In this paper we study the problem of private multi-file retrieval (PMFR) which covers the case of general $r$. We first prove an upper bound on the capacity of PMFR schemes which indicates the minimum possible download size per unit of retrieved files. Then we design a general PMFR scheme which happens to attain the upper bound when $r\geq\frac{M}{2}$, thus achieving the optimal communication cost. As $r$ goes down we show the trivial approach of executing $r$ independent PIR instances achieves the near optimal communication cost. Comparing with the capacity-achieving PIR schemes, our PMFR scheme reduces the number of subpackages needed for each file from $N^M$ to $N^2$, which implies a great reduction of implementation complexity.

---
Source: https://tomesphere.com/paper/1704.00250