The Subset Assignment Problem for Data Placement in Caches
Shahram Ghandeharizadeh, Sandy Irani, Jenny Lam

TL;DR
This paper introduces the subset assignment problem for data placement in caches, aiming to minimize costs by optimally assigning items to memory banks with replication, considering large data objects and small number of memory banks.
Contribution
It formulates the subset assignment problem, proves its NP-hardness, and provides an efficient LP relaxation solution tailored for small numbers of memory banks.
Findings
LP relaxation can be solved efficiently for small d
Algorithm runs in near-linear time in n for fixed d
Excluding fractional items has minimal impact for small data objects
Abstract
We introduce the subset assignment problem in which items of varying sizes are placed in a set of bins with limited capacity. Items can be replicated and placed in any subset of the bins. Each (item, subset) pair has an associated cost. Not assigning an item to any of the bins is not free in general and can potentially be the most expensive option. The goal is to minimize the total cost of assigning items to subsets without exceeding the bin capacities. This problem is motivated by the design of caching systems composed of banks of memory with varying cost/performance specifications. The ability to replicate a data item in more than one memory bank can benefit the overall performance of the system with a faster recovery time in the event of a memory failure. For this setting, the number of data objects (items) is very large and the number of memory banks (bins) is a small…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
