# Planted Hitting Set Recovery in Hypergraphs

**Authors:** Ilya Amburg, Jon Kleinberg, Austin R. Benson

arXiv: 1905.05839 · 2019-05-16

## TL;DR

This paper introduces a theoretical framework and scalable algorithm for recovering core nodes in hypergraphs, which represent complex interactions in networked data, outperforming existing methods on real datasets.

## Contribution

The paper presents the first theoretical analysis and practical algorithm for planted core node recovery in hypergraphs, leveraging the hitting set property.

## Key findings

- The algorithm accurately recovers core nodes in real-world hypergraph datasets.
- It outperforms baseline methods based on network centrality and core-periphery measures.
- The approach is scalable to large hypergraphs.

## Abstract

In various application areas, networked data is collected by measuring interactions involving some specific set of core nodes. This results in a network dataset containing the core nodes along with a potentially much larger set of fringe nodes that all have at least one interaction with a core node. In many settings, this type of data arises for structures that are richer than graphs, because they involve the interactions of larger sets; for example, the core nodes might be a set of individuals under surveillance, where we observe the attendees of meetings involving at least one of the core individuals. We model such scenarios using hypergraphs, and we study the problem of core recovery: if we observe the hypergraph but not the labels of core and fringe nodes, can we recover the "planted" set of core nodes in the hypergraph?   We provide a theoretical framework for analyzing the recovery of such a set of core nodes and use our theory to develop a practical and scalable algorithm for core recovery. The crux of our analysis and algorithm is that the core nodes are a hitting set of the hypergraph, meaning that every hyperedge has at least one node in the set of core nodes. We demonstrate the efficacy of our algorithm on a number of real-world datasets, outperforming competitive baselines derived from network centrality and core-periphery measures.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1905.05839/full.md

## Figures

13 figures with captions in the complete paper: https://tomesphere.com/paper/1905.05839/full.md

## References

54 references — full list in the complete paper: https://tomesphere.com/paper/1905.05839/full.md

---
Source: https://tomesphere.com/paper/1905.05839