# Sparse One-Time Grab Sampling of Inliers

**Authors:** Maryam Jaberi, Marianna Pensky, Hassan Foroosh

arXiv: 1901.02338 · 2019-01-09

## TL;DR

This paper introduces a 'one-time-grab' sampling algorithm designed to efficiently select minimal samples from large datasets with multiple structures and outliers, ensuring coverage of all structures with high probability.

## Contribution

It proposes a novel sampling method that minimizes the number of samples needed to capture all underlying structures in large, complex datasets, regardless of outliers.

## Key findings

- Reduces sample size needed for structure detection
- Guarantees coverage of all structures with high probability
- Applicable as a front end to various clustering methods

## Abstract

Estimating structures in "big data" and clustering them are among the most fundamental problems in computer vision, pattern recognition, data mining, and many other other research fields. Over the past few decades, many studies have been conducted focusing on different aspects of these problems. One of the main approaches that is explored in the literature to tackle the problems of size and dimensionality is sampling subsets of the data in order to estimate the characteristics of the whole population, e.g. estimating the underlying clusters or structures in the data. In this paper, we propose a `one-time-grab' sampling algorithm\cite{jaberi2015swift,jaberi2018sparse}. This method can be used as the front end to any supervised or unsupervised clustering method. Rather than focusing on the strategy of maximizing the probability of sampling inliers, our goal is to minimize the number of samples needed to instantiate all underlying model instances. More specifically, our goal is to answer the following question: {\em `Given a very large population of points with $C$ embedded structures and gross outliers, what is the minimum number of points $r$ to be selected randomly in one grab in order to make sure with probability $P$ that at least $\varepsilon$ points are selected on each structure, where $\varepsilon$ is the number of degrees of freedom of each structure.'}

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1901.02338/full.md

## Figures

12 figures with captions in the complete paper: https://tomesphere.com/paper/1901.02338/full.md

## References

5 references — full list in the complete paper: https://tomesphere.com/paper/1901.02338/full.md

---
Source: https://tomesphere.com/paper/1901.02338