# Beyond Moments: Robustly Learning Affine Transformations with   Asymptotically Optimal Error

**Authors:** He Jia, Pravesh K . Kothari, Santosh S. Vempala

arXiv: 2302.12289 · 2023-02-27

## TL;DR

This paper introduces a polynomial-time algorithm for robustly learning affine transformations of hypercubes from corrupted samples, achieving near-optimal total variation distance guarantees and surpassing prior bounds.

## Contribution

The authors develop a novel geometric certification approach for affine transformations that improves robustness and accuracy in ICA, especially under sample corruption.

## Key findings

- Achieves total variation distance $O(\epsilon)$ with polynomial complexity.
- Provides optimal recovery guarantees up to a constant factor.
- Outperforms previous methods with weaker bounds on component estimation.

## Abstract

We present a polynomial-time algorithm for robustly learning an unknown affine transformation of the standard hypercube from samples, an important and well-studied setting for independent component analysis (ICA). Specifically, given an $\epsilon$-corrupted sample from a distribution $D$ obtained by applying an unknown affine transformation $x \rightarrow Ax+s$ to the uniform distribution on a $d$-dimensional hypercube $[-1,1]^d$, our algorithm constructs $\hat{A}, \hat{s}$ such that the total variation distance of the distribution $\hat{D}$ from $D$ is $O(\epsilon)$ using poly$(d)$ time and samples. Total variation distance is the information-theoretically strongest possible notion of distance in our setting and our recovery guarantees in this distance are optimal up to the absolute constant factor multiplying $\epsilon$. In particular, if the columns of $A$ are normalized to be unit length, our total variation distance guarantee implies a bound on the sum of the $\ell_2$ distances between the column vectors of $A$ and $A'$, $\sum_{i =1}^d \|a_i-\hat{a}_i\|_2 = O(\epsilon)$. In contrast, the strongest known prior results only yield a $\epsilon^{O(1)}$ (relative) bound on the distance between individual $a_i$'s and their estimates and translate into an $O(d\epsilon)$ bound on the total variation distance. Our key innovation is a new approach to ICA (even to outlier-free ICA) that circumvents the difficulties in the classical method of moments and instead relies on a new geometric certificate of correctness of an affine transformation. Our algorithm is based on a new method that iteratively improves an estimate of the unknown affine transformation whenever the requirements of the certificate are not met.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/2302.12289/full.md

## Figures

2 figures with captions in the complete paper: https://tomesphere.com/paper/2302.12289/full.md

## References

19 references — full list in the complete paper: https://tomesphere.com/paper/2302.12289/full.md

---
Source: https://tomesphere.com/paper/2302.12289