Frank-Wolfe Algorithm for the Exact Sparse Problem
Farah Cherfaoui (QARMA), Valentin Emiya (QARMA), Liva Ralaivola, (QARMA), Sandrine Anthoine (I2M)

TL;DR
This paper analyzes the Frank-Wolfe algorithm's effectiveness for the Exact Sparse reconstruction problem, showing it efficiently identifies support atoms and converges exponentially fast under quasi-incoherent dictionaries.
Contribution
It provides theoretical guarantees for support recovery and convergence speed of the Frank-Wolfe algorithm in sparse reconstruction with quasi-incoherent dictionaries.
Findings
Supports are correctly identified at each iteration.
Exponential convergence is achieved beyond a certain iteration.
Algorithm performance depends on dictionary incoherence.
Abstract
In this paper, we study the properties of the Frank-Wolfe algorithm to solve the \ExactSparse reconstruction problem. We prove that when the dictionary is quasi-incoherent, at each iteration, the Frank-Wolfe algorithm picks up an atom indexed by the support. We also prove that when the dictionary is quasi-incoherent, there exists an iteration beyond which the algorithm converges exponentially fast.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMarkov Chains and Monte Carlo Methods · Stochastic Gradient Optimization Techniques · Mathematical Approximation and Integration
Frank-Wolfe Algorithm for the \ExactSparseProblem
Farah Cherfaoui1, Valentin Emiya1, Liva Ralaivola1 and Sandrine Anthoine2
1 Aix Marseille Univ, Université de Toulon, CNRS, LIS, Marseille, France
2 Aix Marseille Univ, CNRS, Centrale Marseille, I2M, Marseille, France This work was supported by the Agence Nationale de la Recherche under grant JCJC MAD (ANR-14-CE27-0002).
Abstract
In this paper, we study the properties of the Frank-Wolfe algorithm to solve the \ExactSparsereconstruction problem. We prove that when the dictionary is quasi-incoherent, at each iteration, the Frank-Wolfe algorithm picks up an atom indexed by the support. We also prove that when the dictionary is quasi-incoherent, there exists an iteration beyond which the algorithm converges exponentially fast.
1 Introduction
Given a dictionary of a large number of atoms, the sparse signal approximation problem consists of constructing the best linear combination with a small number of atoms to approximate a given signal. Sparse signal reconstruction is a sub-problem of the sparse signal approximation problem. In the latter case, we suppose that the given signal has an exact representation with or less atoms from this dictionary. We say that the signal is -sparse. This subset of atoms is indexed by a set called the support. In this paper, we only consider the sparse signal reconstruction problem, which is called the \ExactSparseproblem.
Several algorithms have been developed to solve or approximate the \ExactSparseproblem. The Matching Pursuit algorithm (MP) [6] and Orthogonal Matching Pursuit algorithm (OMP) [7] are two fundamental greedy algorithms used for solving this problem. Tropp [8] and Gribonval and Vandergheynst [3] proved that, if the dictionary is quasi-incoherent, then at each iteration the MP and OMP algorithms pick up an atom indexed by the support. They also proved that these two algorithms converge exponentially fast. In fact, Tropp in [8] demonstrates that OMP converges after exactly iterations, where is the size of the support. We study in this paper the properties of the Frank-Wolfe algorithm [2] to solve the \ExactSparseproblem. The Frank-Wolfe algorithm [2] is an iterative optimization algorithm designed for constrained convex optimization. It has been proven to converge exponentially if the objective function is strongly convex [4] and linearly in the other cases [2]. The atom selection steps in Matching Pursuit and Frank-Wolfe are very similar. This inspired for example Jaggi and al. [5] to use the Frank-Wolfe algorithm to prove the convergence of the MP algorithm when no conditions are made on the dictionary.
In this paper, we use the MP algorithm to prove that the Frank-Wolfe algorithm can have the same recovery and convergence properties as MP. We prove that when the dictionary is quasi-incoherent, the Frank-Wolfe algorithm picks up only atoms indexed by the support. Also, we prove that when the dictionary is quasi-incoherent, the Frank-Wolfe algorithm converges exponentially from a certain iteration even though the function we consider is not strongly convex.
2 The problem and the algorithm
2.1 The \ExactSparseproblem
For any vector , we denote by its coordinate. The support of is the set of indices of nonzero coefficients:
[TABLE]
Fix a dictionary of unit-norm vectors. Assume that is -sparse, then the \ExactSparseproblem is to find:
[TABLE]
where the pseudo-norm counts the number of nonzero components in its argument. This problem has been proven to be NP-hard [1] and has been tackled essentially with two kind of approaches. The first one is the local approach, using a greedy algorithm like MP or OMP. The second approach is a global one where one relaxes the problem. A most popular choice is the relaxation:
[TABLE]
where is the norm.
We present, in the next parts, the Frank-Wolfe algorithm [2] for the \ExactSparseproblem, and then the recovery properties and convergence rate of this algorithm.
2.2 The Frank-Wolfe algorithm
The Frank-Wolfe algorithm solves the optimization problem
[TABLE]
where is a convex and continuously differentiable function and is a compact and convex set. In the original version of the Frank-Wolfe algorithm, each iterate is defined as a convex combination between and with .
In the case of the relaxation of the \ExactSparseproblem (Equation (1)), and is the ball of radius . Noting that and that , we obtain that can be calculated as in line 4 and 5 of Algorithm 1. Note also that we initialize by zero (line 1) and that we select the convex combination parameter as in line 6.
In the analysis of Algorithm 1, we use the residual whose norm is also the minimized objective function .
3 Recovery property and convergence rate
For a dictionary , we denote by the coherence of and by the Babel function. These two quantities measure how much the elements of the dictionary look alike. More details can be found in [8].
In this section we present our major results. Theorem 1 gives the recovery property for the Frank-Wolfe algorithm. We prove that when the dictionary is quasi-incoherent (i.e. ), the Frank-Wolfe algorithm reconstructs every -sparse signal. Theorem 2 shows that when the dictionary is quasi-incoherent, the Frank-Wolfe algorithm converges exponentially. We recall that a sequence converges exponentially if: , with .
Theorem 1**.**
*Let be a dictionary, its coherence, and a -sparse signal (i.e. ).
If , then at each iteration, Algorithm 1 picks up a correct atom, i.e. , .*
Sketch of proof.
The proof of this theorem is very similar to the proof of Theorem 3.1 in [8]. ∎
Theorem 2**.**
*Let be a dictionary, its coherence, and a -sparse signal (i.e. ).
If and , then there exists a such that for all iteration of Algorithm 1, we have:*
[TABLE]
where .
Sketch of proof.
The general idea of the proof can be summarized as follows. The first step will be to prove that if the dictionary is quasi-incoherent, then the step chosen in line 6 of Algorithm 1 is in . A consequence of this is that:
[TABLE]
We can then write the expression of :
[TABLE]
which yields using Eq. (3):
[TABLE]
The second step is to bound . Using Theorem 1, we can show that the sequence of is bounded by the sequence . Since the sequence converges to zero, then the sequence of also converges to zero. Therefore, there exists an iteration such that for all : where is ball centered in and of radius . As a result, . Since , we have .
By definition of :
[TABLE]
Noting that , one obtains
[TABLE]
By Theorem 1, lies in the linear span of atoms indexed by . Since we assume that these atoms are linearly independent, we have
[TABLE]
where is the matrix whose columns are the atoms indexed by and its smallest singular. So, By Lemma 2.3 of [8], and we obtain:
[TABLE]
Finally, we show that using the fact that since the are of unit norm. ∎
Note that Tropp in [8] has already proved that if the dictionary is incoherent, then . As a result, is in . We also have that because . Finally, since is greater that , we have that is in . We conclude that Theorem 2 gives the exponential convergence rate of the residual norm. As , this implies that this theorem also gives the exponential convergence rate of the objective function beyond a certain iteration.
It is possible to guarantee an exponential convergence from the first iteration if is big enough. Lemma 1 gives a lower bound of to obtain this result.
Lemma 1**.**
Let be a dictionary of coherence , a -sparse signal (i.e. ) and . If
[TABLE]
then Algorithm 1 converges exponentially from the first iteration. Here, is the matrix whose columns are the atoms indexed by .
We proved in Theorem 2 that when the iterates enter the ball , the Frank-Wolfe algorithm converges exponentially. The intuition of this lemma is to grow the value of compared to (then also grows). This implies that the iterates enter the ball earlier and the exponential convergence starts earlier.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] Geoff Davis, Stephane Mallat, and Marco Avellaneda. Adaptive greedy approximations. Constructive approximation , 13(1):57–98, 1997.
- 2[2] Marguerite Frank and Philip Wolfe. An algorithm for quadratic programming. Naval Research Logistics (NRL) , 3(1-2):95–110, 1956.
- 3[3] Rémi Gribonval and Pierre Vandergheynst. On the exponential convergence of matching pursuits in quasi-incoherent dictionaries. IEEE Transactions on Information Theory , 52(1):255–261, 2006.
- 4[4] Jacques Guélat and Patrice Marcotte. Some comments on Wolfe’s ‘away step’. Mathematical Programming , 35(1):110–119, 1986.
- 5[5] Francesco Locatello, Rajiv Khanna, Michael Tschannen, and Martin Jaggi. A unified optimization view on generalized matching pursuit and Frank-Wolfe. ar Xiv preprint ar Xiv:1702.06457 , 2017.
- 6[6] Stéphane G Mallat and Zhifeng Zhang. Matching pursuits with time-frequency dictionaries. IEEE Transactions on signal processing , 41(12):3397–3415, 1993.
- 7[7] Yagyensh Chandra Pati, Ramin Rezaiifar, and Perinkulam Sambamurthy Krishnaprasad. Orthogonal matching pursuit: Recursive function approximation with applications to wavelet decomposition. Signals, Systems and Computers, 1993. 1993 Conference Record of The Twenty-Seventh Asilomar Conference on , pages 40–44, 1993.
- 8[8] Joel A Tropp. Greed is good: Algorithmic results for sparse approximation. IEEE Transactions on Information theory , 50(10):2231–2242, 2004.
