CURE: Curvature Regularization For Missing Data Recovery
Bin Dong, Haocheng Ju, Yiping Lu, Zuoqiang Shi

TL;DR
This paper introduces CURE, a novel regularization combining low-dimensional manifold constraints with curvature smoothness, improving missing data recovery in imaging tasks.
Contribution
The paper proposes CURE, a new regularization method that integrates manifold low dimension and curvature smoothness, enhancing image inpainting and semi-supervised learning.
Findings
CURE outperforms LDMM in image inpainting.
WeCURE improves semi-supervised learning results.
Numerical experiments validate the effectiveness of the proposed methods.
Abstract
Missing data recovery is an important and yet challenging problem in imaging and data science. Successful models often adopt certain carefully chosen regularization. Recently, the low dimension manifold model (LDMM) was introduced by S.Osher et al. and shown effective in image inpainting. They observed that enforcing low dimensionality on image patch manifold serves as a good image regularizer. In this paper, we observe that having only the low dimension manifold regularization is not enough sometimes, and we need smoothness as well. For that, we introduce a new regularization by combining the low dimension manifold regularization with a higher order Curvature Regularization, and we call this new regularization CURE for short. The key step of solving CURE is to solve a biharmonic equation on a manifold. We further introduce a weighted version of CURE, called WeCURE, in a similar manner…
| Method | COIL20 | ISOLET | ||||
|---|---|---|---|---|---|---|
| 2 | 5 | 10 | 2 | 5 | 10 | |
| GL | 55.61 | 68.50 | 76.11 | 31.19 | 45.51 | 66.27 |
| WNLL[42] | 59.59 | 74.13 | 80.65 | 49.12 | 61.90 | 73.05 |
| CURE | 59.73 | 74.77 | 80.91 | 49.14 | 61.94 | 73.23 |
| WeCURE | 63.29 | 77.65 | 84.76 | 52.65 | 64.92 | 76.50 |
| Images | C.man | House | Peppers | Starfish | Monarch | Airplane | Parrot | Lena | Barbara | Boat | Man | Couple | Average |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Sample Rate | 10% | ||||||||||||
| LDMM | 19.9329 | 24.8723 | 20.6103 | 19.9285 | 19.3395 | 19.9612 | 19.5449 | 26.1005 | 23.3176 | 22.6681 | 23.9415 | 22.7225 | 21.9117 |
| WNLL | 21.9993 | 28.3325 | 23.3210 | 22.2705 | 22.4218 | 21.7954 | 21.6121 | 28.5089 | 26.3732 | 24.8116 | 25.8126 | 25.0263 | 24.3571 |
| CURE | 21.7095 | 28.3023 | 23.3315 | 22.0185 | 22.0650 | 21.4078 | 21.5080 | 28.3013 | 26.3031 | 24.6798 | 25.7207 | 24.9033 | 24.1876 |
| WeCURE | 21.8571 | 28.7967 | 23.7416 | 22.3540 | 22.5829 | 21.4335 | 21.7753 | 28.7926 | 26.7155 | 25.0060 | 25.7145 | 25.1940 | 24.4970 |
| Sample Rate | 15% | ||||||||||||
| LDMM | 21.0948 | 26.4075 | 21.6434 | 20.9887 | 20.9843 | 21.0712 | 21.3412 | 27.7591 | 25.6175 | 23.8791 | 25.1269 | 24.0065 | 23.3267 |
| WNLL | 23.3052 | 29.1647 | 25.0635 | 23.5147 | 23.7171 | 22.7292 | 22.5851 | 29.5856 | 27.7837 | 25.8633 | 26.9433 | 26.2245 | 25.5400 |
| CURE | 22.8514 | 29.5745 | 25.1007 | 23.4509 | 23.8326 | 22.5211 | 22.4579 | 29.6253 | 27.7315 | 25.7653 | 26.9278 | 26.1798 | 25.5016 |
| WeCURE | 23.0993 | 30.9540 | 25.7840 | 24.0722 | 24.2587 | 22.8246 | 22.8708 | 30.1331 | 28.5615 | 26.2943 | 27.3484 | 26.7266 | 26.0773 |
| Sample Rate | 20% | ||||||||||||
| LDMM | 21.9057 | 28.2924 | 22.7767 | 22.6264 | 22.4175 | 22.1073 | 21.9409 | 28.9160 | 26.8121 | 24.8777 | 26.2350 | 25.0044 | 24.4927 |
| WNLL | 23.9478 | 30.8222 | 25.8068 | 24.5382 | 24.6738 | 23.8359 | 23.2844 | 30.5140 | 28.7357 | 26.6614 | 27.7806 | 26.7532 | 26.4462 |
| CURE | 23.7846 | 31.4606 | 25.7513 | 24.7232 | 24.8360 | 23.7147 | 23.5282 | 30.6271 | 28.9715 | 26.6736 | 27.8198 | 26.8165 | 26.5589 |
| WeCURE | 24.5007 | 32.1789 | 26.6428 | 25.3982 | 25.5151 | 24.1406 | 24.0625 | 31.3711 | 29.7794 | 27.3033 | 28.3473 | 27.4934 | 27.2278 |
| Images | C.man | House | Peppers | Starfish | Monarch | Airplane | Parrot | Lena | Barbara | Boat | Man | Couple | Average |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Sample Rate | 10% | ||||||||||||
| LDMM | 0.2677 | 0.3406 | 0.4406 | 0.3856 | 0.4870 | 0.3338 | 0.4560 | 0.4508 | 0.4881 | 0.3121 | 0.3469 | 0.3389 | 0.3874 |
| WNLL | 0.3557 | 0.4236 | 0.5681 | 0.5415 | 0.6523 | 0.4352 | 0.5680 | 0.5316 | 0.6308 | 0.4383 | 0.4787 | 0.5123 | 0.5113 |
| CURE | 0.3591 | 0.4337 | 0.5849 | 0.5382 | 0.6537 | 0.4324 | 0.5733 | 0.5356 | 0.6392 | 0.4409 | 0.4817 | 0.5240 | 0.5164 |
| WeCURE | 0.3726 | 0.4397 | 0.6042 | 0.5721 | 0.6842 | 0.4448 | 0.5953 | 0.5402 | 0.6572 | 0.4628 | 0.5051 | 0.5476 | 0.5355 |
| Sample Rate | 15% | ||||||||||||
| LDMM | 0.3622 | 0.4288 | 0.5308 | 0.4848 | 0.5986 | 0.4252 | 0.5464 | 0.5382 | 0.6164 | 0.4187 | 0.4483 | 0.4619 | 0.4884 |
| WNLL | 0.4456 | 0.5053 | 0.6380 | 0.6196 | 0.7076 | 0.5052 | 0.6247 | 0.5931 | 0.6964 | 0.5130 | 0.5544 | 0.5911 | 0.5828 |
| CURE | 0.4464 | 0.5294 | 0.6610 | 0.6294 | 0.7299 | 0.5115 | 0.6435 | 0.5994 | 0.7068 | 0.5226 | 0.5637 | 0.6067 | 0.5959 |
| WeCURE | 0.4577 | 0.5459 | 0.6766 | 0.6658 | 0.7473 | 0.5273 | 0.6621 | 0.6102 | 0.7275 | 0.5462 | 0.5939 | 0.6308 | 0.6159 |
| Sample Rate | 20% | ||||||||||||
| LDMM | 0.4385 | 0.5148 | 0.5980 | 0.5783 | 0.6692 | 0.5003 | 0.6074 | 0.5997 | 0.6840 | 0.5003 | 0.5295 | 0.5501 | 0.5642 |
| WNLL | 0.4970 | 0.5735 | 0.6856 | 0.6691 | 0.7439 | 0.5684 | 0.6673 | 0.6376 | 0.7373 | 0.5722 | 0.6062 | 0.6364 | 0.6329 |
| CURE | 0.5063 | 0.6044 | 0.7051 | 0.6889 | 0.7687 | 0.5847 | 0.6850 | 0.6457 | 0.7515 | 0.5882 | 0.6203 | 0.6571 | 0.6505 |
| WeCURE | 0.5270 | 0.6167 | 0.7241 | 0.7214 | 0.7859 | 0.6009 | 0.7017 | 0.6570 | 0.7683 | 0.6093 | 0.6492 | 0.6806 | 0.6702 |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNumerical methods in inverse problems · Image and Signal Denoising Methods · Sparse and Compressive Sensing Techniques
\newsiamremark
remarkRemark \newsiamremarkhypothesisHypothesis
\newsiamthmclaimClaim \headersCURE: Curvature Regularization For Missing Data RecoveryBin Dong, Haocheng Ju, Yiping Lu, and Zuoqiang Shi
CURE: Curvature Regularization For Missing Data Recovery
XXX XXX (, ). XXX
XXX33footnotemark: 3
Bin Dong Beijing International Center for Mathematical Research, Peking University, Beijing, 100871 China.() [email protected]
Haocheng Ju School Of Mathematical Science, Peking University, Beijing, 100871 China.() [email protected]
Yiping Lu Institute for Computational and Mathematical Engineering (ICME), Stanford University, Stanford, CA, 94305.() [email protected]
Zuoqiang Shi() Department of Mathematical Sciences, Yau Mathematical Sciences Center, Tsinghua University, Beijing, 100084 China. [email protected]
Abstract
Missing data recovery is an important and yet challenging problem in imaging and data science. Successful models often adopt certain carefully chosen regularization. Recently, the low dimensional manifold model (LDMM) was introduced by [36] and shown effective in image inpainting. The authors of [36] observed that enforcing low dimensionality on image patch manifold serves as a good image regularizer. In this paper, we observe that having only the low dimensional manifold regularization is not enough sometimes, and we need smoothness as well. For that, we introduce a new regularization by combining the low dimensional manifold regularization with a higher order CUrvature REgularization, and we call this new regularization CURE for short. The key step of CURE is to solve a biharmonic equation on a manifold. We further introduce a weighted version of CURE, called WeCURE, in a similar manner as the weighted nonlocal Laplacian (WNLL) method [42]. Numerical experiments for image inpainting and semi-supervised learning show that the proposed CURE and WeCURE significantly outperform LDMM and WNLL respectively.
keywords:
Graph Laplacian, Nonlocal Methods, Point Cloud, Biharmonic Equation, Interpolation, Image Inpainting.
{AMS}
62H35 65D18 68U10 58C40 58J50
1 Introduction
Missing data recovery is a fundamental problem in imaging science and data analysis. In many cases, it can be formulated as a function interpolation problem in multiple dimension spaces. Let be an unknown function. We would like to acquire its values on a set of points . However, due to practical limitations, we are only able to observe its values on a subset . The goal of missing data recovery is to reconstruct the missing values of based on the observed values in . In this paper, we focus on two kinds of typical and important tasks of missing data recovery, i.e. semi-supervised learning and image inpainting, though it can be well applied to other related tasks as well.
Since the problem of missing data recovery is an under-determined inverse problem, we can only hope to recover the missing values of if we have certain prior knowledge on , e.g. belonging to a certain function class or having certain mathematical or statistical properties. Successful models include Rudin–Osher–Fatemi(ROF) model [39] and its variants [26, 4, 13], the applied harmonic analysis models such as wavelets [44, 18], curvelet [43], shearlet [22, 32] and wavelet frame [2, 9, 12, 10, 47, 20], the Bayesian statistics based methods [38, 40, 48]; and the list goes on.
More recently, people started to use low dimensional manifolds to describe the underlying relationship between the data points which serves as an effective geometric prior on the interpolant. For example, [36, 37] observed that image patches, regarded as data points in a high dimension space, often lie on a low dimensional manifold; and [15, 49] allowed the data lie close to (but may not be on) a certain low dimensional manifold.
To harvest the low dimensional property of data, [36] applied the following Dirichlet energy [50] to regularize the dimension of the embedded manifold
[TABLE]
In [36], the authors gave a geometric interpretation of the Dirichlet regularizer. They showed that the dimension of a smooth manifold embedded in can be calculated by a simple formula
[TABLE]
where is the coordinate function, for any ,
This means that we can minimize the Dirichlet energy to enforce a penalty on the (local) dimensions of the underlying manifold. As a result, the authors referred to their method as the low dimensional manifold model (LDMM). To recover missing data, they proposed to minimize the Dirichlet energy subject to the constraints , , where denotes the observed part of the underlying function .
1.1 Higher Order Regularization
Only low dimension structure of the manifold does not readily ensure smoothness of the reconstructed manifold which may lead to unsatisfactory results [34, 23, 11]. As a simple demonstration, we show in Figure 1 a degenerated interpolation result from the two data points labeled in red. Although the interpolated surface is also a low dimensional manifold, it is certainly not a smooth interpolation.
In this paper, we look for the proper interpolation by not only assuming low dimensionality of the manifold, but also the smoothness. For that, in addition to the Dirichlet energy, we further introduce a CUrvature REgularization (CURE) term via biharmonic operator. The proposed CURE energy reads as follows
[TABLE]
where LDMM is given by (1). Note that regularizing the curvature by introducing higher order energy term has already been proposed in image processing [41]. However, to the best of our knowledge, we are the first to promote curvature-like regularization for nonlocal image processing. Furthermore, inspired by the weighted nonlocal Laplacian (WNLL) method proposed by [42] which can preserve the symmetry of the Laplace operator, we propose a weighted CURE (WeCURE) model which can significantly improve the results over the CURE model. To demonstrate the effectiveness of CURE and WeCURE, we test our model on semi-supervised learning and image inpainting task. Numerical results show that CURE/WeCURE produces significantly better results than LDMM/WNLL in both tasks. A glimpse of the results for image inpainting is shown in Figure 2 where we can see the significant improvement of CURE over LDMM and WeCURE over WNLL. More details and numerical results can be found in Section 3 and Section 4.
1.2 Other Related Works
Nonlocal patch-based image restoration methods[16, 17, 7, 6, 26] have achieved great success in the literature. In addition, [24, 3, 19] also introduced different graph Laplacian-based regularization on manifold and graphs. Our method, however, focuses on both smoothness and low dimensionality of the underlying data manifold. The most similar work to ours is [1], where the authors also introduced a higher order regularization for semi-supervised learning. The difference is threefold. First, we extend the task to image inpainting rather than just semi-supervised learning. Secondly, we introduce a curvature perspective on the higher order regularization. Last but not least, the newly proposed weighted version of CURE, i.e. WeCURE, has significant performance boost in both image inpainting and semi-supervised learning.
Another approach to regularize the dimension of the manifold is through low-rank matrix completion [27, 28]. The basic idea is to group the patches by similarity and penalized the rank/nuclear norm of the matrix obtained by reshaping the stack of the similar patches. The work in this paper reveals a benefit of PDE-based approaches that higher order information, such as curvature, can be naturally incorporated in the model.
1.3 Organization of the Paper
The paper is organized as follows. The proposed CURE and WeCURE model are introduced in Section 2, Numerical comparisons of CURE and WeCURE with LDMM and WNLL for semi-supervised learning and image inpainting are presented in Section 3 and Section 4 respectively. The general setting of the asymptotic analysis of the proposed model is presented in Section 5 and the complete proof is given in Sections A.4 and A.5. Conclusions and summary are given in Section 6.
2 Curvature Regularization (CURE): Model and General Algorithm
In this section, we first propose the CURE model and a weighted version of CURE. Then, we will discuss how (We)CURE can be applied to missing data recovery in general.
2.1 CURE
Let be a smooth manifold embedded in and locally parameterized as
[TABLE]
where is the local dimension of at , and . Let be the coordinate function on , i.e. for
[TABLE]
To enforce smoothness of the underlying manifold, we further regularize the curvature of the manifold. Recall that the mean curvature of a manifold is defined as the trace of the second fundamental form [33], i.e. . Here is the metric tensor defined by . If the coordinate function is an isometric immersion, the mean curvature can be calculated as , where (see [33] for detail).
Now, we are ready to introduce the CURE energy in continuum setting:
[TABLE]
where is given by (1). The gradient is commonly approximated by the nonlocal gradient in the discrete setting
[TABLE]
where is a set with points on the manifold . Then,
[TABLE]
Here, is a given symmetric weight function which is often chosen to be a Gaussian weight =exp, where is a parameter and denotes the Euclidean norm in . The negative of the first variation of takes the form
[TABLE]
which is the nonlocal Laplacian that has been used in image processing [5, 6, 24, 26]. It is also called graph Laplacian in spectral graph and machine learning literature [14, 50]. To simplify the notation, we use to denote the graph Laplacian [31, 45, 46]:
[TABLE]
Now, the proposed CURE model can be cast as the following optimization problem in the discrete setting
[TABLE]
In [42], a weighted nonlocal Laplacian (WNLL) method was introduced to balance the loss at both labeled and unlabeled points and to preserve the symmetry of the Laplace operator at the same time. Let be a set with labeled points. The WNLL model in the discrete setting is given by
[TABLE]
where
[TABLE]
and similarly for .
Following a similar idea as that in WNLL, we propose the weighted CURE model (WeCURE) in the discrete setting
[TABLE]
where
[TABLE]
and similarly for .
2.2 CURE for Missing Data Recovery
For missing data recovery, we can simply minimize the CURE or WeCURE energy with respect to the constraints where is the observed values of the underlying function to be recovered. We discuss implementation details for WeCURE. CURE is a special case of WeCURE with all weights equal to 1.
Recall the definition of the energy function of WeCURE (3) and notice that . Then, WeCURE model for missing data recovery can be rewritten as
[TABLE]
where with for and for , and is the matrix of graph Laplacian. The first variation of (4) is
[TABLE]
Note that
[TABLE]
Thus
[TABLE]
Then, the solution to problem (4) can be given by solving the following Euler-Lagrange equation
[TABLE]
where with and is the weighted coefficient in WNLL. The above linear system is symmetric positive definite and sparse which can be solved efficiently by iterative solvers such as the conjugate gradient method. We remark that, for (non-weighted) CURE method, we only need to replace matrix above by identity matrix . We summarize (We)CURE algorithm for missing data recovery in Algorithm 1.
3 CURE for Semi-Supervised Learning
Semi-supervised learning is a challenging and yet frequently encountered machine learning task. It can be formulated as a missing data recovery problem [50]. Given a data set , we assume there are totally different classes. Let be a subset of with labels, i.e
[TABLE]
where is the subset with label . It is typical for semi-supervised learning that is far less than . The objective of semi-supervised learning is to extend labels to the entire data set . Our algorithm is summarized in Algorithm 2.
We test WNLL, Weighted Nonlocal Total Variation (WNTV) [30], CURE, WeCURE on the MNIST dataset [29] of handwritten digits classification [8], COIL20 dataset[Nene96columbiaobject] of object classification and ISOLET dataset[21] of spoken letter recognition. Some sample images from MNIST and COIL20 are shown in Figure 3. The MNIST dataset contains 70,000 gray-scale images of size 28 28 with 10 classes of digits going from 0 to 9. Each class contains 7,000 images. Each image can be seen as a point in a 784-dimension Euclidean space. The COIL20 dataset contains 20 objects, and each object has 72 images. The size of each image is 32 32 pixels, with 256 grey levels per pixel. Thus, each image is represented by a 1024-dimensional vector. The ISOLET dataset contains 150 subjects who spoke the name of each letter of the alphabet twice. The speakers are grouped into sets of 30 speakers each and are referred to as isolet1 through isolet5. In our experiment, we use isolet1 which consists of 1560 samples with each sample represented by a 617-dimensional vector.
The weight function is constructed as
[TABLE]
where is chosen to be the distance between and its th nearest neighbor ( in MNIST, in COIL20 and ISOLET). To make the weight matrix sparse, the weight is truncated to the 50 nearest neighbors.
In our test on MNIST, we choose five different sampling rates to form the training set: labeling 700, 100, 70, 50 and 35 images in the whole dataset at random. For each sampling rate, we repeat the test results 10 times. In our test on COIL20 and ISOLET, we choose three different sampling rates to form the training set: labeling , , at random. For each sampling rate, we repeat the test 10 times. Figure 4 shows the success rate of WNLL, CURE, and WeCURE method on MNIST dataset. The first five images of Figure 4 show the success rate for each sampling rate, while the last image shows the average success rate for each of the five sampling rate. It can be clearly observed that the proposed CURE and WeCURE outperform WNLL for all the tested cases. With a high sampling rate, WeCURE is comparable with CURE, whereas WeCURE outperforms CURE in the cases with lower sampling rates. In terms of average success rate, both CURE and WeCURE outperform WNLL. We also compare (We)CURE with WNLL and Weighted Nonlocal Total Variation (WNTV) [30] in Table 1. It can be seen that (We)CURE significantly outperforms both WNLL and WNTV in cases with lower sample rates (50/70000,100/70000). Table 2 shows the result on COIL20 and ISOLET dataset. It can be seen that WeCURE outperforms CURE and WNLL by .
4 CURE for Image Inpainting
In this section, we apply (We)CURE to reconstruct the images with partially observed pixels. We adopt the assumption that image patches lie on a low dimensional and smooth manifold. Given an image , for any , we define an image patch as
[TABLE]
where we assume and are odd integers and we adopt reflective boundary conditions for near image boundary. Define the patch set as the collection of all patches:
[TABLE]
Define a function on as
[TABLE]
where is the intensity of image at pixel .
Now, suppose we only observe the image on a subset of pixels . We would like to recover the entire image from the observed data . This problem can be recast as the interpolation of the function on the patch set with being given in , . This falls into the general algorithmic framework of (We)CURE for missing data recovery (Algorithm 2). Notice that the patch set is unknown. Thus, we need to iterative update the patch set . We summarize the (We)CURE algorithm for this problem in Algorithm 3.
The weight function is chosen as (6). Here, are semi-local patches and is chosen to be the distance between and its 20th nearest neighbor. To make the weight matrix sparse, the weight is truncated to the 50 nearest neighbors. In the semi-local patches, the local coordinate is normalized to have the same amplitude as the image intensity,
[TABLE]
with
[TABLE]
where and are the size of the image. The purpose of introducing semi-local patches is to constrain the search space to a local area. The larger leads to smaller search space making the searching quicker, while smaller leads to global search and make more accurate results. Thus following [42] we gradually reduce by and initialization .
We apply our algorithm to 12 widely used test images. In our experiment, we select the patch size to be . For each patch, the nearest neighbors are obtained by using an approximate nearest neighbor (ANN) search algorithm. We use a k-d tree approach as well as an ANN search algorithm to reduce the computational cost. The linear system in weighted nonlocal Laplacian and graph Laplacian is solved by the conjugate gradient method. We use the solution of WNLL after 6 steps as the initialization of our algorithm to get a proper initial guess of the similarity relationships between different groups. The initial image of WNLL is obtained by filling the missing pixels with random numbers which satisfy a Gaussian distribution, where is the mean of and is the standard deviation of .
Quality of the restored images is measured by PSNR and SSIM. PSNR is defined as
[TABLE]
where is the ground truth. SSIM is defined as a multiplication of three terms that quantifies similarity of luminance, contrast and structure. It takes the following form
[TABLE]
where
[TABLE]
where and are the local means, standard deviations and cross-covariance for image .
The numerical results are shown in Table 3 and Table 4. For qualitative comparisons, Figure 6 shows the inpainting results of 3 images from Set12 dataset at sample rate. Figure 7 shows the inpainting results at sample rate. As we can see, WeCURE gives much better results than WNLL both visually and in terms of PSNR and SSIM. We observe that (We)CURE can well recover texture and preserve sharp image features such as edges, though it also introduces mild artifacts in smooth regions. This is why (We)CURE significantly outperforms WNLL in terms of SSIM.
5 Asymptotic Analysis
In this section, we aim to provide an asymptotic analysis of the proposed numerical scheme for WeCURE model using -convergence. The idea of the proof is sketched as follows. We first fix the bandwidth of the kernel and consider our scheme as an integral scheme of a non-local functional. Then, we reduce the bandwidth of the kernel to zero to show that the non-local functional is a good approximation to the original WeCURE functional. The proof mostly follows the notation and general idea of [45, 46, trillos2018error]. A recent paper [dunlop2019large] also established a -convergence proof of the Biharmonic equation. The difference between their paper and ours is mainly the setting of the problem. In their paper, labeled data is considered as the boundary condition, while in our paper, we also consider the labeled data as samples from the data distribution and the rate of the number of labeled and unlabeled data is a fixed factor. In this setting, we will show that weights of WeCURE are crucial to achieving convergence.
Let and be uniformly sampled from , where is an open bounded domain in . Let be the set of labeled points where is uniformly sampled from . In this paper, we consider the ratio to be fixed. Let be a function whose value is only known at the labeled points. The empirical measure of data points is given by . We consider a graph with vertices and denote the weights of the edges as where is a radially symmetric function which satisfies the following assumptions:
(A1) and is continuous at 0.
(A2) is non-increasing.
(A3) has compact support. If , then .
The discrete WeCURE model is given by (the weight is , not in previous sections)
[TABLE]
The continuum nonlocal WeCURE model is given by
[TABLE]
The continuum (local) WeCURE model is given by
[TABLE]
where , is the first coordinate of vector .
Remark 5.1**.**
*The models introduced above contain the corresponding CURE models as special cases if we simply modify some coefficients in the WeCURE models and replace the term by ( is a certain constant). *
We are now ready to present the main theorems of this section.
Theorem 5.2**.**
*Let , be an open, bounded, connected set with Lipschitz boundary. Let be a sequence of i.i.d random points uniformly sampled from . is the set of labeled points whose value is given by . Assume the kernel satisfies conditions (A1)-(A3). Then -converges to as in the sense. *
Theorem 5.3**.**
*Under the assumptions of Theorem 5.2, -converges to as in with metric. *
Theorem 5.4**.**
*(Compactness)Under the assumptions of Theorem 5.2, satisfies the compactness property with respect to the metric. *
The complete proof of Theorem 5.2 and Theorem 5.3 can be found in Sections A.4 and A.5 and Theorem 5.4 is a direct consequence of [bourgain2001another, Theorem 4].
6 Conclusion and Future Work
In this paper, we proposed to use both low dimensionality and smoothness of the underlying data manifold as a regularizer for missing data recovery. For that, we introduced curvature regularization (CURE) and a weighted version of it (WeCURE). Comparing to related models such as LDMM, WNLL, and WNTV, the new regularization was proven more effective for semi-supervised learning and image inpainting on some datasets.
There are plenty of future directions worth exploring. For modelling, a natural question is whether different curvatures can also serve as good smoothing regularizers regularizer for data manifolds and how are they different from the one we chose for CURE? Can these curvatures be easily computed? How does CURE work for other tasks of missing data recovery? Furthermore, convergence analysis of solving the Biharmonic equation (5) on manifold also needs to be studied. Due to a lack of understanding of the numerical methods for the Biharmonic equation, it prohibited us from generalizing CURE to generic inverse problems.
Acknowledgments
Bin Dong is supported in part by NSFC 11671022 and Beijing Natural Science Foundation (Z180001). Haocheng Ju is supported by the Elite Undergraduate Training Program of the School of Mathematical Sciences at Peking University. Zuoqiang Shi is supported by NSFC 11671005. We would also like to thank Dr. Wei Zhu for his valuable comments and kindly sharing the codes of both LDMM and LDMM+WNLL for comparisons.
Appendix A Preliminaries
In this section we present a brief review of some basic concepts used in the asymptotic analysis. The interested readers should consult[45] for a more detailed introduction to these concepts.
A.1 Optimal transport
is an open and bounded domain in . is the Borel -algebra of and is the set of all Borel probability measures on . Given , the distance between is defined by:
[TABLE]
where is the set of all Borel probability measures on for which the marginal on the first variable is and the marginal on the second variable is . The elements are also referred as transportation plans between and . When
[TABLE]
defines a metric on , which is called the -transportation distance.
Given a Borel map and the push-forward of by , denoted by is given by:
[TABLE]
Then for any bounded Borel function the following change of variables in the integral holds:
[TABLE]
When the measure is absolutely continuous with respect to the Lebesgue measure, (13) is equivalent to:
[TABLE]
A.2 The Space
The space was introduced in[45] to compare functions defined on and an open domain .
[TABLE]
The metric on the space is
[TABLE]
where the set of transportation plans defined in the previous subsection. When the measure is absolutely continuous with respect to the Lebesgue measure, (19) is equivalent to:
[TABLE]
A.3 -Convergence
We follow the definition of -convergence by [slepcev2019analysis] in a random setting.
Definition A.1**.**
*Let be a metric space and be a probability space. For each the functional is a random variable. We say -converge almost surely on the domain to with respect to , and write , if there exists a set with , such that for all and all :
(i)(liminf inequality) for every sequence converging to
[TABLE]
*(ii)(limsup inequality) there exists a sequence converging to such that
[TABLE]
Definition A.2**.**
We say that the sequence of nonnegative functionals satisfies the compactness property if the following holds: Given an increasing sequence of natural numbers and a bounded sequence in for which
[TABLE]
* is relatively compact in . *
A.4 Proof of Theorem 5.2
A.4.1 Liminf inequality
Proof A.3**.**
Assume that as . First we show that
[TABLE]
Since , using the change of variables(16) it follows that
[TABLE]
Notice that
[TABLE]
Moreover, we have
[TABLE]
and
[TABLE]
Note that indicates , so the first two terms go to zero as . We only have to show
[TABLE]
Note that for almost every
[TABLE]
along with the monotonicity of , we have
[TABLE]
Note that from Theorem 2.5 in [45], we have
[TABLE]
along with the standard result in real analysis that if , then , we have
[TABLE]
Similarly, we can show that
[TABLE]
and we obtain(22) and , along with
[TABLE]
we have
[TABLE]
The rest terms can be proved in a similar way and we have
[TABLE]
A.4.2 Limsup inequality
Proof A.4**.**
Define to be the restriction of to the first data points , and we have . From the proof of the liminf inequality in the previous section, we have
[TABLE]
A.5 Proof of Theorem 5.3
A.5.1 Liminf inequality
Proof A.5**.**
Consider an arbitrary and suppose that as
[TABLE]
The inequality
[TABLE]
follows from the proof of Theorem 8 in[ponce2004new]. Next we show that
[TABLE]
We need the following lemma to establish the liminf inequality.
Lemma A.6**.**
Let be a bounded open subset of , is a open set compactly contained in . Suppose that is a sequence of functions such that
[TABLE]
if for some , then
[TABLE]
*where , is the first coordinate of vector .
Proof A.7**.**
We claim that
[TABLE]
Using a simple change of variables , we have
[TABLE]
*The second equality follows from that is compactly contained in . The third equality follows from fourth order Taylor expansion and the vanishing of first and third order term is a direct result from the radial symmetry of . Combined with (33), we have (35). Note that implies using Hölder inequality. Taking to zero in the right hand side of (35) we have (34). *
*We can proceed to the proof of Liminf equality of Theorem 2.2. Our main idea follows from [45]. Consider an arbitrary and suppose that as . We want to show that . Without loss of generality, we assume that is uniformly bounded.
Consider a standard mollifier. is a smooth radially symmetric function, supported in the closed unit ball and is such that . We define .
Fix an open domain compactly contained in . Let . Set . . For and for a given function we define the mollified function by setting . The functions are smooth and satisfy as . Furthermore*
[TABLE]
By taking the second derivative, it follows that there is a constant (only depending on the mollifier ) such that
[TABLE]
Since as the norms are uniformly bounded. Therefore, taking in the inequalities(37) and setting , implies
[TABLE]
Moreover, using (36) to express and , it is straightforward to deduce that
[TABLE]
*for some constant independent of . In particular, as and hence we can apply LemmaA.6 to infer that
[TABLE]
[TABLE]
The second inequality is obtained by using the change of variables, and is contained in the transformed domain. The third inequality follows from Cauchy-Schwarz inequality. Using a change of variables , we have the third equality. The fourth equality follows from that has compact support, and thus the integral on is the same as the integral on . Let and apply (38), we have
[TABLE]
Since as and is lower semicontinuous, we have
[TABLE]
Take and we obtain the desired liminf inequality. Next we show
[TABLE]
As is uniformly bounded, we have
[TABLE]
Using nonlocal Green’s formula in[26], we have
[TABLE]
Substitute into (31), we have
[TABLE]
Let , it’s straightforward to show
[TABLE]
Summing up , we have
[TABLE]
A.5.2 Limsup inequality
Proof A.8**.**
From Remark 2.7 in[45], we only have to prove the limsup inequality for . We want to prove
[TABLE]
[TABLE]
The inequality
[TABLE]
follows from the proof of Theorem 8 in[ponce2004new]. Next we show
[TABLE]
Let .
[TABLE]
The first equality is obtained by setting and the vanishing of first order term is a direct result from the radial symmetry of . stands for the Hessian matrix. The first inequality is obtained by a change of variables and the transformed domain is contained in . As is compactly supported, it’s straightforward to show that
[TABLE]
then we have
[TABLE]
Similar to the proof of inequality(42), we have
[TABLE]
Let , it’s straightforward to show
[TABLE]
Summing up , we have
[TABLE]
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] S. Agarwal, K. Branson, and S. Belongie , Higher order learning with graphs , in Proceedings of the 23rd international conference on Machine learning, ACM, 2006, pp. 17–24.
- 2[2] C. Bao, B. Dong, L. Hou, Z. Shen, X. Zhang, and X. Zhang , Image restoration by minimizing zero norm of wavelet frame coefficients , Inverse problems, 32 (2016), p. 115004.
- 3[3] A. L. Bertozzi and A. Flenner , Diffuse interface models on graphs for classification of high dimensional data , Multiscale Modeling & Simulation, 10 (2012), pp. 1090–1118.
- 4[4] K. Bredies, K. Kunisch, and T. Pock , Total generalized variation , SIAM Journal on Imaging Sciences, 3 (2010), pp. 492–526.
- 5[5] A. Buades, B. Coll, and J.-M. Morel , Neighborhood filters and pde’s , Numer. Math, 105, p. 1–34.
- 6[6] A. Buades, B. Coll, and J.-M. Morel , A review of image denoising algorithms, with a new one. multiscale model , Simul, 4, p. 490–530.
- 7[7] A. Buades, B. Coll, and J.-M. Morel , A non-local algorithm for image denoising , in Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE Computer Society Conference on, vol. 2, IEEE, 2005, pp. 60–65.
- 8[8] C. Burges, Y. Le Cun, and C. , Cortes. mnist database .
