TL;DR
This paper introduces a new class of Markov Random Field models with bottleneck potentials, enabling MAP inference that accounts for maximum local potentials, and demonstrates effective algorithms and results on seismic horizon tracking.
Contribution
It extends MRFs with bottleneck potentials, providing novel relaxations and algorithms for MAP inference involving maximum-based potentials.
Findings
Effective algorithms for bottleneck MRF inference.
Successful application to large-scale seismic horizon tracking.
Improved inference quality with bottleneck potentials.
Abstract
We consider general discrete Markov Random Fields(MRFs) with additional bottleneck potentials which penalize the maximum (instead of the sum) over local potential value taken by the MRF-assignment. Bottleneck potentials or analogous constructions have been considered in (i) combinatorial optimization (e.g. bottleneck shortest path problem, the minimum bottleneck spanning tree problem, bottleneck function minimization in greedoids), (ii) inverse problems with -norm regularization, and (iii) valued constraint satisfaction on the -pre-semirings. Bottleneck potentials for general discrete MRFs are a natural generalization of the above direction of modeling work to Maximum-A-Posteriori (MAP) inference in MRFs. To this end, we propose MRFs whose objective consists of two parts: terms that factorize according to (i) , i.e. potentials as in plain MRFs, and…
| F3-Netherlands | Opunake-3D | Waka-3D | ||||||||||
| I | II* | III* | IV | V* | VI | I* | II | I | II* | III* | IV | |
| Instance sizes | ||||||||||||
| 263K | 153K | 362K | 153K | 231K | 101K | 443K | 2965K | 614K | 366K | 601K | 614K | |
| 3271K | 1768K | 4394K | 1422K | 1896K | 944K | 406K | 2630K | 3524K | 2020K | 3110K | 3646K | |
| Mean absolute deviation from ground truth | ||||||||||||
| MST | 1.6212 | 1.1245 | 1.6542 | 0.1283 | 0.1088 | 1.5319 | 0.4930 | 0.4596 | 2.0060 | 0.2759 | 0.9457 | 1.6108 |
| MRF | 7.1435 | 6.2105 | 3.2169 | 0.1540 | 0.0896 | 2.4498 | 0.4786 | 0.9002 | 1.4073 | 0.0221 | 0.4245 | 1.2388 |
| B-MRF | 0.8881 | 0.2644 | 0.0889 | 0.0463 | 0.0894 | 1.6049 | 0.4887 | 0.4209 | 1.1855 | 0.0116 | 0.4135 | 1.0377 |
| Runtimes (seconds) | ||||||||||||
| MST | 14 | 7 | 19 | 3 | 6 | 7 | 8 | 8 | 8 | 5 | 7 | 8 |
| MRF | 673 | 139 | 931 | 62 | 65 | 33 | 114 | 522 | 740 | 22 | 687 | 781 |
| B-MRF | 1820 | 1374 | 3108 | 1609 | 1554 | 1322 | 2979 | 2554 | 6548 | 2402 | 7738 | 8158 |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Bottleneck potentials in Markov Random Fields
Ahmed Abbas 1,2 Paul Swoboda1
1Max Planck Institute for Informatics, 2ZEISS
Abstract
We consider general discrete Markov Random Fields (MRFs) with additional bottleneck potentials which penalize the maximum (instead of the sum) over local potential value taken by the MRF-assignment. Bottleneck potentials or analogous constructions have been considered in (i) combinatorial optimization (e.g. bottleneck shortest path problem, the minimum bottleneck spanning tree problem, bottleneck function minimization in greedoids), (ii) inverse problems with -norm regularization, and (iii) valued constraint satisfaction on the -pre-semirings. Bottleneck potentials for general discrete MRFs are a natural generalization of the above direction of modeling work to Maximum-A-Posteriori (MAP) inference in MRFs. To this end, we propose MRFs whose objective consists of two parts: terms that factorize according to (i) , i.e. potentials as in plain MRFs, and (ii) , i.e. bottleneck potentials. To solve the ensuing inference problem, we propose high-quality relaxations and efficient algorithms for solving them. We empirically show efficacy of our approach on large scale seismic horizon tracking problems.
1 Introduction
In the field of computer vision MRFs have found many applications such as image segmentation, denoising, optical flow, 3D reconstruction and many more, see [16] for a non-exhaustive overview of problems and algorithms. The above application scenarios are modelled well as a sum of unary/pairwise/ternary/ potentials on the underlying graphical structure. More generally, whenever the error of fitting a model to data is captured by local terms, the objective can factorize into a sum of local error terms. However, such objectives are not always appropriate. In some computer vision applications, a single error can entail subsequent errors, rendering the solution useless and local terms are unable to properly penalize this. Prominent examples are tracking problems [44, 13, 28, 10], where making a single error and following a wrong track results in low accuracy nonetheless. In inverse problems, -norm regularization penalizes the maximum deviation from the fitted model and is appropriate e.g. for some types of group sparsity [27, 23, 24]. In all of the above scenarios global potentials, which penalize the maximum value assignment w.r.t. a given set of local costs are the appropriate choice. We call this maximum value a bottleneck and the aim is to find a configuration such that its bottleneck has minimum cost. Formally, optimizing a bottleneck objective can be written as
[TABLE]
where is the space of feasible elements of the optimization problem and is a real valued vector. Additionally, the bottleneck objective can be written as the infinity norm , where is the Hadamard product.
2 Related work
Bottleneck-type objectives occur throughout many subfields of mathematical programming. The optimization problems are also often called min-max problems.
**Bottleneck potentials in MRFs. ** The case of pairwise binary MRFs with bottleneck potentials has been addressed in [32] and has been applied to image segmentation. As noted by the authors of [32], incorporating a bottleneck term gives better segment boundaries and resolves ‘small cuts’ (or shrinking bias) of graph cut [6]. In [8] the authors interpret -norm regularization for discrete labeling problems and for as MAP-inference in MRFs and propose approximating the bottleneck potential corresponding to via a high value of . In [18] a more general class of MAP-inference with higher order costs were proposed which include the bottleneck potentials we investigate. In [25] the authors propose a model that can be specialized to bottleneck costs with linear distances and give a primal heuristic for solving it.
Furthermore, labeling problems containing only bottleneck objectives were considered in [37, 33, 11, 29], where [11, 29] devised algorithms for special cases of the pure min-max labeling problem, and [37, 33] devised message passing schemes with application in parallel machine scheduling.
In contrast to previous works [8, 32, 37, 33, 11, 29] we investigate the mixed problem where ordinary and bottleneck potentials are both present in the same inference problem. Unlike [32] we allow arbitrary MRFs. Also unlike the algorithm in [8, 25], our algorithms are based on a linear programming relaxation resulting in a more rigorous approach. Lastly, in comparison to [18] we propose a stronger relaxation (theirs corresponds to 1) and an algorithm that scales to large problem sizes.
**Combinatorial bottleneck optimization problems. ** Many classical optimization problems have bottleneck counterparts, e.g. bottleneck shortest paths [15], bottleneck spanning trees [12] and bottleneck function optimization in greedoids [22]. The above works consider the case of bottleneck potential only. More similar to our work is the mixed case of linear and bottleneck costs, as investigated for the shortest and bottleneck path problem in [31].
**-norm regularization in inverse problems. ** For inverse problems with uniform noise, the appropriate choice of norm is [5][Chapter 7.1.1]. Extensions of this basic -norm are used in [27] for multi-task learning and in [23] for the multi-task lasso. More generally, mixed norms with -norms are useful for problems where group sparsity is desirable. Proximal [27], block coordinate descent [23] as well as network flow techniques [24] have been proposed for numerical optimization of such problems.
**Semiring-based constraint satisfaction. ** In semiring-based constraint satisfaction problems the goal is to compute , where is a space of labellings on a node set , is collection of subsets of , and are functions that depend only on nodes in the subset [41, 42, 17, 3, 2]. Popular choices for the pair is the -semiring which corresponds to MAP-inference and the -semiring which corresponds to computing the partition function. In contrast to the above semirings, the algebra corresponding to bottleneck potentials is only a pre-semiring (the distributive law does not hold) and hence classical arc consistency algorithms as discussed in [41, 42] are not applicable and specialized methods are needed. For an in-depth study of the -pre-semiring we refer to [9]. Our case however does not completely fit into (pre-)semiring based constraint satisfaction setting, since we are concerned with the mixed case in which both the -semiring and the -pre-semiring occur together in a single optimization problem.
**Applications in horizon tracking. ** The seismic horizon tracking problem, i.e. identifying borders between layers of various types of rock beds, has been addressed in [44, 13, 43, 39] from the computational perspective. The authors in [44] use a greedy method inspired by the minimum spanning tree problem to that end. In [13] the authors propose to solve a shortest path problem to track seismic horizons along a 2-D sections of the original 3-D volume. In [43, 39] the authors set up linear equations to solve the horizon tracking problem.
Most similar to our work is the minimum spanning tree inspired approach of [44]. In contrast to [44], we consider a rigorously defined optimization problem for which we develop a principled LP-based approach instead of greedily selecting solutions. Conceptually, the 2-D shortest path method [13] is also similar to ours, however it cannot be extended to the 3-D setting we are interested in. Moreover, we allow for a more sophisticated objective function than [13]. Methods of [43, 39] do not use optimization at all but solve linear systems and require more user intervention.
**Contribution & organization. ** Section 3 introduces bottleneck potentials and their non-linear generalizations for general discrete MRFs. We also consider the mixed problem of MAP-inference for a combination of ordinary MRF-costs w.r.t. the -semiring and bottleneck potentials. In Section 4 we derive special algorithms to solve the problem for chain graphs via a dynamic shortest path method, and for graphs without pairwise interactions via an efficient enumerative procedure. Combining these two special cases, we derive a high-quality relaxation for the case of general graphs. To solve this relaxation, we propose an efficient dual decomposition algorithm. In Section 5 we show empirically that our approach results in a scalable algorithm that gives improved accuracy on seismic horizon tracking as compared to MAP-inference in ordinary MRFs and a state-of-the-art heuristic.
Code and datasets are available at https://github.com/LPMP/LPMP.
3 MRFs with bottleneck potentials
First, we will review the classical problem of Maximum-A-Posteriori (MAP) inference in Markov Random Fields (MRF). Second, we will introduce the bottleneck labeling problem which extends MAP-MRF by additional bottleneck term that penalize the maximum value of potentials taken in an assignment (as opposed to the sum for MRFs).
3.1 Markov Random Fields:
A graph will be a tuple with undirected edges . To each node a label set is associated. To each node a unary potential is associated, and to each edge a pairwise potential . We will call the label space and a labeling. For subsets we define , analogously we refer to labels . In particular, refers to a node labeling and to an edge labeling. A tuple consisting of a graph, corresponding label space and potentials is called a Markov Random Field (MRFs). The problem
[TABLE]
is called the Maximum-A-Posteriori (MAP) inference problem in MRFs.
**Local Polytope Relaxation: ** We use the over-complete representation to obtain a linear programming relaxation of the optimization problem (2). For we associate the -th label from with one-hot encoding which is a unit vector of length with a at the -th location. In other words, we can write . Analogously, we can write . To obtain a convex relaxation, we define unary marginals as , , and pairwise marginals as , . We couple unary and pairwise marginals together to obtain the local polytope relaxation [40].
[TABLE]
With the local polytope we can relax the problem of MAP-inference in MRFs (2) as:
[TABLE]
Note that the relaxation (4) subject to constraint (3) is tight for some graphs such as trees and for special families of cost functions including different forms of submodularity [20].
3.2 Bottleneck labeling problem:
Given an MRF, we associate to it a second set of potentials, which we call bottleneck potentials. As opposed to MRF, however, the corresponding assignment cost is not given by the sum of individual potential values but by their maximum. The goal of inference in a pure bottleneck labeling problems (i.e. with all zero MRF potentials) is thus to find a labeling such that the maximum bottleneck potential value taken by the labeling is minimal.
Definition 1** (Bottleneck labeling problem).**
Let an MRF be given. Additionally, let unary bottleneck potentials and pairwise bottleneck potentials are also given. We call the set of all possible values taken by bottleneck potentials as bottleneck values, i.e.
[TABLE]
Let bottleneck costs be given. The bottleneck labeling problem is defined as
[TABLE]
and is defined in (2).
It is straightforward to adapt our work to the case of multiple bottlenecks and triplet/quadrupelt/ potentials. However we focus on models with only a single bottleneck and pairwise potentials for simplicity.
A special case of the problem (6) was considered by [32] where the authors only allow binary labels, special form of MRF costs and . Additionally, heuristics for extending to the multi-label case were given in [8]. Below we propose exact algorithms for multi-label chain graphs and an LP-relaxation for general graphs.
**Example. ** In the special case when and MRF costs are zero i.e. then the problem reduces to a pure bottleneck labeling problem as:
[TABLE]
Note that if we are know the optimal bottleneck value in , then the bottleneck labeling problem can be reduced to the MAP-inference problem in MRFs. This reduction can be done by setting the unary and pairwise MRF costs to for the labelings which have bottleneck potentials greater than the optimal value . i.e.
[TABLE]
Then, the constraints (6b) will be automatically satisfied by a feasible solution of the MAP-inference problem.
4 Algorithms:
In this section, algorithms for solving the bottleneck labeling problem are proposed. We first present efficient and exact algorithms for edge-free graphs and chain graphs. Later, we will use these two algorithms for solving the problem on general graphs using dual-decomposition. The algorithms for edge-free graphs and chains are designed with the end goal of using them inside dual decomposition for general graphs and therefore contain extra steps.
4.1 Bottleneck labeling with unary potentials:
Assume that the graphical model does not contain any edges i.e. , so only unary potentials need to be considered. The problem (6) in this case can be efficiently solved with Algorithm LABEL:alg:unary-bottleneck-labeling-problemLABEL:alg:unary-bottleneck-labeling-problem.
Algorithm LABEL:alg:unary-bottleneck-labeling-problemLABEL:alg:unary-bottleneck-labeling-problem enumerates all bottleneck values in ascending order. For every bottleneck value , an optimal node labeling can be found by choosing for each node the best label w.r.t. potentials that is feasible to bottleneck constraints (6b). The costs of such labelings are stored in . Updating the best node labeling between consecutive bottleneck values can be done by only checking the nodes for which (6b) has changed the feasible set. Finally, the optimum bottleneck value can be computed by
[TABLE]
Once the optimal bottleneck value is computed, the problem can be reduced to inference in MRF by disallowing the configurations which have bottleneck potentials greater than as mentioned in (8).
Proposition 1**.**
The runtime of Algorithm LABEL:alg:unary-bottleneck-labeling-problemLABEL:alg:unary-bottleneck-labeling-problem is where
The most expensive operation in Algorithm LABEL:alg:unary-bottleneck-labeling-problemLABEL:alg:unary-bottleneck-labeling-problem is sorting. However, the sorting can be reused when the algorithm is run multiple times with varying linear potentials , which is the case in our dual decomposition approach for general graphs.
4.2 Bottleneck labeling problem on chains:
Assume and is a chain. Even though the relaxation over the local polytope for inference in chains is tight for pairwise MRFs, introducing the bottleneck potential (as also done in [18]) destroys this property which is demonstrated in the Appendix in Lemma 1. Therefore, we propose an efficient algorithm to solve the bottleneck labeling problem (6) exactly on chains.
First, we note that the MAP-MRF problem (2) can be modeled through a shortest path problem in a directed acyclic graph. Figure 1 illustrates the construction for the case of . We treat each label of a variable in in the chain as a pair of nodes , and in the shortest path digraph . Each unary cost becomes an arc cost for . Each pairwise cost becomes an arc cost . The source and sink nodes are connected with the labels of first and last node of the graphical model resp. with zero costs for modeling the shortest path problem. Algorithm LABEL:alg:chain-digraph-conversionLABEL:alg:chain-digraph-conversion in the Appendix gives the general construction.
The shortest -path in graph having cost corresponds to an optimum labeling for from (6) in the chain graphical model .
Algorithm LABEL:alg:chain-bottleneck-labeling-problemLABEL:alg:chain-bottleneck-labeling-problem uses the constructed directed acyclic graph to solve the bottleneck labeling problem (6) on chain MRF by an iterative procedure. For each bottleneck thresholds in , the shortest path is found such that only those edges are used whose bottleneck potentials are within the threshold. This way a matrix of all possible pairs of bottleneck values and shortest path costs is computed. The optimal bottleneck value can be then found by (9).
Naively implementing Algorithm LABEL:alg:chain-bottleneck-labeling-problemLABEL:alg:chain-bottleneck-labeling-problem by computing a shortest path from scratch in each iteration of the loop will result in high complexity. Note that when iterating over bottleneck threshold values in ascending order, exactly one edge is added per iteration. Therefore, dynamic shortest path algorithms can be used for recomputing shortest paths more efficiently. Additionally, since the graph is directed and acyclic, the shortest path in each iteration can be found in linear time by breadth-first traversal [7]. These improvements are detailed in Algorithm LABEL:alg:dynamic-shortest-path-for-chainsLABEL:alg:dynamic-shortest-path-for-chains in the Appendix.
Proposition 2**.**
The worst case run-time of Algorithm LABEL:alg:chain-bottleneck-labeling-problemLABEL:alg:chain-bottleneck-labeling-problem is where, is the arc-set in underlying graph .
While the worst-case runtime of LABEL:alg:chain-bottleneck-labeling-problemLABEL:alg:chain-bottleneck-labeling-problem is quadratic in the number of edges of the underlying shortest path graph, the average case runtime is better, since not for every edge a new shortest path needs to be computed and often a shortest path computation can reuse previous results for speedup. Moreover, the sorting operation on line 2 can be re-used similar to Algorithm LABEL:alg:unary-bottleneck-labeling-problemLABEL:alg:unary-bottleneck-labeling-problem.
Remark 1**.**
If we have a pure bottleneck labeling problem without the MRF potentials (7) then the problem can be solved in linear-time by sorting the bottleneck weights and halving the number of edges by the median edge as mentioned in [15]. The problem we consider is more general and needs to account for the MRF potentials.
Remark 2**.**
The chain bottleneck labeling problem can also be solved for linear bottleneck cost by the method from [31]. However, the method from [31] is not polynomial w.r.t. bottleneck values but requires them to be small integers to be efficient, making it unsuitable for our purposes.
Remark 3** (Bottleneck labeling on trees).**
The dynamic shortest path algorithm can be modified to work on trees as well, where we replace shortest path computations by belief propagation. For clarity of presentation we have chosen to restrict ourselves to chains.
4.3 Relaxation for the bottleneck labeling problem on arbitrary graphs:
Since MAP-MRF is NP-hard and the bottleneck labeling problem generalizes it, we cannot hope to obtain an efficient exact algorithm for the general case. As was done for MAP-MRF [40, 21, 34], we approach the general case with a Lagrangian decomposition into tractable subproblems. To this end, we decompose the underlying MRF problem into trees, which can be solved via dynamic programming. The bottleneck subproblem is decomposed into a number of bottleneck chain labeling problems. To account for the global bottleneck term, these bottleneck chain problems are connected through a unary bottleneck labeling problem defined on a higher level graph. We use Algorithms LABEL:alg:chain-bottleneck-labeling-problemLABEL:alg:chain-bottleneck-labeling-problem and LABEL:alg:unary-bottleneck-labeling-problemLABEL:alg:unary-bottleneck-labeling-problem as subroutines to solve the bottleneck decomposition. An example of our decomposition can be seen in Figure 2.
To account for the MRF-inference problem, we cover the graph by trees . For the bottleneck labeling problem we cover the graph by chains (trees) . For the decomposition we introduce variables to specify the labeling of the MRF subproblem for graph , and variables for the chain bottleneck labeling subproblems defined for graph . We propose the following overall decomposition:
[TABLE]
We constrain the variables for MRF tree subproblems and chain bottleneck subproblems to be consistent via (10b) and (10c). The labelings on chains () are required to be feasible with respect to the bottleneck value via (10f).
To obtain a tractable optimization problem, we dualize the constraints (10b) and (10c) using dual variables and respectively. We keep rest of the constraints and denote the feasible variables for chain w.r.t constraints (10e), (10f) by the set as:
[TABLE]
The dual problem is:
[TABLE]
Where and are defined as:
[TABLE]
Evaluating amounts to solving a MAP-MRF problem on a tree. Evaluating corresponds to computing for each bottleneck value the corresponding minimal assignment from the set for all chains in , and then choosing a bottleneck value such that the sum over all chain subproblems is minimal.
The dual problem (12a) is a non-smooth concave maximization problem. Evaluating and gives supergradients of (12a) which can be used in subgradient based solvers (e.g. subgradient descent, bundle methods).
**Finding : ** Each MRF tree subproblem can be solved independently using dynamic programming to get the optimal labeling .
**Finding : ** is solved by Algorithm LABEL:alg:multiple-chain-bottleneck-labeling-problemLABEL:alg:multiple-chain-bottleneck-labeling-problem. It proceeds as follows:
Build a higher level graph and represent each chain in as a node in . Populate the potentials for all nodes by going through all using Algorithm LABEL:alg:chain-bottleneck-labeling-problemLABEL:alg:chain-bottleneck-labeling-problem (lines 3-3). 2. 2.
Use Algorithm LABEL:alg:unary-bottleneck-labeling-problemLABEL:alg:unary-bottleneck-labeling-problem to find an optimal bottleneck value of in graph (lines 3-3). 3. 3.
Find optimal labelings for each chain subproblem by disallowing the configurations having bottleneck values more than . (lines 3- 3).
We optimize the Lagrange multipliers in the dual problem (12a) subject to constraints 12b with the proximal bundle method [35]. After solving the dual problem we have found a valid reparameterization of the MRF potentials (12b) which maximizes the lower bound. This lower bound can be used as a measure of the quality of the primal solution and helps in rounding a primal solution.
4.4 Primal rounding
For recovering the primal solution through rounding on MRF subproblems, we use the approach of [19] by treating the dual optimal values of as MRF potentials. Additional details on how to best choose and more details are given in Section 6.2 in the Appendix.
We also considered using message passing instead of subgradient ascent by computing the min-marginals through the procedure used in primal rounding. However, our strong relaxation does not easily allow an efficient way for min-marginal computation. Specifically, for the subproblem in (13b) it is hard to reuse computations to recalculate min-marginals efficiently for different variables, making any message passing approach slow. Rounding is only executed once at the end, hence very efficient min-marginal computation is of lesser concern.
5 Experiments
We apply our proposed technique for tracking layers (horizons) in open source sub-surface volumetric data (a.k.a. seismic volumes). Accurate tracking of horizons is one of the most important problems in seismic interpretation for geophysical applications. See Fig. 3 for an illustration of a cross-section of a typical seismic volume.
Popular greedy approaches such as [44] rely on tracking horizons by establishing correspondences between nearby points that lie on the same horizon with high probability. As such they are prone to fall into local optima.
On the other hand a natural option to track horizons is to state the problem as MAP-inference in MRFs for which high-quality solvers exist that are less sensitive to local optimal. However, the cost structure of MRFs is not adequate for our problem. Specifically, the MRF energy is composed of a sum of local terms defined over nodes and edges of the graphical model that account for similarity between seed and adjacent nodes respectively. For the cost function it may be more advantageous to track a wrong horizon if it is more self-similar and pay a higher dissimilarity cost at a few locations rather than follow the correct horizon if it is harder to follow. This problem is exacerbated by the typically rather weak unary costs (see below). For an illustration of this behaviour we refer to Figure 3.
We argue that the bottleneck potential remedies this shortcoming of MAP-MRF, while preserving its advantages. In contrast to unary and pairwise potentials of MRFs, the bottleneck potential penalizes a single large discontinuity in the tracking that comes from jumping between layers more than multiple small discontinuities stemming from a hard to track horizon. As such it prefers to follow a rugged horizon with no big jump over a single large jump followed by an easy to track horizon.
Further computational challenges in horizon tracking are:
- •
Nearby rock layers look very similar to the layer in consideration, due to which tracking algorithms can easily jump to a wrong layer.
- •
Due to structural deformations in the subsurface environment, rock layers can have discontinuities.
- •
The appearance of a horizon can vary at different locations. Nearby horizon layers can bifurcate or disappear. This makes estimation of reliable cost terms for our model (6) difficult.
- •
The size of a seismic survey can be huge. Some open-source seismic volumes cover a survey area. The corresponding input data has a size of GB.
- •
Lack of established open-source ground truth makes the evaluation procedure and learning of parameters difficult.
**Seismic horizon tracking formulation: ** Our input are 3-D seismic volumes of size , where corresponds to the depth axis and to x,y-axes resp. For each location in the volume we seek to assign a depth value . We formalize this as searching for a labeling function .
For computational experiments we used publicly available seismic volumes [36]. Due to lack of available ground truth, we have selected three volumes (F3 Netherlands, Opunake-3D, Waka-3D) and tracked 11 horizons by hand with the commercial seismic interpretation software [14]. Annotation took two days of work of an experienced seismic interpreter.
**Matching costs: ** In order to track horizons, established state-of-the-art tracking algorithms [44, 43, 13] rely on computing similarity scores between patches that might or might not lie on the same horizon. This is traditionally done using hand-crafted features that rely on cross-correlation or optical flow. We opt for a learning-based approach following the great success of CNN-based architectures for computing patch similarity [45, 38]. Specifically, we extend the architecture of the 2-D patch similarity CNN [45] to efficiently deal with the 3-D structure. CNN architecture and the procedure for training is mentioned in Figure 6 and Section 6.1 in Appendix. We train two CNNs to compute matching probabilities between depth labels of nodes . The two neural networks and compute matching probabilities between adjacent and non-adjacent nodes respectively:
[TABLE]
In (14) , is a two channel image of size centered around .
**Graph construction: ** We rephrase computing a depth labeling as labeling nodes in a graph. Consider the grid graph with nodes corresponding to points on and axes of the seismic volume. Edges are given by a neighborhood induced by the -grid. For each node the label space is . A depth labeling thus corresponds to a node labeling of . We compute unary , and pairwise potentials , based on the patch matching costs computed in (14). Further details about unary and pairwise potentials can be found in Section 6.1 in Appendix.
**Algorithms: ** We compare our method with a plain MRF, and a state-of-the-art heuristic for horizon tracking based on a variation of minimum spanning tree (MST) problem.
- •
MST: The horizon tracking approach of [44] is a greedy approach inspired by Borůvka’s algorithm for MST [4]. It starts by marking the nodes where the seed labels are given and iteratively marks assigns label to the adjacent node that contributes minimal cost w.r.t. unary and pairwise potentials.
- •
MRF: We solve MAP-MRF on with unary and pairwise potentials from above with TRWS [19].
- •
B-MRF: We additionally include a bottleneck term on pairwise bottleneck potentials (19). We solve the problem with the algorithm from Section 4.3, and round a primal solution with the procedure from Section 4.4.
**Results: ** Table 1 lists the mean absolute deviation (MAD) of different methods for tracking horizon surfaces. MAD is computed as i.e., the norm of the difference in tracked surface and the ground-truth , normalized by the dimensions of the surface.
From table 1 we see that our method B-MRF outperforms MRF and MST by a wide margin on most problems. The resulting depth surfaces can be seen in Figure 4, and the rest in the Appendix. Inspecting the results we see that MRF finds piece-wise continuous assignments not necessarily corresponding to the correct surface but preferring easy to track ones. On the other hand, MST often follows the horizon quite well but as soon as it marks a pixel wrongly, it often cannot recover due to its greedy nature. Our method addresses these shortcomings to a large degree. The resulting surfaces are smooth and track the correct surface to a larger extent than both MST and MRF by favoring piece-wise smooth regions due to the ordinary unary and pairwise MRF costs, yet following tracks that do not make incorrect jumps due to the additional bottleneck costs. Since we optimize a global energy by a convex relaxation method, we suffer less from poor local optimal than MST. On the other hand, the more expressive optimization model and advanced algorithms that allow B-MRF to obtain higher accuracy also lead to higher runtimes.
6 Appendix
Lemma 1**.**
The following relaxation over the local polytope (4) is not tight for bottleneck labeling problem:
[TABLE]
s.t.
[TABLE]
Proof.
We give a proof by example as in the Figure 5 with a 3 node binary graphical model only containing pairwise bottleneck potentials with . Both integer optimal solutions have cost: , whereas the optimal solution of the LP relaxation has cost .
∎
Lemma 2.
Shortest path computation (line 2) on directed acyclic graph can be done in linear time in the number of arcs using breadth-first traversal by Algorithm LABEL:alg:dynamic-shortest-path-for-chainsLABEL:alg:dynamic-shortest-path-for-chains. However, shortest path update is performed every time after introducing each arc in (lines 2-2), thus making the run-time i.e. quadratic in the number of arcs in . The sorting operation adds an additional factor of . ∎
6.1 Experiment details
**Cost formulation: **
For unary potentials of seed node , we set . Rest of the unary potentials are computed as:
[TABLE]
The pairwise potentials contain terms in-addition to patch similarity from the CNN. First, we compute optical flow using the gradient structure tensor based approach of [1] on the seismic volume along the x and y-axis resulting in optical flow and . From this we compute the displacement penalization for edges in x-direction as , and analogously for edges in y-direction. Next, we compute a coherence estimate [1] indicating whether the optical flows and are reliable. The pairwise MRF potentials combine the above terms into a sum of appearance terms (14) and weighted discontinuity penalizers:
[TABLE]
The second term allows discontinuities in the horizon surfaces where orientation estimates can be incorrect and penalizes discontinuities where the horizon is probably continuous. The authors from [44, 13] do not use the coherence estimate, making their cost functions less robust. The bottleneck pairwise potentials are:
[TABLE]
The scaling parameter in (19) makes the bottleneck potential invariant to the grid size.
**CNN training: **
We use six out of eleven labeled horizons for training the CNN adopted from [45] mentioned in Figure 6. To prevent the CNN from learning the whole ground truth for these six horizons only of the possible patches are used. For data augmentation, translation is performed on non-matching patches as also done in [38] for stereo. The validation set contains of the possible patches from each of the eleven horizons.
For training the in (18), edges in underlying MRF models are sampled for creating the patches. Training was done using PyTorch [26] for epochs and the model with the best validation accuracy () was used for computing the potentials.
For training the used in (17), we use the same procedure as above except that the patches are sampled randomly and thus their respective nodes do not need to be adjacent. The best validation accuracy on the trained model was .
6.2 Primal rounding details
The function defines some ordering for all nodes in . Based on this order, we sequentially fix labels of the respective node as in Algorithm LABEL:alg:primal-rounding-MRFLABEL:alg:primal-rounding-MRF.
For any given dual variables, there is a whole set of equivalent dual variables that give the same dual lower bound. However, when rounding with LABEL:alg:primal-rounding-MRFLABEL:alg:primal-rounding-MRF, the final solution is sensitive to the choice of dual equivalent dual variables. One of the reasons is that the rounding is done on the MRF subproblems only, so we want to ensure that the dual variables carry as much information as possible. Therefore, we propose a propagation mechanism that modifies the dual variables of the bottleneck potentials such that the overall lower bound is not diminished, while at the same time making the MRF dual variables as informative as possible. This procedure is described in Algorithm LABEL:alg:min-marginals-chainsLABEL:alg:min-marginals-chains.
Schemes similar to Algorithm LABEL:alg:min-marginals-chainsLABEL:alg:min-marginals-chains were also proposed in [30] for exchanging information between pure MRF subproblems. Intuitively, the goal in such schemes is to extract as much information out of a source subproblem (in our case bottleneck MRF subproblem) and send it to the target subproblem (in our case MRF subproblem) such that the overall lower bound does not decrease. Such a strategy helps the target subproblem in taking decisions which also comply with the source subproblem.
Algorithm LABEL:alg:min-marginals-chainsLABEL:alg:min-marginals-chains computes min-marginals of nodes for a given bottleneck chain subproblem . Proceeding in a similar fashion as Algorithm LABEL:alg:chain-bottleneck-labeling-problemLABEL:alg:chain-bottleneck-labeling-problem, min-marginals computation on a chain is done as follows:
As an input, the costs of solution in all other chains excluding are required, which can be obtained in the similar fashion as was done before in Algorithm LABEL:alg:multiple-chain-bottleneck-labeling-problemLABEL:alg:multiple-chain-bottleneck-labeling-problem (lines 3-3). 2. 2.
The algorithm maintains forward and backward shortest path distances (i.e., distances from source, sink resp.). This helps in finding the cost of minimum distance path passing through any given node in the directed acyclic graph of chain . 3. 3.
Similar to Algorithm LABEL:alg:chain-bottleneck-labeling-problemLABEL:alg:chain-bottleneck-labeling-problem, bottleneck potentials are sorted and the bottleneck threshold is relaxed minimally on each arc addition. 4. 4.
On every arc addition, the set of nodes for which distance from source/sink got decreased are maintained in resp. Only this set of nodes will need to re-compute their min-marginals.
The above-mentioned Algorithm LABEL:alg:min-marginals-chainsLABEL:alg:min-marginals-chains only calculates the min-marginals for nodes, the calculation of min-marginals for edges can be carried out in similar way by also updating those edges of the DAG whose adjacent node gets updated (lines 7- 7). After computing the min-marginals, messages to MRF tree Lagrangians can be computed by appropriate normalization.
6.3 Results
Figures 7- 17 contain the visual comparison of tracked horizon surfaces mentioned in Table 1. Similar to Figure 4, the color of the surface denotes depth.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] Peter Bakker. Image structure analysis for seismic interpretation . Ph D thesis, Delft University of Technology, 2002.
- 2[2] S. Bistarelli. Semirings for Soft Constraint Solving and Programming . Lecture Notes in Computer Science. Springer Berlin Heidelberg, 2004.
- 3[3] S. Bistarelli, U. Montanari, F. Rossi, T. Schiex, G. Verfaillie, and H. Fargier. Semiring-based CS Ps and valued CS Ps: Frameworks, properties, and comparison. Constraints , 4(3):199–240, Sep 1999.
- 4[4] Otakar Boruvka. O Jistém Problému Minimálním (About a Certain Minimal Problem) (in Czech, German summary). Práce Mor. Prírodoved. Spol. v Brne III , 3, 1926.
- 5[5] Stephen Boyd and Lieven Vandenberghe. Convex Optimization . Cambridge University Press, March 2004.
- 6[6] Y. Boykov and O. Veksler. Graph Cuts in Vision and Graphics: Theories and Applications , pages 79–96. Springer US, Boston, MA, 2006.
- 7[7] Thomas T. Cormen, Charles E. Leiserson, and Ronald L. Rivest. Introduction to Algorithms . MIT Press, Cambridge, MA, USA, 1990.
- 8[8] C. Couprie, L. Grady, L. Najman, and H. Talbot. Power watershed: A unifying graph-based optimization framework. IEEE Transactions on Pattern Analysis and Machine Intelligence , 33(7):1384–1399, July 2011.
