Simplification of Polyline Bundles
Joachim Spoerhase, Sabine Storandt, Johannes Zink

TL;DR
This paper studies the problem of simplifying multiple shared polylines simultaneously, proving NP-hardness of approximation, and providing efficient bi-criteria approximation algorithms with fixed-parameter tractability results.
Contribution
It introduces a generalized polyline bundle simplification problem, proves its NP-hardness, and offers approximation algorithms with bounds based on error tolerance and shared bend points.
Findings
NP-hard to approximate within n^{1/3 - ε} for any ε > 0
Bi-criteria approximation with O(log(ℓ + n)) factor when allowing δ to be exceeded
Fixed-parameter tractability in the number of shared bend points
Abstract
We propose and study a generalization to the well-known problem of polyline simplification. Instead of a single polyline, we are given a set of polylines possibly sharing some line segments and bend points. Our goal is to minimize the number of bend points in the simplified bundle with respect to some error tolerance (measuring Fr\'echet distance) but under the additional constraint that shared parts have to be simplified consistently. We show that polyline bundle simplification is NP-hard to approximate within a factor for any where is the number of bend points in the polyline bundle. This inapproximability even applies to instances with only polylines. However, we identify the sensitivity of the solution to the choice of as a reason for this strong inapproximability. In particular, we prove that if we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Aalto University, Finland
University of Würzburg, [email protected]://orcid.org/0000-0002-2601-6452Funded by European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement No. 759557). University of Konstanz, [email protected] by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) – Project-ID 50974019 – TRR 161. University of Würzburg, [email protected]://orcid.org/0000-0002-7398-718X \CopyrightJoachim Spoerhase, Sabine Storandt, and Johannes Zink\ccsdesc[500]Theory of computation Approximation algorithms analysis \ccsdesc[500]Theory of computation Computational geometry \ccsdesc[300]Theory of computation Problems, reductions and completeness \ccsdesc[300]Theory of computation Fixed parameter tractability \hideLIPIcs
Simplification of Polyline Bundles
Joachim Spoerhase
Sabine Storandt
Johannes Zink
Abstract
We propose and study a generalization to the well-known problem of polyline simplification. Instead of a single polyline, we are given a set of polylines possibly sharing some line segments and bend points. Our goal is to minimize the number of bend points in the simplified bundle with respect to some error tolerance (measuring Fréchet distance) but under the additional constraint that shared parts have to be simplified consistently. We show that polyline bundle simplification is -hard to approximate within a factor for any where is the number of bend points in the polyline bundle. This inapproximability even applies to instances with only polylines. However, we identify the sensitivity of the solution to the choice of as a reason for this strong inapproximability. In particular, we prove that if we allow to be exceeded by a factor of in our solution, we can find a simplified polyline bundle with no more than bend points in polytime, providing us with an efficient bi-criteria approximation. As a further result, we show fixed-parameter tractability in the number of shared bend points.
keywords:
Polyline Simplification, Bi-criteria Approximation, Hardness of Approximation, Geometric Set Cover
1 Introduction
Visualization of geographical information is a task of high practical relevance, e.g., for the creation of online maps. Such maps are most helpful if the information is neatly displayed and can be grasped quickly and unambiguously. This means that the full data often needs to be filtered and abstracted. Many important elements in maps like borders, streets, rivers, or trajectories are displayed as polylines (also known as polygonal chains). For such a polyline, a simplification is supposed to be as sparse as possible and as close to the original as necessary.
A simplified polyline is usually constructed by a subset of bend points of the original polyline such that the (local) distance to the original polyline does not exceed a specifiable value according to a given distance measure, e.g., Fréchet distance or the Hausdorff distance. The first such algorithm, which is still of high practical importance, was proposed by Ramer [16] and by Douglas and Peucker [7]. Hershberger and Snoeyink [13] proposed an implementation of this algorithm that runs in time, where is the number of bend points in the polyline. It is a heuristic algorithm as it does not guarantee optimality (or something close to it) in terms of retained bend points. An optimal algorithm in this sense was first proposed by Imai and Iri [14]. Chan and Chin [5] improved the running time of this algorithm to for the Hausdorff distance. For the Fréchet distance, the optimal solution can be determined in time as described by Godau [10].
We remark that all of these algorithms consider the distance segment-wise. This is, the distance between each segment of the simplification and its corresponding sub-polyline of the input polyline does not exceed the given threshold. We adhere to this widespread approach. Intuitively and from an application point of view, it makes sense to map a point of the input polyline only to a point of a segment of the simplification “spanning over” with respect to the input polyline as this ensures a certain degree of locality. However, the general unrestricted approach has also received attention in the literature. Here, the Hausdorff or Fréchet distance between the input polyline and the simplification as a whole polyline is considered. For the (undirected) Hausdorff distance, this problem becomes -hard [17] and for the Fréchet distance, there is an time algorithm, where is the output complexity of the simplification [17]. The problem variant where in addition the requirement is dropped that all bend points of the simplification must be bend points of the input polyline, is called a weak simplification. Agarwal et al. [1] show that an optimal simplification under the segment-wise Fréchet distance with distance threshold , as computable using the algorithm by Imai and Iri, has no more bend points than an optimal weak simplification with distance threshold . We note that computing the Fréchet distance between two polylines can be solved in polynomial time [2], but may become -hard when considering additional properties like allowing to take shortcuts, which replace outliers in one of the polylines [3].
From a Single Polyline to a Bundle of Polylines
On a map, there are usually multiple polylines to display. Such polylines may share bend points (bends) and line segments between bends (segments) sectionwise. We call them a bundle of polylines. One example is a schematic map of a public transport network where bus lines are the polylines and these share some of the stations and legs.
One might consider simplifying the polylines of a bundle independently. This has some drawbacks, though. On the one hand, the total complexity might even increase when the shared parts are simplified in different ways; see Figure 1. On the other hand, it might suggest a misleading picture when we remove common segments and bends of some polylines, but not of all. Therefore, we require that a bend in a simplification of a bundle of polylines is either kept in all polylines containing it or discarded in all polylines. Our goal is then to minimize the total number of bend points that have to be kept. In Figure 2, we give an example of a simplification of a bundle of polylines.
Related Work
Polyline bundles were studied before in different contexts. In [4], the goal is to interfere a concise graph which represents all trajectories in a given bundle sufficiently well. But this approach primarily aims at retrieving split and merge points of trajectories correctly and does not produce a simplification of each trajectory in the bundle. Methods for map generation based on movement trajectories [12] have a similar scope but explicitly allow to discard outliers and to unify sufficiently similar trajectories, which is not allowed in our setting.
Agarwal et al. [1] describe an time approximation algorithm for (classical) polyline simplification under the Fréchet distance. It is an approximation algorithm in the sense that the output simplification for distance threshold has at most as many bends as an optimal solution with distance threshold . In Theorem 3.5, we also relate the size of our approximate solution respecting a distance thershold of to an optimal solution with distance threshold .
There is also a multitude of polyline simplification problem variants for single polylines which involve additional constraints. One important variant is the computation of the smallest possible simplification of a single polyline which avoids self-intersection [6]. Another practically relevant variant is the consideration of topological constraints. For example, if the polyline represents a country border, important cities within the country should remain on the same side of the polyline after simplification. It was proven that those problem variants are hard to approximate within a factor [8]. Hence, in practice, they are typically tackled with heuristic approaches [8, 9].
Note that the only allowed inputs to those problem variants are either a single polyline without self-intersections or a set of polylines without self-intersections and without common bends or segments (except for common start or end points). In contrast, we explicitly allow non-planar inputs and polyline bundles in which bends and segments may be shared among multiple polylines. We also remark that the known results on hardness of approximation of these problems heavily rely on the constraint that feasible solutions are still non-intersecting. Since we do not require this, we have to resort to different techniques.
Contribution
We introduce the optimization problem of polyline bundle simplification, where we are given polylines on an underlying set of points as well as an error bound and seek to find a simplified polyline bundle with the smallest possible number of remaining points, where each simplified polyline has a Fréchet distance of no more than to the original polyline and the simplification is consistent for shared parts.
While the optimal simplification of a single polyline can be computed in polynomial time, we show that polyline bundle simplification is -hard to approximate within a factor for any . This result applies already to bundles of two polylines, hence excluding an efficient FPT-algorithm depending on parameter .
On the positive side, we show that this strong inapproximability can be overcome when relaxing the error bound slightly. In particular, we design an efficient bi-criteria approximation algorithm. Here, we allow the simplified polylines in our solution to have a Fréchet distance of instead of only to the original polylines. We can then approximate the optimal solution for the original choice of within a factor logarithmic in the input size. As the choice of for real-world problems often is made in a rather ad hoc fashion and uncertainties with respect to the precision of the input polylines have to be factored in as well, we deem our bi-criteria approximation to be of high practical relevance.
We furthermore show that, while the number of polylines in the bundles is not suitable to obtain an FPT-algorithm, the problem of polyline bundle simplification is indeed fixed-parameter tractable in the number of bend points that are shared among the polylines.
2 Formal Problem Definition
An instance of the polyline bundle simplification problem (from now on abbreviated by PBS) is specified by a triple , where is a set of points (bends) in the plane, a polyline bundle , which is a set of polylines represented as lists of points from , and a distance parameter , which specifies a threshold for the the maximum (segment-wise) Fréchet distance between original and simplified polyline bundle. Each polyline () is simple in the sense that each bend of appears at most once in its list.
Definition 2.1** (Polyline Bundle Simplification).**
Given a triple , the goal is to obtain a minimum size subset of points, such that for each polyline its induced simplification (which is while preserving the order of points)
- •
contains the start and the end point of , i.e., , and
- •
has a segment-wise Fréchet distance of at most to , i.e., for each line segment of and the corresponding sub-polyline of from to , abbreviated by , we have .
For the sake of self-containedness we restate the definition of the Fréchet distance below.
Definition 2.2** (Fréchet Distance).**
Between two polylines and in the Euclidean plane, the Fréchet distance is
[TABLE]
*where and are continuous and non-decreasing functions with , , ,
and with .*
3 Hardness of Polyline Bundle Simplification
In this section, we describe a polynomial-time reduction from Minimum Independent Dominating Set (MIDS) to PBS to show -hardness and hardness of approximation. In the MIDS problem, we are given a graph , where is the vertex set and is the edge set of . We define and . The goal is to find a set of minimum cardinality that is a dominating set of as well as an independent set in . A dominating set contains for each vertex , itself or at least one of ’s neighbors. An independent set contains for each edge at most one of its endpoints. Halldórsson [11] has shown that MIDS, which is also referred to as Minimum-Maximal-Independent-Set, is -hard to approximate within a factor of for any . In his proof, he uses a reduction from SAT to MIDS: from a SAT formula , he constructs a graph such that an algorithm approximating MIDS would decide if is satisfiable. We observe that this reduction is still correct if is a 3-SAT formula. Moreover, we observe that the number of edges in the graph constructed in this reduction by a 3-SAT formula is linear in the number of vertices. Thus, we conclude the following corollary and assume henceforth that we reduce only from sparse graph instances of MIDS, in other words, for some sufficiently large constant .
Corollary 3.1**.**
MIDS* on graphs of vertices and edges, i.e., sparse graphs, is -hard to approximate within a factor of for any .*
In our reduction, we use three types of gadgets, which are in principle all lengthy zigzag pieces. We use vertex gadgets to indicate whether a vertex is in the set or not, edge gadgets to enforce the independent set property, and neighborhood gadgets to enforce the dominating set property. See Fig. 3 for an overview. We define our gadgets in terms of an arbitrary (threshold for the maximum Fréchet distance) and some . Note that our problem setting allows overlaps of different polylines without having a common bend or segment (non-planar input). In our reduction there can also be overlaps, which do not affect the involved polylines locally.
Vertex Gadget. For each vertex, we construct a vertex gadget (see Figure 3(a)), which we arrange vertically next to each other on a horizontal line in arbitrary order and with some distance between one and the next vertex gadget.
A vertex gadget has bends arranged in a zigzag course with x-distance ( for the first and the last segment) and y-distance between each two consecutive bends.
Claim 1**.**
In a vertex gadget, there is precisely one shortcut, which starts at the first and ends at the last bend.
Clearly, the line segment from the first to the last bend has Fréchet distance at most to the other bends and segments of the vertex gadget. Moreover, observe that there is no shortcut starting or ending at any inner bend. Thus, either none or all inner bends are skipped. We say that the corresponding vertex is in if and only if we do not skip the inner vertices.
Edge Gadget. For each edge , we construct an edge gadget (see Figure 3(b)) being a zigzag course with bends and sharing its second and second last bend with one of the two corresponding vertex gadgets—the vertex gadgets of and . All neighboring bends from the second to the second last are equidistant in x-dimension, while the first and second bend, and the second last and last bend have the same x-coordinate. In y-dimension, the first and the last bend are below the second and second last bend, respectively. The other bends are above the second bend or below the first bend.
Claim 2**.**
In an edge gadget, there are precisely three long shortcuts. These are (i) from the first to the last bend, (ii) from the first to the second last bend, and (iii) from the second to the last bend. Beside these three shortcuts, there are more shortcuts, which skip only the second and the second last bend (and possibly also the third and third last bend). There is no shortcut not skipping one of the shared bends, i.e., the second or the second last bend.
In Appendix A, we argue that Claim 2 is correct. It follows that not skipping one of the two shared bends is a relatively expensive choice in terms of retained bends. Remember that not skipping one of the shared bends means not taking the shortcut in the corresponding vertex gadget, which means putting the corresponding vertex into . So, skipping almost all bends in the edge gadget of implies not having or in , which means respecting the independent set property for the edge .
Neighborhood Gadget. For each vertex , we construct a neighborhood gadget (see Figure 3(c)). This gadget shares a bend with every vertex gadget corresponding to a vertex of , which is and the vertices being adjacent to . These shared bends are on the same height. The vertex gadgets of appear in some horizontal order in our construction. Say the corresponding vertices in order are . Let the shared bends with and be and , respectively, and define as the distance between and . We place the first bend (the starting point) of the neighborhood gadget below and to the left of , where is the distance between and , and let the second bend be . Symmetrically, we place the last bend (the end point) of the gadget below and to the right of and let the second last bend be . Between each two bends and shared with the vertex gadgets of and for each , we add a zigzag with bends as in Figure 3(c).
Claim 3**.**
In a neighborhood gadget, the only shortcuts are (i) the shortcuts skipping only for and (ii) the shortcuts starting at the first bend or with and ending at the last bend or with —except for the shortcut starting at the first and ending at the last bend.
In Appendix A, we argue that Claim 3 is correct. Consequently, we can skip almost all bends in a neighborhood gadget if we keep at least one bend of . If we skip all of them, we can skip no other bend. So, to avoid high costs, we must not take the shortcut of the vertex gadget of at least one vertex of . This means that we must, for each , add a vertex of to , which enforces the dominating set property.
Observe that all shared bends are shared between only two polylines—by a vertex gadget and either an edge gadget or a neighborhood gadget. With inner bends, a vertex gadget provides enough bends that are shared with the edge and neighborhood gadgets as a vertex is contained in at most neighborhoods and has at most incident edges. In the following lemma, we analyze the size of the constructed PBS instance.
Lemma 3.2**.**
By our reduction, we obtain from an instance of MIDS an instance of PBS with bends such that , where , ( is constant).
- Proof.
To count the bends of the vertex, edge, and neighborhood gadgets without double counting, we charge the shared bends to the vertex gadgets. All vertex gadgets together have bends, all edge gadgets have bends without shared bends, and all neighborhood gadgets have bends without shared bends. Summing these values up and using yields (for )
[TABLE]
∎
We say a simplification of an instance of PBS obtained by this reduction corresponds to an independent and dominating set and vice versa if we take all “long” shortcuts in the vertex gadgets except for the ones corresponding to and we skip all inner unshared bends in all edge and neighborhood gadgets, which is possible since is independent and dominating. Observe that for each independent and dominating set there is precisely one corresponding simplification (which is also valid acc. to ).
Lemma 3.3**.**
Let be a solution for an instance of MIDS. In the instance of PBS obtained by our reduction, the size of the simplification corresponding to is , where and is constant.
- Proof.
Only for all , we take the shortcuts in the corresponding vertex gadgets in . This gives us remaining bends in all vertex gadgets combined. In the following, we will count shared bends for the vertex gadgets. We take a “long” shortcut in all of the edge gadgets. This gives us two remaining unshared bends in all edges gadgets ( bends in total). Moreover, we skip all inner unshared bends in all of the neighborhood gadgets ( bends remaining). Altogether, this sums up to . ∎
By Lemma 3.3, we know that for an optimal solution of an instance of MIDS, the corresponding simplification in the instance of PBS obtained by our reduction has size , where and which of course is at least the size of the optimal solution of . We formalize this in the following corollary.
Corollary 3.4**.**
For an instance of MIDS and the instance of PBS obtained by our reduction from , .
Theorem 3.5**.**
PBS* is -hard to approximate within a factor of for any , where is the number of bend points in the polyline bundle.*
- Proof.
Assume that there is an approximation algorithm solving any instance of PBS within a factor of for some constant relative to the optimal solution. We can transform any instance of MIDS, where , and , this is the size of an optimal solution, to an instance of PBS using the reduction described above in this section, where and the size of an optimal solution is .
Employing to solve yields a (simplified) polyline bundle . We denote the number of bends in by and we know that for some . If all -bend-sequences in all edge and neighborhood gadgets are skipped, we can immediately read an independent dominating vertex set from the vertex gadgets where the shortcut is not taken. Otherwise, we replace such that it corresponds to any maximal independent set (which is always an independent and dominating set and can be found greedily in polynomial time). Observe that this can only lower the number of bends compared to a solution not skipping all -bend-sequences in the edge and neighborhood gadgets as in all vertex gadgets together we can skip at most bends.
Using Lemma 3.3 and Corollary 3.4, we can state that
[TABLE]
which we can reformulate as . We can assume that as otherwise we could check all subsets of of size at most in polynomial time. Similarly, we can assume that is large enough so that . Beside this, we apply Lemma 3.2 and obtain
[TABLE]
Since we know that it is -hard to approximate MIDS within a factor of for any , it follows that cannot be a polynomial time algorithm, unless = . Or in other words, it is -hard to approximate PBS within a factor of for any . ∎
Currently, we use one polyline per gadget. So, our reduction uses polylines. We can reduce the number of polylines to two by connecting all vertex gadgets—one after the other—in arbitrary order by two segments, which gives us the first polyline, and by connecting all edge and neighborhood gadgets in arbitrary order by two segments, which gives us the second polyline. The extra bend between each pair of new segments is placed far away from the construction, e.g. at . This never creates new shortcuts for skipping a bend in a vertex gadget or in a neighborhood gadget. Yet, we might create new shortcuts that allow for additionally skipping the first and the last bend of an edge gadget. However, we cannot skip any further bend unless the second or second last bend is skipped, which preserves the functionality of our gadget. For the analysis, this gives us an additive constant of at most bends that cannot be skipped, which we can include to Inequalities (2)–(4) in Theorem 3.5 with the same result to obtain the following corollaries.
Corollary 3.6**.**
Even for instances of two polylines, PBS is -hard to approximate within a factor of for any , where is the number of bend points in the polyline bundle.
Corollary 3.7**.**
PBS* is not fixed-parameter tractable in the number of polylines .*
4 Bi-criteria Approximation for Polyline Bundle Simplification
In this section, we describe a bi-criteria approximation algorithm for PBS. Conceptually, a bi-criteria approximation is a generalization of a (classical) approximation where it is allowed to violate a certain constraint by a specific factor. In particular, an algorithm is called a bi-criteria -approximation algorithm if it runs in polynomial time and produces a solution of size at most while relaxing the constraint by a factor of .
In our particular problem PBS, we relax the error bound . In Section 3, we have shown that there is no bi-criteria -approximation algorithm for PBS for any unless = . This strong inapproximability comes from the high sensitivity towards choices of keeping or discarding single bends, which is modulated by the given value of . By making a bad choice we cannot take (helpful) shortcuts that have a distance just a little greater than the given distance threshold to the original sub-polyline. This can be overcome by relaxing the constraint slightly. In particular, we show that allowing a constraint violation by a factor of , we can design an efficient algorithm with an approximation guarantee of . For an overview of our algorithm see Fig. 6.
The key building block of our algorithm is a connection between PBS and a certain geometric set cover problem, which we call star cover problem. The star cover problem models the aspect of shortcutting polylines by few bend points but does not take into account consistency. We argue, however, that approximate solutions to the star cover problem can be post-processed to form consistent PBS solutions by slightly violating the error threshold .
Star Cover Problem
Next, we introduce the star cover problem, which is a special type of the set cover problem defined over instances of PBS. Informally spoken, a star is a bend together with some incident shortcut segments. These shortcut segments span sets of original segments of the polylines. To this end, we first direct each polyline in a given PBS instance arbitrarily but ensuring that all (shortcut) segments of are oriented in the same direction. Then a star consists of a set of incoming shortcuts of some bend; see Fig. 5 for an example.
Definition 4.1** (Star).**
A star is the combination of a bend and, for each polyline that contains , one or zero incoming shortcut segments (according to ).
We say a star covers a segment–polyline pair , if contains for a shortcut and lies on between and . Our goal is to find a small set of stars that cover all segment–polyline pairs. We denote the set of all segment–polyline pairs in the input by and the subset of pairs covered by a particular star by . Then the star cover problem is defined as follows.
Definition 4.2** (Star Cover).**
A star cover is a set of stars, such that , i.e. all segment–polyline pairs are covered. The star cover problem (abbreviated by StCo) asks for a minimum size star cover.
Relationship between Instances of Polyline Bundle Simplification and Star Cover
Next, we investigate the relationship between an instance of StCo and its corresponding instance of PBS. We argue that every (optimal) solution for PBS can be decomposed into a star cover. Hence an optimal StCo yields a lower bound for an optimal PBS solution.
Lemma 4.3**.**
The size of an optimal solution of any instance of StCo obtained from an instance of PBS is bounded by , where is the size of an optimal solution of .
- Proof.
Consider an optimal solution of . From the simplified polyline bundle induced by , we can get a star cover for any instance of StCo obtained from by iteratively adding a star in the following way until there are only isolated bends. Get a star by taking any connected bend as a central bend and the bends that precede on each of the simplified polylines as its outer bends. Remove the segment–polyline pairs covered by from our simplified polyline bundle. Repeat this until there are no more segment–polyline pairs. The obtained star cover has at most stars and at least as many stars as a minimum star cover. So, . ∎
Approximation for the Star Cover Problem
We can compute an approximate solution for StCo by employing the classical greedy algorithm [15] for set cover, which iteratively selects the set with the most uncovered elements until all elements are covered. However, if applied naively, the running time would be exponential in the size of the PBS instance as the number of stars might be in the order of . We observe, however, that it suffices to consider only maximal stars (containing on each polyline incident to the central bend the incoming shortcut that covers the largest number of segments). As there are only maximal stars, this guarantees polynomial running time.
Lemma 4.4**.**
We can compute an -approximation for an instance of StCo obtained from an instance in time , where is the maximum number of polylines any bend point occurs in and is the maximum number of segments any valid shortcut (according to ) can skip.
- Proof.
There is a polynomial time greedy algorithm that yields an approximation for the set cover problem, where is the size of the largest set in the given collection of subsets of the universe [15]. The greedy algorithm works as follows. While there are uncovered elements from the universe, add the set with the largest number of uncovered elements to the set cover. In an instance of StCo, this is the maximum number of segment–polyline pairs a single star can cover. If the central bend point of a star lies in at most polylines, the star contains at most shortcut segments, and each of which covers at most segments, hence we have . Observe that .
Having settled the approximation ratio, it remains to prove the polynomial running time. Using the algorithm by Imai and Iri [14] independently for each polyline, we can find all (maximal) shortcuts for every bend on every polyline in time . Combining these shortcuts at every bend gives us all maximal stars in time . For each star, we also save the number of segment–polyline pairs it covers and, to each segment–polyline pair, we link all stars it appears in. Both can be done in time . As long as there are uncovered segments, we find the star with the most uncovered segments and then update the number of uncovered segments for the other stars. This can be done in time in total as well. ∎
Relationship between Star Covers and Solutions of Polyline Bundle Simplification
While a solution for PBS can be directly converted into a star cover as argued above, the converse is more intricate. The shortcuts contained in the selected stars may be overlapping or nested along a polyline, that is, bends skipped by one shortcut may be end points of another shortcut in the set. Moreover, shared parts of different polylines may be shortcut differently. Therefore consistency is not guaranteed. We explain how to derive from a star cover solution a solution for its corresponding instance of PBS. Some of the shortcuts of the StCo solution are replaced by shorter shortcuts in order to integrate some intermediate point to the PBS solution. Lemma 4.5 states that those newly introduced shortcuts can be at most away from the original polyline. The situation described there is depicted in Fig. 5. It follows immediately from a lemma by Agarwal et al. ([1], Lemma 3.3).
Lemma 4.5**.**
Given a polyline and a distance threshold . If there are with and (i.e., segment is a valid shortcut), then for any with , .
Equipped with this lemma, we now discuss the actual transformation from a StCo solution to a PBS solution. The idea is to keep, beside the starting points of all polylines, only the central bend points of the selected stars while dropping their leaves. This is closely tied with the fact that we minimize the number of stars while ignoring their degree in the algorithm. The main insight here is that the shortcuts induced by this augmented point set still have a small distance to the original polylines.
Lemma 4.6**.**
Let be a star cover for an instance of StCo obtained from an instance of PBS. If is an -approximation for its instance of StCo, a bi-criteria -approximation for can be computed in time from .
- Proof.
Let be the set of central bends of the stars in and let be the set of first bends of all polylines from . We return as the bi-criteria approximate solution. Clearly, we can construct this set in time . According to Lemma 4.3, , where is the size of the optimal solution of and is the size of the optimal solution of the instance of StCo where is an approximation for. We conclude
[TABLE]
Let be the polyline bundle induced by . It remains to prove that the Fréchet distance between each induced segment of each polyline in and its corresponding sub-polyline in is at most . Consider any segment of any polyline corresponding to a polyline such that precedes in . There is a star in that covers all segments of . Clearly, all segments of are covered by the stars of and if there was no single star covering all segments of , but multiple stars, there would be another central bend of a star between and on and, in , would not be a segment. The central bend of succeeds or is equal to as otherwise would not cover all of . Accordingly, the outer bend of on precedes or is equal to as otherwise would not cover all of . By the definition of a star, we know that . By Lemma 4.5, it follows that . ∎
Bi-criteria Approximation for Polyline Bundle Simplification via Star Cover
Using the previous lemmas, we obtain the main theorem of this section. It is reasonable to assume that the number of polylines is polynomial in in practically relevant settings. Hence, we essentially obtain an exponential improvement over the complexity-theoretic lower bound if we allow the slight violation of the error bound.
Theorem 4.7**.**
There is a bi-criteria -approximation algorithm for PBS running in time , where is the number of polylines and is the number of bend points in the polyline bundle.
- Proof.
We describe a (kind of) approximation-preserving reduction from PBS to StCo, which can be realized as a bi-criteria approximation algorithm. Its steps are depicted in Fig. 6. Given an instance of PBS, where we let the size of the optimal solution be , we assign an arbitrary direction to each . This yields our corresponding instance of StCo. For this corresponding instance of StCo, compute an approximation star cover . We can do this in time according to Lemma 4.4. According to Lemma 4.6, we can compute a bi-criteria -approximation for from in time. Since and , this is also a bi-criteria -approximation. ∎
5 Fixed-Parameter Tractability
A brute force approach is checking for every subset of the bend set in time whether it is a valid simplification and accepting the one with the smallest number of bends or segments. Consequently, the runtime of this approach is . When considering fixed-parameter tractability, investigating parameters of the input is a natural choice. According to Corollary 3.7, PBS is not fixed-parameter tractable (FPT) in the number of polylines . However, PBS is FPT in the number of shared bends, i.e., bends contained in more than one polyline. We denote the set of those bends by and we let .
Theorem 5.1**.**
PBS* is FPT in the number of shared bends . There is an algorithm solving PBS in time .*
- Proof.
We describe an algorithm that solves PBS in time . Given an instance of PBS, the first step is to compute, for each , its shortcut graph using the algorithm by Imai and Iri [5]. This can be done in time . For a polyline and a distance threshold , the shortcut graph is the directed graph that has the bends of as its vertices and has an edge from to if , this is, if there is a shortcut from to in . Given the shortcut graph of , the vertices of a shortest path in from the first bend of to the last bend of define an optimal simplification of .
The second step is to iterate over all subsets and check if is part of an optimal solution. Before the first iteration, we initialize a variable and we will save the current best solution by . Then, in each iteration, we temporarily remove from all shortcut graphs all vertices and all edges that correspond to a shortcut skipping a bend in . Clearly, removing can be performed in time for each . For the removal of the edges in , note that we can sort the list of bends and the list of all edges (defined by their endpoints) alphanumerically by the occurrence of the bends within the polyline . If we traverse both lists simultaneously in ascending order, we remove an edge if and only if its endpoint-bends come before and after the currently considered bend from . Therefore, the removal operations can be performed in time per .
If some shortcut graph becomes disconnected by these removal operations, we continue with the next iteration. Otherwise, we take the bends of a shortest path from the first to the last bend in each reduced version of . Together they define a simplification of our PBS instance. If the number of bends in is less than , we set and . After the iteration process, we return . Since we have subsets of and each iteration can be performed in time, the running time of the algorithm is in .
It remains to prove that is in the end an optimal solution of our input instance of PBS. First note that our algorithm always returns some polyline simplification because for , we do not get a disconnected after the removal operations.
The returned solution is valid because the shared bends of are taken in all simplified polylines (they cannot be skipped) and the other shared bends are skipped in all simplified polylines. Our algorithm finds the minimum size solution because in one iteration it considers , where is the set of retained bends of an optimal solution. Moreover, an optimal solution cannot have fewer bends occurring in only one polyline than our algorithm since this would imply a shorter shortest path within the reduced version of . ∎
6 Conclusion and Outlook
We have generalized the well-known problem of polyline simplification from a single polyline to polyline bundles. Although in the case of one polyline, efficient algorithms have long been known, it turned out that simplifying two or more polylines is a problem that is indeed hard to approximate within a factor of for any . However, if we relax the constraint on the maximum Fréchet distance between original and simplified polyline by a factor of 2, we can overcome this strong inapproximability bound. Moreover, we can find an optimal simplification quickly if we have only a small number of shared bends since the problem of polyline bundle simplification is fixed-parameter tractable (FPT) in this parameter.
Based on our results, there are many possible directions for future research.
- •
Our current bi-criteria approximation guarantee is logarithmic in the number of polylines plus the number of bend points . In most practical application, is smaller than or at most polynomial in . From a theoretical perspective, however, it might be interesting to get rid off the dependency on in the bi-criteria approximation in order to get improvements for the case where is significantly larger than .
- •
As a distance measure, we employed the Fréchet distance, which we consider to be more natural and intuitive than the Hausdorff distance when comparing polylines. However, the Hausdorff distance is sometimes used in classical polyline simplification as well. Our hardness results also apply to the Hausdorff distance, but our bi-criteria approximation algorithm fails since Lemma 4.5 is not true for the Hausdorff distance. One might consider PBS using the Hausdorff distance or other (even non-segment-wise) distance measurements.
- •
In our generalization to bundles of polylines, we aim for a minimizing the number of retained bends (Min-Bends). However, minimizing the number of retained segments (Min-Segments) is an alternative goal, which also generalizes the classical minimization problem for a single polyline. Optimal simplifications for both goals may differ; see Fig. 7. Our hardness and FPT results also apply for the goal Min-Segments. However, it is not clear how to obtain a similar result for the bi-criteria approximability.
- •
For practical purposes, the scalability of the proposed bi-criteria approximation algorithm, the FPT algorithm, and possibly new heuristics should be investigated on real-world data.
Appendix A Omitted Content of Section 3
It remains to show the correctness of Claim 2 and Claim 3, which we use in our reduction from MIDS to PBS. Our gadgets are depicted in Fig. 3. For convenience, we provide by Fig. 8 a copy of them with some additional details, to which we will refer in this appendix. For example is the x-distance between two consecutive (inner) vertices in an edge and a neighborhood gadget (if a gadget is rotated, the distance is measured along the corresponding rotated axis). We know that .
See 2
In (i), both of the shared bends, these are the second and the second last, are skipped and we can take the “long” shortcut from the first to the last bend because the line segment between them is horizontal and has y-distance or or to all inner bends. In (ii), the most critical part is the distance between the third last bend and the straight-line segment from the first to the second last bend (see Figure 3(b)). It is
[TABLE]
Observe that (iii) is the same as (ii) but mirrored. If neither the second nor the second last bend is skipped, i.e., if and are in the set , then we cannot cut short anything in this gadget. Clearly, we cannot take a “long” shortcut from the second to the second last bend because the lower row of inner bends has distance from the potential shortcut segment. Moreover, we cannot take a “short” shortcut from a bend of the lower row to a bend of the upper row or the other way around. If we would aim to skip two inner bends, the distance (see Figure 3(b)) from an inner bend to the shortcut segment would have to be at most . However, it is
[TABLE]
where
[TABLE]
and hence,
[TABLE]
Observe that this becomes even greater if we aim for skipping four or more bends or if we start or end at one of the two shared bends. To make this clearer, we explicitly consider the latter case where a potential shortcut would start at the second bend and end at the -th bend. This situation is depicted in Fig. 9. If it was a valid shortcut, would be less than or equal to . Since is inside a rectangular triangle, its length is
[TABLE]
where is inside another rectangular triangle with legs of length and , so
[TABLE]
We can determine via the angles and as
[TABLE]
In the -functions, all parameters are positive, so they live in the range . Hence, lives in the range . In this range, the -function is monotonously increasing. Therefore, to give a lower bound on , we can use a lower bound on by specifying a lower bound on . Since , and , we state that
[TABLE]
where . A lower bound on is
[TABLE]
So, we can get a lower bound on by
[TABLE]
To prove that is always greater than , it suffices to show that the prefactor is equal to or greater than for all possible values of . We reformulate using well-known trigonometric identities:
[TABLE]
For , this is and, from Equation (16), it is easy to see that is even greater for . Thus, we conclude that always holds.
It remains to consider potential shortcuts starting or ending at the first or the last bend. Clearly, skipping only the second or second last bend is always possible. Skipping the second and the third bend or skipping the second last and the third last bend may sometimes be possible depending on how much the edge gadget is stretched horizontally. However, according to the previous analysis, skipping more bends is not possible since the distance between the potential shortcut segment and the bend before the end point of the potential shortcut is at least .
See 3
Clearly, the shortcuts (i) for skipping any (or exactly one neighbor of ) are valid and there is no shortcut from the first to the last bend since the potential shortcut segment has distance to the upper row of bends. In (ii), there clearly is a shortcut if we start at any and end at any . If we start at some and end at the last bend, observe that, in the most extreme case, the segment from to the last bend has a y-distance to the upper row of
[TABLE]
when it passes in x-dimension. Thus, this shortcut is valid and the same holds for the shortcuts from the first bend to some .
It remains to argue that there are no more shortcuts. A shortcut starting and ending at a bend on the upper or lower row is not possible because it would either be a horizontal segment, which has distance to the other row, or the distance to some bend in between would be at least , which we have shown to be greater than in Equations (7)–(9). It is easy to see that there is no shortcut starting at the first bend and ending at some inner bend of the upper or lower row. The same holds true for shortcuts starting at some inner bend of the upper or lower row and ending at the last bend.
Moreover, a shortcut segment starting (ending) at some for and skipping one bend would have a distance of to this bend as depicted in Fig. 10. Since is inside a rectangular triangle, we can determine by
[TABLE]
where is in another rectangular triangle and thus can be determined by
[TABLE]
Putting them together, we get
[TABLE]
For , this is and again, for , is even greater.
If we skip more than one inner bend, the distance to the last skipped bend becomes only greater. Hence, we conclude that Claim 3 is correct.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] P. K. Agarwal, S. Har-Peled, N. H. Mustafa, and Y. Wang. Near-linear time approximation algorithms for curve simplification. Algorithmica , 42(3-4):203–219, 2005.
- 2[2] H. Alt and M. Godau. Computing the Fréchet distance between two polygonal curves. International Journal of Computational Geometry and Applications , 5:75–91, 1995.
- 3[3] M. Buchin, A. Driemel, and B. Speckmann. Computing the Fréchet distance with shortcuts is NP-hard. In Proceedings of the 30th Annual Symposium on Computational Geometry (So CG’14) , pages 367–376, 2014.
- 4[4] M. Buchin, B. Kilgus, and A. Kölzsch. Group diagrams for representing trajectories. In Proceedings of the 11th ACM SIGSPATIAL International Workshop on Computational Transportation Science , pages 1–10, 2018.
- 5[5] W. S. Chan and F. Chin. Approximation of polygonal curves with minimum number of line segments or minimum error. International Journal of Computational Geometry and Applications , 6(1):59–77, 1996.
- 6[6] M. de Berg, M. J. van Kreveld, and S. Schirra. Topologically correct subdivision simplification using the bandwidth criterion. Cartography and Geographic Information Systems , 25(1):243–257, 1998.
- 7[7] D. H. Douglas and T. K. Peucker. Algorithms for the reduction of the number of points required to represent a digitized line or its caricature. Cartographica , 10(2):112–122, 1973.
- 8[8] R. Estkowski and J. S. Mitchell. Simplifying a polygonal subdivision while keeping it simple. In Proceedings of the 17th Annual Symposium on Computational Geometry (So CG’01) , pages 40–49. ACM, 2001.
