1 Introduction
Consider a given set
\({P}\) of
n distinct points in the plane. A (simple)
polygonization \({\mathcal {P}}\) of
\({P}\) is a closed tour on
\({P}\) that forms a simple polygon. The task of the Computational Geometry Challenge 2019 was to find polygonizations of
\({P}\) with minimum (“
Min-Area”) and maximum (“
Max-Area”) enclosed area for various input sets
\({P}\) and different sizes
n. We refer to Reference [
5] for a survey with background information on this problem and to References [
4,
8,
11,
13] for descriptions of the approaches taken by the other four top teams.
In general, the number of polygonizations of
\({P}\) is exponential in
n: García et al. [
7] prove a lower bound of
\(\Omega (4.642^n)\) for the number of polygonizations that a set of
n points can admit. Hence, a brute-force search for
Min-Area/
Max-Area polygonizations of
\({P}\) is no option. This fact motivated us to base our approach on two key components:
•
The sampler computes an initial polygonization of \({P}\).
•
The optimizer takes a polygonization and, while maintaining its simplicity, modifies the polygonization to optimize its area.
It is obvious that the mere number of polygonizations makes it highly unlikely that a sampler will return a polygonization that already has a near-optimal area. It is less clear, though, whether different classes of initial polygonizations might allow the optimizer to yield better results. And even if a “good” class of initial polygonizations existed, it is not obvious how to compute them efficiently.
We opted for computing a large sample set of polygonizations with different characteristics for every given set of points. This initial generation of polygonizations on
\({P}\) is accomplished by our two random polygon generators
RandomPolygonGenerator (Rpg) and
SweepPolygonGenerator(Spg), by our simple generator for monotone polygons, PAO-mono, and by Helsgaun’s TSP code LKH [
10]. Then, for every polygonization of our sample set, we apply local optimizations to optimize the area. This task is carried out by
Pao-Flip, our polygon area optimizer. Source code of our implementations is available on
GitHub and can be used freely under the
GPL(v3)license :
In the sequel, we describe the algorithms of Rpg and Spg and analyze their properties. Then we present Pao-Flip’s optimization heuristics applied to convert our initial polygonizations into area-optimal polygonizations.
3 Algorithm Engineering
3.1 Efficient 2-opt Moves
The main building block for carrying out 2-opt moves among the edges of a polygon is given by the detection of pairs of edges that intersect. Nevertheless, we need to avoid an all-pairs check. For a static set of straight-line segments, the Bentley-Ottmann line sweep [
2] is a classical (and relatively easy to implement) tool for detecting all pairs of edges that intersect. In our application the situation is complicated by the fact that we need to do more than intersection detection on a static set of line segments: As we apply 2-opt moves to resolve intersections among edges, new intersections can be introduced, so our algorithm must be able to perform a limited version of dynamic segment intersection detection.
As it is not immediately obvious how to combine a line sweep with 2-opt moves in an efficient way, we were surprised that we could not find any experimental analysis of such a line sweep. Hence, we implemented Spg and tested three variants of the sweep. The main sweep direction of all our sweeps is from left to right. Our sweeps differ mainly in how they detect and resolve an intersection.
Assume that the sweep line is at x coordinate \(x_s\) and has encountered a pair of edges that intersect to the right of \(x_s\). This intersection is resolved by applying a 2-opt move.
•
The sweep continues rightwards and scans edges to the right of \(x_s\). When the sweep reaches the right-most input point, we restart the sweep from scratch at the left-most input point. We loop through this left-right sweeping until no further intersection is found.
•
We check for other intersections at the current sweep-line position \(x_s\) and resolve all those intersections. Then the sweep continues rightwards and scans edges to the right of \(x_s\). Again, after reaching the right-most input point we restart the sweep at the left-most input point.
•
We reverse the sweep direction and scan edges to the left of \(x_s\). This leftwards sweep allows us to deal with possibly new edge intersections introduced by the 2-opt move to the left of \(x_s\), i.e., to detect and resolve such intersections by additional 2-opt moves. We resume our rightwards sweep once the leftwards sweep has reached the left-most end-point of an edge involved in one of those 2-opt moves. Hence, we get the invariant that no intersection exists to the left of the current sweep position whenever we sweep rightwards. Thus, the sweep is finished once the right-most input point has been reached.
All three variants have in common that they keep sweeping leftwards or leftwards/rightwards until we get one full pass over all edges that does not reveal an intersection. This way it is guaranteed that the final polygon is simple.
In addition to these three variants of a Bentley-Ottmann line sweep combined with 2-opt moves, we implemented a brute-force algorithm (
BF-2opt) for comparison purposes: Initially, we sort all input points lexicographically
4 with respect to their
x and
y coordinates. Furthermore, we associate with each point
p the two edges of the polygon that are currently incident at
p. Then intersections are found and resolved by two nested loops, where the outer loop runs over all edges
e of the polygon and the inner loop runs over all edges
\(e^{\prime }\) whose projection onto the
x-axis overlaps with
e.
Handling Collinearities. We note that collinear edges need special care, because a 2-opt move applied to collinear edges need not always result in a shortening of the perimeter of the polygon. Worse, we might get into a two-opt cycle that keeps swapping collinear edges forth and back. To guarantee that the perimeter of the polygon decreases strictly monotonically, we
(1)
ensure that consecutive edges of the polygon that are collinear do not overlap each other, and
(2)
refine the 2-opt move for non-consecutive edges that overlap.
If the polygon under consideration contains a vertex sequence
\((\ldots ,v_{i},v_{i+1},\ldots ,v_{j-1},v_{j},\ldots)\) of three or more consecutive vertices that are collinear, then we arrange these vertices in lexicographical order. In the example depicted in Figure
3(b), where
\(j=i+4\), re-arranging these vertices either (as depicted) in descending lexicographical order
\((v_{i+1},v_{i+4},v_{i},v_{i+3},v_{i+2})\) or in ascending lexicographical order
\((v_{i+2},v_{i+3},v_{i},v_{i+4},v_{i+1})\) ensures we reduce the perimeter of the polygon. Thus, in this example, we could choose the order randomly. If only one order reduces the perimeter of the polygon, then this order is chosen. (A simple case analysis shows that at least one of the two orders always reduces the perimeter.)
Now suppose that the polygon under consideration contains vertex sequences
\((\ldots ,v_{i-2}, v_{i-1},v_{i},v_{i+1},v_{i+2},\ldots)\) and
\((\ldots ,v_{j-2},v_{j-1},v_{j},v_{j+1},v_{j+2},\ldots)\) such that the edges
\(\overline{v_{i},v_{i+1}}\) and
\(\overline{v_{j},v_{j+1}}\) are collinear and overlap. We get two basic geometric scenarios of how these edges can interact; see Figure
3(c) and (d). If
\(v_{j}\) lies on
\(\overline{v_{i},v_{i+1}}\) and
\(v_i\) lies on
\(\overline{v_j,v_{j+1}}\), as shown in Figure
3(c), then we resolve the overlap by considering the new vertex sequences
\((\ldots ,v_{i-2},v_{i-1},v_{j-1},v_{j-2},\ldots)\) and
\((\ldots ,v_{i+2},v_{i+1},v_{j},v_{i},v_{j+1},v_{j+2},\ldots)\). If both
\(v_{i}\) and
\(v_{i+1}\) lie on
\(\overline{v_{j},v_{j+1}}\), as shown in Figure
3(d), then we resolve the overlap by considering the new vertex sequences
\((\ldots ,v_{i-2},v_{i-1},v_{i+2},v_{i+3},\ldots)\) and
\((\ldots ,v_{j-1},v_{j},v_{i+1},v_{i},v_{j+1},v_{j+2},\ldots)\). Note that the new vertex sequence, once again, may contain new intersections. However, it is easy to see that such a (generalized) 2-opt move reduces the perimeter of the polygon in both scenarios. Hence, this modification of the standard 2-opt moves guarantees that we will end up with a simple polygon even in the presence of collinear input points.
3.2 PAO-Flip
The flipping and inverting methods were implemented in C\raise.12ex++ in our optimizer Pao-Flip. Its input is a polygonization \({\mathcal {P}}\). Without loss of generality, assume that we seek a Min-Area polygonization. Since inverting can be seen as a special case of flipping, we focus on describing our approach to flipping.
We use Shewchuk’s
Triangle [
14] to construct a constrained Delaunay triangulation of
\(\mathop {CH}({P})\), with the edges of
\({\mathcal {P}}\) forming the constraints. By taking advantage of the triangulation, we can easily identify candidates for a flip. A candidate pair consists of two triangles such that we can flip the adjacent boundary, as described in Section
2.2 and shown in Figure
2.
Pao-Flip scans \({\mathcal {P}}\) until it arrives at a reflex vertex \(v_i\) where the exterior triangle \(\Delta (v_{i-1},v_i,v_{i+1})\) is empty. Due to an oversight in the implementation of the algorithm that was discovered only recently, well after the end of our work, the symmetric case of convex vertices \(v_i\), where the triangle \(\Delta (v_{i-1},v_i,v_{i+1})\) is an interior triangle, was not considered in our implementation. Hence, there may have been invert/flip operations that were not considered and that might have further improved our results.
Emptiness of such a triangle can easily be checked by considering the end-points of all triangulation edges incident to \(v_i\) that are outside of \({\mathcal {P}}\). Then Pao-Flip iterates through the interior triangles incident at \(v_i\) and identifies the triangle \(\Delta (v_i,v_{j},v_{j+1})\) with maximal area and that is incident to a boundary edge. Let \(A_e\) denote the area of that exterior triangle, and let \(A_i\) denote the area of the interior triangle. If \(A_i \gt A_e\), then we apply flipping and decrease the area bounded by \({\mathcal {P}}\). We continue the scan along \({\mathcal {P}}\) and repeat this process until all reflex vertices have been visited.
This basic approach runs in \(\mathcal {O}(n)\) time, because we look at every reflex vertex at most once and we have a linear number of triangulation edges that are examined at most twice. Our tests quickly demonstrated that the polygonizations obtained were not necessarily area-optimal or even very close to area-optimal.
Hence, we introduce randomness by weakening the greediness of our approach: We allow flipping whenever \(v_i\) has at least one incident interior triangle with area \(A_i\) such that \(A_i \gt A_e\). The actual triangle to be flipped is chosen uniformly at random from all interior triangles whose area is greater than \(A_e\). Furthermore, we visit the vertices in random order. This modification does not change the overall runtime, since we iterate over all triangles, store the relevant triangle indices, and then choose an index at random.
Furthermore, we increase the number of candidates available for flipping at a reflex vertex \(v_i\) by first modifying the triangulation. Rather than sticking to the original constrained Delaunay triangulation during the entire optimization, we apply standard edge flipping: Whenever two adjacent interior triangles form a convex quadrilateral, we could flip their common diagonal. A linear number of random edge flips is applied between rounds of triangle flipping.
4 Results
In addition to attempting to compute
Min-Area/
Max-Area-polygonizations, we subjected our codes to various tests. All runtime experiments were run on an Intel Core i7-6700 CPU clocked at
\(3.40 \,\mathrm{G}\mathrm{Hz}\), with
\(256 \,KiB\) L1 cache,
\(1 \,MiB\) L2 cache and
\(8 \,MiB\) L3 cache. Figure
4 shows a plot of the runtimes for the input sets of the Challenge, except for one input with one million points. As predicted by theory, both RPG-space and RPG-star show a slightly super-linear complexity and seem to run in roughly
\(10^{-7} \, n \ln n\) seconds. Among the different variants of the 2-opt algorithm, the brute-force intersection checking of BF-2opt is slowest and the rightwards/leftwards sweep of SPG-reverse is fastest for polygonizations of at least
\(10^3\) input points. For such inputs, SPG-reverse is roughly 25 times faster than BF-2opt, and still about
\(6\,\,{\rm to}\,\,8\) times faster than the other variants based on line sweeping. It is not surprising that BF-2opt works better the smaller the input is.
To get a better understanding of our
Spg variants, we investigated their main characteristics by running each of them 100 times per instance. The left plot of Figure
5 shows how often SPG-resolve and SPG-loop restart from scratch to pass over all edges and how often SPG-reverse reverses the sweep direction from rightwards to leftwards. The plot indicates a barely super-linear growth rate of the number of reversals and a clearly sub-linear growth rate of the number of passes. However, keep in mind that each new pass of SPG-resolve and SPG-loop involves scanning all edges once again, while SPG-reverse avoids the scanning of edges that are known to contain no intersections.
The right plot of Figure
5 tells us that SPG-reverse performs fewer 2-opt moves to obtain a polygonization than SPG-resolve and SPG-loop. Furthermore, SPG-loop again fares better than SPG-resolve. The numbers of 2-opt moves of all three variants seem to have a slightly super-linear growth rate. That is, the number of 2-opt moves observed in our experiments always stayed far below the cubic worst-case bound established by van Leeuwen and Schoone [
15]. (But it is not particularly surprising that our tests failed to make this bound evident, since we tested far too few samples relative to the huge sample spaces of all polygons defined by
n vertices.)
These plots support our understanding that SPG-reverse requires fewer computational steps than SPG-loop, which in turn is better than SPG-resolve. This conclusion is reflected by the runtimes shown in Figure
4.
Our basic assumption during this work was that a polygonization with small (respectively, large) area is more likely to be obtained by our optimizer
Pao-Flip if the initial polygonization has already a comparatively small (respectively, large) area. A second initial assumption was that it would not matter which particular heuristic generated the initial polygon. To probe this claim we analyzed the percentage of the convex hull of a point set that was covered by the interior of an actual polygonization. Figure
6 shows that RPG-space tends to generate initial polygonizations with larger area than RPG-star. We use the same scoring scheme as in the Challenge: The score for an instance is the ratio given by the area achieved divided by the area of the convex hull. Thus, a lower value is better for
Min-Area, whereas a larger value is better for
Max-Area.
No obvious winner is discernible among the 2-opt variants: For each input size of the Challenge, the distribution of the area of a polygonization seems to be roughly uniform within about the same range. Figure
7, where we study the area distribution of a number of polygon samples on two specific point sets, also makes it apparent that the area of samples generated by RPG-star is, on average, comparatively smaller than those of other samples. Looking at the extreme ends of the samples in Table
1(a), we note that within this test-run, the smallest area for the small 60-vertex input was actually found by SPG-loop. For the bigger point set, however, RPG-star wins the smallest sample by a huge margin.
Table
1(b) lists the percentage values for the ancestry of our final
Min-Area/
Max-Area polygonizations. (In that table all 2-opt variants of RPG-2opt, BF-2opt and
Spg are summarized as 2opt.) We were surprised to learn that about 47
\(\%\) of our final
Min-Area polygonizations came from initial polygonizations obtained by running LKH, i.e., from (approximate) TSP tours, while none of our final
Max-Area polygonizations stems from a TSP tour. Apparently, it was easier for
Pao-Flip to trade large-area triangles for small-area triangles if the initial polygonization did not contain chains that zigzag wildly. At any rate, it seems that the initially smaller sizes for polygons produced by RPG-star did not help at the optimizing step in most cases. For some (very) small inputs the initial polygonization produced by 2opt was already optimal or, at least, could not be improved by
Pao-Flip.
Note that the Challenge did not specify a time limit per instance. As the only time limit was given by the final deadline for submitting our results, we kept applying our heuristics and optimization procedures for as long as the competition remained open and whenever we had some capacity to spare on our systems. Instances subject to further sampling or optimization runs were selected at random, with each individual job limited to an hour of CPU time.
5 Discussion
Sample results for
Min-Area/
Max-Area polygonizations are shown in Figure
8. The surprisingly larger percentage of final
Min-Area polygonizations derived from initial TSP tours made us reconsider our basic assumption that a
Max-Area (respectively,
Min-Area) polygonization would be obtained best by applying
Pao-Flip to an initial polygonization with a reasonably large (respectively, small) area. Perhaps we should have been less greedy and should have applied
Pao-Flip randomly to initial polygonizations, without consideration of whether or not the initial polygonization has a reasonably small or large area.
During the tail end of the competition time we noticed that the optimization using Pao-Flip can get stalled at local maxima or minima, with nothing to gain by further flipping. Future work could explore the use of simulated annealing (or similar) to allow Pao-Flip to get away from local minima/maxima. For instance, while looking for a Min-Area polygonization, one could allow some flipping that increases the area of the polygonization. We leave the exploration of these variants to future research.
We note that every polygonization has a positive probability to appear as one of our starting polygonization. (Recall that all 2-opt variants start by generating random sequences of the input points, which form the seed polygons for the subsequent heuristics.) From a theoretical point of view it is interesting to ask whether the space of all polygonizations is connected under the invert/flip operations discussed in Section
2.2. So, given two polygonizations
\({\mathcal {P}} _1,{\mathcal {P}} _2\) on the same set of input points, is it always possible to derive
\({\mathcal {P}} _2\) from
\({\mathcal {P}} _1\) via a series of invert/flip operations? We suppose this to be true but can only prove it for the class of
x-monotone polygonizations. And if this conjecture were true: What is a tight bound on the number of invert/flip operations needed in the worst case to derive
\({\mathcal {P}} _2\) from
\({\mathcal {P}} _1\)?