Dynamic Data Structures for Interval Coloring
Girish Raguvir J, Manas Jyoti Kashyop, N. S. Narayanaswamy

TL;DR
This paper introduces efficient online and dynamic algorithms for interval graph coloring, achieving near-optimal color usage and update times, with theoretical bounds and practical implications for dynamic graph management.
Contribution
The paper presents a novel incremental coloring algorithm with bounded colors and efficient amortized update time, and a fully dynamic algorithm with improved worst-case update bounds for interval graphs.
Findings
Incremental algorithm uses at most 3ω - 2 colors.
Amortized update time is O(log n + Δ).
Fully dynamic updates support insertion in O(log n + Δ log ω) and deletion in O(Δ^2 log n).
Abstract
We consider the dynamic graph coloring problem restricted to the class of interval graphs. At each update step the algorithm is presented with an interval to be colored, or a previously colored interval to delete. The goal of the algorithm is to efficiently maintain a proper coloring of the intervals with as few colors as possible by an online algorithm. In the incremental model, each update step presents the algorithm with an interval to be colored. The problem is closely connected to the online vertex coloring problem of interval graphs for which the Kierstead-Trotter (KT) algorithm achieves the best possible competitive ratio. We first show that a sub-quadratic time direct implementation of the KT-algorithm is unlikely to exist conditioned on the correctness of the Online Boolean Matrix Vector multiplication conjecture due to Henzinger et al. \cite{DBLP:conf/stoc/HenzingerKNS15}. We…
| Procedure | Incremental | Fully Dynamic |
|---|---|---|
| :Maintains the SLS for endpoint | Worst case | Worst case |
| (i) in the Incremental case: as a dynamic array | ||
| and doubly linked list . | (Lemma 9) | (Lemma 17) |
| (ii) in the Fully Dynamic case: Red Black Tree and . | Return value | Return value |
| : From the interval tree , | Worst case | Worst case |
| computes the set of endpoints contained in the interval | ||
| and returns = | (Lemma 10) | (Lemma 18) |
| Return value | Return value | |
| : Updates the endpoints in | Amortized | Worst case |
| on addition of interval at level . For each | ||
| (i) Incremental case: updates and | (Lemma 11) | (Lemma 19) |
| (ii) Fully Dynamic case: updates and | ||
| : Assigns an offset value to from | Worst case | Worst case |
| by considering the offset of the intervals | ||
| intersecting it in | (Lemma 8) | (Lemma 8) |
| Interval Tree [16] | |||
| Method | Description | Running Time | Return |
| Value | |||
| .insert() | Inserts interval into | worst case | - |
| .delete() | Deletes interval from | worst case | - |
| .intersection() | Returns a set of intervals in | ||
| that intersect with | worst case | ||
| Doubly linked list [17] | |||
| .insert() | Inserts element into list | worst case | - |
| .delete() | Deletes element from list | worst case | - |
| .begin() | Returns the first element of list | worst case | |
| Set [18] | |||
| .insert() | Inserts a new element into | amortized | - |
| .begin() | Iterator to the first element of the set | worst case | - |
| .end() | Iterator to the last element of the set | worst case | - |
| Dynamic Array [19] | |||
| .at() | Inserts at -th position of array . | amortized | - |
| Doubles the array size after initialization | |||
| if array is full | |||
| .size() | Returns the size of array | worst case | size |
| Red-Black Tree [20] | |||
| .insert() | Inserts element into | worst case | - |
| .delete() | Deletes element from | worst case | - |
| .max() | Returns the maximum element in | worst case | |
| .min() | Returns the minimum element in | worst case | |
| .empty() | Checks if the tree is empty | worst case | 0/1 |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsScheduling and Timetabling Solutions · Complexity and Algorithms in Graphs · Optimization and Search Problems
Dynamic Data Structures for Interval Coloring 111Preliminary version of this work appeared in International Computing and Combinatorics Conference(COCOON),pages 478-489, 2019
Girish Raguvir J
Manas Jyoti Kashyop
N. S. Narayanaswamy
Department of Computer Science and Engineering, Indian Institute of Technology Madras, Chennai 600036, India
Abstract
We consider the dynamic graph coloring problem restricted to the class of interval graphs in the incremental and fully dynamic setting. The input consists of a sequence of intervals that are to be either colored, or deleted, if previously colored. For the incremental setting, we consider the well studied optimal online algorithm (KT-algorithm) for interval coloring due to Kierstead and Trotter [1]. We present the following results on the dynamic interval coloring problem.
Any direct implementation of the KT-algorithm requires time per interval in the worst case.
There exists an incremental algorithm which supports insertion of an interval in amortized time per update and maintains a proper coloring using at most colors.
There exists a fully dynamic algorithm which supports insertion of an interval in update time and deletion of an interval in update time in the worst case and maintains a proper coloring using at most colors.
The KT-algorithm crucially uses the maximum clique size in an induced subgraph in the neighborhood of a given vertex. We show that the problem of computing the induced subgraph among the neighbors of a given vertex has the same hardness as the online boolean matrix vector multiplication problem [2]. We show that
Any algorithm that computes the induced subgraph among the neighbors of a given vertex requires at least quadratic time unless the OMv conjecture [2] is false.
Finally, we obtain the following result on the OMv conjecture.
If the matrix and the vectors in the online sequence have the consecutive ones property, then the OMv conjecture [2] is false.
keywords:
Dynamic graph algorithms; Interval coloring; Lower bound.
††journal: Theoretical Computer Science
1 Introduction
Maintenance of data structures for graphs in the dynamic setting has been extensively studied. In the dynamic setting, a graph has a fixed set of vertices whereas the edge set keeps evolving by means of edge updates. An edge update consists of either insertion of a new edge or deletion of an existing edge. A dynamic graph is thus a sequence of graphs, , where is the total number of edge updates, initial graph is an empty graph and graph is obtained from by a single edge update. In our work, is an interval graph and an update consists of an interval to be inserted or deleted. Therefore, in our dynamic setting, a single update may insert or delete many edges in the underlying interval graph. This is different from the commonly studied case in the area of dynamic graph algorithms where on each edge update a single edge is inserted or deleted.
The graph coloring problem is one of the most extensively studied problems. In the dynamic setting, graph coloring problem is as follows: there is an online sequence of edge updates and the goal is to maintain proper coloring after every update. Several works ([3],[4],[5] and [6]) propose heuristic and experimental results on the dynamic graph coloring problem. To the best of our knowledge, the formal analysis of data structures for dynamic graph coloring have been done in [7], [8], [9], [10], [11], and [12]. We continue the study of dynamic data structures for graph coloring. We focus on interval graphs in the incremental as well as in the fully dynamic setting. The online update sequence consists of intervals and our goal is to maintain a proper coloring of the intervals with as few colors as possible while maintaining a small update time. In the incremental setting, each update in the online update sequence consists of an interval to be colored. In the fully dynamic setting, each update in the online update sequence consists of either an interval to be colored or a previously colored interval to be deleted.
In the incremental setting, intervals in the update sequence are inserted one after the other and we aim to efficiently maintain a proper coloring of the intervals using as few colors as possible after every update. Our approach is to consider efficient implementations of well-studied online algorithms for interval coloring. Online algorithms for interval coloring and variants is a rich area with many results [13]. Note that an online algorithm is not allowed to re-color a vertex during the execution of the algorithm. On the other hand, an incremental algorithm is not restricted in anyway during an update step except that we desire that the updates be done as efficiently as possible. Naturally, an online interval coloring algorithm which is efficiently implementable is a good candidate for an incremental interval coloring algorithm as it only assigns a color to the current interval, and does not change the color of any of the other intervals. For the online interval coloring problem, Kierstead and Trotter presented a competitive algorithm (KT-algorithm) and they also proved that their result is tight [1]. The tightness is proved by showing the existence of an adaptive adversary that forces an online algorithm to use colors where is the maximum clique size in the interval graph formed by the given set of intervals. On the other hand, the KT-algorithm uses at most colors.
1.1 Our Results
Our goal is to design incremental and fully-dynamic algorithms for interval coloring. Towards this, we study efficient implementations of the KT-algorithm. The KT-algorithm computes a coloring in which each color is a 2-tuple , where is the level value of and is the offset of . In the incremental and fully-dynamic setting, we design efficient 3-approximation algorithms for interval coloring. In the incremental case our results leave open the possibility of improving the number of colors used by sacrificing the constraint in online algorithms that an interval cannot be re-colored. We start by considering the efficiency of a direct implementation of the KT-algorithm. A direct implementation uses a data structure that only maintains the intervals and responds to intersection queries by reporting the intervals which intersect a queried interval. We show the following result in Section 2.2.
Any direct implementation of the KT-algorithm requires time per interval in the worst case, where is the maximum degree of a vertex in the associated interval graph. (Theorem 2)
We then show that a comparison based data structure which supports the insertion of a new interval or computes the number of intervals intersecting a given interval requires comparisons for at least one of the operations (Lemma 3). In Section 2.3, our next result is a different approach to compute the level value for an interval. This approach avoids the lower bound for a direct implementation by maintaining additional information associated with the intervals that have been colored. While our approach, called Algorithm , uses the same number of colors as the KT-algorithm, we show that the level value for each interval computed by our approach is at most the level value computed by the KT-algorithm ( Lemma 5). We show an example where for an interval the level value computed by Algorithm is smaller than the level value computed by the KT-algorithm. We design an incremental interval coloring algorithm which implements Algorithm in Section 3 and show that it uses at most colors.
There exists an incremental algorithm which supports insertion of an interval in amortized time per update, where is the total number of intervals in the update sequence and is the maximum degree of a vertex in the interval graph formed by those intervals. (Theorem 7)
In Section 4, in the fully dynamic setting, an interval that has already been colored can be deleted, apart from the insertions. At the end of each update, our aim is to maintain a coloring of the remaining set of intervals, where is the maximum clique in the interval graph associated with the remaining set of intervals. In order to bound the number of colors to , deletion of an interval may trigger a change in the colors of some of the remaining intervals creating a set of dirty intervals. Cleaning up of those dirty intervals may in turn create more dirty intervals resulting in a cascading effect. We design an approach to efficiently compute the set of such dirty intervals after a deletion. Thus, we present a fully dynamic algorithm for interval coloring in the fully dynamic setting.
There exists a fully dynamic algorithm which supports insertion of an interval in update time and deletion of an interval in update time in the worst case, where is the total number of intervals inserted and is the maximum degree of a vertex in the interval graph formed by those intervals. (Theorem 16)
Our final contribution is motivated by the fact that the KT-algorithm computes the maximum clique size in an induced subgraph of the neighbors of the current interval. In our attempt to design efficient data structures to report the neighborhood of a vertex we encountered a connection to the online boolean matrix vector multiplication problem and the related OMv conjecture, which is due to Henzinger et. al [2]. We present a reduction in Section 5 where we show the following result.
Any algorithm that needs to compute induced subgraph among the neighbors of a given vertex requires at least quadratic time unless online boolean matrix vector multiplication conjecture is false.(Theorem 20)
Finally, we use the well-known interval tree data structure to obtain the following result on online boolean matrix vector multiplication conjecture.
In the online boolean matrix vector multiplication problem, if the boolean matrix and the vectors in the online sequence have consecutive ones property then the OMv conjecture is false. (Theorem 21)
2 Kierstead-Trotter algorithm and Supporting Line Segment
Let the set denote a sequence of intervals, and let denote the associated interval graph. For , let where and represent the left and right endpoint of , respectively. Let be the ordering of vertices of interval graph where vertex is the -th vertex in and it corresponds to the interval in . Let , , and denote the size of the maximum cardinality clique in G, the maximum degree of a vertex in , and the chromatic number of , respectively. It is well-known that for interval graphs . When the graph is clear, refer to these numbers as , , and .
2.1 Kierstead-Trotter algorithm - overview
The intervals in the sequence are presented to the online KT-algorithm. For , let be the interval presented and let be the corresponding vertex in . The KT-algorithm computes a color based on the color given to the vertices . The color assigned to a vertex is a tuple of two values and is denoted as . is called the level value, is called the offset, and is said to be in level . is computed in Step I and in Step II is computed. The key property is that for each edge , the tuple is different from .
**Step I: ** For , let denote the induced subgraph of on the vertex set . Define = .
**Key Properties maintained by Step I [1]: **
For each vertex , .
- 2.
Property P : The set is an independent set. For each , , the subgraph of induced on has maximum degree at most 2.
**Step II: ** is chosen to be the smallest value from the set which is different from the offset of each of the at most two neighbors whose level is .
Analysis: Since the vertices with level value [math] form an independent set, the offset for all these vertices is . Therefore, the color for all the vertices in level [math] is . By Property P, for each level , the maximum degree in the graph induced by vertices in the level is . Therefore, the algorithm uses at most colors, , , and , to color the vertices in level . Hence, total number of colors used by the algorithm is at most = .
2.2 Quadratic lower bound for computing the in the graph induced by a subset of neighbors of a vertex
We start by considering implementations of the KT-algorithm in which the data structures are designed only to store the input intervals and support only intersection queries among intervals. We refer to such an implementation as a direct implementation and prove a lower bound on the running time of a direct implementation. This lower bound motivates the additional data structures that are necessary to obtain an implementation of the KT-algorithm with a better running time. We start by observing a lower bound on the time to identify the size of a maximum clique in a given set of intervals.
Lemma 1**.**
A deterministic algorithm which computes the maximum clique size in the interval graph formed by a given set of intervals has running time .
Proof.
The proof is by contradiction. Let be a deterministic algorithm such that on each input consisting of a set of intervals, it reports the maximum clique size in the corresponding interval graph in time. Due to this assumption, it follows that algorithm does read the entire input on intervals. Let us consider the execution of on interval sequence . Let be the index such that during the execution, does not read the -th interval. Consider obtained from by replacing the -th interval by an interval disjoint from all the other intervals. Since the execution of does not read the -th interval, the output on both and will both be the same. However, both the outputs cannot be correct since the maximum clique size for is and for is . Therefore, for a deterministic algorithm to compute the maximum clique size on all inputs, it must read all the intervals in the input, and thus its running time is . Hence the Lemma. ∎
Using Lemma 1 we prove that a direct implementation of the KT-algorithm will have a running time.
Theorem 2**.**
A direct implementation of the KT-algorithm has running time where is the maximum degree of a vertex in the associated interval graph.
Proof.
For each and , to check if , the KT-algorithm computes the maximum clique size in the interval graph formed by the intervals with level value at most and intersecting with the input interval . From Lemma 1, computing the size of the maximum clique takes time. The KT-algorithm repeats this computation for the values of starting from [math] until the value for which is true. The worst case is reached for the input sequence in Lemma 1 for which the clique will be computed in the graphs . The running time of a direct implementation on is which is . Therefore, a direct implementation of the KT-algorithm takes time in the worst case. Hence the Theorem. ∎
While our subsequent results show that we can maintain additional information to circumvent the lower bound faced by a direct implementation, we observe that any comparison based data structure to maintain the given set of intervals and respond to intersection queries uses comparisons.
Lemma 3**.**
Let D be a comparison based data structure which supports the following operations on an interval :
D.Insert([l, r]): Inserts interval [l, r] into D.
- 2.
D.Query([l, r]): Returns the total number of intervals in D which are intersecting with [l, r].
Let be the number of comparisons performed during D.Insert([l, r]) before inserting [l, r]. Let be the number of comparisons performed during D.Query([l, r]) before responding to the query. Then either or is .
Proof.
It is possible to use data structure D to design a comparison based sorting algorithm which is defined as follows: Let be distinct numbers given as input for comparison sorting. For every , perform . A linear search is performed on to find the minimum denoted by min. Finally, to compute the sorted order, for each , perform . If query returns , then is the -th element in the sorted order. Thus, the total number of comparisons required to find the sorted order is + + . It is well-known that any comparison sorting of numbers performs comparisons [14]. Therefore, + + is . This implies that either = or . Hence the Lemma. ∎
The result in Lemma 3 shows that any interval tree based approach which maintains the input intervals and computes intersecting intervals will use comparisons. On the other hand, in Section 2.3 we overcome the lower bound presented in Theorem 2 by maintaining additional information about the coloring computed by the KT-algorithm. This additional information plays a crucial role in an efficient data structure for the KT-algorithm.
2.3 Supporting Line Segment (SLS)-a geometric handle
To overcome the limitation of computing a maximum clique in an induced subgraph, we maintain the size of some cliques, and use the structure of interval graphs to conclude that these cliques indeed represent a maximum clique in the neighborhood of each interval. The algorithm can be seen as an efficient version of the KT-algorithm and we refer to it as (KT-algorithm using supporting line segment).
For each , and denote the level value computed by and the KT-algorithm, respectively. Further, for each , and denote the offset computed by and the KT-algorithm, respectively. Let be a non-negative real number and be the set of all intervals in which contain the point . For the set , define the set to be the set of levels assigned to intervals in . Define = . In other words, is the smallest non-negative integer which is not the level value for an interval containing . For , the Supporting Line Segment(SLS) at is defined to be the set = , and is called the height of the SLS . Note that the set is of size at least , and there are intervals in for which the level values are [math] to .
Algorithm : For , the color for is computed as follows:
Step I: For each , compute the height, , of the SLS . Define .
Key Properties maintained by Step I
- (a)
For each interval , (Lemma 5).
- (b)
Property P : The set is an independent set. For each , , the subgraph of induced on has maximum degree at most 2 (Lemma 6). 2. 2.
Step II: : Compute to be the smallest value from the set that is different from the offset of the neighbours of which have the level .
Remark: We show an example in Figure LABEL:fig:example, where for interval , the level value computed by is strictly less than the level value computed by the KT-algorithm. The intervals arrive in the following order : , and . Intervals , , and get level value [math]. KT-algorithm computes and computes . The portion colored as gray is the overlapping portion between interval and interval .
Correctness of Algorithm . We start by proving that the level value computed by Algorithm depends only on the set of endpoints of the intervals. This finite set is denoted by and we refer to the points in the set as endpoints.
Lemma 4**.**
For each real number , . Further, there is an endpoint such that .
Proof.
If is an endpoint of an interval, then and hence the Lemma is proved. Suppose is not an endpoint. Let denote the largest left endpoint among all the intervals in and denote the smallest right endpoint among all the intervals in . By definition, and . Since is a set of intervals, it follows that and are present in all the intervals in . Further, since both and are present in each interval in , it follows that the set of intervals that contain them is a superset of . Therefore, the height of the SLS at and is at least the height of the SLS at . Hence the Lemma. ∎
From the description in Section 2.1, we know that the level value of computed by the KT-algorithm is given by = . Further in Algorithm , is defined to be the maximum height of the SLS at the endpoints contained in . We next prove that .
Lemma 5**.**
For each , is at least the maximum height of the SLS at any endpoint contained in the interval .
Proof.
By definition, is the maximum height of SLS at any endpoint contained in . By the definition of the height of an SLS at an endpoint , we know that for each there is an interval such that , and all these intervals form a clique of size . Therefore, it follows that has a clique of size at least . Therefore, it follows that . Hence the Lemma. ∎
We prove in Lemma 6 that the level values computed by Algorithm satisfy Property P.
Lemma 6**.**
Algorithm satisfies Property P and thus uses at most colors.
Proof.
We first prove that the set is an independent set. To show this, we prove that for a pair of intersecting intervals and , at least one of or is more than 0. Without loss of generality, let us assume that the interval appeared before . If , then our claim is correct. We now consider the case when . Since and intersect, it follows that an endpoint of one of them is contained in the other. Therefore, after is presented to Algorithm , the SLS of one of the endpoints in is more than 0. By the definition of in Algorithm , it follows that . Therefore, is an independent set. The same argument also shows that if , then and .
We now prove that the level value computed by Algorithm satisfies Property P. Let be the first interval during the execution of the algorithm which has at least 3 intersecting intervals in level . Let these intervals be . Since there cannot be a containment relationship between two intersecting intervals with the same level value, it follows that two intervals contain a common endpoint with . Without loss of generality, let and contain a common endpoint of . Further, we know by the Helly property for intervals that one of the intervals is contained in the union of the other two [15]. Consequently, one of the 3 intervals contains a point for which the SLS has height . This contradicts the hypothesis that . Therefore, our assumption that an interval has at least 3 neighbours in is wrong. Consequently, all the intervals with the same level value are assigned an offset from . Further, if two intervals with the same level value intersect, then they get different offsets as described in Step II. From Lemma 5, we know that for any interval , we have . Therefore, maximum level value of any interval is . For level [math] we use one color and for every other level we use at most colors. Therefore, the number of colors used by ALgorithm is = . Hence the Lemma. ∎
In the rest of the paper we design data structures that are useful in an efficient implementation of Algorithm in both the incremental and fully dynamic settings. Apart from data structures to maintain intervals, we also use data structures to maintain supporting line segments. The data structures to maintain the supporting line segments are crucial in overcoming the limitations of a direct implementation of the KT-algorithm. The necessary data structures are described in Section 2.4.
2.4 Dynamic Data Structures for Algorithm
In this section, we present the various data structures used to implement in the incremental and fully dynamic setting. For this purpose, we come up with the procedures listed in Table 1. The procedures in the incremental setting differ from their counterparts in the fully dynamic setting on the data structures used to store SLS. The running time of the incremental and the fully dynamic algorithms are governed by the running times of these procedures. Detailed descriptions of the procedures are given in Section 3.1 for incremental setting and in Section 4.4 for fully dynamic setting. Next, we describe the different data structures which are used to implement the procedures in Table 1. Running time of different operations on these data structures are listed in Table 2.
Interval Trees to store intervals and endpoints:
Set of intervals . The set of intervals is maintained as an interval tree. Therefore, is an interval tree such that for each , interval is maintained as its left and right endpoints and . Further, the level value and offset and are computed at the time of insertion, and updated as necessary in the fully dynamic case. The index of the update when is inserted is also stored and referred as time of insertion whenever necessary. 2. 2.
Set of endpoints . The set of endpoints of the intervals in is stored as an interval tree denoted by . For every interval =, we maintain the left endpoint and the right endpoint as intervals and respectively in . 3. 3.
Hash table points to set of intervals with same level value. For a non-negative integer , points to the interval tree which maintains the set of intervals with level value .
Data Structures to store supporting line segments: At every endpoint , the SLS and the height, , of is maintained. In the incremental setting, the height of an SLS is non-decreasing with the updates and this need not be true in the fully dynamic case. Thus we have different data structures to represent SLS in the incremental setting and the fully dynamic setting.
**Incremental setting: ** In the incremental setting, the SLS is maintained using a dynamic array . For a level value , is defined to be if there is an interval containing whose level value is . Otherwise, is defined to be [math]. Clearly, the height of the supporting line segment is the smallest index such that . To respond to queries for efficiently, a doubly linked list and a dynamic array are used as follows. The head of the doubly linked list contains the value of , and following it, the set as a doubly linked list in increasing order of the value of . To maintain , we define a doubly linked list which stores every index in where is [math] in the increasing order of the value of . Note that the value stored at the head node of is . The dynamic array is defined as follows: for each , if , then stores a pointer to the node in which stores the index ; otherwise, . Using the dynamic array , insert, delete, and search operations in can be performed in constant time. Insertion into a dynamic array takes amortized constant time [19]. A query for the value of can be answered in constant time by returning the value stored in the head node of . 2. 2.
**Fully dynamic setting: ** SLS is maintained using two Red-Black trees, and . A level value is stored in if there is an interval in which contains and whose level value is . Otherwise, a level value is stored in . To compute the height , we do the following: if is non-empty, then the minimum value in is the required height . If is empty, then is one more than the maximum value in . The number of nodes in the trees and is at most . Therefore, the time required to compute is .
3 Incremental Interval Coloring using Algorithm
We present the incremental algorithm which is an implementation of Algorithm . The pseudo code of is presented in Algorithm 1 along with the corresponding steps. The procedures used in described in Section 3.1. The amortized update time of is given by the Theorem 7.
**Computing :
Step 1**: Insert into the set of intervals (Line 1 in Algorithm 1). Check if the endpoint is already present in the set of endpoints (Line 4 in Algorithm 1). If not, then compute the SLS at endpoint (Line 5 in Algorithm 1) and insert into the set (Line 6 in Algorithm 1). Repeat for endpoint (Line 8-11 in Algorithm 1).
Step 2: Compute the set . For each , compute , the height of the SLS at . Assign (Line 12-13 in Algorithm 1).
Step 3: Update the SLS for each point (Line 14 in Algorithm 1).
Computing : Compute to be the smallest value from the set which is different from the offset of the neighbours of which have the level (Line 15 in Algorithm 1).
Algorithm 1 () is used to handle insertion of interval
1:.insert()
2:
3:
4:if (.intersection( = [math]) then
5: (,)
6: .insert()
7:end if
8:if (.intersection( = [math]) then
9: (,)
10: .insert()
11:end if
12: (,)
13:
14:()
15:()
Theorem 7**.**
* is an incremental algorithm which supports insertion of a sequence of intervals in amortized time per update.*
Proof.
We analyze the running time for computing and .
Analysis for computing : Computing involves steps.
Step 1 in computing takes time: Insertion of = into takes time. Let = . Checking if is present in by an intersection query takes time in the worst case. If is in , then no further processing is done. On the other hand, if is not in then procedure () is invoked. From Lemma 9, procedure takes time. The same steps are repeated for = . Hence Step 1 in computing for interval takes time. 2. 2.
Step 2 in computing takes time: Procedure \sf compute\mbox{-}max\mbox{-}SLS$$(\mathcal{E},I_{i}) is invoked to perform this step. From Lemma 10, procedure (Algorithm 4) takes time. Hence Step 2 in computing for interval takes time. 3. 3.
Step in computing takes amortized time: Procedure \sf update\mbox{-}Endpoints$$(S,L(I_{i})) is invoked to perform this step. From Lemma 11, procedure (Algorithm 5) takes amortized time. Hence Step in computing for interval takes amortized time.
Analysis for computing : To compute the offset value of interval with level value , procedure \sf compute\mbox{-}Offset$$(I_{i}) is invoked. From Lemma 8, procedure \sf compute\mbox{-}Offset$$(I_{i}) takes time.
Therefore, total time taken by for insertion of intervals is the total time taken for Step 1, Step 2, Step 3, and the total time spent in computing the offset. For interval graphs, it is well known that . Thus the running time is . Therefore, the amortized update time over a sequence of interval insertions is . Hence the Theorem. ∎
3.1 Procedures used in
The data structures used in designing these procedures are listed in Table 2.
Lemma 8**.**
Procedure takes as input interval , computes the offset value for interval and takes time in the worst case.
Proof.
Hash table is used to access the interval tree . If is NULL, then an interval tree is created with as the first interval (Line 2-3 in Algorithm 2). Otherwise, is the interval tree which stores all the intervals with level value same as . An intersection query is performed on with to obtain all the intervals that intersect with (Line 5 in Algorithm 2). From Property P, the maximum number of intervals returned by the above query is . The offset value of interval , , is set to be the smallest value from not assigned to any of the at most two neighbors of in level (Line 6 in Algorithm 2). Interval is inserted to (Line 7 in Algorithm 2).
Running time of is dominated by intersection query in Line 5 and insertion of interval in Line 7. Since , worst case running time of is . Hence the Lemma. ∎
Lemma 9**.**
Procedure takes set of intervals and endpoint as input, maintains SLS at using dynamic array and doubly linked list , and takes time in the worst case.
Proof.
performs an intersection query on with interval (Line 2 in Algorithm 3). Let denote the set returned by the intersection query. Maximum height = and the set = are computed (Line 7-9 in Algorithm 3). For every in the range , is set to [math] and is inserted to (Line 11-13 in Algorithm 3). For every in the range , is reset to if and is deleted from (Line 15-17 in Algorithm 3). It returns and .
Running time of is dominated by the intersection query in Line 2, loop in Line 7-9, loop in Line 11-13, and loop in Line 15-17. At any level, SLS intersects with at most intervals and we have many levels. Hence, = , , and . Again, . Therefore, time taken by procedure in the worst case is . Hence the Lemma. ∎
Lemma 10**.**
Procedure takes set of endpoints and interval as input, computes the set of endpoints contained in and maximum among the height of the SLS at the endpoints in , and takes time in the worst case.
Proof.
Procedure performs an intersection query of on (Line 2 in Algorithm 4). This query returns the set of all the endpoints which intersect with . It computes the maximum among the height of the SLS at endpoints in , denoted by (Line 4-11). The procedure returns and the set .
Running time of is dominated by intersection query in Line 2 and loop from Line 4-11. Since is the maximum degree in the associated interval graph, interval can intersect with at most intervals. Therefore, = . Further, . Thus the worst case time taken by is . Hence the Lemma. ∎
The different procedures used in are described and analyzed below.
**Procedure \sf update\mbox{-}Endpoints$$(S,L(I)): ** This procedure, described in Algorithm 5, is used to update the SLS at the endpoints contained in set : let . For every endpoint , size of is checked.
Case A: If .size(). In this case, is set to . The pointer stored in is used to delete the node in which stores the value and is set to NULL subsequently. If the deleted node in was the head node, then the head node is updated to the next node in and thus the value of also gets updated. 2. 2.
Case B: If .size(). In this case, the standard doubling technique for expansion of dynamic arrays [19] is used to increase the size of until .size() becomes strictly greater than . is also expanded along with and appropriate nodes are inserted to . Once .size() , the remaining operations are same as in the case A.
Lemma 11**.**
Procedure takes amortized time.
Proof.
For every endpoint , size of is checked in constant time. To analyze the time required in , we observe that every update must perform the operations as described in case A. We refer to these operations as task M(M stands for mandatory). Some updates have to perform additional operations as described in case B. We refer to these operations as task A(A stands for additional). The time taken by each update to perform task M is . Since is the maximum degree, it follows that . Therefore, every update takes time to perform task M in the worst case. To analyze the time required to perform task A, we crucially use the fact that our algorithm is incremental and hence only expansions of the dynamic arrays take place. Since is the size of the maximum clique, it follows that the maximum size of a dynamic array throughout the entire execution of the algorithm is upper bounded by . Over a sequence of insertions, the total number of endpoints is upper bounded by . Therefore, we maintain at most dynamic arrays. For every such array, total number of inserts in the array and the associated doubly linked list is at most in the entire run of the algorithm. An insertion into the dynamic array takes constant amortized time and insertion into doubly linked list takes constant worst case time. Therefore, during the entire run of the algorithm total time required to perform task A on one dynamic array and its associated doubly linked list is . This implies that during the entire run of the algorithm total time spent on task A over all the updates is . Let be the total time spent on at the end of insertions. This is the sum of the total time for task A and the total time for task M. Further, since , it follows that . Hence the Lemma. ∎
4 Fully dynamic interval coloring
An update in the update sequence in the fully dynamic setting consists of an interval to be colored or a previously colored interval to be deleted. The -th update is Insert where is the interval presented to the algorithm. The update Delete is to delete the interval that was inserted during the -th update. At the end of each update, the invariants are maintained such that it follows that the intervals are colored with at most colors, where is the size of the maximum clique in the interval graph just after the update. For an insert update, we use (Algorithm 1) to ensure that the invariants are maintained at the end of the update. However, to get a good bound on the update time, we use a different set of data structures to maintain SLS (see Section 2.4). Therefore, the major result in this section is to handle the delete of a previously colored interval. There are two aspects in the algorithm: the first one is to ensure that the invariants are maintained after a delete, and the second one is to ensure that the update is efficient.
4.1 Algorithm for Delete
Let be the color of at the beginning of the update. The pseudo code and steps of Algorithm are presented in Algorithm 6.
Step 1: Remove interval from the set of intervals and hash table .(Line 1 and Line 2 in Algorithm 6)
Step 2: Compute the set of endpoints contained in . Let = . (Line 3 in Algorithm 6)
Step 3: For each endpoint , update SLS to reflect the deletion of interval . (Line 4 to 8 in Algorithm 6)
Step 4: Compute the set of intervals intersecting with and with level value strictly greater than . Let = (Line 10 in Algorithm 6). Sort in the increasing order of level value and break the ties in the increasing order of time of insertion. For every interval in , repeat the following steps:
Compute the set of endpoints intersecting with . Let = . For every endpoint compute the height of the SLS . Compute = (Line 11 in Algorithm 6).
If then no further processing is required for interval .
If then following steps are executed (Line 17 to 25 in Algorithm 6):
(a)
Change level value of from to .
(b)
Recompute the offset value for with the new level value .
(c)
Update SLS for every point to reflect the change in level value of .
Algorithm 6 () is used to handle deletion of interval
1:delete
2:.delete()
3:.intersection()
4:for in do
5: if .intersection( then
6: .delete(
7: .insert()
8: end if
9:end for
10:Compute . Sort in the increasing order of level value and break the ties in the increasing order of time of insertion.
11:for in do
12: (,)
13: if then
14: continue
15: end if
16: delete()
17: insert()
18: for in do
19: if .intersection( then
20: .delete(
21: .insert()
22: end if
23: .insert()
24: .delete()
25: end for
26:
27: ()
28:end for
4.2 Correctness of
The first crucial property maintained by Algorithm is that at the end of the update, for each interval , there exists a point in such that the height of the SLS at is at least . We refer to this property as Invariant C. The second crucial property maintained is property P. The following lemma proves a bound on the number of colors used.
Lemma 12**.**
At the end of each update, the number of colors used is at most , where is the size of the maximum clique in the interval graph just after the update.
Proof.
We start by assuming that prior to the update the invariant C and Property P is satisfied by the coloring. We show that after the update they continue to be satisfied. For update Insert invariant C and Property P are satisfied by the coloring at the end of the update. This follows from Lemma 5 which shows that invariant C is maintained by Algorithm and Lemma 6 which proves that Algorithm maintains Property P.
For update Delete we know from Lemma 13 that the set consists of those intervals whose level values are changed by Algorithm to ensure that invariant C is maintained. Algorithm iterates over each interval in and ensures, by modifying if necessary, that there is a point such that . Whenever is modified, it is modified to be the maximum , over all . This choice of also ensures that has at most two neighbors in the level and none of the neighbors with level number has a containment relationship with . The proof uses the same argument in Lemma 6. Thus Property P is maintained by Algorithm .
Therefore, it follows that after the update, Property is satisfied by the level values of the intervals and from invariant C it follows that the largest level value of any interval is at most . Further, the intervals whose level values are [math] form an independent set. Therefore, the number of colors used by the algorithm at the end of an update is . Consequently, the algorithm uses at most colors after each update step. Hence the Lemma. ∎
We now prove that on the update Delete, it is sufficient for Algorithm to consider the set which is defined to be = . On update Delete, an interval is called dirty if during execution of Algorithm invariant C is violated for that interval. An interval which is not dirty during the execution of Algorithm is called clean. In Lemma 13, we show that is a super set of all such intervals which become dirty.
Lemma 13**.**
During the execution of Algorithm on the update Delete, if an interval becomes dirty, then .
Proof.
On deletion of , the supporting line segments are naturally classified into two sets: those whose height reduces and those whose height does not reduce. Let be a point for which the height does not reduce. First we consider an interval which contains , , and , and show that does not become dirty during the execution of Algorithm . This is because, during the execution of the algorithm, only intervals with level value greater than are considered for a reduction in level value. Therefore, the height of will remain at least throughout the execution of the algorithm. Therefore, does not become dirty during the execution of Algorithm .
Next, let us consider an interval which contains , , and . The algorithm considers the intervals in increasing order of level number. Thus, it follows that the level value of an interval which contains and for which will not reduce during the execution of Algorithm . Therefore, the height of the SLS at does not change throughout the execution of Algorithm , and thus does not become dirty during the execution of Algorithm . Therefore, an interval which becomes dirty during the execution of Algorithm must contain a whose height reduces on the deletion of . Thus, intersects with at , and as we have proved above, must have level value more than . In other words it must be an element of . Hence the Lemma. ∎
4.3 Worst-case analysis of runtime of and
Lemma 14**.**
Algorithm 6 implements in time.
Proof.
It is clear from the description that Algorithm 6 implements each of the steps of . The most expensive steps in Algorihtm 6 are the computation and sorting of , and updating the level values of the intervals in , if necessary, in lines 11-28. is computed by an intersection query to the interval tree and the worst-case running time is , where is the number of endpoints in which is the number of intervals intersecting with . Subsequently, sorting takes time time. Each iteration in lines 11-28 is for an interval , and the running time of an iteration is dominated by the iteration in lines 18-25. The number of times lines 18-25 is executed is which the number of endpoints in , where is . In each of these iterations, the Red-Black trees are updated to reflect the height of the corresponding supporting line segment, and this takes time, using the fact that the number of values in each Red-Black tree is at most . Thus the running time of Algorithm 6 is . Hence the Lemma. ∎
We next analyze which is implemented by Algorithm 1 in the fully-dynamic setting by representing supporting line segments as Red-Black trees. Recall that in the incremental case they were represented by dynamic arrays and a doubly linked list.
Lemma 15**.**
Algorithm 1 with supporting line segments represented as Red-Black trees implement in the fully-dynamic setting. Algorithm 1 inserts an interval in worst case time.
Proof.
In Lemma 17, Lemma 18, and Lemma 19, respectively, we prove that in the fully-dynamic setting, with SLS represented as Red-Black trees, , , and correctly implement Step 1, Step 2, and Step 3, of correctly. Therefore, it follows that is implemented by Algorithm 1 correctly. Further, these Lemmas also show that the worst-case running times of these functions is , , , respectively. Therefore, the worst-case running time, of implemented by Algorithm 1, with SLS represented as Red-Black trees, is . Hence the Lemma. ∎
Our fully-dynamic algorithm for interval coloring follows by combining Lemma 14 and Lemma 15.
Theorem 16**.**
There exists a fully dynamic algorithm which supports insertion of an interval in and deletion of an interval in worst case time.
4.4 Procedures used in and
The procedures , , and which are defined in Section 3.1 are defined in this section with Red-Black trees used to represent supporting line segments. The worst-case running time of these procedures differ from their running times in Section 3.1. The data structures used in designing these procedures are listed in Table 2.
Lemma 17**.**
Procedure takes as input the set of intervals and endpoint , maintains SLS at as Red-Black trees and , and takes time in the worst case.
Proof.
performs an intersection query on with (Line 2 in Algorithm 7). The query returns all the intervals in which contain endpoint . Let denote the set returned by the intersection query. Set = and height = are computed (Line 5-8 in Algorithm 7). For every in the range , is inserted to (Line 11-13 in Algorithm 7). For every in the set , is deleted from and inserted to (Line 14-17 in Algorithm 7). returns and .
Running time of is dominated by the intersection query in Line 2, and loops in Line 11-13 and Line 14-17. At any level, SLS intersects with at most intervals and we have many levels. Hence, = . Again, . Therefore, intersection query takes . Further, a single insertion in and takes time. Therefore, total time taken by the loops is . This implies that the worst case time taken by is . Hence the Lemma. ∎
Lemma 18**.**
Procedure takes as input set of endpoints and interval , computes the maximum height of SLS contained in interval , and takes time in the worst case.
Proof.
works as follows (Algorithm 8): an intersection query is performed on with (Line 2). Let be the set returned by the intersection query. For every endpoint , to compute the height of SLS , the following steps are used: If is non empty then the minimum value in is assigned to (Line 10). Otherwise, is assigned a value which is one more than the maximum value in (Line 8). The maximum value of the height of an SLS at any endpoint in is computed as = (Line 12). The procedure returns the set and value .
Running time of is dominated by the intersection query in Line 2 and the loop in Line 4-13. We know that and . Therefore, worst case time taken by is . Hence the Lemma. ∎
Lemma 19**.**
Procedure takes set of endpoints and as input, update the SLS at the endpoints contained in set and takes time in the worst case.
Proof.
The procedure works as follows (Algorithm 9): for every , is deleted from and is inserted to . For one SLS it takes time and . Therefore, worst case time taken by is . Hence the Lemma. ∎
Procedure \sf compute\mbox{-}Offset$$(I): This procedure is same as the one described in Section 3.1.
5 Quadratic lower bound for induced neighborhood subgraph computation
In Section 2.2 we showed that a direct implementation of the KT-algorithm will not run in sub-quadratic time. From Section 2.1 the crucial step is to compute maximum clique in an induced subgraph of the neighborhood of the interval inserted during an update. In this section we explore an interesting connection between computing the induced subgraph of the neighborhood of a vertex in a graph and the well-known OMv conjecture due to Henzinger et al., [2]. Formally, we define the following problem:
**Induced Neighborhood Subgraph Computation: ** The input to the Induced Neighborhood Subgraph Computation problem consists of the adjacency matrix of a directed graph and a set of vertices. The goal is to compute the graph induced by and output the subgraph as adjacency lists. Here is the set of those vertices which have a directed edge from some vertex in . In other words, there is a directed edge from to iff the entry is .
We show that Induced Neighborhood Subgraph Computation problem is at least as hard as the following problem.
Online Boolean Matrix-Vector Multiplication (OMv)[2]: The input for this online problem consists of an matrix , and a sequence of boolean column vectors , presented one after another to the algorithm. For each , the online algorithm should output before is presented to the algorithm. Note that in this product, a multiplication is an AND operation and the addition is an OR operation.
The current best algorithm for the OMv problem has an expected running time of [21]. The following conjecture, due to Henzinger et al., [2], is well known about the OMv problem.
**OMv conjecture: ** The Online Boolean Matrix-Vector Multiplication (OMv) problem does not have a algorithm for any .
In Theorem 20 we reduce OMv problem to Induced Neighborhood Subgraph Computation problem. As a conseuqence of our reduction an efficient algorithm for Induced Neighborhood Subgraph Computation problem implies an efficient algorithm for the OMv problem.
Theorem 20**.**
Any algorithm for Induced Neighborhood Subgraph Computation problem needs at least quadratic time unless OMv conjecture is false.
Proof.
We show that an algorithm to solve the Induced Neighborhood Subgraph Computation problem can be used to solve the Online Boolean Matrix-Vector Multiplication problem. Let be an algorithm for the Induced Neighborhood Subgraph Computation problem with the running time of being , for some . We use algorithm to solve the Online Boolean Matrix-Vector Multiplication problem in time as follows : Let be the input matrix for the Online Boolean Matrix-Vector Multiplication problem and let be the column vectors presented to the algorithm one after the other. For the column vector , let set . To compute , we invoke on input . Let denote the induced subgraph on computed by the algorithm . Note that is an induced subgraph of the directed graph whose adjacency matrix is . To output the column vector , we observe that the -th row in the output column vector is 1 if and only if and there is an edge in such that . Given that has been computed in time, it follows that the number of edges in is and consequently the column vector can be computed in time. Therefore, using the algorithm we can solve Boolean Matrix-Vector Multiplication problem in time. If we believe that the OMv conjecture is indeed true, then it follows that the Induced Neighborhood Subgraph Computation problem cannot have an algorithm for any . Hence the Theorem. ∎
5.1 OMv conjecture is false for instances with the consecutive ones property
A 0-1 matrix is said to have the consecutive ones property if in each row, the column indices which have a 1 form an interval. A 0-1 column vector satisfies the consecutive ones property if the row indices which have a 1 in the column form an interval. We consider a special case of the OMv problem where the input matrix and the sequence of online vectors satisfy consecutive ones property. Each row in the matrix corresponds to an interval and every column index is a point on the number line. In particular, in the -th row if and are the least and largest column index, respectively, such that , then the -th row corresponds to the interval . For each , if and are the least and largest indices in such that then the vector is interpreted as the interval .
Now, using the data structures described in Table 2 in Section 2.4 we design an algorithm to solve OMv problem in quadratic time for this special case.
Theorem 21**.**
OMv conjecture is false if the input matrix and the vectors in the online vector sequence have the consecutive ones property.
Proof.
The proof is by presenting an algorithm to solve the OMv problem. The algorithm has a preprocessing step in which the intervals corresponding to the rows of the matrix are maintained in an interval tree. Subsequently, the matrix vector product is computed using queries to the interval tree.
**Preprocessing step: ** The interval corresponding to the rows in are computed. Let = denote the set of intervals corresponding to the rows in . Computing the set takes time. An interval tree is constructed using the set . The construction of takes time. Therefore, total time required in the preprocessing step is .
**Computing : ** For , when vector is presented, interval corresponding to is computed in time. Let be the set returned by the intersection query . Since, , from Table 2 the time required by the query is . is now computed as follows: for each , if interval is present in then the -th position in the vector is set to , otherwise [math]. Thus can be computed in time. Therefore, total time required for computing is . Thus the OMv problem on such instances can be solved in time . Hence the Theorem. ∎
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] H. A. Kierstead, W. T. Trotter, An extremal problem in recursive combinatorics, Congressus Numerantium 33 (143-153) (1981) 98.
- 2[2] M. Henzinger, S. Krinninger, D. Nanongkai, T. Saranurak, Unifying and strengthening hardness for dynamic problems via the online matrix-vector multiplication conjecture, in: Proceedings of the Forty-Seventh Annual ACM on Symposium on Theory of Computing, STOC, 2015, pp. 21–30.
- 3[3] D. O. Antonie Dutot, Frederic Guinand, Y. Pign, On the decentralized dynamic graph coloring problem, in: Complex Systems and Self Organization Modelling, 2007, pp. 259–261.
- 4[4] L. Ouerfelli, H. Bouziri, Greedy algorithm for dynamic graph coloring, in: Communications, Computing and Control Applications, 2011, pp. 1–5.
- 5[5] S. P. M. G. M. R. Scott Sallinen, Keita Iwabuchi, R. A.Pearce, Graph coloring as a challenge problem for dynamic graph processing on distributed systems, in: International Conference for High Performance Computing, Networking, Storage and Analysis, 2016, pp. 347–358.
- 6[6] R. L. Bradley Hardy, J. Thompson, Tackling the edge dynamic graph coloring problem with and without future adjacency information, in: In Journal of Heuristics, 2017, pp. 1–23.
- 7[7] M. Henzinger, P. Peng, Constant-time dynamic ( Δ Δ \Delta +1)-coloring and weight approximation for minimum spanning forest: Dynamic algorithms meet property testing, Co RR abs/1907.04745 (2019).
- 8[8] S. Bhattacharya, F. Grandoni, J. Kulkarni, Q. C. Liu, S. Solomon, Fully dynamic ( Δ Δ \Delta +1)-coloring in constant update time, Co RR abs/1910.02063 (2019).
