Approximation schemes for the generalized extensible bin packing problem
Asaf Levin

TL;DR
This paper introduces an efficient approximation scheme for a generalized bin packing problem where bin costs depend on bin index, improving previous methods and providing new approximation schemes.
Contribution
It presents an EPTAS for a new generalized bin packing problem and an AFPTAS for a related variant, advancing approximation techniques.
Findings
Established an EPTAS for the generalized bin packing problem.
Developed an AFPTAS for a related bin packing variant.
Improved previous PTAS to an EPTAS for the problem.
Abstract
We present a new generalization of the extensible bin packing with unequal bin sizes problem. In our generalization the cost of exceeding the bin size depends on the index of the bin and not only on the amount in which the size of the bin is exceeded. This generalization does not satisfy the assumptions on the cost function that were used to present the existing polynomial time approximation scheme (PTAS) for the extensible bin packing with unequal bin sizes problem. In this work, we show the existence of an efficient PTAS (EPTAS) for this new generalization and thus in particular we improve the earlier PTAS for the extensible bin packing with unequal bin sizes problem into an EPTAS. Our new scheme is based on using the shifting technique followed by a solution of polynomial number of -fold programming instances. In addition, we present an asymptotic fully polynomial time…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
11institutetext: Faculty of Industrial Engineering and Management, The Technion, 32000 Haifa, Israel. 11email: [email protected].
Approximation schemes for the generalized extensible bin packing problem††thanks: This research was supported by a grant from the GIF, the German-Israeli Foundation for Scientific Research and Development (grant number I-1366-407.6/2016) and by a grant from the ISF, the Israel Science Foundation (grant number 308/18).
Asaf Levin 11
Abstract
We present a new generalization of the extensible bin packing with unequal bin sizes problem. In our generalization the cost of exceeding the bin size depends on the index of the bin and not only on the amount in which the size of the bin is exceeded. This generalization does not satisfy the assumptions on the cost function that were used to present the existing polynomial time approximation scheme (PTAS) for the extensible bin packing with unequal bin sizes problem. In this work, we show the existence of an efficient PTAS (EPTAS) for this new generalization and thus in particular we improve the earlier PTAS for the extensible bin packing with unequal bin sizes problem into an EPTAS. Our new scheme is based on using the shifting technique followed by a solution of polynomial number of -fold programming instances. In addition, we present an asymptotic fully polynomial time approximation scheme (AFPTAS) for the related bin packing type variant of the problem.
1 introduction
We define the following load balancing on parallel machines problem that we name the generalized extensible bin packing problem (gebp). The input consists of jobs, where job has size , there are machines where for all , machine is associated with three positive input numbers , such that the following assumption holds:
[TABLE]
Assigning the set of jobs to machine , incurs a load on machine that is the total size of jobs in . That is, the load of machine that is assigned the set of jobs is , and the cost of machine is
[TABLE]
The goal of gebp is to find a partition of the jobs to machines such that the total cost of the machines (in this solution) is minimized. In this definition of the cost function of machine , the value of is seen as a fixed cost of machine , the value of is the standard capacity of machine , and is the cost of extending the capacity of machine by one unit of overtime. The value of captures the speed in which increasing the total size of jobs assigned to causes the cost of to increase by one unit. This speed is similar to the roles of speeds in the environment of uniformly related machines that is widely studied in the scheduling literature. In our study it will be easier to refer to the reciprocal of the speed (i.e., to the values of ) and not to the speeds.
The extensible bin packing problem (ebp) is the special case of gebp where for every machine we have (note that these values satisfy (1)). Even this special case is strongly NP-hard via the standard reduction from 3-Partition. This extensible bin packing problem was suggested by [22, 7]. Another special case of gebp that was considered before is the case of extensible bin packing with unequal bin sizes (ebp-ubs). This ebp-ubs is defined as the special case of gebp where for every machine , and (once again (1) holds for such values). Another interesting special case of gebp that generalizes ebp is the generalization from identical machines to uniformly related machines, that is, the special case of gebp where for all . Observe that this last case does not generalizes ebp-ubs and it is not a special case of ebp-ubs. Our new model is defined in order to generalizes all these special cases.
Before we state our main result and present the literature, we define the notion of approximation algorithms and the different types of approximation schemes. An -approximation algorithm for a minimization problem is a polynomial time algorithm that always finds a feasible solution of cost at most times the cost of an optimal solution. The infimum value of for which an algorithm is an -approximation is called the approximation ratio of the algorithm. A polynomial time approximation scheme (PTAS) is a family of approximation algorithms such that the family has a -approximation algorithm for any . An efficient polynomial time approximation scheme (EPTAS) is a PTAS whose time complexity is of the form where is some (not necessarily polynomial) computable function, and is a polynomial function of the length of the (binary) encoding of the input. A fully polynomial time approximation scheme (FPTAS) is a stronger concept, defined like an EPTAS, but the function must be a polynomial in . When we consider an EPTAS we say that an algorithm (for some problem) has a polynomial running time complexity if its time complexity is of the form . Note that while a PTAS may have time complexity of the form , where can be polynomial or even super-exponential, this cannot be the case for an EPTAS. The notion of an EPTAS is modern and finds its roots in the FPT (fixed parameter tractable) literature (see [4, 10, 14, 19]). It was introduced in order to distinguish practical from impractical running times of PTAS’s, for cases where a fully polynomial time approximation scheme (FPTAS) does not exist (unless P=NP). In this work, we design an EPTAS for gebp for which an FPTAS does not exist unless P=NP as gebp is strongly NP-hard.
In [22] Speranza and Tuza analyzed an online variant of ebp and considered the list scheduling heuristic showing that it is a -approximation while a slightly improved algorithm is suggested whose approximation ratio is and a lower bound of is established for this online variant. In [7] Dell’Olmo et al. showed that the longest processing time heuristic is a -approximation for ebp. The EPTAS of Alon et al. [1, 2] for load balancing on identical machines solves ebp and thus this special case admits an EPTAS prior to this work. The time complexity of this EPTAS for ebp (among other problems on identical machines) was improved in the work of Jansen, Klein, and Verschae [18]. The online problem was studied further in [24]. See also [9, 3, 20] for a study of this special case in the stochastic settings in the context of scheduling operating rooms, and [21] for a use of the approximation algorithms for this problem in PCM interface management arising in wireless switch design.
The study of ebp-ubs was initiated by Dell’Olmo and Speranza [8] who showed that the approximation ratio of the longest processing time heuristic is and that the approximation ratio of the online algorithm list scheduling is exactly . They also showed that any online algorithm has an approximation ratio of at least . The PTAS of Epstein and Tassa [13] for vector scheduling in asymmetric settings gives a PTAS for ebp-ubs. Their assumption that the cost functions of the machines have a common constant upper bound on the Lipschitz constants cannot be met for gebp as this means that the maximum ratio between the costs of extending a pair of machines by one unit of overtime is bounded by a constant (i.e., their scheme assumes that is bounded by a constant independent of the pair of machines ). The online problem of ebp-ubs was also studied in [23] who analyzed the performance guarantee of list scheduling as a function of the standard capacities of the machines and present an improved online algorithm for the cases .
Thus, with respect to the existence of approximation schemes, ebp was known to admit an EPTAS while ebp-ubs was known to have a PTAS (that is not an EPTAS). The approximability of gebp as well as its special case of for all were not studied before.
Our main result is an EPTAS for gebp. In particular, we improve upon the scheme of [13] for ebp-ubs and present the first EPTAS for this (previously studied) special case. Our scheme first apply preprocessing steps and then breaks the asymmetry between the machines in a two steps approach. In the first step, we use the machinery of the shifting technique in order to partition the instance into polynomially many sub-instances each of which has the additional property that the standard capacities of the machines in the sub-instance are similar. The resulting sub-instance still captures unbounded asymmetry between the machines, and in order to tackle the sub-instances we use the recent algorithms for -fold programming. We refer to [17] for an earlier EPTAS for a different scheduling problem that is based on solving an -fold programming instance. The time complexity of our scheme is a single exponential function of times a polynomial of .
We conclude this work in Section 5 by showing the existence of an asymptotic fully polynomial time approximation scheme for a related bin packing type variant of the problem similarly to the variant of ebp studied by [6, 5]. In this variant of gebp the number of machines of each type is not part of the input and is determined as part of the solution. Namely, for every we first decide how many machines of type with a fixed cost , a standard capacity , and the cost for overtime, to have where for all , . In a second stage we find a feasible allocation of the jobs to the machines we have (according to the decisions made in the first stage). We denote this variant gebp-bpv. In Section 5 we establish the existence of an asymptotic fully polynomial time approximation scheme (AFPTAS) for gebp-bpv. We note that the special case of one type of machines with was considered by Coffman and Lueker [6, 5] who presented an AFPTAS for this special case of gebp-bpv.
Paper outline.
We present our EPTAS for gebp in the main part of the paper. This exposition is partitioned into preprocessing steps and characterization of near optimal solutions in Section 2, followed by an analysis of the shifting technique when applied to gebp in Section 3, and finally the use of the -fold programming algorithm to solve the family of sub-instances resulting from the shifting step is described in Section 4. We establish the existence of an AFPTAS for gebp-bpv in Section 5.
2 Preprocessing steps and the structure of a near optimal solution
We assume that satisfies that is integer. We use the fact that in order to establish the existence of an EPTAS for gebp, it suffices to establish for some integer constant , a -approximation algorithm whose time complexity is upper bounded by the product of a computable function of and a polynomial of the input length. When we state time complexity of steps in our algorithm we ignore polynomial factors of .
Our preprocessing steps consists of scaling and rounding of the input parameters. Our goal is to assume that job sizes are rounded and that the minimum value of is .
First, we consider the scaling of the parameters that allows us to assume that . That is, we prove the following lemma.
Lemma 1
Without loss of generality, .
Proof
Assume that the input of gebp does not satisfy the claim. Then we let , and we do the following. For every machine we multiply by , and we divide by . In addition, for every job , we multiply the job size by . Observe that the new input satisfies for all .
Next, we show that for every solution for the original input, the cost of the solution in the new input is the same as it was in the original input. To see this fact, note that for every machine , its load, i.e., the value of is times its value in the solution for the original input, and thus it satisfies in the new input if and only if it is satisfied in the original input. Furthermore, the value of in the new input is exactly times its value in the original input, and thus the cost of machine is the same in the two inputs. ∎
Next, without loss of generality, we assume that machines are sorted according to their standard capacities, that is, we assume .
Throughout this work we use the following observation.
Observation 2.1
Let be two numbers such that and let be a machine, then .
Proof
The inequality holds by the fact that the cost function is monotone non-decreasing (as is at least , and thus non-negative). The inequality holds by the following argument. If , then and we have and the inequality holds. Otherwise, and by the assumption , we conclude and so establishing the required inequality. ∎
Next, we consider the rounding of the jobs sizes and we use the following rounding method. This rounding method is motivated by the fact that the -fold programming formulations which we use to solve sub-instances of the rounded problem later on, assume that all coefficients of the constraint matrix are (relatively small) integers. Thus, for every job , we let be the integer value such that . We let the rounded value of be
[TABLE]
The rounded instance is the instance of gebp in which the values of the input parameters are for all (such that ), and for all . In the sequel, we use the fact that in , for every integer value of there are at most distinct rounded job sizes in the interval .
For a solution Sol and an instance of gebp, we denote by its objective function value where the input parameters are according to (in particular we use this notation for and for ). Next we analyze the impact of the rounding step on the performance guarantee of our algorithm.
Lemma 2
Let Sol be a feasible solution. Then, .
Proof
For every job , we have as we round up the size of . For every , we have as by definition of , and when we round up we increase its size by at most , that is, we have
[TABLE]
as we argued.
For every machine , the total size of jobs assigned to in Sol as a solution to is at most the total size of jobs assigned to in Sol as a solution to and this is at most times the total size of jobs assigned to as a solution to . The claim follows by Observation 2.1 and summing the costs of all machines. ∎
In what follows, we assume that the original input of gebp satisfies the assumption of Lemma 1 and the job sizes are already rounded as they are in . With a slight abuse of notation, we let be the parameters of machine and be the size of job (for all ) in this input which we denote by . It is sufficient to provide an EPTAS for , and this would imply an EPTAS for the original input.
Next, we characterize near optimal solutions. We let be an index of a machine for which .
Definition 1
A solution Sol for gebp is called a nice solution if for every machine , the total size of jobs assigned to (in Sol) is at most .
The proof of the following lemma uses the observation that since , adding a set of jobs of total size to machine increases the cost of by at most .
Lemma 3
Let opt be an optimal solution (for the rounded instance) whose cost is denoted as . Then, there is a nice solution whose cost is at most .
Proof
Consider the solution opt. Let be the set of machines such that and opt assigns to a set of jobs of total size larger than . We create the solution by changing the assignment of jobs in so that these jobs are assigned to , while the assignment of other jobs is the same as in opt. That is, we reassign the jobs that were assigned to machine in so that they are assigned to . Observe that the new solution is indeed a nice solution.
Next, we upper bound the cost of the new solution. Let be the total size of jobs assigned by opt to machine (for all ). We have the following.
[TABLE]
however, , and thus it suffices to show that for every , we have . This follows as where the inequality holds as and the last equation holds by the definition of as . ∎
We let opt-nice be an optimal solution among all nice solutions for the rounded instance whose cost is . In what follows we show an algorithm that returns a solution Sol whose cost is at most and its time complexity is . This solution Sol is a feasible solution to the original instance of gebp that is obtained in time (as the rounding takes and the scaling takes ) whose approximation ratio as an approximation algorithm for the original instance of gebp is .
3 Using the shifting technique to obtain a family of instances with machines with similar standard capacities
We use the shifting technique [16, 15] to partition the rounded instance into a family of problems that can be solved almost independently. For every value of we let
[TABLE]
and we let be an index such that
[TABLE]
Observe that the value is easily computed in time.
Recall that opt-nice is the best solution for the rounded instance whose cost is . Let be the best solution among the nice solutions which allocate no job to machines in . Then, we next show that the cost of is close to .
Lemma 4
We have .
Proof
The first inequality holds by definition. We prove the second inequality by establishing a nice solution Sol which allocates no job to machines in and we bound its cost by . Consider opt-nice and let be the set of jobs which opt-nice assigns to machines in . In order to construct Sol we modify opt-nice by changing the allocation of and we assign these jobs to machine (and recall that ). Clearly by moving jobs from machines to machine the property of nice solutions cannot be hurt as the load in Sol of every machine which is not is not larger than it was in opt-nice. Furthermore Sol does not allocate jobs to machines in . Last, the cost of Sol denoted as satisfies
[TABLE]
where (2) holds as for every machine that opt-nice assigns a total load of (for ) its cost in opt-nice was at least and this extra load on machine increases the cost of by at most , and thus the increase of the cost of the solution due to the reallocation of jobs of size to machine is at most (the cost of in Sol); inequality (3) holds by definition of ; and (4) follows by the definition of the objective function of gebp. ∎
In what follows, we enforce the algorithm to allocate no job to machines in . Since is an optimal nice solution subject to this additional constraint, we conclude that it suffices to construct a feasible solution Sol whose cost is approximately . Next, we delete the set of machines from the instance. This deletion of machines does not hurt the feasibility of Sol and of (as these solutions do not allocate jobs to the deleted machines), however, it decreases the cost of both solutions by a common non-negative constant that is the total fixed cost of the deleted machines. Thus, it suffices to show that we can design an EPTAS for the instance resulted from by deleting the machines in . Once again, with a slight abuse of notation we assume that the instance is the instance resulted from this deletion of machines and denote by the set of machines in such that .
We partition the machine set of the instance resulting from the deletion of . This partition is obtained by letting each partition be a maximal (with respect to inclusion) set of consecutive indices of machines such that there are no two consecutive indices satisfying . We let be the number of partitions in this partition such that for every , and for every and we have and in fact we have , by the sorting of the machines. For , let and so the indices in are those between and . A crucial property for our algorithm is the following one.
Lemma 5
For every , and every pair of machines , we have
[TABLE]
Proof
Assume by contradiction that the claim does not hold for . Then, . Then, By the integrality of , we conclude that the following holds.
[TABLE]
Thus, by the pigeonhole principle, there are at least two integers which are equivalent to modulo such that . Next, we define to be either or according to the following rule. If then we let , and otherwise we let . Then, observe that when we deleted the set of machines , we deleted all machines with standard capacities in the interval and in particular and do not belong to this interval. Let be the maximum index of a machine with . Then, since by definition of we have , however the ratio between and is strictly larger than contradicting the assumption that so , and thus the claim follows. ∎
We next partition the job set as follows. For the job subset is defined as
[TABLE]
The set is and for every the set
[TABLE]
where . Observe that every nice solution allocates all jobs of to machine , and for every it allocates all jobs of to machines in .
For , we define a relaxation of the problem gebp where the set of machines is , the set of jobs is and in addition we have sand consisting of jobs of total size , and where we need to schedule all jobs and the sand on the machines but we are allowed to leave jobs and sand of total size at most unscheduled (these jobs are assigned to machines with indices smaller than or to machine ). The notion of sand means that the jobs that are part of the sand can be assigned fractionally to machines. We denote by the relaxation corresponding to the index together with the two numerical parameters .
We will show that if is an integer multiply of while is an integer multiply of , then can be approximated within a multiplicative factor of with time complexity that fits the assumptions of an EPTAS. That is, we will prove the following theorem in the next section.
Theorem 3.1
There exists an algorithm Alg that given an instance of defined by such that is an integer multiply of while is an integer multiply of , Alg returns a -approximated solution to and the time complexity of Alg is upper bounded by where .
Before presenting the proof of Theorem 3.1, we show that the existence of the algorithm Alg is sufficient to guarantee the existence of an EPTAS for gebp.
Theorem 3.2
There is an algorithm with time complexity that given the rounded instance returns a solution whose cost is at most .
Proof
It suffices to construct a -approximation algorithm for the rounded instance after deleting the machines in .
The first step of the algorithm is to apply Alg on a family of inputs consisting of the following ones. For every , for every in the interval that is an integer multiply of , and for every in that is an integer multiply of , we apply Alg to solve approximately the instance . We denote by the cost of the solution returned by Alg when applied on the instance . The time complexity of the first step is as the number of inputs solved by Alg is at most using .
The second step is to use dynamic programming in order to concatenate a sequence of inputs in the family consisting of one input for each value of . We define the dynamic programming formulation as a shortest path computation in a directed layered graph . The graph consists of layers denoted as and one additional node . The nodes of layer are associated with the possible value of . This defines the nodes of layers , however to use this definition for layer we define a value as the total size of jobs in plus the value of , i.e., . Thus, in every layer there are nodes, and in total there are nodes in . We next describe the edge set of together with the length associated with each edge. For and a node in layer and node in layer we have an edge from the node in towards node in whose length is defined as follows. We compute a value that is the maximum integer multiply of such that together with the total size of jobs in the resulting size is at most . That is, for , we compute
[TABLE]
The value of is computed slightly different. We subtract from the total size of jobs in and the resulting value is (note that this is already a rounded value). That is, . The length of the edge in the graph between these two nodes is defined as . For every node in layer we have an edge from this node directed to whose length is . This length of the edges directed to is motivated by the fact that assigning jobs of total size to machine costs at most . In the resulting directed graph we find a shortest path from the node in layer to node , where is defined as follows.
[TABLE]
The time complexity of the second step is determined by the number of edges in the graph that is at most . We denote by the node in layer that belongs to the shortest path computed by the algorithm, and we let be the corresponding value of that the algorithm computed using (5) for the sequence of .
The third (and last) step of the algorithm is to compute a feasible solution for gebp whose cost is at most times the total length of . For , we show that we can assign (integrally) the jobs and small jobs of total size each of which of size at most such that a total size of at most is not assigned (such a solution is called feasible), and the cost of this feasible solution is at most .
Consider one specific value of . We say that the jobs in are large and the other jobs are small. The solution returned by Alg specifies the assignment of large jobs to machines in (some of these jobs might be unassigned) and for each machine it defines a volume of sand that is assigned to . We denote by the set of large jobs that the solution does not assign to machines in . The feasibility of the solution (for ) ensures the following inequalities.
[TABLE]
where the first inequality holds by the guarantee on the total size of jobs and sand that the solution does not assign, and the second inequality follows by the fact that the total size of sand in the instance is at most . In the solution that we create we assign the jobs in exactly as in , while for the assignment of small jobs we consider the list of small jobs and we process the machines in one by one in an arbitrary order as long as is not empty. When considering the current machine , we find a minimum prefix of whose total size is at least , this prefix of jobs is assigned to , we delete it from and move to the next machine in . If this prefix is undefined, it means that the total size of jobs in is smaller than and we assign all jobs in to (and stop the assignment process of small jobs to machines in ). The time complexity of this step is . Furthermore, if there is a machine such that when processing all jobs in are assigned to , then all small jobs are assigned and the feasibility of the solution we create to machines in follows by (6). Otherwise, every machine receives a total size of small jobs of at least , and once again by (6) the resulting solution we create is a feasible solution. We observe that for every machine , the total size of jobs assigned to is at most larger than the total size of jobs (and sand) assigned to in . By observation 2.1, this increase of the total size of jobs assigned to may increase the cost of by a multiplicative factor of at most as we show next. If denotes the total size of jobs and sand assigned to in , then the cost of in that solution is and in our solution it is at most where the first inequality follows by the monotonicity of the cost function and the second inequality by Observation 2.1. In order to use the induction (and decrease the value of by ), note that the total size of jobs not assigned to machines that are not and with indices at least which are of size at most is at most , and the total size of jobs with sizes in the interval is the total size of jobs in . Thus, by the definition of in terms of , we conclude that the total size of jobs of size at most that are still unscheduled is at most and indeed we guarantee the assumption on the recursive algorithm for . The claim follows as any set of jobs of total size at most can be assigned to machine increasing the cost of that machine by at most that is the length of the edge of adjacent to .
The theorem follow by showing that the graph has a path whose total length is at most where is a cheapest solution among all nice solutions which do not allocate jobs to machines in . Based on opt’ we define a fractional value of for all as follows where we let . For a given value of , the fractional value of is the total size of jobs in that opt’ assigns to machines in . Similarly, we define . By the definition of we conclude that the cost of an optimal solution to is at most the cost opt’ pays for machines in .
Next, for every , we round up to the next integer multiply of and we denote by this rounded up value. This may force us to increase and thus our next step is to round up (for all ) to the next integer multiply of and to add another to the rounded up value to get the value . The rounding of is different and we round down to the next value of the form of the total size of jobs in plus an integer multiply of .
When we compare the two instances (for ) of the auxiliary problem with , we can take a solution of the first one and add size of sand to machine to get a feasible solution of the second problem. This is sufficient even for to get a feasible solution for the instance we solved for the edge between in layer to node in layer . This additional sand increases the cost of machine by a multiplicative factor of at most but this input satisfies the assumptions for which Alg is a -approximation for aux. Thus, the length of the edge between in layer to in layer is at most times the total cost of the machines in that opt’ pays. Since is smaller than , we conclude that the total size of jobs which opt’ assigns to is larger than the one in our solution due to this edge directed to . ∎
4 Approximating via the use of -fold programming
We assume that is an integer multiply of while is an integer multiply of (and hence also an integer multiply of ). We first show that by restricting ourselves to solutions of for which the total size of sand assigned to each machine is an integer multiply of the approximation ratio is multiplied by at most . We denote by the resulting auxiliary problem with this additional constraint.
Lemma 6
Let be an optimal solution for , then is a -approximation for .
Proof
is clearly a feasible solution to . It thus suffices to upper bound its cost. Let Sol be an optimal solution for . We modify the (total) size of sand assigned to each machine by rounding it up to the next integer multiply of . If the total size of sand which we allocate is larger than , then we decrease integer multiplies of from the size of sand assigned to some machines so that the total size of sand which we assigned is exactly . Observe that by rounding up the size of sand assigned to each machine we increase its load by at most . Let be the original load of (in Sol) and let be its load in the new created solution, then we have . If , then the cost of machine is in both solutions. Otherwise, . Thus, the cost of is at most times the cost of Sol. ∎
Based on Lemma 6, the proof of Theorem 3.1 and thus also the proof of Theorem 3.2 follow by establishing an exact algorithm for solving (i.e., an algorithm for finding an optimal solution of ) whose time complexity is upper bounded by like the algorithm we present next.
The first step of the algorithm is to partition the sand of size into a set of dummy jobs each of which of size . Observe that the total size of these dummy jobs is and an assignment of the jobs in and the dummy jobs to machines in such that the total size of unassigned jobs and dummy jobs is at most is a feasible solution to and this is a characterization of the feasible solutions of . Let be the set of jobs and dummy jobs of this instance. In what follows we say a job and mean that is either a job or a dummy job, that is, we do not distinguish between jobs and dummy jobs of the same size. For every that is a size of a job in , we denote by the number of jobs of of size .
Note that all jobs in have sizes that are integer multiply of and have sizes of at most . Due to our rounding of the job sizes there are distinct sizes in every interval of sizes where the upper bound is at most times the lower bound of the interval. Thus, the number of distinct sizes of jobs in is at most where the inequality follows by lemma 5. We let be the set of distinct sizes of jobs in .
For machine we define a configuration of machine as a vector consisting of components where the components are associated with the elements in in increasing order (of the sizes in this set). Each component corresponding to represents the number of jobs of size which are assigned to . This number is a non-negative integer that is at most (as the load of is at most ). Thus, the number of distinct configurations of machine is at most . Each such configuration of machine has a cost that is the value of when assigned the set of jobs of this configuration. We denote by the set of configurations of machine , and for each we let be the cost of this configuration.
We next formulate as an integer linear program. The decision variables are for every machine and that is an indicator variable that equals when machine is assigned a configuration and [math] otherwise, and the set of additional variables for every encoding the number of jobs in of size which are not assigned to machines in .
The objective function is clearly to minimize the total cost of the used configurations. That is,
[TABLE]
We have the following families of constraints:
The global constraints.
We have a constraint for each saying that every job of size is either assigned to one of the machines in or not assigned to any machine in . That is, for every we introduce the constraint:
[TABLE]
In addition we have a bound of on the total size of unassigned jobs. We divide this inequality by and obtain the following inequality as an additional global constraint.
[TABLE]
We use the division by this common factor to conclude that all coefficients of the global constraints are non-negative integers which are at most . Furthermore, observe that the number of global constraints which we denote by is a small constant .
Next, we group the decision variables in bricks where a brick is the collection of variables corresponding to one specific machine . The columns corresponding to variables of each brick are consecutive columns of the resulting constraint matrix.
The local constraints.
For every brick, namely for every machine , we have one local constraint involving (only) variables of that brick, namely the constraint that each machine is assigned exactly one configuration. That is, for every , the local constraint of brick is
[TABLE]
In addition, we have lower and upper bounds on the variables. In our settings the is an indicator variable (so it should be between [math] and ) while the are non-negative and we can add the additional (meaningless) upper bound of . Thus we introduce the following bounds.
[TABLE]
Using these constraints and variables, the integer linear program formulating is to minimize the objective (7) subject to the constraints (8) for every , the constraint (9), the constraints (10) for every , and the constraints (11) (in addition to the requirement that all variables are integers).
For using the results for -fold programming we note the following bounds.
- •
The number of global constraints is .
- •
The number of local constraints of a brick is .
- •
The maximum absolute value of a component of the constraint matrix (i.e., the infinity-norm of the matrix) is .
- •
The number of variables in every brick is .
- •
The number of bricks is while the right hand side is bounded by .
The problem we formulated is a special case of generalized -fold programming where the number of variables . The running time of the algorithm of Eisenbrand et al. [11] for solving such problem (see the scaling, column of the linear objective case of Corollary 91 in [11]) is upper bounded by . Using our bounds and , this is upper bounded by . The coefficient is a function of that is upper bounded by and is a strongly polynomial bound independent of .
5 An asymptotic fully polynomial time approximation scheme for gebp-bpv
When considering asymptotic schemes that return a solution of cost at most for some function where is the optimal cost of the same instance, the additive term is not scalable. In order to use this definition of asymptotic approximation scheme we assume that .
Lemma 7
Without loss of generality , and .
Proof
Assume that the claim does not hold. We first scale all the by dividing these numbers by a common factor of . Observe that the cost of every feasible solution is scaled by this factor, and the first part of the claim holds. Furthermore, assumption (1) still holds. We apply Lemma 1 and observe that the transformation in the proof of that lemma does not change the fixed costs of machines. ∎
Let be the resulting instance after (perhaps) changing the input according to the proof of the last lemma. Let denotes an optimal solution for this instance, and denotes its cost. We show the existence of an algorithm that returns a feasible solution for with cost at most and time complexity that is upper bounded by a polynomial in .
Lemma 8
Consider an optimal solution Sol for under the additional constraint that for every and every machine of type , either Sol assigns a unique job to , or the load of is at most . Then, the cost of Sol is at most .
Proof
Assume that the assumption is not satisfied with respect to the set of jobs that assigns to . We replace by a collection of machines of type and assign to these machines such that the total cost of these machines is at most times the cost of in . Applying this transformation for each machine establish the claim of the lemma.
Consider , for every job of size at least we add a dedicated machine of type for this job and assign it to its dedicated machine. Performing this step on all such jobs may increase the cost of the solution by at most an additive term of but this additive term is at most an fraction of the cost of in . Moreover, if an increase of the cost occurred it means that the resulting set of machines (the new dedicated machines as well as machine ) satisfy the conditions of the lemma.
If the conditions of the claim do not hold yet, then in particular it means that the remaining set of jobs that were not assigned to dedicated machines (by the previous modification) have total size larger than . Then, we pack these jobs into bins, each of which of capacity using the next-fit heuristic. That is, we have an open machine of type , and we process the jobs one by one. When we process job , we try to assign it to the current open machine. If the resulting set of jobs assigned to the open machine has total size at most we do so, and continue to the next job, otherwise we close the open machine and open a new open machine of type and assign there. If was the total size of jobs in we use at most machines to pack all the jobs in , and this may increase the cost of the resulting solution by the total fixed cost of these machines, that is, by at most using (1). This is at most times the cost of assigning to and the claim follows. ∎
Let be such that . Then, by Lemma 8, we conclude that every job of size larger than is allocated a dedicated machine. For each such job, we find the type for which the resulting cost of the dedicated machine is minimized and we use such machine to process the job. In this way, we eliminate all jobs of size larger than . In the remaining instance that we denote as , the load of every machine (in Sol) is at most and for every collection of such jobs there is a type of machines such that if we assign it to such machine, then the resulting cost would be at most where the inequality follows by Lemma 7.
It suffices to construct an asymptotic approximation scheme for where we modify the definition of the problem so that the load of every machine is at most . We next show that such a scheme was established by Epstein and Levin in [12]. To use the results of [12], we transform the instance into an instance of bin packing with bin utilization cost (BPUC) for which [12] designed an AFPTAS. In BPUC we are given a monotonically non-decreasing non-negative cost function , where its domain contains the interval , and a set of items , where item has a non-negative size (such that for all ). The goal is to partition the items into subsets such that the total size of items in each subset is at most and the cost, which is defined as , is minimized.
In order to transform into an instance of BPUC, we do the following. The set of items is the set of jobs. The size of item in the BPUC instance is (that is, the fraction of a largest load of a machine in a solution that satisfies the additional constraint), and to define the bin utilization cost we do the following. For , we set to be
[TABLE]
Observe the following simple properties. First, is a monotone non-decreasing function as for every the cost function is monotone non-decreasing. Second for every we can evaluate in polynomial time as we can evaluate in constant time for every . Last for every we can find a maximum value such that , since for every in constant time we can compute a maximum value of such that (if while if then there is no for which ). These properties guarantee the assumptions used by [12] to design their AFPTAS for BPUC.
The following theorem follows by the observation that partitioning the jobs of to subsets according to the solution obtained for BPUC and then choosing for each subset the type of machine that minimizes the cost of assigning the jobs to that machine type, results in a solution for of the same cost as the cost of the solution for BPUC. This holds also in the other direction if we are given an optimal solution for we obtain an optimal solution for the input of BPUC. Thus, is equivalent to the instance we created for BPUC which proves the following theorem.
Theorem 5.1
There is an AFPTAS for gebp-bpv.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] N. Alon, Y. Azar, G. J. Woeginger, and T. Yadid. Approximation schemes for scheduling. In Proc. 8th Symposium on Discrete Algorithms (SODA 1997) , pages 493–500, 1997.
- 2[2] N. Alon, Y. Azar, G. J. Woeginger, and T. Yadid. Approximation schemes for scheduling on parallel machines. Journal of Scheduling , 1(1):55–66, 1998.
- 3[3] B. P. Berg and B. T. Denton. Fast approximation methods for online scheduling of outpatient procedure centers. INFORMS Journal on Computing , 29(4):631–644, 2017.
- 4[4] M. Cesati and L. Trevisan. On the efficiency of polynomial time approximation schemes. Information Processing Letters , 64(4):165–171, 1997.
- 5[5] E. Coffman and G. S. Lueker. Approximation algorithms for extensible bin packing. Journal of Scheduling , 9(1):63–69, 2006.
- 6[6] E. G. Coffman Jr and G. S. Lueker. Approximation algorithms for extensible bin packing. In Proc. 12th Symposium on Discrete Algorithms (SODA 2001) , pages 586–588, 2001.
- 7[7] P. Dell’Olmo, H. Kellerer, M. G. Speranza, and Z. Tuza. A 13/12 approximation algorithm for bin packing with extendable bins. Information Processing Letters , 65(5):229–233, 1998.
- 8[8] P. Dell’Olmo and M. G. Speranza. Approximation algorithms for partitioning small items in unequal bins to minimize the total size. Discrete Applied Mathematics , 94(1-3):181–191, 1999.
