Baixe o app para aproveitar ainda mais
Prévia do material em texto
See discussions, stats, and author profiles for this publication at: https://www.researchgate.net/publication/229026552 Integer Programming Article · April 2006 CITATIONS 0 READS 4,599 1 author: Some of the authors of this publication are also working on these related projects: Random projections in optimization View project Applications in Energy View project Leo Liberti CNRS and École Polytechnique 321 PUBLICATIONS 4,505 CITATIONS SEE PROFILE All content following this page was uploaded by Leo Liberti on 20 May 2014. The user has requested enhancement of the downloaded file. https://www.researchgate.net/publication/229026552_Integer_Programming?enrichId=rgreq-5be37a87f320231f72c352ba0467c787-XXX&enrichSource=Y292ZXJQYWdlOzIyOTAyNjU1MjtBUzo5OTAxMjg5MTkwNjA2MUAxNDAwNjE3OTIyNjEx&el=1_x_2&_esc=publicationCoverPdf https://www.researchgate.net/publication/229026552_Integer_Programming?enrichId=rgreq-5be37a87f320231f72c352ba0467c787-XXX&enrichSource=Y292ZXJQYWdlOzIyOTAyNjU1MjtBUzo5OTAxMjg5MTkwNjA2MUAxNDAwNjE3OTIyNjEx&el=1_x_3&_esc=publicationCoverPdf https://www.researchgate.net/project/Random-projections-in-optimization?enrichId=rgreq-5be37a87f320231f72c352ba0467c787-XXX&enrichSource=Y292ZXJQYWdlOzIyOTAyNjU1MjtBUzo5OTAxMjg5MTkwNjA2MUAxNDAwNjE3OTIyNjEx&el=1_x_9&_esc=publicationCoverPdf https://www.researchgate.net/project/Applications-in-Energy?enrichId=rgreq-5be37a87f320231f72c352ba0467c787-XXX&enrichSource=Y292ZXJQYWdlOzIyOTAyNjU1MjtBUzo5OTAxMjg5MTkwNjA2MUAxNDAwNjE3OTIyNjEx&el=1_x_9&_esc=publicationCoverPdf https://www.researchgate.net/?enrichId=rgreq-5be37a87f320231f72c352ba0467c787-XXX&enrichSource=Y292ZXJQYWdlOzIyOTAyNjU1MjtBUzo5OTAxMjg5MTkwNjA2MUAxNDAwNjE3OTIyNjEx&el=1_x_1&_esc=publicationCoverPdf https://www.researchgate.net/profile/Leo_Liberti?enrichId=rgreq-5be37a87f320231f72c352ba0467c787-XXX&enrichSource=Y292ZXJQYWdlOzIyOTAyNjU1MjtBUzo5OTAxMjg5MTkwNjA2MUAxNDAwNjE3OTIyNjEx&el=1_x_4&_esc=publicationCoverPdf https://www.researchgate.net/profile/Leo_Liberti?enrichId=rgreq-5be37a87f320231f72c352ba0467c787-XXX&enrichSource=Y292ZXJQYWdlOzIyOTAyNjU1MjtBUzo5OTAxMjg5MTkwNjA2MUAxNDAwNjE3OTIyNjEx&el=1_x_5&_esc=publicationCoverPdf https://www.researchgate.net/profile/Leo_Liberti?enrichId=rgreq-5be37a87f320231f72c352ba0467c787-XXX&enrichSource=Y292ZXJQYWdlOzIyOTAyNjU1MjtBUzo5OTAxMjg5MTkwNjA2MUAxNDAwNjE3OTIyNjEx&el=1_x_7&_esc=publicationCoverPdf https://www.researchgate.net/profile/Leo_Liberti?enrichId=rgreq-5be37a87f320231f72c352ba0467c787-XXX&enrichSource=Y292ZXJQYWdlOzIyOTAyNjU1MjtBUzo5OTAxMjg5MTkwNjA2MUAxNDAwNjE3OTIyNjEx&el=1_x_10&_esc=publicationCoverPdf Integer Programming Leo Liberti LIX, École Polytechnique, F-91128 Palaiseau, France (liberti@lix.polytechnique.fr) March 13, 2006 Abstract A short introduction to Integer Programming (IP). Problems leading to IP models. Some mod- elling tricks and reformulations. Geometry of linear IP. TUM matrices. Brief notes on polyhedral analysis. Separation theory. Chvatal cut hierarchy, Gomory cuts, Disjunctive cuts, RLT cut hier- archy. Iterative methods: Branch-and-Bound, Cutting Plane, Branch-and-Cut, Branch-and-Price. Lower bounds: Lagrangian relaxation and subgradient method. Upper bounds: heuristics. Contents 1 Introduction 2 1.1 Canonical and Standard Forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.2 Modelling Tricks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 2 Geometrical and Structural Properties of ILP 4 2.1 Totally Unimodular Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 3 Polyhedral Analysis 7 3.1 Separation Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 3.2 Chvátal Cut Hierarchy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 3.3 Gomory Cuts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 3.3.1 Gomory Cutting Plane Algorithm: Example . . . . . . . . . . . . . . . . . . . . . . 9 3.4 Disjunctive Cuts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 3.5 Lifting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 3.6 RLT Cut Hierarchy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 4 Enumerative Methods for Integer Programming 13 4.1 Branch-and-Bound . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 4.1.1 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 4.2 Branch-and-Cut . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 1 INTRODUCTION 2 4.3 Branch-and-Price . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 5 Heuristics and Bounds 15 5.1 Lower Bounds: Lagrangian Relaxation and Subgradient Method . . . . . . . . . . . . . . 16 6 Conclusion 17 1 Introduction Many real-world problems naturally require solutions of a discrete type. We might be asked how many containers we should route from a set of manufacturing plants to a number of customer sites, subject to production limits and delivery requirements, so that the transportation costs are minimized. In another setting, we might have to decide where to install some servicing facilities so that every neighbouring town is guaranteed service, at minimum cost. The first is an abstract description of a transportation problem, which calls for discrete variables (the number of containers routed from plant i to customer j). The second describes a set covering problem, which relies on binary variables which tell whether a servicing facility will be built on a geographical region i. A more abstract description of the set covering problem is that, given a set X and a subset cost function c : P → R, we want to find a subset system S ⊆ P(X) so that ⋃ S = X and ∑ s∈S c(s) is minimized. The fact that integer decision variables are required to solve these problems makes it impossible to model them by linear programming (LP) methods alone, as appears from their formulations below. 1.1 Example Transportation Problem. Let xij be the (discrete) number of product units transported from plant i to customer j. We model the problem as follows: maxx m ∑ i=1 n ∑ j=1 cijxij ∀ i ≤ m n ∑ j=1 xij ≤ li ∀ j ≤ n m ∑ i=1 xij ≥ dj (1) where m is the number of plants, n is the number of customers, cij is the unit transportation cost from plant i to customer j, li are the production limits at the plants and dj are the demands at the customer sites. 1.2 Example Set Covering Problem. Let xi = 1 if a servicing facility will be built on geographical region i (out of m possible regions), and 0 otherwise. We model the problem as follows: minx m ∑ i=1 cixi ∀ j ≤ n m ∑ i=1 aijxi ≥ 1 (2) where m is the number of regions, n is the number of towns to be serviced, ci is the cost of building a facility on region j and aij = 1 if a facility on region i can serve town j, and 0 otherwise. 1 INTRODUCTION 3 This essay describes the standard techniques for modelling Integer Programming problems (IP) and solving them. Notice that we deal almost exclusively with integer linear programs (ILP). The rest of this section is devoted to basic definitions, modelling tricks and reformulation. Section 2 deals with the geometrical and structural properties of an integer program. In Section 3 we give brief notes about polyhedral analysis. Section 4 is a resumé of the most popular iterative methods for solving ILPs. Finally, in section 5 we discuss techniques for finding lower and upper bounds to integer programs. Most of the material has been taken from [Wol98, NW88, PS98, Fis99, KV00]. 1.1 Canonical and Standard Forms An ILP is in canonical form if it is modelled as follows: minx c ⊤x s.t. Ax ≥ b x ≥ 0 x ∈ Zn, (3) where x ∈ Zn are the decision variables, c ∈ Rn is a cost vector, A is an m × n matrix with integral entries, and b an integral vector in Zm. An ILP is in standard form if it is modelled as follows: minx c ⊤x s.t. Ax = b x ≥ 0 x ∈ Zn, (4) again, x ∈ Zn are the decision variables, c ∈ Rn is a cost vector, A is an m × n matrix with integral entries (assume rank(A) = m), and b an integral vector in Zm. As in LP theory, a canonical ILP problem can be put into standard form via the introduction of slack variables (one per constraint). Notice that an ILP is nonlinear. In fact, every integer variable x ∈ Z can be replaced by a continuous variable and an integrality enforcing constraint sin(πx) = 0. A problem where both integer and continuous variables appear is called a mixed-integer linear problem (MILP) or mixed-integer nonlinear problem (MINLP). Since the 3-Sat problem (finding a feasible solution of a conjunctive normal form system of boolean formulas with 3 variables) is NP-complete and 3-Sat can be modelled as an ILP problem, it follows that the class of ILP problems is NP-hard. 1.2 Modelling Tricks Whereas continuous LPs are used to model situations where quantitative decisions must be taken, ILPs express a qualitative type of decision. 1. To impose the fact that exactly one choice xi is to be taken among many possibilities x1, . . . , xn, use a constraint as follows: n ∑ i=1 xi = 1. Similar constraints hold for “at least one choice” (≥ 1) and “at most one choice” (≤ 1) 2. The disjunctive constraint either a⊤1 x ≤ b1 or a ⊤ 2 x ≤ b2 2 GEOMETRICAL AND STRUCTURAL PROPERTIES OF ILP 4 can be formulated as follows: a⊤1 x ≤ b1 + yM1 a⊤2 x ≤ b2 + (1 − y)M2, where M1,M2 are upper bounds on the values attained by a ⊤ 1 x, a ⊤ 2 x. Basically, where y = 0 the first constraint is active and the second one is inactive, due to the “big M” constant. When y = 1 the situation is reversed. 3. “If-then” type constraints can be reformulated to disjunctive constraints by observing that in first order logic we have φ → ψ is the same as ¬φ ∨ ψ. So, to express the condition “if a⊤1 x ≤ b1 then a⊤2 x ≤ b2”, we use disjunctive constraints reformulation for a ⊤ 1 x b1 and a ⊤ 2 x ≤ b2. 4. An integer variable product x1x2 can be linearized exactly as follows: substitute the product x1x2 with a (continuous) variable y and introduce the following constraints: y ≤ x1 y ≤ x2 y ≥ x1 + x2 − 1 y ∈ [0, 1]. 2 Geometrical and Structural Properties of ILP Since by relaxing the integrality constraints of an ILP we obtain an LP problem, it should come to no surprise that the basic geometrical definitions concerning ILPs are much the same as those of LP. The solutions of an ILP in form (3) or (4) are the integer vectors in Zn inside the polyhedron P ⊆ Rn defined by the problem constraints. 2.1 Definition A set S ⊆ Rn is convex if for any two points x, y ∈ S the segment between them is wholly contained in S, that is, ∀λ ∈ [0, 1] (λx+ (1 − λ)y ∈ S). A linear equation a⊤x = b where a ∈ Rn defines a hyperplane in Rn. The corresponding linear inequality a⊤x ≤ b defines a closed half-space. Both hyperplanes and closed half-spaces are convex sets. Since any intersection of convex sets is a convex set, the subset of Rn defined by the system of hyperplanes Ax ≥ b, x ≥ 0 is convex. 2.2 Definition A subset S ⊆ Rn defined by a finite number of closed halfspaces is called a polyhedron. A bounded, non-empty polyhedron is a polytope. 2.3 Definition The set conv(S) of all convex combinations ∑n i=1 λixi (with λi ≥ 0 for all i and ∑n i=1 λi = 1) of a finite set S of points {xi | i ≤ n} in R n is called the convex hull of S. A convex combination is strict if for all i ≤ n we have 0 < λi < 1. 2.4 Definition Consider a polyhedron P and a closed half-space H in Rn. Let H ∩ P be the affine closure of H ∩P and d = dim(H ∩ P ). If d < n then H ∩P is a face of P . If d = 0, H ∩P is called a vertex of P , and if d = 1, it is called an edge. If d = n− 1 then H ∩ P is a facet of P . 2 GEOMETRICAL AND STRUCTURAL PROPERTIES OF ILP 5 2.1 Totally Unimodular Matrices Consider the following LP problem: minx c ⊤x s.t. Ax = b x ≥ 0 x ∈ Rn, (5) where c,A, b are as in problem (4). This problem is the continuous relaxation of the ILP (4). By LP theory, the solution of problem (5) is a basic feasible solution x∗ ∈ Rn corresponding to a square, invertible m×m matrix B composed of a set of basic columns of A, and we have that x∗ = (B−1b, 0). Furthermore, x∗ is a vertex of the polyhedron Ps = {x ≥ 0 | Ax = b}. The conditions for x∗ to be a solution of (4) are that x∗ ∈ Zn, that is, every component of B−1b must be integer. Since b is an integer vector, A is an integer matrix, B is a submatrix of A, and B−1 = 1det(B)C where C is an integer matrix whose entries are integral functions of the entries of B, a condition for B−1b to have integral components is that det(B) = ±1. 2.5 Definition An m×n integral matrix A whose m×m invertible square submatrices all have determinant ±1 is called unimodular. We have just proved the following proposition. 2.6 Proposition If A is unimodular, the vertices of the polyhedron Ps = {x ≥ 0 | Ax = b} all have integral components Thus, any solution of the LP relaxation (5) is a solution of the ILP problem (4). A similar condition can be imposed on A for a polyhedron Pc = {x ≥ 0 | Ax ≥ b} corresponding to the ILP in canonical form (3). 2.7 Definition An m × n integral matrix A where all invertible square submatrices of any size have determinant ±1 is called totally unimodular (TUM). The following propositions give some characterizations of TUM matrices. 2.8 Proposition A matrix A is TUM ⇔ A⊤ is TUM ⇔ (A, I) is TUM. 2.9 Proposition An m × n matrix A is TUM if: (a) for all i ≤ m, j ≤ n we have aij ∈ {0, 1,−1}; (b) each column of A contains at most 2 nonzero coefficients; (c) there is a partition R1, R2 of the set of rows such that for all columns j having exactly 2 nonzero coefficients, we have ∑ i∈R1 aij − ∑ i∈R2 aij = 0. Proof. Suppose that conditions (a), (b), (c) hold but A is not TUM. Let B be the smallest square submatrix of A such that det(B) 6∈ {0, 1,−1}. B cannot contain a column with just one nonzero entry, otherwise B would not be minimal (just eliminate the row and column of B containing that single nonzero entry). Hence B must contain 2 nonzero entries in every column. By condition (c), adding the rows in R1 to those in R2 gives the zero vector, and hence det(B) = 0, contradicting the hypothesis. 2 In the following lemma and theorem, we show why TUM matrices are important. 2 GEOMETRICAL AND STRUCTURAL PROPERTIES OF ILP 6 2.10 Lemma Let x∗ ∈ P = {x ≥ 0 | Ax ≥ b}. Then x∗ is a vertex of P if and only if for any pair of distinct points x1, x2 ∈ P , x ∗ cannot be a strict convex combination of x1, x2. Proof. (⇒) Suppose x∗ is a vertex of P and there are points x1, x2 ∈ P such that there is a λ ∈ (0, 1) with x∗ = λx1 + (1 − λ)x2. Since x ∗ is a vertex, there is a a halfspace H = {x ∈ Rn | h⊤x ≤ d} such that H ∩ P = {x∗}. Thus x1, x2 6∈ H and h⊤x1 > d, h⊤x2 > d. Hence h⊤x∗ = h⊤(λx1 + (1− λ)x2) > d, whence x∗ 6∈ H, a contradiction. (⇐) Assume that x∗ cannot be a strict convex combination of any pair of distinct points x1, x2 of P , and suppose that x ∗ is not a vertex of P . Since x∗ is not a vertex, there is a ball S of radius ε > 0 and center x∗ wholly contained in P . Let x1 ∈ S such that ||x1 − x ∗||2 = ε 2 . Let x2 = (x ∗ − x1) + x ∗. By construction, x2 is the point symmetric to x1 on the line x1x∗. Thus, ||x2 − x∗||2 = ε 2 , x2 6= x1, x2 ∈ P , and x ∗ = 12x1 + 1 2x2, which is a strict convex combination of x1, x2, a contradiction. 2 2.11 Theorem If A is totally unimodular, the vertices of the polyhedron Pc = {x ≥ 0 | Ax ≥ b} all have integral components. Proof. Let x∗ be a vertex of Pc, and s ∗ = Ax∗ − b. Then (x∗, s∗) is a vertex of the standardized polyhedron P̄c = {(x, s) ≥ 0| Ax− s = b}: suppose it were not, then by Lemma 2.10 there would be two distinct points (x1, s1), (x2, s2) in P̄c such that (x ∗, s∗) is a strict convex combination of (x1, s1), (x2, s2), i.e. there would be a λ ∈ (0, 1) such that (x∗, s∗) = λ(x1, s1) + (1 − λ)(x2, s2). Since s1 = Ax1 − b ≥ 0 and s2 = Ax2 − b ≥ 0, both x1 and x2 are in Pc, and thus x ∗ = λx1 + (1 − λ)x2 is not a vertex of Pc since 0 < λ < 1, which is a contradiction. Now, since A is TUM, (A,−I) is unimodular, and (x∗, s∗) is an integral vector. 2 The transportation problem (1) has a totally unimodular matrix, so that solving a continuous relax- ation of the problem always yields an integral solution vector. 2.12 Example Network Flow. One important class of problems with a TUM matrix is that of network flow problems. We have a network on a digraph G = (V,A) with a source node s, a destination node t, and capacities uij on each arc (i, j). We have to determine the maximum amount of material flow that can circulate on the network from s to t. The variables xij , defined for each arc (i, j) in the graph, denote the quantity of flow units. maxx ∑ i∈δ+(s) xsi ∀ i ≤ V, i 6= s, i 6= t ∑ j∈δ+(i) xij = ∑ j∈δ−(i) xji ∀(i, j) ∈ A 0 ≤ xij ≤ uij . (6) Notice that the network flow problem is not strictly an ILP, since there is no integrality requirement on the amount of flow routed in the network. However, it does have a TUM matrix, and hence, provided the capacities uij are integral, the solution is an integral vector. This class of problem is especially important as many common graph problems can be modelled by network flow constraints (e.g. shortest path problem). Even when this is not possible, sometimes it helps to model part of a problem with flow constraints, so that some of the integer variables in the resulting formulation can be relaxed to be continuous [Mac03]. Note that whereas a TUM constraint matrix implies a polyhedron with integral vertices, the converse is not necessarily true. In particular, a sufficient condition for A to be TUM is that the polyhedron {x ≥ 0 | Ax = b} has integral vertices for all integral vectors b ∈ Zm. 3 POLYHEDRAL ANALYSIS 7 3 Polyhedral Analysis The feasible region of any ILP problem can be expressed as the convex hull of its feasible vectors. Thus, if the integrality constraints of the ILP are relaxed, solving its continuous relaxation by LP methods still yields the optimal solution of the ILP. Since LP is in P, there is a definite advantage in being able to express the feasible region of any ILP as the convex hull of its feasible space. Unfortunately, given a feasible region in canonical form Ax ≥ b, x ≥ 0, finding the convex hull P̄ of the feasible vectors is a hard task. Even though any polyhedron (hence even P̄ ) can be expressed by a finite number of inequalities, the required number might be exponential in the size of the instance, thus rendering the whole exercise useless. Nonetheless, this concept spawned a lot of research, from the 1970s and onwards, based on finding the inequalities defining P̄ . This field is called polyhedral analysis. Consider an ILP min{c⊤x | x ∈ Zn ∩ P} where P = {x ≥ 0 | Ax ≥ b} and such that P̄ is the convex hull of its feasible integral solutions (thus min{c⊤x | x ∈ P̄} has the same solution as the original ILP). We are interested in finding explicit linear constraints defining the facets of P̄ . The polyhedral analysis approach is usually expressed as follows: given an integral vector set X ⊆ Zn and a valid inequality h⊤x ≤ d for X, show that the inequality is a facet of conv(X). We present here two approaches. 1. Find n points x1, . . . , xn ∈ X such that h ⊤xi = d∀i ≤ n and show that these points are affinely independent (i.e. that the n− 1 directions x2 − x1, . . . , xn − x1 are linearly independent). 2. Select t ≥ n points x1, . . . , xt satisfying the inequality. Suppose that all these points are on a generic hyperplane µ⊤x = µ0. Solve the equation system ∀k ≤ t n ∑ j=1 xkjµj = µ0 in the t+ 1 unknowns (µj , µ0). If the solution is (λhj , λd) with λ 6= 0 then the inequality h ⊤x ≤ d is facet-defining. There are various theoretical approaches to show that a given polyhedron is a convex hull of an integral vector set; however most of these approaches are difficult to generalize. One of the most stringent limits of polyhedral analysis is that the theoretical devices used to find facets for a given ILP problem are often unique to that very problem. 3.1 Separation Theory Finding all the facets of P̄ is overkill, however. Since the optimization direction points us towards a specific region of the feasible polyhedron, what we really need is an oracle that, given a point x′ ∈ P , tells us that x′ ∈ P̄ or else produces a separating hyperplane h⊤x = d such that for all h⊤x̄ ≤ d for all x̄ ∈ P̄ and h⊤x′ > d (adding the constraint h⊤x ≥ d to the ILP problem formulation then tightens the feasible region P ). The problem of finding a valid separating hyperplane is the separation problem. It is desirable that the separation problem for a given NP-hard problem should have polynomial complexity. The classic example of a polynomial separation problem for an NP-hard problem is the TSP. Consider the undirected TSP of finding the shortest Hamiltonian cycle in an indirected complete weighted graph Kn (with weights cij on the edges {i, j}) with n vertices. Since a Hamiltonian tour is also a 2-matching, the following ILP model has a feasible region which contains the set of Hamiltonian tours as a proper subset: min ∑ i6=j∈V cijxij s.t. ∑ i6=j∈V xij = 2 ∀ i ∈ V xij ∈ {0, 1} ∀ i 6= j ∈ V. (7) 3 POLYHEDRAL ANALYSIS 8 Some of the feasible solutions are given by disjoint cycles. However, we can exclude them from the feasible region by requesting that each cut should have cardinality at least 2, as follows: ∀ S ⊆ V ( ∑ i∈S,j 6∈S xij ≥ 2). There is an exponential number of such constraints (one for each subset S of V ), however we do not need them all: we can add them iteratively by identifying cuts δ(S) where ∑ {i,j}∈δ(S) xij is minimum. We can see this as the problem of finding a cut of minimum capacity in a flow network (so the current values xij are used as arc capacities — each edge is replaced by opposing arcs). The problem is the LP dual of finding the maximum flow in the network, and we can solve that problem in with a polynomial algorithm (O(n3) if we use the Goldberg-Tarjan push-relabel algorithm). So if each cut has capacity at least 2, the solution is an optimal Hamiltonian cycle. Otherwise, supposing that δ(S) has capacity K < 2, the following is a valid cut: ∑ {i,j}∈δ(S) xij ≥ 2, which can be added to the formulation. The problem is then re-solved iteratively to get a new current solution until the optimality conditions are satisfied. In the following sections (3.2-3.6) we shall give examples of four different cut families which can be used to separate the incumbent (fractional) solution from the convex hull of the integral feasible set. 3.2 Chvátal Cut Hierarchy A cut hierarchy is a finite sequence of cut families that generate tighter and tighter relaxed feasible sets P = P0 ⊇ P1 ⊇ . . . ⊇ Pk = P̄ , where P̄ = conv(X). Given the relaxed feasible region P = {x ≥ 0 | Ax = b} in standard form, define the Chvátal first-level closure of P as P1 = {x ≥ 0 | Ax = b,∀u ∈ R n +⌊uA⌋ ≤ ⌊ub⌋}. Observe that although formally the number of inequalities defining P1 is infinite, in fact it can be shown that these can be reduced to a finite number. We can proceed in this fashion by transforming the Chvátal first-level inequalities to equations via the introduction of slack variables and reiterating the process on P1 to obtain the Chvátal second-level closure P2 of P , and so on. It can be shown that any fractional vertex of P can be “cut” by a Chvátal inequality. 3.3 Gomory Cuts Gomory cuts are in fact a special kind of Chvátal cuts. Their fundamental property is thatthey can be inserted in the simplex tableau very easily, and thus are very efficient in cutting plane algorithms, which generate cuts iteratively in function of the current incumbent. Suppose x∗ is the optimal solution found by the simplex algorithm running on a continuous relaxation of the ILP. Assume x∗h is fractional. Since x ∗ h 6= 0, column h is a basic column; thus there corresponds a row t in the simplex tableau: xh + ∑ j∈ν ātjxj = b̄t, (8) where ν are the nonbasic variable indices, ātj is a component of B −1A (B is the nonsingular square matrix of the current basic columns of A) and b̄t is a component of B −1b. Since ⌊ātj⌋ ≤ ātj for each row index t and column index j, xh + ∑ j∈ν ⌊ātj⌋xj ≤ b̄t. 3 POLYHEDRAL ANALYSIS 9 Furthermore, since the LHS must be integer, we can restrict the RHS to be integer too: xh + ∑ j∈ν ⌊ātj⌋xj ≤ ⌊b̄t⌋. (9) We now subtract (9) from (8) to obtain the Gomory cut: ∑ j∈ν (⌊ātj⌋ − ātj)xj ≤ (⌊b̄t⌋ − bt). (10) We can subsequently add a slack variable to the Gomory cut, transform it to an equation, and easily add it back to the current dual simplex tableau as the last row with the slack variable in the current basis. 3.3.1 Gomory Cutting Plane Algorithm: Example In this section we illustrate the application of Gomory cut in an iterative fashion in a Cutting Plane algorithm. The cutting plane algorithm works by solving a continuous relaxation at each step. If the continuous relaxation solution fails to be integral, a separating cutting plane (a valid Gomory cut) is generated and added to the formulation, and the process is repeated. The algorithm terminates when the continuous relaxation solution is integral. Let us solve the following IP in standard form: min x1 − 2x2 t.c. −4x1 + 6x2 + x3 = 9 x1 + x2 + x4 = 4 x ≥ 0 , x1, x2 ∈ Z In this example, x3, x4 can be seen as slack variables added to an original problem in canonical form with inequality constraints expressed in x1, x2 only. We identify xB = (x3, x4) as an initial feasible basis; we apply the simplex algorithm obtaining the following tableaux sequence, where the pivot element is framed . x1 x2 x3 x4 0 1 -2 0 0 9 -4 6 1 0 4 1 1 0 1 x1 x2 x3 x4 3 − 13 0 1 3 0 3 2 − 2 3 1 1 6 0 5 2 5 3 0 − 1 6 1 x1 x2 x3 x4 7 2 0 0 3 10 1 5 5 2 0 1 1 10 2 5 3 2 1 0 − 1 10 3 5 The solution of the continuous relaxation is x̄ = ( 32 , 5 2 ), where x3 = x4 = 0. 3 POLYHEDRAL ANALYSIS 10 (1) (2) x̄ We derive a Gomory cut from the first row of the optimal tableau: x2 + 1 10x3 + 2 5x4 = 5 2 . The cut is formulated as follows: xi + ∑ j∈N ⌊āij⌋xj ≤ ⌊b̄i⌋, (11) where N is the set of nonbasic variable indices and i is the index of the chosen row. In this case we obtain the constraint x2 ≤ 2. We introduce this Gomory cut in the current tableau. Note that if we insert a valid cut in a simplex tableau, the current basis becomes primal infeasible, thus a dual simplex iteration is needed. First of all, we express x2 ≤ 2 in terms of the current nonbasic variables x3, x4. We subtract the i-th optimal tableau row from (11), obtaining: xi + ∑ j∈N āijxj ≤ b̄i ⇒ ∑ j∈N (⌊āij⌋ − āij)xj ≤ (⌊b̄i⌋ − b̄i) ⇒ − 1 10 x3 − 2 5 x4 ≤ − 1 2 . Recall that the simplex algorithm requires the constraints in equation (not inequality) form, so we intro- duce a slack variable x5 ≥ 0 in the problem: − 1 10 x3 − 2 5 x4 + x5 = − 1 2 . We add this constraint as the bottom row of the optimal tableau. We now have a current tableau with an added row and an added column, corresponding to the new cut and the new slack variable (which is in the basis): x1 x2 x3 x4 x5 7 2 0 0 3 10 1 5 0 5 2 0 1 1 10 2 5 0 3 2 1 0 − 1 10 3 5 0 − 12 0 0 − 1 10 − 2 5 1 3 POLYHEDRAL ANALYSIS 11 The new row corresponds to the Gomory cut x2 ≤ 2 (labelled “constraint (3)” in the figure below). (1) (2) (3) x̄ x̃ We carry out an iteration of the dual simplex algorithm using this modified tableau. The reduced costs are all non-negative, but b̄3 = − 1 2 < 0 implies that x5 = b̄3 has negative value, so it is not primal feasible (as x5 ≥ 0 is now a valid problem constraint). We pick x5 to exit the basis. The variable j entering the basis is given by: j = argmin{ c̄j |āij | | j ≤ n ∧ āij < 0}. In this case, j = argmin{3, 12}, corresponding to j = 4. Thus x4 enters the basis replacing x5 (the pivot element is indicated in the above tableau). The new tableau is: x1 x2 x3 x4 x5 13 4 0 0 1 4 0 1 2 2 0 1 0 0 1 3 4 1 0 − 1 4 0 3 2 5 4 0 0 1 4 1 − 5 2 The optimal solution is x̃ = ( 34 , 2). Since this solution is not integral, we continue. We pick the second tableau row: x1 − 1 4 x3 + 3 2 x5 = 3 4 , to generate a Gomory cut x1 − x3 + x5 ≤ 0, which, written in terms of the variables x1, x2 is −3x1 + 5x2 ≤ 7. This cut can be written as: − 3 4 x3 − 1 2 x5 ≤ − 3 4 . The new tableau is: 3 POLYHEDRAL ANALYSIS 12 x1 x2 x3 x4 x5 x6 13 4 0 0 1 4 0 1 2 0 2 0 1 0 0 1 0 3 4 1 0 − 1 4 0 3 2 0 5 4 0 0 1 4 1 − 5 2 0 − 34 0 0 − 3 4 0 − 1 2 1 The pivot is framed; exiting row 4 è was chosen as b̄4 < 0, entering column 5 as c̄3 |ā43| = 13 < 1 = c̄5 |ā45| ). Pivoting, we obtain: x1 x2 x3 x4 x5 x6 3 0 0 0 0 13 1 3 2 0 1 0 0 1 0 1 1 0 0 0 53 − 1 3 1 0 0 0 1 − 83 1 3 1 0 0 1 0 23 − 4 3 This tableau has optimal solution x∗ = (1, 2), which is integral (and hence optimal). Figure below shows the optimal solution and the last Gomory cut to be generated. (1) (2) (3) (4) x̄ x̃ x∗ 3.4 Disjunctive Cuts We saw in Section 1.2, point 2, that disjunctive constraints can be modelled by resorting to ILP techniques. Suppose we are working in a feasible region consisting of a disjunction of two polyhedra P1, P2, and consider two inequalities h⊤1 x ≤ d1 (valid in P1), h ⊤ 2 x ≤ d2 (valid in P2). Choose h ⊤ such that for all i ≤ n hi ≤ min(h1i, h2i), and d such that d ≥ max(d1, d2). Then the inequality h ⊤x ≤ d is valid for P1 ∪ P2 (it is called disjunctive cut). 4 ENUMERATIVE METHODS FOR INTEGER PROGRAMMING 13 3.5 Lifting It is possible to derive valid inequalities for a given ILP min{cx | Ax ≤ b ∧ x ∈ {0, 1}n} (12) by lifting lower-dimensional inequalities to a higher space. If the inequality ∑n j=2 πjxj ≤ π0 (with πj ≥ 0 for all j ∈ {2, . . . , n}) is valid for the restricted ILP min{ n ∑ j=2 cjxj | n ∑ j=2 aijxj ≤ bi∀i ≤ m ∧ xj ∈ {0, 1} ∀ j ∈ {2, . . . , n}}, then by letting π1 = max (π0 − n ∑ j=2 πjxj) n ∑ j=2 aijxj ≤ bi − ai1 ∀i ≤ m xj ∈ {0, 1} ∀j ∈ {2, . . . , n}. we can find the inequality ∑n j=1 πjxj ≤ π0 which is valid for (12), as long as the above problem is not infeasible; if it is, then x1 = 0 is a valid inequality. Of course the lifting process can be carried out with respect to any variable index (not just 1). 3.6 RLT Cut Hierarchy This cut hierarchy, described in [SD00], is a nonlinear lifting. RLT stands for Reformulation and Lin- earization Technique. One advantage of this cut hierarchy with respect to the Chvátal one is that the RLT can be applied to Mixed-Integer Linear Programs (MILP) very effectively. From each constraint a⊤i x ≥ bi we derive the i-th constraint factor γ(i) = a ⊤ i x− bi (i ≤ m). We then form the bound products x(J1, J2) = ∏ i∈J1 xi ∏ i∈J2 (1 − xi) for all partitions J1, J2 of {1, . . . , d}. The RLT d-th level closure is obtained by multiplying each constraint factor γ(i) by all the bound products. For all i ≤ m, for all J1, J2 partitioning d, we derive the following valid inequality x(J1, J2)γ(i) ≥ 0. (13) Notice that these inequalities are nonlinear. After having reduced all occurrences of x2i via the valid integer equation x2i = xi, we now linearize the inequalities (13) by lifting them in higher-dimensional space: substitute all products ∏ i∈J xi by a new added variable wJ = ∏ i∈J xi, to obtain linearized cuts of the form liJ1,J2(x,w) ≥ 0, (14) where liJ1,J2 is a linear function of x,w. We define Pd = {x | ∀i ≤ m and J1, J2 partitioningd, l i J1,J2 (x,w) ≥ 0}. This cut hierarchy has two main shortcomings: (a) the high number of added variables and (b) the huge number of generated constraints. It is sometimes used heuristically to generate tight continuous relaxations of the ILP problem to be used in Branch-and-Bound algorithms. 4 Enumerative Methods for Integer Programming In this section we shall very briefly describe some of the most popular solution methods for the solution of ILP problems. 4 ENUMERATIVE METHODS FOR INTEGER PROGRAMMING 14 4.1 Branch-and-Bound The first iterative method developed for solving ILPs is the Branch-and-Bound algorithm, which relies on a list of relaxed subproblems to be solved. Initially, the list is initialized to contain the continuous reformulation of the ILP. In a typical iteration, the subproblem in the list having the lowest lower bound is chosen, removed from the list, and solved by LP methods, yielding a solution x∗ and a lower bounding objective function value z∗. If x∗ is integral, the problem is fathomed and a new incumbent solution is found, and every subproblem in the list having lower bound greater than the incumbent objective function value is removed from the list and discarded as it cannot contain the optimum. Otherwise, the variable x∗i with fractional value x ∗ i −⌊x ∗ i ⌋ nearest to 1 2 is chosen for branching, and two new subproblems are created: one with an added constraint xi ≤ ⌊x ∗ i ⌋ and the other with xi ≥ ⌈x ∗ i ⌉; both are assigned the lower bound z∗ and inserted in the list. The algorithm terminates when the list of subproblems is empty. This algorithm is guaranteed to terminate in a finite number of steps. 4.1.1 Example Initially, we let x∗ = (0, 0) e f∗ = −∞. Let L be the unsolved problem list, initially containing the continuous relaxation of the original IP N1: max 2x1 + 3x2 x1 + 2x2 ≤ 3 6x1 + 8x2 ≤ 15 x1, x2 ∈ Z+. The solution of the continuous relaxation is P (intersection of line (1) x1+2x2 = 3 with (2) 6x1+8x2 = 15), as depicted in the figure below. 1 1 3/2 15/8 3/2 5/2 32 3/8 (2) (1) (3) (4) ∇f Q P R We obtain an upper bound1 f = 214 corresponding to a fractionary solution x = P = ( 3 2 , 3 4 ). We choose the fractionary component x1 as the branching variable, and form two subproblems N2, N3; we add the constraint x1 ≤ 1 (labelled (3)) to N2 and x1 ≥ 2 (labelled (4)) to N3. Finally, we push N2 and N3 on the problem list L. We choose N2 and remove it from L. The solution of the continuous relaxation of N2 is at the intersection Q of (1) and (3): we obtain therefore x = Q = (1, 1) and f = 5; since the solution is integral, we do not branch; further, since f > f∗ we update x∗ = x and f∗ = f . 1Notice we are solving a maximization problem, so lower bound is replaced by upper bound. 5 HEURISTICS AND BOUNDS 15 Now, we choose N3 and remove it from L. The solution of the continuous relaxation of N3 is at the intersection R of (2) and (4): we obtain x = R = (2, 38 ) and f = 41 8 . Since the upper bound is 41/8, and ⌊ 418 ⌋ = 5, each integral solution found in the feasible region of N3 will never attain a better objective function value than f∗; hence, again, we do not branch. Since the list L is now empty, the algorithm terminates with solution x∗ = (1, 1). The algorithmic process can be summarized by the following tree: N1 N2 N3 x1 ≤ 1 x1 ≥ 2 4.2 Branch-and-Cut Branch-and-Cut algorithms are a mixture of Branch-and-Bound and Cutting Plane algorithms. In prac- tice, they follow a Branch-and-Bound iterative scheme where valid cuts are generated at each subproblem. 4.3 Branch-and-Price Branch-and-Price might be described as a mixture of Branch-and-Bound and Column Generation algo- rithms. Extremely large LP models are solved at each Branch-and-Bound iteration by column generation. The branching rules are sometimes modified to take into account certain numerical difficulties arising from the employment of the column generation algorithm to solve each subproblem. 5 Heuristics and Bounds When optimal methods take too long to solve a difficult ILP, we must resort to heuristic methods. These methods do not guarantee optimality, yet they strive to build a good solution in an acceptable amount of time. The trade-off between computational time spent searching for a solution and solution quality can often be controlled by fine-tuning the algorithmic parameters. As heuristic algorithms are not guaranteed to find the optimal solution, they are often constructed with a special eye to the implementation practi- calities, so that they run fast. A posteriori analysis and modification of these algorithms is also specially useful. The “heuristic construction process” is often: (a) think of a possible heuristic, (b) implement it, (c) see how it performs on a certain number of test instances and record its failures, and (d) improve it, learning from experience. For example, a possible heuristic for the set covering problem (2) is to iteratively choose the region covering the largest number of towns. This type of heuristic is called greedy. Since the primary reason to employ a heuristic algorithm is that we are not able to solve the problem to optimality, estimating the heuristic solution quality is a hard challenge in itself. In certain cases (as with Christofides’ algorithm for the Symmetric Travelling Salesman Problem) we are able to prove that the heuristic solution is within a certain, pre-defined range from the optimal solution. The techniques for these proofs all belong to a field called Approximation Theory. As in Polyhedral Analysis, a different approach is called for each heuristic and problem. In the absence of approximation results, the only way to assess the heuristic solution quality is to find a lower bound to the optimal solution, and compare it with the heuristic results. The tighter the bound, the best our quality assessment will be. 5 HEURISTICS AND BOUNDS 16 Another reason why heuristics and lower bounds are important in ILP methods is that heuristics provide an upper bound to the solution. Thus, Branch-and-Bound convergence can be accelerated by finding good incumbents via heuristic methods. Tight lower bounding techniques also help speed up convergence considerably. 5.1 Lower Bounds: Lagrangian Relaxation and Subgradient Method The classic way to obtain a lower bound for an ILP is to solve its continuous relaxation, as has been said above. Other techniques exist, however, which make use of some problem structure to improve the bound. Consider a “nearly decomposable problem” as follows: minx,y c ⊤ 1 x+ c ⊤ 2 y A1x ≥ b1 A2y ≥ b2 Bx+Dy ≥ d x ∈ Zn1+ , y ∈ Z n2 + (15) where x, y are two sets of integer variables, c⊤1 , c ⊤ 2 , b1, b2, d are vectors and A1, A2, B,D are matrices of appropriate sizes. Observe that were it not for the complicating constraints Bx + Dy ≥ d we might decompose problem (15) into two smaller independet subproblems and solve each separately (usually a much easier task). A tight lower bound for this kind of problem can be obtained by solving a Lagrangian relaxation of the problem. The complicating constraints become part of the objective function (each constraint weighted by a Lagrange multiplier): L(λ) = minx,y c ⊤ 1 x+ c ⊤ 2 y + λ(Bx+Dy − d) A1x ≥ b1 A2y ≥ b2 x ∈ Zn1+ , y ∈ Z n2 + (16) For each real-valued vector λ ≥ 0, L(λ) is a lower bound to the solution of the original problem. Since we want to find the tightest possible lower bound, we maximize L(λ) with respect to the Lagrange multipliers λ: max λ≥0 L(λ). (17) The lower bound obtained by solving (17) is guaranteed to be at least as tight as that obtained by solving the continuous relaxation of the original problem. Indeed, there exists a stronger result about the strength of the Lagrangian relaxation. For simplicity of notation, let us consider the following problem: minx c ⊤x Ax ≤ b x ∈ X, (18) where X is a discrete but finiteset X = {x1, . . . , x⊤}. Consider the lagrangian relaxation max λ≥0 min x∈X c⊤x+ λ(Ax− b), (19) and let p∗ be the solution of (19). By definition of X, (19) can be written as: maxλ,u u u ≤ c⊤xt + λ(Axt − b) ∀t ≤ T λ ≥ 0. 6 CONCLUSION 17 The above problem is linear, so its dual: minv ∑T t=1 vt(c ⊤xt) ∑T t=1 vt(Aix t − bi) ≤ 0 ∀i ≤ m ∑T t=1 vt = 1 vt ≥ 0 ∀t ≤ T (where Ai is the i-the row of A) has the same optimal objective function value p ∗. Notice now that ∑T t=1 vtx t subject to ∑T t=1 vt = 1 and vt ≥ 0 for all t ≤ T is the convex hull of {x t | t ≤ T}, and so the above dual program can be written as: minx c ⊤x Ax ≤ b x ∈ conv(X), where conv(X) is the convex hull of X. Thus, the bound given by the Lagrangian relaxation is at least as strong as that given by the continuous relaxation restricted to the convex hull of the original discrete domain. For problems with the integrality property, the Lagrangian relaxation is no tighter than the continuous relaxation. On the other hand, the above result also suggests that Lagrangian relaxations are usually very tight. Solving (17) is not straightforward, as L(λ) is a piecewise linear (but not differentiable) function. The most popular method for solving such problems is the subgradient method. Its description is very simple, although the implementation details are not. 1. Start with an initial vector λ∗. 2. If a pre-set termination condition is verified, exit. 3. Evaluate L(λ∗) by solving the decomposable problem (16) with fixed λ, yielding a solution x∗, y∗. 4. Choose a vector d in the subgradient of L with respect to λ, and a step length t, and set λ′ = max{0, λ∗ + td}. 5. Update λ∗ = λ′ and go back to step 2. Choosing appropriate subgradient vectors and step lengths at each iteration to accelerate convergence is a non-trivial task. 6 Conclusion In this essay we have presented some of the most well-known tools for the solution of ILP problems. The topics discussed in the essay range from modelling concepts to geometrical insights, to algorithmic descriptions. References [Fis99] M. Fischetti. Lezioni di Ricerca Operativa (in Italian). Edizioni Libreria Progetto, Padova, 1999. [KV00] B. Korte and J. Vygen. Combinatorial Optimization, Theory and Algorithms. Springer-Verlag, Berlin, 2000. REFERENCES 18 [Mac03] N. Maculan. Integer programming problems using a polynomial number of variables and con- straints for combinatorial optimization problems in graphs. In N. Mladenović and Dj. Dugošija, editors, SYM-OP-IS Conference Proceedings, Herceg-Novi, pages 23–26, Beograd, September 2003. Mathematical Institute, Academy of Sciences. [NW88] G.L. Nemhauser and L.A. Wolsey. Integer and Combinatorial Optimization. Wiley, New York, 1988. [PS98] C.H. Papadimitriou and K. Steiglitz. Combinatorial Optimization: Algorithms and Complexity. Dover, New York, 1998. [SD00] H. Sherali and P. Driscoll. Evolution and state-of-the-art in integer programming. Journal of Computational and Applied Mathematics, 124:319–340, 2000. [Wol98] L.A. Wolsey. Integer Programming. Wiley, New York, 1998. View publication statsView publication stats https://www.researchgate.net/publication/229026552
Compartilhar