Home AI/ML Mixed-Integer Programming (MIP) Explained: Python Optimization Guide

Mixed-Integer Programming (MIP) Explained: Python Optimization Guide

Last updated: May 27, 2026
k
Published April 15, 2026 · Updated May 27, 2026 · 33 min read

Summary

What this post covers: A practical introduction to Mixed-Integer Programming, including how to formulate decision problems, how branch-and-cut solvers operate internally, and how to implement realistic models in Python using PuLP, Pyomo and OR-Tools.

Key insights:

  • MIP underpins UPS ORION, airline crew scheduling and Amazon same-day routing. It saves these companies hundreds of millions of dollars annually and is considerably more important to industry than the more widely publicised deep-learning methods.
  • MIP is NP-hard in theory, yet modern branch-and-cut solvers, which apply cutting planes, presolve and primal heuristics, routinely handle millions of variables because real-world problem structure is substantially friendlier than the worst case.
  • Formulation quality dominates solver choice. A tight LP relaxation, supported by appropriate big-M values, strong cuts and symmetry breaking, often produces a 100-fold speedup, considerably more than the gain from upgrading from CBC to Gurobi.
  • Open-source solvers such as CBC, HiGHS and SCIP close more than 95 percent of optimality gaps on most problems with fewer than 100,000 variables. Commercial solvers such as Gurobi and CPLEX justify their licence fees only on the largest or most adversarial instances.
  • MIP is the appropriate tool when constraints are strict and decisions are discrete. Genetic algorithms, constraint programming and reinforcement learning each prevail in narrow niches but rarely match MIP’s guaranteed optimality bounds.

Main topics: The Big Idea Behind MIP, Formulating a MIP Step by Step, How MIP Solvers Actually Work, Python Implementation: Full Working Examples, Solvers Compared: Open Source vs Commercial, Real-World Applications, Practical Tips and Common Pitfalls, MIP vs Alternatives: GA, CP, RL, Frequently Asked Questions, Related Reading, References.

UPS’s ORION routing system saves the company approximately 100 million miles of driving each year, reduces fuel consumption by 10 million gallons, and eliminates roughly 100,000 metric tons of CO2 emissions. It is not powered by a neural network or a reinforcement-learning system. ORION is a substantial Mixed-Integer Program, a mathematical optimisation model containing yes/no decisions, integer counts and linear relationships, solved to near-optimality day after day. Airlines such as American and Delta use the same class of model to schedule crews across tens of thousands of flights, saving hundreds of millions of dollars annually. Amazon’s same-day delivery network is essentially a single, very large MIP that is re-solved every few minutes.

Mixed-Integer Programming is arguably the most valuable area of applied mathematics for which most software engineers have never written a line of code. A practitioner who has encountered a problem of the form “select which actions to take, how many of each, and in what order, so as to minimise cost or maximise profit” has almost certainly encountered a MIP without recognising it. The remainder of this article examines what MIP is, how problems are formulated within it, how the solvers operate internally, and how to write Python code that runs in practice.

The Big Idea Behind MIP

Consider a small delivery business that must decide which of five warehouses to open and which customers should be served from each. Opening a warehouse is a yes/no decision. The number of trucks purchased is an integer. The daily shipping volume is a continuous quantity. Total cost depends on each of these in a largely linear manner: fixed costs for opening, variable costs for shipping. The objective is to minimise total cost subject to satisfying customer demand. This situation describes a Mixed-Integer Linear Program.

A MIP is an optimisation problem in which some variables must take integer or binary values, others may be continuous, the objective is linear, and the constraints are linear. The “mixed” qualifier refers to the combination of integer and continuous variables. When every variable is continuous, the problem is a Linear Program, which is solvable in polynomial time by the simplex or interior-point method. When every variable is integer, the problem is a pure Integer Program. In practice, most real problems are MIPs, because business decisions typically combine discrete choices with continuous quantities.

LP vs IP vs MIP: What Actually Changes

The theoretical step from LP to MIP is large. LP is solvable in polynomial time; MIP is NP-hard. As problem size grows, solution time can therefore expand sharply. In practice, however, modern MIP solvers routinely handle problems with millions of variables, because the structure of real problems is typically far more tractable than the worst case.

Aspect LP IP (Pure Integer) MIP
Variable types All continuous All integer/binary Mix of continuous and integer
Complexity Polynomial (P) NP-hard NP-hard
Typical size solvable Millions of variables Thousands to millions Thousands to millions
Algorithm Simplex / Interior point Branch and cut Branch and cut
Use case Resource allocation, blending Pure combinatorial Most real business problems
Example Refinery product mix TSP, graph coloring Facility location, scheduling

 

Why Rounding the LP Solution Fails

A tempting shortcut is to solve the LP relaxation, treating the integer variables as continuous, and then round to the nearest integer. This approach is almost always incorrect and can fail dramatically. Consider a simple example: maximise x + y subject to x + y ≤ 1.5 with x, y ∈ {0, 1}. The LP relaxation produces x = 0.5, y = 1.0 with an objective of 1.5. Naive rounding may yield (1, 1), which is infeasible, or (0, 1) with an objective of 1, or (1, 0) with an objective of 1. The true MIP optimum is 1. Now consider a constraint of the form “x + y + z + … ≤ 1″ representing the opening of one warehouse out of 100. Rounding the fractional LP solution produces meaningless results.

The gap between the LP relaxation’s optimal value and the true MIP optimal value is termed the integrality gap. A formulation with a small integrality gap is described as tight or strong. A substantial portion of the craft of MIP modelling consists of making this gap as small as possible without expanding the problem size unmanageably.

MIP Geometry: LP Relaxation vs Integer Feasibility x y LP feasible region LP optimum (fractional) (x=4.2, y=6.8), obj=11.0 MIP optimum (integer) (x=4, y=6), obj=10.0 integrality gap = 10% Legend Integer feasible point LP optimum (corner, fractional) MIP optimum (best integer) LP feasible region Key insight: Rounding the LP optimum (4.2, 6.8) does NOT give the MIP optimum. The best integer point may lie deep inside — not on the boundary. Tighter formulations shrink the LP polygon toward the integer hull — faster solves.

When MIP Is Effective and When It Is Not

MIP is the appropriate tool when a problem has a clear discrete structure, a largely linear cost model, and when a provable guarantee of optimality, or a bounded optimality gap, is valuable. Classic applications include assignment (matching workers to jobs), scheduling (deciding which tasks run on which machines and in what order), routing (vehicle paths through customers), facility location (depot placement), network design (deciding which links to build), capacity planning (deciding how much to invest), and portfolio optimisation with discrete constraints (cardinality limits and round-lot purchases).

MIP is not the appropriate tool when the problem is entirely continuous, in which case LP or QP suffices; when the cost function is highly nonlinear and cannot reasonably be linearised, in which case nonlinear solvers or genetic algorithms may be preferable; when no clear discrete structure can be exploited; or when answers are required in milliseconds on problems that would take a solver minutes. Real-time control, for example, often relies on a heuristic or learned policy, sometimes trained by solving many MIPs offline.

Key Takeaway: MIP delivers a provable optimum, or a proven gap, for problems with discrete decisions. It scales substantially further in practice than its theoretical complexity suggests, thanks to decades of algorithmic engineering. It is most beneficial when the underlying problem genuinely contains a yes/no, integer-count structure.

Formulating a MIP Step by Step

Formulating a MIP is in part a craft and in part an engineering exercise. The modeller defines decision variables, writes an objective, and encodes business rules as linear constraints. The same problem may be modelled in many ways, and the differences materially affect solve time.

Decision Variables

MIPs typically employ three categories of variable.

  • Continuous (for example, litres of fuel or dollars invested): any real number within a range.
  • Integer (for example, the number of trucks or workers): non-negative integers.
  • Binary (for example, opening a warehouse yes or no, or buying a stock yes or no): 0 or 1. Binary variables are by far the most common in modelling because they encode logical choices.

Objective Function

The objective is a linear combination of the decision variables. For example, minimising total cost may be expressed as the sum of fixed cost multiplied by open_i, plus the sum of unit cost multiplied by shipment_ij. Maintaining a linear objective is a soft rule, since many nonlinear costs can be linearised by introducing auxiliary variables and constraints.

Linear Constraints and Logical Constraints

Constraints are ≤, ≥ or = relations between linear expressions. Their expressive power derives from the use of binary variables to encode logic.

  • At most k:i xi ≤ k
  • At least k:i xi ≥ k
  • Exactly one:i xi = 1 (assignment)
  • Implication (if x=1 then y=1): y ≥ x
  • Mutual exclusion (x and y cannot both be 1): x + y ≤ 1

The Big-M Method for If-Then Logic

One of the oldest and most frequently misused techniques in MIP is the Big-M method. Consider an investor who wishes to express the following: if a binary y = 0, then a continuous x must be 0; if y = 1, x may rise up to its natural upper bound. The corresponding constraint is written as follows:

x ≤ M * y     # where M is a sufficiently large number

If y = 0, the constraint forces x ≤ 0, so x = 0. If y = 1, the constraint becomes xM, which is effectively no upper bound. The mechanism is simple. However, Big-M is hazardous: selecting M too large weakens the LP relaxation, increases the integrality gap, and introduces numerical instability. Modern solvers such as Gurobi and CPLEX support indicator constraints (y = 1 ⇒ x ≤ c) natively, which are both tighter and numerically safer.

Caution: A common error is setting M = 1e9 as a precaution. Doing so undermines numerical stability and renders the LP relaxation useless. The smallest valid upper bound on the quantity involved should be selected.

Worked Example: The 0/1 Knapsack

Consider a bag with capacity W and n items, each with weight wi and value vi. The objective is to select a subset of items that maximises total value without exceeding capacity.

Variables: xi ∈ {0, 1} = 1 if item i is chosen.

Objective: maximise ∑i vi xi

Constraints:i wi xiW

The formulation is complete. Two lines of mathematics translate, as the implementation below illustrates, into roughly five lines of Python.

Worked Example: Uncapacitated Facility Location

Consider m candidate warehouse sites and n customers. Opening warehouse i costs fi. Serving customer j from warehouse i costs cij. Each customer must be served by exactly one open warehouse.

Variables:

  • yi ∈ {0, 1} = 1 if warehouse i is open.
  • xij ∈ [0, 1] = fraction of customer j‘s demand served from i (often also binary in assignment form).

Objective: minimize ∑i fi yi + ∑i, j cij xij

Constraints:

  • i xij = 1 for all j (each customer served fully)
  • xijyi for all i, j (can only ship from an open warehouse)

The final constraint deserves attention. The naive Big-M version would be ∑j xij ≤ M · yi, a single aggregated constraint per warehouse. The disaggregated form, xijyi, instead produces one constraint per customer-warehouse pair. The constraint count rises, but the LP relaxation is substantially tighter and solves run considerably faster. This is a canonical example of why formulation matters.

How MIP Solvers Actually Work

Understanding the internals of a MIP solver is not solely an academic exercise. It influences how a modeller writes formulations, how solver logs are interpreted, and why small-looking reformulations can change solve time by two orders of magnitude.

Branch and Bound

The core algorithm is branch and bound. The procedure begins by solving the LP relaxation, with the integrality requirements dropped. If the LP solution is already integer, the procedure terminates. Otherwise, a fractional variable, for example x = 2.7, is selected and two subproblems are created: one with x ≤ 2 and one with x ≥ 3. Each LP relaxation is solved, and the procedure recurses. The tree of subproblems grows, but entire branches may be pruned under three rules.

  • Infeasibility: the LP of a subproblem has no feasible solution.
  • Bound dominance: the LP bound of a subproblem is worse than the best integer solution found so far, referred to as the incumbent. No solution in this branch can improve upon the incumbent.
  • Integer feasibility: the LP solution of a subproblem is already integer, in which case the incumbent is updated if the new solution is better.

Branch and Bound Tree x ≤ 2 x ≥ 3 y ≤ 3 y ≥ 4 y ≤ 3 y ≥ 4 x ≤ 1 x = 2 ROOT (LP relax) x=2.7, y=3.4 obj=18.2 Node A x=2, y=3.6 obj=17.4 Node B x=3, y=3.1 obj=17.8 A1: integer feas. x=2, y=3 obj=16 (incumbent) A2 x=1.5, y=4 obj=17.0 B1: INFEASIBLE pruned B2: bound 15.5 < 16, pruned A2a: bound 15.8 < incumbent, pruned A2b: OPTIMAL x=2, y=4 obj=17 ★ LP node (fractional) Integer feasible Infeasible (pruned) Bound-dominated (pruned) Optimal solution

Cutting Planes

Pure branch and bound can grow unmanageably. The breakthrough that made modern MIP practical was the introduction of cutting planes: additional linear inequalities added to the LP relaxation that remain valid for all integer solutions but exclude the fractional LP optimum. Classical Gomory cuts, derived from the simplex tableau, were the first systematic family. Modern solvers apply dozens of families, including mixed-integer rounding cuts, flow cover cuts, knapsack cover cuts, clique cuts and lift-and-project cuts. Combining cuts with branching produces branch and cut, the dominant paradigm since the 1990s.

Heuristics Inside the Solver

A strong upper bound, in the case of a minimisation, allows the solver to prune aggressively. Modern solvers incorporate sophisticated primal heuristics. The feasibility pump rounds the LP solution and projects back toward feasibility. RINS (Relaxation Induced Neighbourhood Search) fixes the variables that agree between the LP relaxation and the incumbent and then solves a smaller MIP in the remaining space. Local branching defines a Hamming-distance neighbourhood around the incumbent. These methods routinely find feasible solutions within seconds on problems that pure branch and bound would struggle to address.

Presolve: the underlying mechanism

Before any branching occurs, the solver runs presolve, a suite of transformations that tighten bounds, eliminate redundant constraints, fix variables, detect implied integralities, and identify special structures such as set covering or packing. On real-world models, presolve often shrinks the problem by 30 to 70 percent before the first LP is solved. When Gurobi appears to solve a million-variable MIP almost instantaneously, presolve is typically the reason.

Warm Starts and Incumbents

A feasible solution from a heuristic, a previous solve, or a human expert can be supplied to the solver as a MIP start. The solver immediately holds an incumbent for pruning, and the search concentrates on proving optimality or improving on that incumbent. This single practice can convert a one-hour solve into a one-minute solve.

Python Implementation: Full Working Examples

The examples below use PuLP for the simpler cases and Pyomo for more advanced ones. Both are open source, and both allow easy switching between solvers. Installation is performed via pip install pulp pyomo. PuLP ships with the CBC solver by default.

Example 1: 0/1 Knapsack

from pulp import LpProblem, LpVariable, LpMaximize, lpSum, LpBinary, value

items = ['A', 'B', 'C', 'D', 'E']
weights = {'A': 2, 'B': 3, 'C': 4, 'D': 5, 'E': 9}
values  = {'A': 3, 'B': 4, 'C': 5, 'D': 8, 'E': 10}
capacity = 10

prob = LpProblem("Knapsack", LpMaximize)
x = LpVariable.dicts("item", items, cat=LpBinary)

# Objective: maximize total value
prob += lpSum(values[i] * x[i] for i in items)

# Constraint: total weight ≤ capacity
prob += lpSum(weights[i] * x[i] for i in items) <= capacity

prob.solve()

print(f"Status: {prob.status}")
print(f"Total value: {value(prob.objective)}")
for i in items:
    if x[i].value() > 0.5:
        print(f"  Take {i} (w={weights[i]}, v={values[i]})")

Running the code prints items A, B, D and C, or whichever subset the solver identifies, with a total value of 20 and a total weight of 9. CBC handles the problem in milliseconds.

Example 2: TSP with MTZ Subtour Elimination

The Travelling Salesman Problem is the classic routing benchmark. The subtle challenge in a MIP formulation is to forbid subtours, that is, disconnected loops. The Miller-Tucker-Zemlin formulation introduces auxiliary order variables ui and the constraint uiuj + n · xijn − 1 for all i ≠ j (except node 0). MTZ is weaker than the exponential family of subtour elimination constraints but fits within a compact formulation.

from pulp import LpProblem, LpVariable, LpMinimize, lpSum, LpBinary, LpInteger
import math, random

random.seed(42)
n = 8
coords = [(random.uniform(0, 100), random.uniform(0, 100)) for _ in range(n)]
d = [[math.hypot(coords[i][0]-coords[j][0], coords[i][1]-coords[j][1])
      for j in range(n)] for i in range(n)]

prob = LpProblem("TSP", LpMinimize)
x = [[LpVariable(f"x_{i}_{j}", cat=LpBinary) if i != j else None
      for j in range(n)] for i in range(n)]
u = [LpVariable(f"u_{i}", lowBound=0, upBound=n-1, cat=LpInteger) for i in range(n)]

# Objective: total distance
prob += lpSum(d[i][j] * x[i][j] for i in range(n) for j in range(n) if i != j)

# Each node entered and left exactly once
for i in range(n):
    prob += lpSum(x[i][j] for j in range(n) if j != i) == 1
    prob += lpSum(x[j][i] for j in range(n) if j != i) == 1

# MTZ subtour elimination (fix u[0] = 0)
prob += u[0] == 0
for i in range(1, n):
    for j in range(1, n):
        if i != j:
            prob += u[i] - u[j] + n * x[i][j] <= n - 1

prob.solve()
tour = [0]
cur = 0
for _ in range(n - 1):
    for j in range(n):
        if j != cur and x[cur][j].value() > 0.5:
            tour.append(j)
            cur = j
            break
print("Tour:", tour, "length:", prob.objective.value())

For 8 cities the example is a toy. For 50 to 100 cities, MTZ combined with a good solver remains workable. Beyond that scale, practitioners use lazy subtour-elimination callbacks, which add cuts only when violated and scale to thousands of cities.

Example 3: Production Scheduling with Setup Times

Consider three machines and six jobs. Each job must run on one machine. Each machine has a processing time per job and a setup time per (predecessor, job) pair. The objective is to minimise makespan, defined as the time at which the last machine finishes.

from pulp import LpProblem, LpVariable, LpMinimize, lpSum, LpBinary, LpContinuous

jobs = list(range(6))
machines = list(range(3))
proc = {(j, m): 5 + ((j + m) % 4) for j in jobs for m in machines}
setup = {(i, j): 1 + ((i * 3 + j) % 3) for i in jobs for j in jobs if i != j}
BIG_M = sum(proc.values())

prob = LpProblem("SchedWithSetup", LpMinimize)

y = {(j, m): LpVariable(f"y_{j}_{m}", cat=LpBinary)
     for j in jobs for m in machines}          # job assignment
s = {j: LpVariable(f"s_{j}", lowBound=0, cat=LpContinuous) for j in jobs}  # start time
# z[i,j,m] = 1 if i precedes j on machine m
z = {(i, j, m): LpVariable(f"z_{i}_{j}_{m}", cat=LpBinary)
     for i in jobs for j in jobs if i != j for m in machines}
C_max = LpVariable("Cmax", lowBound=0, cat=LpContinuous)

# Each job on exactly one machine
for j in jobs:
    prob += lpSum(y[j, m] for m in machines) == 1

# Completion time ≤ makespan
for j in jobs:
    prob += s[j] + lpSum(proc[j, m] * y[j, m] for m in machines) <= C_max

# Disjunctive: if i and j both on machine m, one before the other
for i in jobs:
    for j in jobs:
        if i >= j:
            continue
        for m in machines:
            prob += z[i, j, m] + z[j, i, m] >= y[i, m] + y[j, m] - 1
            prob += s[j] >= s[i] + proc[i, m] + setup[i, j] - BIG_M * (1 - z[i, j, m])
            prob += s[i] >= s[j] + proc[j, m] + setup[j, i] - BIG_M * (1 - z[j, i, m])

prob += C_max                                   # minimize makespan
prob.solve()

print("Makespan:", C_max.value())
for m in machines:
    assigned = sorted([j for j in jobs if y[j, m].value() > 0.5],
                      key=lambda j: s[j].value())
    print(f"Machine {m}: " +
          " -> ".join(f"J{j}(s={s[j].value():.1f})" for j in assigned))

This represents a miniature version of real job-shop scheduling. The Big-M disjunctive constraints are precisely where indicator constraints in Gurobi or CPLEX would be cleaner. With six jobs, CBC solves the model in under a second. With 50 jobs, performance begins to degrade and a commercial solver becomes valuable.

Example 4: Multi-Period Facility Location

from pulp import LpProblem, LpVariable, LpMinimize, lpSum, LpBinary, LpContinuous

warehouses = ['W1', 'W2', 'W3', 'W4']
customers  = ['C1', 'C2', 'C3', 'C4', 'C5', 'C6']
periods    = [1, 2, 3]

fixed_cost  = {'W1': 1000, 'W2': 1500, 'W3': 1200, 'W4': 900}
capacity    = {'W1': 80,   'W2': 120,  'W3': 100,  'W4': 70}
demand      = {(c, t): 15 + (hash((c, t)) % 10) for c in customers for t in periods}
ship_cost   = {(w, c): 2 + ((hash((w, c)) % 7)) for w in warehouses for c in customers}

prob = LpProblem("MultiPeriodFL", LpMinimize)

y = {(w, t): LpVariable(f"y_{w}_{t}", cat=LpBinary)
     for w in warehouses for t in periods}      # open warehouse w at time t
x = {(w, c, t): LpVariable(f"x_{w}_{c}_{t}", lowBound=0, cat=LpContinuous)
     for w in warehouses for c in customers for t in periods}

# Objective
prob += (lpSum(fixed_cost[w] * y[w, t] for w in warehouses for t in periods)
         + lpSum(ship_cost[w, c] * x[w, c, t]
                 for w in warehouses for c in customers for t in periods))

# Demand satisfaction
for c in customers:
    for t in periods:
        prob += lpSum(x[w, c, t] for w in warehouses) >= demand[c, t]

# Capacity & open-only-then-ship
for w in warehouses:
    for t in periods:
        prob += lpSum(x[w, c, t] for c in customers) <= capacity[w] * y[w, t]

# Commitment: once open, stay open (y non-decreasing)
for w in warehouses:
    for t in periods[:-1]:
        prob += y[w, t + 1] >= y[w, t]

prob.solve()
print("Total cost:", prob.objective.value())
for t in periods:
    opens = [w for w in warehouses if y[w, t].value() > 0.5]
    print(f"Period {t}: open => {opens}")

This pattern, comprising binary open/close decisions, continuous flows, demand and capacity constraints, and time coupling, forms the skeleton of countless supply-chain models, including those used by Amazon and Walmart. At enterprise scale, multi-echelon structure, stochastic demand and thousands of SKUs are added, but the mathematical shape remains the same.

Tip: For recurring jobs, such as a nightly re-solve of a supply-chain model, the pipeline can be orchestrated with Apache Airflow so that data ingestion, the MIP solve and result publication are versioned and retryable.

Solvers Compared: Open Source vs Commercial

Solver choice can alter solve time by two orders of magnitude. The current landscape is summarised below as of 2026.

Solver License Speed (relative) Best For
CBC Open source (EPL) 1x Default in PuLP, small/medium problems
GLPK Open source (GPL) 0.7x Teaching, tiny problems
HiGHS Open source (MIT) 3–5x Modern OSS default, fast LP
SCIP Academic/ZIB (free for research) 5–10x Research, mixed constraint/integer
Gurobi Commercial (free academic) 30–100x Industrial gold standard
CPLEX Commercial (free academic) 25–80x IBM ecosystem, enterprise
FICO Xpress Commercial 20–80x Finance, large models

 

The 10 to 100 times advantage of commercial solvers over CBC is genuine. It derives from decades of cutting-plane engineering, superior presolve, parallel branch and bound, and tuned heuristics. For organisations that solve MIPs as a core activity, a Gurobi or CPLEX licence pays for itself on the first serious project. Both vendors offer free academic licences, and researchers have no reason not to evaluate them.

For solver-agnostic code, Pyomo can be used with SolverFactory('gurobi'), SolverFactory('cbc') or SolverFactory('highs'), as can python-mip. PuLP also supports multiple backends, although with a thinner abstraction.

Real-World Applications

The abstract mathematics becomes more tangible when the applications are made explicit. The domains in which MIP underpins industrial operations are outlined below.

MIP in the Wild: Ten Domains MIP engine Airline Crew Scheduling AA, Delta: $100M+/yr savings Vehicle Routing UPS ORION: 100M miles/yr saved Facility Location Supply chains, warehousing Manufacturing Job shop, lot sizing Sports League Scheduling MLB, NBA (CMU research) Healthcare Rostering Nurse/doctor scheduling Portfolio Optimization Cardinality, round-lot Telecom Network Design Capacity & routing Energy Grid Unit Commitment PJM, ERCOT day-ahead Retail Assortment Inventory + shelf space

Airline Crew Scheduling

Every major airline solves two substantial MIPs daily: crew pairing, in which sequences of flights are constructed to form a round trip, and crew rostering, in which pairings are assigned to specific pilots and flight attendants subject to rest, qualification and base constraints. Sabre, American, Delta and United collectively attribute hundreds of millions of dollars in annual savings to these optimisations. The models contain millions of variables and rely heavily on column generation, a decomposition in which new columns (pairings) are priced in on demand rather than enumerated in advance.

UPS ORION

ORION (On-Road Integrated Optimization and Navigation) re-optimises delivery routes for more than 55,000 drivers. The system combines MIP with heuristics because solving a full vehicle routing problem with time windows at this scale would otherwise be intractable. The reported savings are 100 million miles per year, 10 million gallons of fuel, 100,000 tonnes of CO2 and 300 to 400 million dollars per year. Few software projects can claim comparable impact.

Energy Grid Unit Commitment

Regional transmission operators such as PJM, which serves 65 million people across the US East, solve unit commitment MIPs to decide which generators to start or stop and at what output for every hour of the following day. Binary variables capture on and off states, integer variables capture startup sequences, and continuous variables capture megawatt output. A single solve handles thousands of units subject to ramp, minimum up and down, and reserve constraints, and runs in under 20 minutes. Electricity market clearing prices emerge directly from the dual variables of these MIPs.

Healthcare Staff Scheduling

The nurse rostering problem is widely studied in the operations research literature. Each hospital imposes its own rules, including maximum consecutive nights, minimum rest, skill mix per shift, fairness and individual preferences. MIP serves as the principal tool, often combined with constraint programming for the feasibility components.

Sports League Scheduling

Researchers at Carnegie Mellon have constructed MLB and NBA schedules using MIP for many years. The constraints include travel distance, venue availability, television windows, traditional rivalries and competitive balance. Sports scheduling is a frequently used test bed because the constraints are well defined and the benefits, including television revenue and fan experience, are measurable.

Portfolio Optimisation with Discrete Constraints

Pure mean-variance portfolio optimisation is a QP with no integer variables. Real portfolios, however, often impose cardinality constraints, such as a limit of 40 names, and round-lot constraints, such as the requirement that shares be purchased in multiples of 100. These conditions require binary and integer variables, transforming the problem into a mixed-integer quadratic program. LP and QP alone cannot model them; MIP is required.

Other Notable Applications

Further applications include telecom network design (backbone capacity and protection routing), manufacturing job-shop scheduling, lot-sizing and assembly-line balancing, retail assortment and inventory optimisation, chip-design floorplanning, railway crew and rolling-stock scheduling, waste collection routing, and even protein design and kidney-exchange matching. The last application is particularly consequential: kidney-exchange programmes in the United States and United Kingdom use MIP to match donor-recipient pairs in cycles and chains, saving lives each week.

Domain Typical vars Typical constraints Typical solve
Airline crew rostering 1M–10M 100K–1M Hours (column gen)
Unit commitment 100K–500K 500K–2M 10–20 minutes
Multi-echelon supply chain 50K–500K 50K–500K Minutes
Job shop scheduling 10K–100K 50K–500K Seconds to minutes
Portfolio with cardinality 1K–10K 1K–20K Seconds
Nurse rostering 10K–50K 20K–100K Minutes

 

Practical Tips and Common Pitfalls

Experience with MIP is largely a matter of pattern recognition. The lessons that practitioners typically learn through direct experience are summarised below.

Prefer Tight Formulations Over Compact Ones

When in doubt, additional constraints should be written if they tighten the LP relaxation. The facility-location example above, in which xijyi with O(mn) constraints is preferred over ∑j xij ≤ M · yi with O(m) constraints, is the canonical illustration. The disaggregated form appears larger but solves 10 to 100 times faster.

Choose Big-M Carefully, or Avoid It

The smallest valid M should always be selected. If the quantity is a time, M may be the makespan upper bound, defined as the sum of all processing times. If the quantity is a flow, M is the capacity. In Gurobi, CPLEX and recent versions of SCIP, indicator constraints (model.addGenConstrIndicator in gurobipy) should be used. They are numerically safer and often tighter.

Set MIP Gap and Time Limits

In a business context, proving the final 0.1 percent of optimality is rarely worth ten hours of compute time. A MIP gap tolerance of 1 to 5 percent and an appropriate time limit should be set. Most solvers will return the best feasible solution found, together with a verified bound, when either condition is reached.

# In PuLP with CBC
solver = pulp.PULP_CBC_CMD(timeLimit=300, gapRel=0.02, msg=True)
prob.solve(solver)

# In Pyomo with Gurobi
from pyomo.environ import SolverFactory
opt = SolverFactory('gurobi')
opt.options['TimeLimit'] = 300
opt.options['MIPGap'] = 0.02
opt.solve(model, tee=True)

Warm Start From a Heuristic

Any feasible solution should be obtained first, whether by greedy assignment, a previous day’s plan or a quick metaheuristic, and passed in as a MIP start. Incumbent-driven pruning is the single largest costless speedup available.

Decomposition for Substantial Problems

When a monolithic MIP becomes excessively large, decomposition is required. Benders decomposition splits the problem into a master problem governing the discrete decisions and subproblems governing the continuous variables given those discrete choices, iterating with cuts. Dantzig-Wolfe decomposition and column generation address problems with a natural block structure, such as airline pairings and cutting stock. Lagrangian relaxation relaxes coupling constraints using penalty multipliers. Modern solvers automate some of these procedures, but the largest problems still require manual decomposition.

Read the Solver Log

Solver logs convey a narrative: the initial LP bound, the first primal solution, the rate of gap closure, cuts applied, node count and parallel thread usage. If the gap remains stuck after 80 percent of the time limit, a tighter formulation or a better heuristic is typically required rather than a larger machine.

Caution: Units must not be mixed indiscriminately. Variables in the range [0, 1] combined with coefficients in the range [0, 1e7] cause severe numerical difficulties. All quantities should be scaled into reasonable ranges, ideally between 1e-3 and 1e3. Poor scaling is the single most common cause of the situation in which Gurobi reports infeasibility on a problem the modeller is confident is feasible.

MIP vs Alternatives: GA, CP, RL

MIP is powerful but not universal. Knowing when to use an alternative is a mark of an experienced modeller. The companion article on Genetic Algorithms examines the black-box counterpart.

MIP vs Genetic Algorithms

A genetic algorithm is a metaheuristic that evolves a population of candidate solutions using selection, crossover and mutation. It handles black-box fitness functions, arbitrary nonlinearity, and does not require explicit constraints. It provides no optimality guarantee, however. GA is appropriate when the objective or constraints are highly nonlinear, when evaluating a candidate requires a simulation, or when a linear formulation cannot be written. MIP is appropriate when a linear formulation is feasible and a provable optimum, or a bounded gap, is required.

MIP vs Constraint Programming

Constraint Programming excels at pure feasibility and scheduling problems with complex logical structure, for example disjunctive scheduling involving hundreds of global constraints such as AllDifferent or Cumulative. CP does not require linearity and handles logical relationships elegantly. MIP outperforms CP when the objective is a linear cost and when strong LP-based bounds are useful. Some hybrid solvers, such as Google OR-Tools CP-SAT, blur the boundary effectively.

MIP vs Reinforcement Learning

Reinforcement learning learns a policy mapping state to action, typically for sequential decision problems under uncertainty. MIP solves a single deterministic instance to optimality. The two methods address different problems. MIP may be used to solve tomorrow’s nominal plan, while an RL policy reacts to disruptions in real time, trained offline on thousands of perturbed MIP solutions.

Criterion MIP GA CP RL
Optimality guarantee Yes (bounded gap) No Yes No
Needs linear structure Yes No No No
Best on pure discrete logic Good OK Excellent Poor
Best on continuous + discrete Excellent OK Weak OK
Real-time decisions (ms) Rarely Maybe Sometimes Yes
Requires training data No No No Yes
Handles uncertainty natively No (needs stochastic MIP) No No Yes

 

MIP composes well with other methods. Demand forecasts from time-series models feed MIP inputs. Solutions are stored in specialised databases, as discussed in the time-series database comparison. When models are deployed to production systems that also run classifiers such as one-class SVMs for anomaly detection, or graph models such as Graph Attention Networks for relational features, MIP ties the optimisation layer together. Clean engineering practice is important: solver code should be written with sound clean-code principles and versioned according to Git best practices.

Frequently Asked Questions

When does MIP vs LP actually matter?

The moment you have a decision that is inherently yes/no or integer, such as opening a facility, assigning a worker, or buying a discrete number of machines, LP alone cannot model it correctly. Rounding LP solutions is almost never safe. If all decisions are continuous quantities such as litres, dollars or percentages, LP suffices and is substantially faster. If any decisions are binary or integer, MIP is required.

Should I use Gurobi or stick with CBC?

Begin with CBC, which is free and ships with PuLP, to prototype. If your problem solves in seconds and time pressure is limited, CBC is sufficient. If solve times extend into minutes or hours on problems of business significance, a Gurobi or CPLEX licence typically pays for itself many times over. Academic users obtain both at no cost. HiGHS occupies a modern open-source middle ground that has closed much of the gap for many problem classes.

How big a MIP can solvers handle?

Modern solvers routinely handle millions of variables and constraints on ordinary servers. What matters more is structure: highly symmetric or poorly formulated problems with 10,000 variables can be more difficult than well-formulated problems with 1,000,000. Airline crew problems containing billions of potential columns are solved daily via column generation. As a heuristic, if presolve shrinks the model by 50 percent or more, the problem is likely tractable; if not, expect difficulty.

MIP vs Genetic Algorithm: which should I use?

If linear constraints and a linear objective can be written, MIP yields a provable optimum and typically solves faster than a well-tuned GA on the same problem. If the objective requires a black-box simulator, exhibits significant nonlinearity, or changes shape frequently, a GA or other metaheuristic is a better fit. The two approaches can also be combined: a GA may rapidly produce a feasible solution that is then supplied as a MIP start.

Can MIP solve scheduling problems with thousands of tasks?

Yes, but typically with decomposition. Monolithic MIPs on 10,000 or more tasks with intricate constraints tend to be impractical. Practitioners decompose by day, by machine group or by crew. Hybrid approaches, in which MIP handles the macro assignment while constraint programming or local search handles detailed sequencing, are common. Google OR-Tools CP-SAT also handles very large scheduling problems using embedded SAT technology that sometimes outperforms MIP on scheduling-heavy instances.

Tip: Many teams find that the largest gains come not from a faster solver but from a single engineer who can reformulate a weak MIP into a strong one. Formulation skill continues to outperform brute force in 2026.
Related Reading:

References

This post is for informational and educational purposes only; it is not investment, engineering, or business advice.

You Might Also Like

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *