Fourier-Motzkin Elimination

Recall that we want to prove

Main Theorem for Polyhedra (1.2)

A subset $P\subseteq\mathbb{R}^d$ is a sum of a convex hull of a finite set of points plus a conical combination of vectors (a $V$-polyhedron) \begin{align*} P=\operatorname{conv}(V)+\operatorname{cone}(Y) \quad \quad \text{ for some } V\in\mathbb{R}^{d\times n}, Y\in\mathbb{R}^{d\times n'} \end{align*} if and only if it is an intersection of closed halfspaces (an $H$-polyhedron) \begin{align*} P=P(A,z) \quad \quad \text{ for some } A\in\mathbb{R}^{m\times d}, z\in\mathbb{R}^m \end{align*}

Proof

For the forward direction, we want to show that every $\mathcal{V}$-polyhedron is an $\mathcal{H}$-polyhedron. Given some $\mathcal{V}$-polyhedron $P$. Let

$$ \begin{align*} V &= \{v_1,\ldots,v_n\},\\ Y &= \{y_1,\ldots,y_{n'}\}. \end{align*} $$

A point in $\operatorname{conv}(V)$ has the form $t_1v_1+\cdots+t_nv_n$ where $t_i \ge 0$ and $t_1+\cdots+t_n=1.$. Similarly, a point in $\operatorname{cone}(Y)$ has the form $u_1y_1+\cdots+u_{n'}y_{n'}$ where $u_j \ge 0.$. Therefore a point in $\operatorname{conv}(V)+\operatorname{cone}(Y)$ has the form

$$ \begin{align*} x = \sum_{i=1}^{n} t_i v_i + \sum_{j=1}^{n'} u_j y_j, \end{align*} $$

where $t_i\ge0, u_j\ge0$ and $\sum_{i=1}^{n} t_i=1.$. If we form the matrices

$$ \begin{align*} V &= \begin{bmatrix} v_1 & \cdots & v_n \end{bmatrix}, & Y &= \begin{bmatrix} y_1 & \cdots & y_{n'} \end{bmatrix}, \end{align*} $$

and the coefficient vectors

$$ \begin{align*} t &= \begin{bmatrix} t_1\\ \vdots\\ t_n \end{bmatrix}, & u &= \begin{bmatrix} u_1\\ \vdots\\ u_{n'} \end{bmatrix}, \end{align*} $$

then the expression above can be written compactly as

$$ \begin{align*} x=Vt+Yu. \end{align*} $$

The condition $\mathbf{1}t=1$ is simply $t_1+\cdots+t_n=1,$ where $\mathbf{1} = \begin{bmatrix} 1 & \cdots & 1 \end{bmatrix}.$ Hence

$$ \begin{align*} \operatorname{conv}(V)+\operatorname{cone}(Y) &= \left\{ x\in\mathbb{R}^d : \exists\, t\in\mathbb{R}^n,\, u\in\mathbb{R}^{n'} \text{ such that } x=Vt+Yu,\, t\ge0,\, u\ge0,\, \mathbf{1}t=1 \right\} \\ &= \left\{ x\in\mathbb{R}^d : x= \sum_{i=1}^{n} t_i v_i + \sum_{j=1}^{n'} u_j y_j,\, t_i\ge0,\, u_j\ge0,\, \sum_{i=1}^{n} t_i=1 \right\}. \end{align*} $$

Next define

$$ \begin{align*} Q = \left\{ (x,t,u)\in\mathbb{R}^{d+n+n'} : x=Vt+Yu,\, t\ge0,\, u\ge0,\, \mathbf{1}t=1 \right\}. \end{align*} $$

Rather than looking only at the points $x$, we introduce the coefficient vectors $t$ and $u$ as additional variables. Thus every point $x$ is lifted to the point $(x,t,u)$ in the higher-dimensional space $\mathbb{R}^{d+n+n'}$. The advantage of introducing $Q$ is that all of the defining conditions

$$ \begin{align*} x &= Vt+Yu,\\ t &\ge 0,\\ u &\ge 0,\\ \mathbf{1}t &= 1 \end{align*} $$

are linear equations and inequalities. Therefore $Q$ is naturally described as an intersection of finitely many affine hyperplanes and halfspaces, so $Q$ is an $\mathcal{H}$-polyhedron. The original set $\operatorname{conv}(V)+\operatorname{cone}(Y)$ can be recovered from $Q$ by forgetting the variables $t$ and $u$ and keeping only the $x$-coordinates.

So now we know that $Q$ is an $\mathcal{H}$-polyhedron but what we want is to show that the projection of $Q$ gives us an $\mathcal{H}$-polyhedron which is the original set. In other words, we want to show that

$$ \begin{align*} \operatorname{proj}(Q) = \operatorname{conv}(V)+\operatorname{cone}(Y). \end{align*} $$

where the projection is

$$ \begin{align*} \operatorname{proj}(Q) &= \{x : \text{there exist } t,u \text{ such that } (x,t,u)\in Q\} \\ &= \left\{ x : \exists\, t,u \text{ such that } x=Vt+Yu,\, t\ge0,\, u\ge0,\, \mathbf1 t=1 \right\}. \end{align*} $$

Though in the textbook, instead of proving the projection theorem in full generality, they project along a coordinate axis and eliminate one variable. So suppose

$$ \begin{align*} P = \{x\in\mathbb R^d : Ax\le z\}. \end{align*} $$

and suppose that we want to eliminate the variable $x_k$. The way to do this is using the Fourier-Motzkin Elimination method which goes like this: if we have a system of linear inequalities, isolate the variable we want to drop ($x_k$) and collect all the constraints where $x_k$ is on the “greater side” and all the constraints where $x_k$ is on the “less side”. The projection is then simply the system of inequalities that you get by pairing every single lower bound with every single uppoer bound. A small example is below

Consider a bounded triangular region defined by the following three linear inequalities:

$x + y \le 1$
$y \ge 0$
$x \ge -1$

To project this region onto the $x$-axis, we must systematically eliminate the variable $y$. We rewrite each inequality to isolate $y$ on one side. This allows us to separate the constraints into lower bounds and upper bounds for $y$. So

From (1), $y \le 1 - x$ (Upper Bound)
From (2) $y \ge 0 $ (Lower Bound)

Note that constraint (3) does not contain $y$. We set it aside as a independent constraint on $x$ that must be preserved. We combine our upper and lower bounds into a single inequality for $y$:

$$ \begin{align*} 0 \le y \le 1 - x \end{align*} $$

For a valid real number $y$ to exist within this interval, the lower bound must be less than or equal to the upper bound. We eliminate $y$ by setting up this compatibility condition: $0 \le 1 - x $. We now collect all remaining inequalities that only involve $x$. This includes the condition we just derived, alongside any original constraints that never contained $y$ (constraint 3):

$$ \begin{align*} 0 \le 1 - x \quad &\implies \quad x \le 1 \\ \text{From (3):} \quad &\implies \quad x \ge -1 \end{align*} $$

The projection of the polyhedron onto the $x$-axis is completely described by the final system of inequalities for $x$: $ -1 \le x \le 1 $. In interval notation, the projection is $[-1, 1]$.

The textbook then defines

$$ \begin{align*} \operatorname{proj}_k(P) = \{x - x_k e_k \mid x \in P\} \end{align*} $$

So given a point $x$ in the polyhedron, we just subtract its $k$th component. So the $k$th component just becomes $0$ while the remaining coordinates remain untouched. This is equivalent to

$$ \begin{align*} \operatorname{proj}_k(P) = \{x \in \mathbb{R}^d \mid x_k = 0, \exists y \in \mathbb{R} \mid x + ye_k \in P\} \end{align*} $$

So it’s the other way around. A point $x \in \mathbb{R}^d$ is in the set if its $x_k$ component is zero and there exists some $y \in \mathbb{R}$ such $x + ye_k$ is a point in $P$. Note that this projection is contained entirely in the hyperplane

$$ \begin{align*} H_k = \{x \in \mathbb{R}^d \mid x_k = 0\} \end{align*} $$

Moreover, consider the set

$$ \begin{align*} \operatorname{elim}_k(P) &= \{x - te_k \mid x \in P, t \in \mathbb{R}\} \\ &= \{x \in \mathbb{R}^d \mid \exists y \in \mathbb{R}, x + ye_k \in P\} \\ \end{align*} $$

unlike $\operatorname{proj}_k(P)$ which drops a dimension. $\operatorname{elim}_k(P)$ is still full dimensional. For the triangle example above, $\operatorname{proj}_k(P)$ is the line segment where $-1 \leq x \leq 1$ and $y=0$. However, $\operatorname{elim}_k(P)$ is the entire plane such that $-1 \leq x \leq 1$. Moreover, observe that

$$ \begin{align*} \operatorname{proj}_k(P) = \operatorname{elim}_k(P) \cap H_k \end{align*} $$

This leads to the isomorphism

$$ \begin{align*} \operatorname{elim}_k(P) \cong \operatorname{proj}_k(P) \times \mathbb{R} \end{align*} $$

The reason why we want to work with $\operatorname{elim}_k(P)$ is because we stay in the same dimension and avoid dropping a dimension every time we drop a coordinate.

So the goal or plan is take the original set and lift it to the set $Q$. Now we apply Fourier-Motzkin Elimination described in the triangle example above to get $elim_k(Q)$ which as we saw is an $\mathcal{H}$-polyhedron since it a set of linear inequalities. but as we saw above, $elim_k(Q)$ is isomorphic to $\operatorname{proj}_k(Q) \times \mathbb{R}$. Hence $\operatorname{proj}_k(Q)$ is also an $\mathcal{H}$-polyhedron.

To describe Fourier-Motzkin Elimination for eliminating $x_k$, we isolate the coordinate variable $x_k$. Let $a_i$ denote the $i$-th row of the matrix $A$. The dot product $a_i x$ expands to include the $k$-th coordinate explicitly: $a_i x = a_{i1}x_1 + \dots + a_{ik}x_k + \dots + a_{id}x_d$ The expression $a_i x - a_{ik}x_k$ represents a linear combination of all variables except $x_k$. This allows us to isolate $x_k$ for any pair of constraints.

First, we isolate $x_k$ into Upper and Lower Bounds. We choose an index $i$ where $x_k$ acts as an upper bound ($a_{ik} > 0$) and an index $j$ where $x_k$ acts as a lower bound ($a_{jk} < 0$).

Upper Bound ($a_{ik} > 0$):
$$ \begin{align*} a_i x \le z_i &\implies a_{ik}x_k + (a_i x - a_{ik}x_k) \le z_i \\ &\implies a_{ik}x_k \le a_{ik}x_k - a_i x + z_i \end{align*} $$
Lower Bound ($a_{jk} < 0$): Since $a_{jk}$ is negative, the coefficient $-a_{jk}$ is strictly positive. We rewrite the $j$-th constraint to isolate $x_k$ on the greater-than side:
$$ \begin{align*} a_j x \le z_j &\implies a_{jk}x_k + (a_j x - a_{jk}x_k) \le z_j \\ &\implies -z_j + (a_j x - a_{jk}x_k) \le -a_{jk}x_k \\ &\implies (-a_{jk})x_k \ge -a_{jk}x_k + a_j x - z_j \end{align*} $$

Next, to eliminate $x_k$, the leading coefficients on $x_k$ must match. We multiply the upper bound inequality by the positive scalar $(-a_{jk})$ and the lower bound inequality by the positive scalar $a_{ik}$:

$$ \begin{align*} \text{Scaled Upper Bound:} \quad & a_{ik}(-a_{jk})x_k \le (-a_{jk})\left(a_{ik}x_k - a_i x + z_i\right) \\ \text{Scaled Lower Bound:} \quad & a_{ik}(-a_{jk})x_k \ge a_{ik}\left(-a_{jk}x_k + a_j x - z_j\right) \end{align*} $$

A valid coordinate $x_k$ exists if and only if the scaled lower bound is less than or equal to the scaled upper bound:

$$ \begin{align*} a_{ik}\left(-a_{jk}x_k + a_j x - z_j\right) \le (-a_{jk})\left(a_{ik}x_k - a_i x + z_i\right) \end{align*} $$

Expanding both sides yields:

$$ \begin{align*} -a_{ik}a_{jk}x_k + a_{ik}a_j x - a_{ik}z_j \le -a_{jk}a_{ik}x_k - (-a_{jk})a_i x + (-a_{jk})z_i \end{align*} $$

Notice that the $-a_{ik}a_{jk}x_k$ terms appear identically on both sides and cancel completely, removing $x_k$ from the system to get

$$ \begin{align*} a_{ik}a_j x - a_{ik}z_j \le -(-a_{jk})a_i x + (-a_{jk})z_i \end{align*} $$

Gathering all terms involving the remaining vector $x$ to the left side and the constants to the right side gives the final eliminated linear inequality:

$$ \begin{align*} \left(a_{ik}a_j + (-a_{jk})a_i\right)x \le a_{ik}z_j + (-a_{jk})z_i \end{align*} $$

Because $\left(a_{ik}a_j + (-a_{jk})a_i\right)$ is simply a new constant row vector $A'$ and $a_{ik}z_j + (-a_{jk})z_i$ is a new scalar constant $z'$, this expression simplifies to:

$$ \begin{align*} A'x \le z' \end{align*} $$

Pairing every finite upper bound with every finite lower bound generates a new, finite system of linear inequalities. Thus, $\operatorname{elim}_k(P)$ remains an $\mathcal{H}$-polyhedron which is what we wanted to show.

References

Lectures on Polytopes