Recall that we want to prove

Main Theorem for Polyhedra (1.2)
A subset \(P\subseteq\mathbb{R}^d\) is a sum of a convex hull of a finite set of points plus a conical combination of vectors (a \(V\)-polyhedron) \begin{align*} P=\operatorname{conv}(V)+\operatorname{cone}(Y) \quad \quad \text{ for some } V\in\mathbb{R}^{d\times n}, Y\in\mathbb{R}^{d\times n'} \end{align*} if and only if it is an intersection of closed halfspaces (an \(H\)-polyhedron) \begin{align*} P=P(A,z) \quad \quad \text{ for some } A\in\mathbb{R}^{m\times d}, z\in\mathbb{R}^m \end{align*}

Proof

For the forward direction, we want to show that every \(\mathcal{V}\)-polyhedron is an \(\mathcal{H}\)-polyhedron. Given some \(\mathcal{V}\)-polyhedron \(P\). Let

$$ \begin{align*} V &= \{v_1,\ldots,v_n\},\\ Y &= \{y_1,\ldots,y_{n'}\}. \end{align*} $$

A point in \(\operatorname{conv}(V)\) has the form \(t_1v_1+\cdots+t_nv_n\) where \(t_i \ge 0\) and \(t_1+\cdots+t_n=1.\). Similarly, a point in \(\operatorname{cone}(Y)\) has the form \(u_1y_1+\cdots+u_{n'}y_{n'}\) where \(u_j \ge 0.\). Therefore a point in \(\operatorname{conv}(V)+\operatorname{cone}(Y)\) has the form

$$ \begin{align*} x = \sum_{i=1}^{n} t_i v_i + \sum_{j=1}^{n'} u_j y_j, \end{align*} $$

where \(t_i\ge0, u_j\ge0\) and \(\sum_{i=1}^{n} t_i=1.\). If we form the matrices

$$ \begin{align*} V &= \begin{bmatrix} v_1 & \cdots & v_n \end{bmatrix}, & Y &= \begin{bmatrix} y_1 & \cdots & y_{n'} \end{bmatrix}, \end{align*} $$

and the coefficient vectors

$$ \begin{align*} t &= \begin{bmatrix} t_1\\ \vdots\\ t_n \end{bmatrix}, & u &= \begin{bmatrix} u_1\\ \vdots\\ u_{n'} \end{bmatrix}, \end{align*} $$

then the expression above can be written compactly as

$$ \begin{align*} x=Vt+Yu. \end{align*} $$

The condition \(\mathbf{1}t=1\) is simply \(t_1+\cdots+t_n=1,\) where \(\mathbf{1} = \begin{bmatrix} 1 & \cdots & 1 \end{bmatrix}.\) Hence

$$ \begin{align*} \operatorname{conv}(V)+\operatorname{cone}(Y) &= \left\{ x\in\mathbb{R}^d : \exists\, t\in\mathbb{R}^n,\, u\in\mathbb{R}^{n'} \text{ such that } x=Vt+Yu,\, t\ge0,\, u\ge0,\, \mathbf{1}t=1 \right\} \\ &= \left\{ x\in\mathbb{R}^d : x= \sum_{i=1}^{n} t_i v_i + \sum_{j=1}^{n'} u_j y_j,\, t_i\ge0,\, u_j\ge0,\, \sum_{i=1}^{n} t_i=1 \right\}. \end{align*} $$

Next define

$$ \begin{align*} Q = \left\{ (x,t,u)\in\mathbb{R}^{d+n+n'} : x=Vt+Yu,\, t\ge0,\, u\ge0,\, \mathbf{1}t=1 \right\}. \end{align*} $$

Rather than looking only at the points \(x\), we introduce the coefficient vectors \(t\) and \(u\) as additional variables. Thus every point \(x\) is lifted to the point \((x,t,u)\) in the higher-dimensional space \(\mathbb{R}^{d+n+n'}\). The advantage of introducing \(Q\) is that all of the defining conditions

$$ \begin{align*} x &= Vt+Yu,\\ t &\ge 0,\\ u &\ge 0,\\ \mathbf{1}t &= 1 \end{align*} $$

are linear equations and inequalities. Therefore \(Q\) is naturally described as an intersection of finitely many affine hyperplanes and halfspaces, so \(Q\) is an \(\mathcal{H}\)-polyhedron. The original set \(\operatorname{conv}(V)+\operatorname{cone}(Y)\) can be recovered from \(Q\) by forgetting the variables \(t\) and \(u\) and keeping only the \(x\)-coordinates.

So now we know that \(Q\) is an \(\mathcal{H}\)-polyhedron but what we want is to show that the projection of \(Q\) gives us an \(\mathcal{H}\)-polyhedron which is the original set. In other words, we want to show that

$$ \begin{align*} \operatorname{proj}(Q) = \operatorname{conv}(V)+\operatorname{cone}(Y). \end{align*} $$

where the projection is

$$ \begin{align*} \operatorname{proj}(Q) &= \{x : \text{there exist } t,u \text{ such that } (x,t,u)\in Q\} \\ &= \left\{ x : \exists\, t,u \text{ such that } x=Vt+Yu,\, t\ge0,\, u\ge0,\, \mathbf1 t=1 \right\}. \end{align*} $$

Though in the textbook, instead of proving the projection theorem in full generality, they project along a coordinate axis and eliminate one variable. So suppose

$$ \begin{align*} P = \{x\in\mathbb R^d : Ax\le z\}. \end{align*} $$

and suppose that we want to eliminate the variable \(x_k\). The way to do this is using the Fourier-Motzkin Elimination method which goes like this: if we have a system of linear inequalities, isolate the variable we want to drop (\(x_k\)) and collect all the constraints where \(x_k\) is on the “greater side” and all the constraints where \(x_k\) is on the “less side”. The projection is then simply the system of inequalities that you get by pairing every single lower bound with every single uppoer bound. A small example is below

Consider a bounded triangular region defined by the following three linear inequalities:
  • \(x + y \le 1\)
  • \(y \ge 0\)
  • \(x \ge -1\)
To project this region onto the \(x\)-axis, we must systematically eliminate the variable \(y\). We rewrite each inequality to isolate \(y\) on one side. This allows us to separate the constraints into lower bounds and upper bounds for \(y\). So
  • From (1), \(y \le 1 - x\) (Upper Bound)
  • From (2) \(y \ge 0 \) (Lower Bound)
Note that constraint (3) does not contain \(y\). We set it aside as a independent constraint on \(x\) that must be preserved. We combine our upper and lower bounds into a single inequality for \(y\):
$$ \begin{align*} 0 \le y \le 1 - x \end{align*} $$
For a valid real number \(y\) to exist within this interval, the lower bound must be less than or equal to the upper bound. We eliminate \(y\) by setting up this compatibility condition: \(0 \le 1 - x \). We now collect all remaining inequalities that only involve \(x\). This includes the condition we just derived, alongside any original constraints that never contained \(y\) (constraint 3):
$$ \begin{align*} 0 \le 1 - x \quad &\implies \quad x \le 1 \\ \text{From (3):} \quad &\implies \quad x \ge -1 \end{align*} $$
The projection of the polyhedron onto the \(x\)-axis is completely described by the final system of inequalities for \(x\): \( -1 \le x \le 1 \). In interval notation, the projection is \([-1, 1]\).

The textbook then defines

$$ \begin{align*} \operatorname{proj}_k(P) = \{x - x_k e_k \mid x \in P\} \end{align*} $$

So given a point \(x\) in the polyhedron, we just subtract its \(k\)th component. So the \(k\)th component just becomes \(0\) while the remaining coordinates remain untouched. This is equivalent to

$$ \begin{align*} \operatorname{proj}_k(P) = \{x \in \mathbb{R}^d \mid x_k = 0, \exists y \in \mathbb{R} \mid x + ye_k \in P\} \end{align*} $$

So it’s the other way around. A point \(x \in \mathbb{R}^d\) is in the set if its \(x_k\) component is zero and there exists some \(y \in \mathbb{R}\) such \(x + ye_k\) is a point in \(P\). Note that this projection is contained entirely in the hyperplane

$$ \begin{align*} H_k = \{x \in \mathbb{R}^d \mid x_k = 0\} \end{align*} $$

Moreover, consider the set

$$ \begin{align*} \operatorname{elim}_k(P) &= \{x - te_k \mid x \in P, t \in \mathbb{R}\} \\ &= \{x \in \mathbb{R}^d \mid \exists y \in \mathbb{R}, x + ye_k \in P\} \\ \end{align*} $$

unlike \(\operatorname{proj}_k(P)\) which drops a dimension. \(\operatorname{elim}_k(P)\) is still full dimensional. For the triangle example above, \(\operatorname{proj}_k(P)\) is the line segment where \(-1 \leq x \leq 1\) and \(y=0\). However, \(\operatorname{elim}_k(P)\) is the entire plane such that \(-1 \leq x \leq 1\). Moreover, observe that

$$ \begin{align*} \operatorname{proj}_k(P) = \operatorname{elim}_k(P) \cap H_k \end{align*} $$

This leads to the isomorphism

$$ \begin{align*} \operatorname{elim}_k(P) \cong \operatorname{proj}_k(P) \times \mathbb{R} \end{align*} $$

The reason why we want to work with \(\operatorname{elim}_k(P)\) is because we stay in the same dimension and avoid dropping a dimension every time we drop a coordinate.


So the goal or plan is take the original set and lift it to the set \(Q\). Now we apply Fourier-Motzkin Elimination described in the triangle example above to get \(elim_k(Q)\) which as we saw is an \(\mathcal{H}\)-polyhedron since it a set of linear inequalities. but as we saw above, \(elim_k(Q)\) is isomorphic to \(\operatorname{proj}_k(Q) \times \mathbb{R}\). Hence \(\operatorname{proj}_k(Q)\) is also an \(\mathcal{H}\)-polyhedron.

To describe Fourier-Motzkin Elimination for eliminating \(x_k\), we isolate the coordinate variable \(x_k\). Let \(a_i\) denote the \(i\)-th row of the matrix \(A\). The dot product \(a_i x\) expands to include the \(k\)-th coordinate explicitly: \(a_i x = a_{i1}x_1 + \dots + a_{ik}x_k + \dots + a_{id}x_d\) The expression \(a_i x - a_{ik}x_k\) represents a linear combination of all variables except \(x_k\). This allows us to isolate \(x_k\) for any pair of constraints.

First, we isolate \(x_k\) into Upper and Lower Bounds. We choose an index \(i\) where \(x_k\) acts as an upper bound (\(a_{ik} > 0\)) and an index \(j\) where \(x_k\) acts as a lower bound (\(a_{jk} < 0\)).

  1. Upper Bound (\(a_{ik} > 0\)):
    $$ \begin{align*} a_i x \le z_i &\implies a_{ik}x_k + (a_i x - a_{ik}x_k) \le z_i \\ &\implies a_{ik}x_k \le a_{ik}x_k - a_i x + z_i \end{align*} $$
  2. Lower Bound (\(a_{jk} < 0\)): Since \(a_{jk}\) is negative, the coefficient \(-a_{jk}\) is strictly positive. We rewrite the \(j\)-th constraint to isolate \(x_k\) on the greater-than side:
    $$ \begin{align*} a_j x \le z_j &\implies a_{jk}x_k + (a_j x - a_{jk}x_k) \le z_j \\ &\implies -z_j + (a_j x - a_{jk}x_k) \le -a_{jk}x_k \\ &\implies (-a_{jk})x_k \ge -a_{jk}x_k + a_j x - z_j \end{align*} $$

Next, to eliminate \(x_k\), the leading coefficients on \(x_k\) must match. We multiply the upper bound inequality by the positive scalar \((-a_{jk})\) and the lower bound inequality by the positive scalar \(a_{ik}\):

$$ \begin{align*} \text{Scaled Upper Bound:} \quad & a_{ik}(-a_{jk})x_k \le (-a_{jk})\left(a_{ik}x_k - a_i x + z_i\right) \\ \text{Scaled Lower Bound:} \quad & a_{ik}(-a_{jk})x_k \ge a_{ik}\left(-a_{jk}x_k + a_j x - z_j\right) \end{align*} $$

A valid coordinate \(x_k\) exists if and only if the scaled lower bound is less than or equal to the scaled upper bound:

$$ \begin{align*} a_{ik}\left(-a_{jk}x_k + a_j x - z_j\right) \le (-a_{jk})\left(a_{ik}x_k - a_i x + z_i\right) \end{align*} $$

Expanding both sides yields:

$$ \begin{align*} -a_{ik}a_{jk}x_k + a_{ik}a_j x - a_{ik}z_j \le -a_{jk}a_{ik}x_k - (-a_{jk})a_i x + (-a_{jk})z_i \end{align*} $$

Notice that the \(-a_{ik}a_{jk}x_k\) terms appear identically on both sides and cancel completely, removing \(x_k\) from the system to get

$$ \begin{align*} a_{ik}a_j x - a_{ik}z_j \le -(-a_{jk})a_i x + (-a_{jk})z_i \end{align*} $$

Gathering all terms involving the remaining vector \(x\) to the left side and the constants to the right side gives the final eliminated linear inequality:

$$ \begin{align*} \left(a_{ik}a_j + (-a_{jk})a_i\right)x \le a_{ik}z_j + (-a_{jk})z_i \end{align*} $$

Because \(\left(a_{ik}a_j + (-a_{jk})a_i\right)\) is simply a new constant row vector \(A'\) and \(a_{ik}z_j + (-a_{jk})z_i\) is a new scalar constant \(z'\), this expression simplifies to:

$$ \begin{align*} A'x \le z' \end{align*} $$

Pairing every finite upper bound with every finite lower bound generates a new, finite system of linear inequalities. Thus, \(\operatorname{elim}_k(P)\) remains an \(\mathcal{H}\)-polyhedron which is what we wanted to show.


References