Farkas Lemma I

per the textbook, the Farkas lemma gives us a characterization for the solvability of linear inqualities. The first version is as follows:

Proposition 1.7 (Farkas Lemma (I))

Let $A\in\mathbb{R}^{m\times d}$ and $z\in\mathbb{R}^{m}$.
Either there exists a point $x\in\mathbb{R}^{d}$ with $Ax\leq z$,
or there exists a row vector $c\in(\mathbb{R}^{m})^*$ with $c\geq0$, $cA=0$, and $cz<0$, but not both.

Proof

Let

$$ \begin{align*} A= \begin{bmatrix} - & a_1 & -\\ - & a_2 & -\\ & \vdots & \\ - & a_m & - \end{bmatrix} \in \mathbb{R}^{m\times d}, \quad x= \begin{bmatrix} x_1\\ x_2\\ \vdots\\ x_d \end{bmatrix} \in\mathbb{R}^d, \quad Ax= \begin{bmatrix} a_1x\\ a_2x\\ \vdots\\ a_mx \end{bmatrix}. \end{align*} $$

Hence the system $Ax\le z$ is equivalent to the collection of inequalities

$$ \begin{align*} a_1x&\le z_1,\\ a_2x&\le z_2,\\ &\vdots\\ a_mx&\le z_m. \end{align*} $$

Next, we show that only one condition can hold and not both. Suppose for the sake of contradiction that both conditions hold. Then, there exists $x \in \mathbb{R}^d$ with $Ax \leq z$ and there exists a row vector $c \geq 0$ such that $cA = 0$ and $cz < 0$. Observe that

$$ \begin{align} (cA)x = 0x = 0 \tag{1} \end{align} $$

Since $c \geq 0$ and $Ax \leq z$, multiply each inequality by the corresponding nonnegative coefficient of $c$ and add them together to get

$$ \begin{align*} c(Ax)\le cz \end{align*} $$

The left hand side is $0$ by (1) but $cz < 0$. Hence this is a contradiction and only one condition can be true.

Next we need to show that if condition one fails so there is no $x$ satisfying $Ax \leq z$, then condition two must hold. That is, there must exist a row vector $c \geq 0$ satisfying the second condition. That is,

$$ \begin{align*} cA=0 \qquad\text{and}\qquad cz<0. \end{align*} $$

But before we continue, what does this mean even? If there is no $x$ satisfying $Ax \leq z$, then the Polyhedron $P$ is empty. Now what is $Ax$? recall that each row of the matrix $A$ is a normal vector pointing away from the allowed side of the halfspace. why? take any inequality and re-write as a dot product as follows

$$ \begin{align*} 2x_1 + 3x_2 &\leq 6 \\ \begin{bmatrix} 2\\ 3 \end{bmatrix} \cdot \begin{bmatrix} x_1\\ x_2 \end{bmatrix} &\leq 6 \end{align*} $$

Hence $(2 \quad 3)^{T}$ is the normal vector which is prependicular to the surface. Since $6 > 0$, this hyperplane is shifted forward in the direction of the normal vector, hence the hyperspace will contain the origin now.

Now let's study what it means for $cA$ to be zero. When we multiply $c \geq 0$ by $A$, we are taking a linear combination of these vectors so

$$ \begin{align*} cA= \begin{bmatrix} c_1 & c_2 & \cdots & c_m \end{bmatrix} \begin{bmatrix} - & a_1 & -\\ - & a_2 & -\\ & \vdots & \\ - & a_m & - \end{bmatrix} = c_1a_1 + c_2a_2 + \cdots + c_ma_m \end{align*} $$

What kind of linear combination is this? Recall

Affine combination: if the coefficients $c_i$ sum to $1$.
Convex combination: The coefficients must sum to $1$ and be non-negative
Conical combination: The coefficients are only required to be non-negative. $\text{cone}(a_1,\cdots,a_m) = \{c_1a_1 + \cdots + c_ma_m \mid c_i \geq 0\}$

Hence this is a conical combination of the rows $a_1,a_2,\cdots,a_m$. What about $cA = 0$? this means find a point in the conical hull of the rows of $A$ that lands at the origin.

But let’s do another concrete example

Take the inqualities that defines the following halfspaces

$$ \begin{align*} x+y &\leq 2 \\ -x + y &< 2 \\ 0x -2y &\leq -6 \end{align*} $$

Clearly there are no solutions that satifies all three inequalities. So we expect condition two to hold. That is, we should be able to find a row vector $c \geq 0$ such that $Ac = 0$ and such that $cz < 0$. Take the vector $(1 \quad 1 \quad 1)$. Now observe that

$$ \begin{align*} cA = \begin{bmatrix} 1 & 1 & 1 \end{bmatrix} \begin{bmatrix} 1 & 1 \\ -1 & 1 \\ 0 & -2 \end{bmatrix} = \begin{bmatrix} 0 & 0 \end{bmatrix} \end{align*} $$

Moreover,

$$ \begin{align*} cA = \begin{bmatrix} 1 & 1 & 1 \end{bmatrix} \begin{bmatrix} 2 \\ 2 \\ -6 \end{bmatrix} = -2 < 0 \end{align*} $$

So we can see that it at least works!

Back to the proof. Recall that we assumed that the first statement is false. That is, there is no $x$ satisfying $Ax \leq z$. We want to show that condition two must hold. That is, there must exist a row vector $c \geq 0$ satisfying $cA = 0$ and $cz < 0$. The textbook defines

$$ \begin{align*} P = P(A,z) &= \{x \in \mathbb{R}^d \mid Ax \leq z\} \end{align*} $$

and

$$ \begin{align*} Q = P((-z,A),0) &= \{ (x_0,x) \in \mathbb{R}^{d+1} \mid (-z, A) \begin{bmatrix} x_0 \\ x \end{bmatrix} \leq 0\} \\ &= \{ (x_0,x) \in \mathbb{R}^{d+1} \mid -x_0z + Ax \leq 0 \} \\ &= \{ (x_0,x) \in \mathbb{R}^{d+1} \mid Ax \leq x_0z \} \end{align*} $$

What is $Q$? Note above that for any $(x_0,x) \in Q$ and every $\lambda \geq 0$, we can multiply $\lambda$ on both sides since it’s not negative to get

$$ \begin{align*} Ax \leq x_0 z \\ \lambda Ax \leq \lambda x_0 z \end{align*} $$

Hence $(\lambda x_0, \lambda x) \in Q$. But this implies by definition that $Q$ is an $\mathcal{H}$-cone. Next, the textbook claims that

an $$x \in \mathbb{R}^d$$ exists with $$Ax \leq z$$ if and only of $$Q$$ contains a point with $$x_0 > 0$$.

Why? suppose that $P(A,z)$ is not empty. This implies that there exists some point $x \in \mathbf{R}^d$ such that $Ax \leq z$. Choosing $x_0=1$, we see that $(1,x) \in Q$ since $Ax \leq 1z$. Conversely, suppose $Q$ contains a point $(x_0,x)$ with $x_0 > 0$. Then by definition of $Q$

$$ \begin{align*} Ax &\leq x_0z \\ A \frac{x}{x_0} &\leq z \quad \text{(since $x_0 > 0$)}\\ \end{align*} $$

But this implies that $\frac{x}{x_0} \in P(A,z)$. This proves the claim. By assumption, we know that there is no $x$ satisfying $Ax \leq z$. By the claim, this means that $Q$ contains no point with $x_0 > 0$. Now, use Fourier-Motzkin elimination to eliminate $x_1,\cdots,x_d$. What’s left is just inequalities in terms of $x_0$ and since $x_0$ can’t be greater than $0$, then all the inqualities must have the form

$$ \begin{align*} \gamma_i x_0 \leq 0 \end{align*} $$

Since Fourier-Motzkin preseves the $\leq$ form. Moreover, only inequalities with at least one $\gamma > 0$ would force having $x_0 \leq 0$ (remember that $Q$ has no point with $x_0 > 0$). So fix this inequality with $\gamma > 0$ and $\gamma x_0 \leq 0$. Then, we want where this inequality came from during the elimination process. ….[TODO…]

References

Lectures on Polytopes