KKT与CQ

KKT已经成为了几乎所有的CS 、MAT学生都绕不开的知识、工具。浅写一下自己在几年的学习中对KKT和CQ的一些理解,方便自己日后复习。

1 Unconstrained Problem

这一部分是non-linear programming 的基础,这不是本篇文章想要讨论的重点,日后可能会继续做一次总结。对于unconstrained的问题大体有2个解法:

  1. Gradient Descent (1阶的算法)
  2. Newton's method / Quasi Neston's method(2阶的算法)

Unconstrained Problem 是解决constrained problem的基础。barrier method、penalty method 等都是将constrains 放到objective function之上,通过解决unconstrained problem来得到解(这些解法通常都可以与KKT point \ Duality 等结果相联系)

2 Constrained Problem and KKT condition

对于问题:

minxRnf(x) s.t. g(x)0,h(x)=0,\min _{x \in \mathbb{R}^{n}} f(x) \text { s.t. } g(x) \leq 0, \quad h(x)=0,

若x' 是该问题的local solution,且在x'处满足一种CQ,则在x'处必存在以下方程的一组解:

1st Order Karush-Kuhn-Tucker conditions (1阶KKT conditions)

{ (i) f(x)+g(x)λ+h(x)μ=0 (multiplier rule)  (ii) h(x)=0 (iii) λ0,g(x)0,(λ)g(x)=0 (complementarity conditions). \left\{\begin{array}{l} \text { (i) } \nabla f\left(x^{*}\right)+\nabla g\left(x^{*}\right) \lambda^{*}+\nabla h\left(x^{*}\right) \mu^{*}=0 \quad \text { (multiplier rule) } \\ \text { (ii) } h\left(x^{*}\right)=0 \\ \text { (iii) } \lambda^{*} \geq 0, \quad g\left(x^{*}\right) \leq 0, \quad\left(\lambda^{*}\right)^{\top} g\left(x^{*}\right)=0 \quad \text { (complementarity conditions). } \end{array}\right.

KKT condition通过 Farkas Lemma证明的,这一证明可以参考berserkas的教科书。

从以上的描述中我们再强调下这个定理说描述的事情,若在一个x*处满足一种“Cconstrained Qualification”,那么就一定有满足KKT的解。也就是说,在x处若有CQ,则KKT是必要条件

那么我们现在需要重新了解下:

什么是CQ?

3 Constrained Qualification

若要了解constrained qualification 需要先了解“cone”:

Feasible set and Active Constrain

Definition 1:Feasible Set and Active Constraints The set

X:={xRn:g(x)0,h(x)=0}X:=\left\{x \in \mathbb{R}^{n}: g(x) \leq 0, h(x)=0\right\}

is called feasible set of the problem (1). A point xRnx \in \mathbb{R}^{n} is called feasible if we have xXx \in X. For a feasible point xXx \in X, we define the index set of the active constraints A(x)\mathcal{A}(x) and the index set of the inactive constraints I(x)\mathcal{I}(x) as follows:

A(x):={i{1,,m}:gi(x)=0}I(x):={1,,m}\A(x)={i{1,,m}:gi(x)<0}\begin{aligned} \mathcal{A}(x) &:=\left\{i \in\{1, \ldots, m\}: g_{i}(x)=0\right\} \\ \mathcal{I}(x) &:=\{1, \ldots, m\} \backslash \mathcal{A}(x)=\left\{i \in\{1, \ldots, m\}: g_{i}(x)<0\right\} \end{aligned}

Definition 2: Tangent Cone Let XRnX \subset \mathbb{R}^{n} be a nonempty set. The tangent cone (or Bouligand cone) of XX at a point xXx \in X is given by

TX(x):={dRn:(ηk)kR++,(xk)kX such that xkx,ηk(xkx)d}T_{X}(x):=\left\{d \in \mathbb{R}^{n}: \exists\left(\eta_{k}\right)_{k} \in \mathbb{R}_{++},\left(x^{k}\right)_{k} \in X \text { such that } x^{k} \rightarrow x, \quad \eta_{k}\left(x^{k}-x\right) \rightarrow d\right\}

Definition 3: Linealized Tangent Cone

The set

T(g,h,x):={dRn:gi(x)d0iA(x),h(x)d=0}T_{\ell}(g, h, x):=\left\{d \in \mathbb{R}^{n}: \nabla g_{i}(x)^{\top} d \leq 0 \forall i \in \mathcal{A}(x), \quad \nabla h(x)^{\top} d=0\right\}

is called the linearized tangent cone at xXx \in X

几个CQ:

ACQ

T(g,h,x)=TX(x)T_{\ell}(g, h, x)=T_{X}(x)

GCQ

TX(x)=T(g,h,x)whereK:={vRn:vd0dK}T_{X}\left(x^{*}\right)^{\circ}=T_{\ell}\left(g, h, x^{*}\right)^{\circ}\\ \text{where} \\ K^{\circ}:=\left\{v \in \mathbb{R}^{n}: v^{\top} d \leq 0 \quad \forall d \in K\right\}

Definition 4: Constraint Qualification Let xXx \in X be given. A condition that implies the (GCQ) is called Constraint Qualification (CQ) for xx.