# Lowerbounds on the size of semidefinite relaxations

UCSD winter school on Sum-of-Squares, January 2017

# overview

### results

unconditional computational lower bounds

• for classical combinationatorial optimization problems
• examples: Max Cut, Traveling Salesman
• in restricted but powerful model of computational
• generalizes best known algorithms
• all possible linear and semidefinite relaxations
• first super-polynomial lower bounds in this model

general program goes back to Yannakakis’88 for refuting flawed P=NP proofs

proof strategy:

• show that sum-of-squares (SOS) algorithm is an “optimal algorithm” in this model
• use known lower bounds for SOS algorithm to derive lower bounds for all algorithms in this model [Grigoriev, Schoenebeck, Tulsiani, Barak-Chan-Kothari]

concrete theorem (“optimality of SOS”): For every approx ratio $$\alpha$$ and constraint satisfaction problem $$\Pi$$:
If any polynomial-size semidefinite relaxations has approx. ratio $$\alpha$$ for $$\Pi$$, then level-$$O(1)$$ SOS has approx. ratio $$\alpha$$

connection to optimization / convex geometry

• explicit $$0/1$$ polytopes in $${\mathbb{R}}^n$$ that are not (close to) linear images of spectrahedra in $${\mathbb{R}}^{N\times N}$$ for $$N\le \exp({n^{0.1}})$$ (intersection of affine subspace and set of $$N\times N$$ psd matrices)
• strong separation between rank and “positive-semidefinite rank” of matrices

# Yannakakis’s LP model

Yannakakis’s model: [Yannakakis’88]

• formalizes intuitive notion of LP relaxations for problems
• restricted enough to allow for unconditional lower bounds (indep. of P vs NP) [Fiorini-Massar-Pokutta-Tiwary de Wolf’12, Braun-Pokutta-S., Braverman-Moitra, Chan-Lee-Raghavendra-S.,Rothvoß]
• extends to SDP relaxations [Fiorini-Massar-Pokutta-Tiwary-de Wolf, Gouveia-Parrilo-Thomas]

### example: $$\ell_1$$-unit ball

$\left\{ \begin{gathered} \sum_i {\lvert x_i \rvert}\le 1\\ x\in {\mathbb{R}}^n \end{gathered} \right\}$

$$\ell_1$$-unit ball has $$2^n$$ facets $$\leadsto$$ no small direct LP formulation

but: $$\ell_1$$-unit ball is projection of polytope with $$2n+1$$ facets

$\left\{\begin{gathered} -y \le x \le y\\ \sum_i y_i \le 1 \\ x,y\in {\mathbb{R}}^n \end{gathered} \right\}$

### LP formulation of Max Cut

maximize $$f_G(x)=\sum_{ij\in E(G)}(x_i-x_j)^2$$ over $${\{0,1\}}^n$$

Max Cut as linear optimization problem:
maximize $$\sum_{ij\in E(G)}(X_{ii}+X_{jj}-2X_{ij})$$
over $$X\in \mathrm{CUT}_n=\mathrm{convex hull}{\left\{ {x {x}^\intercal} \mid x \in {\{0,1\}}^n \right\}}$$

$$\mathrm{CUT}_n$$ has exponential number of facets
$$\leadsto$$ no small direct LP formulation

general size-$$S(n)$$ LP formulation of Max Cut
polytope $$P\subseteq {\mathbb{R}}^{S(n)}$$ defined by $$\le S(n)$$ linear inequalities that projects to $$\mathrm{CUT}_n$$

sometimes called: extended LP formulation for $$\mathrm{CUT}_n$$

often exponential savings by projection: $$\ell_1$$-norm unit ball, Held-Karp TSP relaxation, LP/SDP hierarchies

question: can we prove size lower bounds
for LP formulations for Max Cut?

(implied by NP$$\neq$$P/poly)

lower bounds on general LP formulations for Max Cut

exact formulation requires size $$S(n)\ge 2^{\Omega(n)}$$ [Fiorini-Massar-Pokutta-Tiwary-de Wolf’12]

approx. ratio $$\gt 1/2$$ requires size $$S(n)\ge 2^{n^{\Omega(1)}}$$ [Chan-Lee-Raghavendra-S.’13,Kothari-Meka-Raghavendra’16]

(best known algorithms for Max Cut based on SDP (sum-of-squares) relaxations)

### SDP formulations of Max Cut

general size-$$S(n)$$ SDP formulation of Max Cut
spectrahedron $$P\subseteq {\mathbb{R}}^{S(n)\times S(n)}$$ defined by intersecting affine subspace with psd cone that projects to $$\mathrm{CUT}_n$$

lower bounds for SDP formul’s for Max Cut [Lee-Raghavendra-S.’15]

exact formulation requires size $$S(n)\ge 2^{n^{\Omega(1)}}$$

approx. ratio $$\gt 16/17$$ requires $$S(n)\ge n^{\omega(1)}$$
(matches best NP-hardness)

for $$S(n)=n^\ell$$, approx ratio no better than for $$O(\ell)$$-deg SOS
$$\leadsto$$ SOS is optimal poly-size SDP approx algorithm for Max Cut

# proofs

### recall: SOS certificates

certify nonnegativity of function by decompositing it into components that are obviously nonnegative

def’n: deg-$$\ell$$ SOS certificate for $$f{\colon}{\{0,1\}}^n\to {\mathbb{R}}$$ consists of $$g_1,\ldots,g_r{\colon}{\{0,1\}}^n\to {\mathbb{R}}$$ with $$\deg g_i\le \ell/2$$ such that

$f=g_1^2+\cdots+g_r^2$

deg-2 SOS captures best known approximation alg’m for Max Cut

Goemans–Williamson approximation for Max Cut:
for every graph $$G$$ and $$c\gt0$$, if $$\max_{{\{0,1\}}^n} f_G\le c$$, then $$c -0.878 f_G$$ has deg-2 SOS certificate

### general subspace certificates

let $$U$$ be linear subspace of real-valued functions on $${\{0,1\}}^n$$
(instead of subspace of low-degree functions)

def’n $$U$$-SOS certificate for $$f{\colon}{\{0,1\}}^n\to{\mathbb{R}}$$ consists of $$g_1,\ldots,g_r\in U$$ such that $$f=g_1^2+\cdots+g_r^2$$

given $$f$$ and explicit basis for $$U$$, in time $$(\dim U)^{O(1)}$$ can find $$U$$-SOS certificate for $$f$$ if it exists

connection to projections of spectrahedra:

lemma: cut polytope $$\mathrm{CUT}_n$$ is projection of $$N\times N$$-dim. spectrahedron iff exists $$N^2$$-dim subspace $$U$$ such that every nonegative deg-2 $$f{\colon}{\{0,1\}}^n\to {\mathbb{R}}$$ has $$U$$-SOS certificate

### efficient subspace certificates

do low-dim certificates imply low-deg certificates?

issue: every nonneg. $$f$$ has $$1$$-dim SOS certif’te (take $$U$$ as span of $$\sqrt f$$)

solution: need to consider families of nonnegative functions

def’n: family of nonnegative functions $${\mathcal F}$$ has $$S(n)$$-dim. subspace certificates if for every $$n\in {\mathbb{N}}$$, exists $$S(n)$$-dim. subspace $$U_n$$ such that every $$f\in {\mathcal F}\cap {\mathbb{R}}^{{\{0,1\}}^n}$$ has $$U$$-SOS certificate

### general lower bound theorem

let $$f{\colon}{\{0,1\}}^m\to {\mathbb{R}}$$ be nonnegative quadratic function

$$\mathrm{extensions}(f)=$$ all $$g{\colon}{\{0,1\}}^n \to {\mathbb{R}}$$ such that $$g$$ computes $$f$$ on subset $$S\subseteq [n]$$ of coordinates ($$g$$ same as $$f$$ up to renaming variables)

SOS optimality theorem: for every $$f\in {\{0,1\}}^m\to {\mathbb{R}}$$ and $$\ell\in {\mathbb{N}}$$:
if $$\mathrm{extensions}(f)$$ have $$n^\ell$$-dimensional subspace certificates, then $$f$$ has deg-$$10\ell$$ SOS certificate

contrapositive allows us to translate lower bounds for SOS to general SDP formulations

e.g.: $$(\sum_i x_i -n/2)^2-1/4$$ has no deg-$$n/100$$ SOS certificate
$$\leadsto$$ $$\mathrm{CUT}_n$$ requires superpolynomial SDP formulations

also preserve problem structure: if $$f$$ is “Max Cut function”, then $$\mathrm{extensions}(f)$$ are also “Max Cut functions”

also get bounds for superconstant $$\ell$$ but

# proofs

proof: (direction $$\Rightarrow$$) suppose $$\mathrm{CUT}_n$$ is projection of spect’dron

$\textstyle S={\left\{ A_0 + \sum_{i=1}^k z_i\cdot A_i \succeq 0 \mid z\in {\mathbb{R}}^k \right\}}\subseteq {\mathbb{R}}^{N\times N}$

every vertex $${x {x}^\intercal}$$ of $$\mathrm{CUT}_n$$ is projection of some $$Q(x)\in S$$

for every quadratic $$f{\colon}{\{0,1\}}^n\to {\mathbb{R}}$$, exists $$Y_f\in {\mathbb{R}}^{N\times N}$$ such that

${\mathop{\mathrm{Tr}}}Y_f Q(x)=f(x)\color{white}{\text{ for all }}x\in {\{0,1\}}^n$

choose $$U$$ as span of $$x\mapsto Q^{1/2}(x)_{ij}$$

to show: every nonneg quadratic $$f$$ has $$U$$-SOS certificate

suppose quadratic $$f{\colon}{\{0,1\}}^n\to {\mathbb{R}}$$ is nonnegative

$$\leadsto$$ $$\min_{Q \in S}{\mathop{\mathrm{Tr}}}Y_f Q\ge 0$$ (because $$S$$ projects to $$\mathrm{CUT}$$_n)

$$\leadsto$$ $$\exists P\succeq 0$$ such that $${\mathop{\mathrm{Tr}}}Y_fQ={\mathop{\mathrm{Tr}}}PQ$$ for all $$Q\in S$$ (by SDP duality)

$$\leadsto$$ $$f(x)={\mathop{\mathrm{Tr}}}P Q(x)={\left\| P^{1/2} Q^{1/2}(x) \right\|}_F^2$$ for all $$x\in {\{0,1\}}^n$$

$$\leadsto$$ $$f$$ has $$U$$-SOS certificate

$${\square}$$

### positive matrix functions

matrix-valued function $$Q{\colon}{\{0,1\}}^n\to {\mathbb{R}}^{N\times N}$$ is positive
if $$Q(x)\succeq 0$$ for all $$x\in {\{0,1\}}^n$$

characterization of subspace certificates in terms of positive matrix functions

lemma for every $$N$$-dim subspace $$U$$ of $${\mathbb{R}}^{{\{0,1\}}^n}$$, exists positive $$Q{\colon}{\{0,1\}}^n\to {\mathbb{R}}^{N\times N}$$ such that $$f$$ has $$U$$-SOS certificate if and only if $$\exists P\succeq 0$$ with $$f(x)={\mathop{\mathrm{Tr}}}P Q(x)$$

converse also holds (up to quadratic loss in $$N$$)

### proof strategy

let $$f{\colon}{\{0,1\}}^m\to {\mathbb{R}}$$ be nonnegative

let $$Q{\colon}{\{0,1\}}^n\to {\mathbb{R}}^{n^\ell\times n^\ell}$$ be positive matrix function
let $${\{ P_S\succeq 0\mid S\subseteq [n], {\lvert S \rvert}=m \}}$$ be such that $$f(x_S)={\mathop{\mathrm{Tr}}}P_S Q(x)$$

to show: $$f+{\varepsilon}$$ has deg-$$10\ell$$ SOS certificate

enough: every level-$$10\ell$$ pseudo-distr’n $$\mu$$ has $${\tilde{\mathbb{E}}}_\mu f\ge -{\varepsilon}$$

relative density $$\mu'= \mu/2^{-m}$$ satisfies

${\mathbb{E}}_\mu f = {\mathbb{E}}_S {\mathbb{E}}_x \mu'(x_S) f(x_S) = \color{red}{{\mathbb{E}}_S {\mathbb{E}}_x \mu'(x_S) \cdot {\mathop{\mathrm{Tr}}}P_S Q(x)}$
where $$x\in {\{0,1\}}^n$$ and $$S\subseteq[n]$$ with $${\lvert S \rvert} = m$$ are random

enough: $$\color{red}{{\mathbb{E}}_S {\mathbb{E}}_x \mu'(x_S) \cdot {\mathop{\mathrm{Tr}}}P_S Q(x)}\ge -{\varepsilon}$$

($$f$$ disappeared)

let $$Q{\colon}{\{0,1\}}^n\to {\mathbb{R}}^{n^\ell\times n^\ell}$$ be positive matrix function
let $${\{ P_S\succeq 0\mid S\subseteq [n], {\lvert S \rvert}=m \}}$$
let $$\mu$$ be level-$$\ell$$ pseuod-distr’n on $${\{0,1\}}^m$$ and $$\mu'=\mu\cdot 2^{n}$$

enough: $$\color{red}{{\mathbb{E}}_S {\mathbb{E}}_x \mu'(x_S) \cdot {\mathop{\mathrm{Tr}}}P_S Q(x)}\ge -{\varepsilon}\quad (\ast)$$

claim: $$(\ast)$$ holds if $$x\mapsto Q^{1/2}(x)$$ has degree at most $$\ell$$

proof: $$x\mapsto {\mathop{\mathrm{Tr}}}P_S Q(x)={\| P^{1/2}_SQ^{1/2}(x) \|}_F^2$$ is deg-$$\ell$$ SOS for all $$S$$
and $$\mu$$ is level-$$\ell$$ pseudo-distr’n

wishful thinking: approximate $$Q$$ by $$\tilde Q$$ in $$(\ast)$$ such that $$x\mapsto \tilde Q^{1/2}(x)$$ has “small degree”?

theorem: $$\exists$$ $$\tilde Q.$$ $${\mathbb{E}}{\mathop{\mathrm{Tr}}}\tilde Q = {\mathbb{E}}{\mathop{\mathrm{Tr}}}Q$$, $$\deg \tilde Q^{1/2}\le O_{\mu,{\varepsilon}}(\log n)$$
$$\color{red}{{\mathbb{E}}_S {\mathbb{E}}_x \mu'(x_S) \cdot {\mathop{\mathrm{Tr}}}P_S Q(x)} \ge \color{green}{{\mathbb{E}}_S {\mathbb{E}}_x \mu'(x_S) \cdot {\mathop{\mathrm{Tr}}}P_S \tilde Q(x)} -{\varepsilon}/2$$

theme: approximator need not be more complex than test

### learning simple approximators

theorem: $$\exists$$ $$\tilde Q.$$ $${\mathbb{E}}{\mathop{\mathrm{Tr}}}\tilde Q = {\mathbb{E}}{\mathop{\mathrm{Tr}}}Q$$, $$\deg \tilde Q^{1/2}\le O_{\mu,{\varepsilon}}(\log n)$$,
$$\color{red}{{\mathbb{E}}_S {\mathbb{E}}_x \mu'(x_S) \cdot {\mathop{\mathrm{Tr}}}P_S Q(x)} \ge \color{green}{{\mathbb{E}}_S {\mathbb{E}}_x \mu'(x_S) \cdot {\mathop{\mathrm{Tr}}}P_S \tilde Q(x)} -{\varepsilon}/2$$ $$(*)$$

intuition:

• start with “simplest possible” choice $$\tilde Q(x)={\mathrm{Id}}$$
• as long as $$(*)$$ not satisfied update $$\tilde Q$$ in “simplest possible” way toward improving $$(*)$$

simplicity for $$\tilde Q$$: version of entropy (from quantum information)

formalizing intuition: let $$F(x)={\mathbb{E}}_S \mu'(x_S) P_S$$. then, $$(*)$$ is same as

$\color{red}{{\langle F,Q \rangle}}\ge\color{green}{ {\langle F,\tilde Q \rangle}} -{\varepsilon}/2\quad (*)$

simplicity of test: $$F$$ has degree $$O_\mu(1)$$ (independent of $$n$$!)

new goal: find maximum entropy $$\tilde Q$$ satisfying $$(*)$$

closed form sol’n: $$\tilde Q(x)\propto e^{-\lambda F(x)/{\varepsilon}}$$, where $$\lambda$$=entropy-defect($$Q$$)
(entropy defect $$O(\log n)$$ because $$Q(x)$$ only $$n^\ell$$-dimensional)

related: (matrix) multiplicative weights, exponential families, gradient / mirror descent

remember actual goal: $$\deg \tilde Q ^{1/2} \le O_{\mu,{\varepsilon}}(\log n)$$

### conclusion

SOS algorithm can simulate general SDP formulation

• can intepret general SDP formulations as quantum state with small entropy defect

• learn simplest quantum state via matrix multiplicative weights (entropy maximization)

• show that this quantum state admits approximation as low-degree sum-of-squares (matrix exponentials)

Opps, you cannot play draw N guess with this browser!