# sum-of-squares and sparsest cut

UCSD winter school on Sum-of-Squares, January 2017

# Cheeger’s bound and degree-2 sos

### sparsest cut with deg-2 sos

recall: sparsest cut value $$\displaystyle \min_{x\in {\{0,1\}}^n} \frac {\color{red}{f_G(x)}}{\color{green}{\frac d n {\lvert x \rvert} (n-{\lvert x \rvert})}}$$

claim: if sparsest cut value $$\ge \color{#d33682}{{\varepsilon}}$$, then

$\vdash_2{\left\{ \color{red}{f_G(x)} \ge \color{#d33682}{{\varepsilon}^2/2} \cdot \color{green}{\frac d n {\lvert x \rvert}(n-{\lvert x \rvert})} \right\}}$

$$\leadsto$$ poly-time algorithm to approximate sparsest cut

later:

• this guarantee is tight for degree-2 sos
• better guarantee with degree-4 sos

### Cheeger bound as SOS certificate

Laplacian matrix $$L_G = \frac 1 d \sum_{\{i,j\}\in E_G} {(e_i-e_j) {(e_i-e_j)}^\intercal}$$

eigenvalues $$0=\lambda_1\le \lambda_2 \le \cdots \le \lambda_n\le 2$$

Cheeger bound: $$\lambda_2 \le$$ sparsest cut value $$\le \sqrt{2\lambda_2}$$

SOS captures this bound

claim: $$\vdash_2 \{ f_G \ge \lambda_2 \cdot \frac d n {\lvert x \rvert}(n-{\lvert x \rvert})\}$$

good approximation because $$\lambda_2\ge {\varepsilon}^2/2$$ if sparsest cut $$\ge {\varepsilon}$$

idea: prove easy direction of Cheeger in SOS

### SOS proof of Cheeger

proof: let $$L_K$$ be projector to space orthogonal to
top eigenvector $$(1,\ldots,1)$$ of $$L_G$$. then,

\begin{aligned} {\langle x,L_G x \rangle}& = \tfrac 1 d \cdot f_G(x) \\ {\langle x,L_K x \rangle}& = \tfrac 1 n \cdot {\lvert x \rvert}(n-{\lvert x \rvert}) \end{aligned}

to show: $$\vdash_2 \{ {\langle x,(L_G - \lambda_2 \cdot L_K)x \rangle} \ge 0\}$$

follows from $$L_G-\lambda_2 \cdot L_K \succeq 0$$

# SOS via pseudo-probability

### pseudo-probability

• useful way to reason about SOS

• dual to SOS certificates

• generalization of classical probability

• uncertainty arises from complexity
(as opposed to lack of information)

### formalization

formal expectation with respect to $$\mu{\colon}{\{0,1\}}^n\to{\mathbb{R}}$$

${\tilde{\mathbb{E}}}_{\mu} f = \sum_{x} \mu(x)\cdot f(x)$
(values of $$f$$ weighted by $$\mu$$)

def’n: $$\mu{\colon}{\{0,1\}}^n\to{\mathbb{R}}$$ is level-$$\ell$$ pseudo-distribution if

• normalization $${\tilde{\mathbb{E}}}_\mu 1 = 1$$
• positivity $${\tilde{\mathbb{E}}}_\mu g^2 \ge 0$$ whenever $$\deg g\le \ell/2$$

level-$$2n$$ pseudo-distributions are pointwise nonnegative and thus actual distributions

### efficient algorithm

th’m: optimize over level-$$\ell$$ pseudo-distributions in time $$n^{O(\ell)}$$
[Parrilo’00, Lasserre’00]

idea: characterization in terms of positive semidefinite matrices

claim: $$\mu{\colon}{\{0,1\}}^n\to {\mathbb{R}}$$ with $${\tilde{\mathbb{E}}}_\mu 1=1$$ is level-$$\ell$$ pseudo-distr’n
iff following matrix is positive semidefinite

${\tilde{\mathbb{E}}}_{\mu(x)} {v_{\ell/2}(x) {v_{\ell/2}(x)}^\intercal} \succeq 0$
(where $$v_k(x)=(1,x)^{\otimes k}$$ is the Veronese map)

more formally: set of moments $${\tilde{\mathbb{E}}}_{\mu(x)}v_\ell(x)$$ has $$n^{O(\ell)}$$-time separation oracle

### duality of pseudo-distr’s and sos cert’s

th’m: either $$\vdash_\ell \{f\ge 0\}$$ or $$\exists$$ level-$$\ell$$ pseudo-distr’n $$\mu$$ with

${\tilde{\mathbb{E}}}_\mu f\lt 0$

### character’n of level 2 pseudo-distr’s

most classical algorithms that use semidefinite prog’ing (SDP) are captured by deg-2 SOS

claim: $$\exists$$ level-2 pseudo-distr’n with mean $$m$$ and 2nd moment $$M$$
iff $${\mathop{\mathrm{diag}}}M = m$$ and $$M-{m {m}^\intercal}\succeq 0$$

characterization useful for developing algorithms based on deg-2 SOS and showing limitations of deg-2 SOS

proof idea: consider linear system of equations in $$\mu$$

${\left\{ {\tilde{\mathbb{E}}}_{\mu(x)} x=m,~ {\tilde{\mathbb{E}}}_{\mu(x)} {x {x}^\intercal}=M \right\}}$
satisfiable iff $${\mathop{\mathrm{diag}}}M =m$$ (by linear indep’nce of multilinear monomials)

for every linear polynomial $$g(x)={\langle a,x \rangle} + b$$,

\begin{aligned}[t] {\tilde{\mathbb{E}}}_\mu g^2 & = {\langle a,M a \rangle} + 2b{\langle a,m \rangle} + b^2\\ & \ge {\langle a,Ma \rangle} - 2{\langle a,m \rangle}^2\\ & = {\langle a,(M-{m {m}^\intercal})a \rangle} \ge 0 \end{aligned}

### quadratic sampling

lemma: for every level-2 pseudo-distr’n $$\mu$$ over $${\{0,1\}}^n$$, there exist Gaussian vector $$X=(X_1,\ldots,X_n)$$ with matching first two moments, so that

${\tilde{\mathbb{E}}}_{\mu(x)} x = {\mathbb{E}}X \text{ and } {\tilde{\mathbb{E}}}_{\mu(x)} {x {x}^\intercal} = {\mathbb{E}}{X {X}^\intercal}$

proof: let $$m={\tilde{\mathbb{E}}}_\mu x$$ be mean of $$\mu$$ and $$\Sigma={\tilde{\mathbb{E}}}_\mu {x {x}^\intercal} - {m {m}^\intercal}$$ be covariance of $$\mu$$. let $$g$$ be standard $$n$$-dimensional Gaussian vector. choose Gaussian vector $$X$$ as

$X = m + \Sigma^{1/2} g\,.$

# SOS limitation at degree 2 for sparsest cut

### Cheeger bound is tight (even for deg-2 SOS)

will see: deg-2 SOS approximation for Sparsest Cut no better than what follows from Cheeger’s bound

th’m: for every $${\varepsilon}\gt0$$, exists graph $$G$$ and level-2 pseudo-distr’n $$\mu$$ such that $$G$$ has sparsest cut value $$\ge {\varepsilon}$$ but

${\tilde{\mathbb{E}}}\underbrace{\color{red}{f_G}}_{\color{red}{\text{sparsity numerator}}} \le O({\varepsilon}^2)\cdot {\tilde{\mathbb{E}}}_{\mu(x)} \underbrace{\color{green}{\tfrac d n{\lvert x \rvert}(n-{\lvert x \rvert})}}_{\color{green}{\text{sparsity denominator}}}$

will choose $$G$$ to be $$n$$-vertex path for $$n=1/{\varepsilon}$$ + self-loops for regul’ty

$$\leadsto$$ sparsest cut $$S={\{ 1,\ldots,n/2 \}}$$ (first half of path), value $$\approx 1/n$$

to construct: level-2 pseudo-distr’n that believes $$G$$ has sparsest cut value $$O(1/n^2)$$

claim: $$\exists$$ level-2 pseudo-distr’n $$\mu$$ over $${\{0,1\}}^n$$ with

$\textstyle {\tilde{\mathbb{E}}}_{\mu (x)} \color{red}{\sum_{i=1}^{n-1} (x_i-x_{i+1})^2} \le O{\left( \frac{1}{n^2} \right)} \cdot {\tilde{\mathbb{E}}}_{\mu(x)}\color{green}{\tfrac 1n \sum_{i\lt j} (x_i-x_j)^2}$

intuition: choose covariance of pseudo-distr’n as projector into space of low eigenvalues of cycle

proof: let $$\omega=e^{2\pi/n}$$ be $$n$$-th root of unity. let $$u=(\omega^1,\ldots,\omega^n)$$.
let $$v,w$$ be real and imaginary part of $$v$$, so that $$u=v+i\cdot w$$.
let $$\mu$$ be level-2 pseudo-distr’n such that

${\tilde{\mathbb{E}}}_{\mu(x)} x = \tfrac 12 \cdot {\mathbf 1}\text{ and } {\tilde{\mathbb{E}}}_{\mu(x)} {x {x}^\intercal} = \tfrac 14 \cdot {\left( {{\mathbf 1}{{\mathbf 1}}^\intercal} + {v {v}^\intercal} + {w {w}^\intercal} \right)}$
(using character’n of level-2 pseudo-distr’s)
then,

$${\tilde{\mathbb{E}}}_\mu \color{green}{\tfrac 1n \sum_{i\lt j}(x_i-x_{j})^2 } = \tfrac 14 \sum_{j=1}^n {\lvert 1-\omega^j \rvert}^2 \ge n\cdot \Omega(1)$$

$${\tilde{\mathbb{E}}}_\mu \color{red}{\sum_{i=1}^{n-1}(x_i-x_{i+1})^2} = (n-1) \cdot \tfrac 14 {\lvert 1-\omega \rvert}^2 \le n \cdot O(1/n^2)$$

🞏

Opps, you cannot play draw N guess with this browser!