Boaz Barak and David Steurer

UCSD winter school on Sum-of-Squares, January 2017

Input:\(d\)-regular graph \(G=(V,E)\) on \(n\) vertices.

Goal:Compute \(\max_{x\in\{0,1\}^n} f_G(x)\) where \(f_G(x) = \tfrac{1}{|E|}\sum_{\{i,j\}\in E} (x_i-x_j)^2\).

- Basic graph problem
- Great testbed for algorithmic techniques
- Related to practical questions including X-ray crystallography (phase recovery), cryo-electron microscopy (angular synchronization), and more

Can test for

bipartiteness: if \(\max f_G = 1\) then we can find a maximizing assignment.

Theorem (Erdős 67):A random assignment \(x\) satisfies \(f_G(x) \geq 1/2\) in expectation.

*Can we beat this?* “combinatorial” algorithms can’t distinguish bet *random* \(d\)-regular graph and random *nearly bipartite* graph.

**Geomans-Williamson algorithm**- non trivial approx using degree \(2\) sos**Feige Schechtman integrality gap**- GW analysis is*tight***Khot et al, Mossell et al, Raghavendra**- degree \(2\) sos is*optimal*under the*Unique Games Conjecture*.

Input:Graph \(G\).Output:\(\max {\tilde{\mathbb{E}}}_\mu f_G\) over \(2\)-pdist \(\mu\).

\(= \max \mathrm{Tr}(MG)\) over \(M \succeq 0\) s.t. \(\forall_i M_{i,i}=1/2\)

Clearly \(GW(G) \geq \max f_G\).

GW Thm 1:If \(\max f_G = 1 - \epsilon\) then \(GW(G) \in [1-\epsilon,1-C\epsilon^2]\).

GW Thm 2:\(GW(G) \in [\max f_G , \max f_G / 0.878 ]\).

Suppose \({\mathbb{E}}_{i \sim j} {\tilde{\mathbb{E}}}_\mu (x_i-x_j)^2 \geq 1-\delta\).

Quadratic sampling lemma\(\Rightarrow\) sample Gaussian \(Y\in {\mathbb{R}}^n\) s.t. \({\mathbb{E}}Y_iY_j = {\tilde{\mathbb{E}}}_\mu x_ix_j\), \({\mathbb{E}}Y_i =1/2\).

Exercise:\(Y,Y'\) Gaussians s.t \({\mathbb{E}}Y = {\mathbb{E}}Y' = {\mathbb{E}}Y^2 = {\mathbb{E}}{Y'}^2 = 1/2\) and \({\mathbb{E}}(Y-Y')^2 = 1 -\delta\) then \({\mathbb{E}}(Y-1/2)(Y'-1/2) \leq C\sqrt{\delta}\).

(\(Y' \sim -(1-\delta)Y + \sqrt{\delta}Z\))

Corollary:If \(Z_i = \mathbb{I}_{Y_i \geq 1/2}\) and \({\tilde{\mathbb{E}}}_\mu (x_i-x_j)^2 = 1-\delta_{i,j}\) then \({\mathbb{E}}(Z_i-Z_j)^2 \geq 1-C\sqrt{\delta_{i,j}}\).

By concavity of \(\sqrt{\cdot}\), get \({\mathbb{E}}_{i\sim j} {\mathbb{E}}(Z_i-Z_j)^2 \geq 1 - C\sqrt{\delta}\). ∎

GW Thm 2 follows similarly.

The *square root* approximation in the GW Max Cut algorithm is related to many other settings:

- Cheeger’s inequality
- Relation between the \(\ell_1\) and \(\ell_2\) norm.
*Pinsker’s Inequality*relating the Kullback-Leibler divergence and total variation distance.- The fact that \(1/\epsilon^2\) samples are required to detect an \(\epsilon\)-biased coin.

Big open question:Is this lossinherentin this setting?

Thm (Feige-Schechtman ’02):\(\forall \delta\gt0\), \(\exists\) \(G\) and \(2\) pdist \(\mu\) s.t. \({\tilde{\mathbb{E}}}_\mu f_G = 1-\delta\) but \(\max f_G \leq 1 - C\sqrt{\delta}\).

Moreover, constant \(C\) exactly matches GW analysis.

(Otherwise can use \(\Theta(\tfrac{1}{\sqrt{\delta}})\) long odd cycle.)

Thm (Feige-Schechtman ’02):\(\forall \delta\gt0\), \(\exists\) \(G\) and \(2\) pdist \(\mu\) s.t. \({\tilde{\mathbb{E}}}_\mu f_g = 1-\delta\) but \(\max f_G \leq 1 - C\sqrt{\delta}\).

Fix \(d=poly(1/\delta)\) and let \(n \gg 2^d\) be a very large integer.

Pick \(v_1,\ldots,v_n \in {\mathbb{R}}^d\) to be random unit vectors, and define the graph \(G\) on \(n\) vertices such that \(i \sim j\) if \({\langle v_i,v_j \rangle} \leq -1+2\delta\).

**Claim 1:** If we define \({\tilde{\mathbb{E}}}x_ix_j = {\langle v_i,v_j \rangle}/2+1/2\) then **(i)** \({\tilde{\mathbb{E}}}f_G \geq 1-\delta\) and **(ii)** \({\tilde{\mathbb{E}}}f^2 \geq 0\) for every linear \(f\).

**Proof:** **(i)** is by inspection. For **(ii)** note that

\[
{\tilde{\mathbb{E}}}f(x)^2 = \sum_{f_if_j}{\tilde{\mathbb{E}}}x_ix_j = 1/2 + \tfrac{1}{2} {\langle w ,w \rangle}
\]

where \(w = \sum_{i=1}^n f_iv_i\).**Claim 2:** The maximum cut of \(G\) is at most \(Pr[ (Y-1/2)(Y'-1/2) \leq 0 ]+o(1)\) where \(Y,Y'\) are Gaussians satisfying \({\mathbb{E}}Y = {\mathbb{E}}Y' = {\mathbb{E}}Y^2 = {\mathbb{E}}{Y'}^2 = 1/2\)
and \({\mathbb{E}}(Y-Y')^2 = 1-\delta\).

Note that Claim 2 implies the theorem.

Claim 2 (restatement):Let \(v_1,\ldots,v_n \in {\mathbb{R}}^d\) and connect \(i\sim j\) if \({\langle v_i,v_j \rangle} \leq -1+2\delta\). Then, maximum cut is obtained by \(S = \{ v : v_i \gt 0 \}\).

Instantiation of *isoperimetry*:

**Classical isomeperimetric inequality:** *Circles* are the shapes in the plane of given volume that minimize the perimeter.

**Borel’s isoperimetry for \({\mathbb{R}}^d\):** *Half spaces* are the subsets of the unit sphere that minimize the perimeter if connect close vectors

\(\sim\) maximize the cut edges if we connected nearly antipodal vectors.

Implies the claim. ∎

Given universe \(U\) of objects (e.g., shapes, sets) and objective \(\Psi:U\rightarrow {\mathbb{R}}\), often want to show:

Optimality theorems:The objects that optimize \(\Psi\) are “nice” (e.g., circles in the classical isoperimtry case, halfspaces in Borel’s case, codewords in testing cases)

Stability theorems:If object isnearly optimalfor \(\Psi\) then it’s close to a “nice” object (e.g., unique decoding).

Inverse theorems:If object hasbetter than randomvalue for \(\Psi\) then “somewhat correlated” with “nice” (e.g.,listdecoding)

Many examples in this course, TCS, math, and elsewhere.