Boaz Barak and David Steurer
UCSD winter school on Sum-of-Squares, January 2017
Input: \(d\)-regular graph \(G=(V,E)\) on \(n\) vertices.
Goal: Compute \(\max_{x\in\{0,1\}^n} f_G(x)\) where \(f_G(x) = \tfrac{1}{|E|}\sum_{\{i,j\}\in E} (x_i-x_j)^2\).
Can test for bipartiteness: if \(\max f_G = 1\) then we can find a maximizing assignment.
Theorem (Erdős 67): A random assignment \(x\) satisfies \(f_G(x) \geq 1/2\) in expectation.
Can we beat this? “combinatorial” algorithms can’t distinguish bet random \(d\)-regular graph and random nearly bipartite graph.
Input: Graph \(G\).
Output: \(\max {\tilde{\mathbb{E}}}_\mu f_G\) over \(2\)-pdist \(\mu\).
\(= \max \mathrm{Tr}(MG)\) over \(M \succeq 0\) s.t. \(\forall_i M_{i,i}=1/2\)
Clearly \(GW(G) \geq \max f_G\).
GW Thm 1: If \(\max f_G = 1 - \epsilon\) then \(GW(G) \in [1-\epsilon,1-C\epsilon^2]\).
GW Thm 2: \(GW(G) \in [\max f_G , \max f_G / 0.878 ]\).
Suppose \({\mathbb{E}}_{i \sim j} {\tilde{\mathbb{E}}}_\mu (x_i-x_j)^2 \geq 1-\delta\).
Quadratic sampling lemma \(\Rightarrow\) sample Gaussian \(Y\in {\mathbb{R}}^n\) s.t. \({\mathbb{E}}Y_iY_j = {\tilde{\mathbb{E}}}_\mu x_ix_j\), \({\mathbb{E}}Y_i =1/2\).
Exercise: \(Y,Y'\) Gaussians s.t \({\mathbb{E}}Y = {\mathbb{E}}Y' = {\mathbb{E}}Y^2 = {\mathbb{E}}{Y'}^2 = 1/2\) and \({\mathbb{E}}(Y-Y')^2 = 1 -\delta\) then \({\mathbb{E}}(Y-1/2)(Y'-1/2) \leq C\sqrt{\delta}\).
(\(Y' \sim -(1-\delta)Y + \sqrt{\delta}Z\))
Corollary: If \(Z_i = \mathbb{I}_{Y_i \geq 1/2}\) and \({\tilde{\mathbb{E}}}_\mu (x_i-x_j)^2 = 1-\delta_{i,j}\) then \({\mathbb{E}}(Z_i-Z_j)^2 \geq 1-C\sqrt{\delta_{i,j}}\).
By concavity of \(\sqrt{\cdot}\), get \({\mathbb{E}}_{i\sim j} {\mathbb{E}}(Z_i-Z_j)^2 \geq 1 - C\sqrt{\delta}\). ∎
GW Thm 2 follows similarly.
The square root approximation in the GW Max Cut algorithm is related to many other settings:
Big open question: Is this loss inherent in this setting?
Thm (Feige-Schechtman ’02): \(\forall \delta\gt0\), \(\exists\) \(G\) and \(2\) pdist \(\mu\) s.t. \({\tilde{\mathbb{E}}}_\mu f_G = 1-\delta\) but \(\max f_G \leq 1 - C\sqrt{\delta}\).
Moreover, constant \(C\) exactly matches GW analysis.
(Otherwise can use \(\Theta(\tfrac{1}{\sqrt{\delta}})\) long odd cycle.)
Thm (Feige-Schechtman ’02): \(\forall \delta\gt0\), \(\exists\) \(G\) and \(2\) pdist \(\mu\) s.t. \({\tilde{\mathbb{E}}}_\mu f_g = 1-\delta\) but \(\max f_G \leq 1 - C\sqrt{\delta}\).
Fix \(d=poly(1/\delta)\) and let \(n \gg 2^d\) be a very large integer.
Pick \(v_1,\ldots,v_n \in {\mathbb{R}}^d\) to be random unit vectors, and define the graph \(G\) on \(n\) vertices such that \(i \sim j\) if \({\langle v_i,v_j \rangle} \leq -1+2\delta\).
Claim 1: If we define \({\tilde{\mathbb{E}}}x_ix_j = {\langle v_i,v_j \rangle}/2+1/2\) then (i) \({\tilde{\mathbb{E}}}f_G \geq 1-\delta\) and (ii) \({\tilde{\mathbb{E}}}f^2 \geq 0\) for every linear \(f\).
Proof: (i) is by inspection. For (ii) note that
Claim 2: The maximum cut of \(G\) is at most \(Pr[ (Y-1/2)(Y'-1/2) \leq 0 ]+o(1)\) where \(Y,Y'\) are Gaussians satisfying \({\mathbb{E}}Y = {\mathbb{E}}Y' = {\mathbb{E}}Y^2 = {\mathbb{E}}{Y'}^2 = 1/2\) and \({\mathbb{E}}(Y-Y')^2 = 1-\delta\).
Note that Claim 2 implies the theorem.
Claim 2 (restatement): Let \(v_1,\ldots,v_n \in {\mathbb{R}}^d\) and connect \(i\sim j\) if \({\langle v_i,v_j \rangle} \leq -1+2\delta\). Then, maximum cut is obtained by \(S = \{ v : v_i \gt 0 \}\).
Instantiation of isoperimetry:
Classical isomeperimetric inequality: Circles are the shapes in the plane of given volume that minimize the perimeter.
Borel’s isoperimetry for \({\mathbb{R}}^d\): Half spaces are the subsets of the unit sphere that minimize the perimeter if connect close vectors
\(\sim\) maximize the cut edges if we connected nearly antipodal vectors.
Implies the claim. ∎
Given universe \(U\) of objects (e.g., shapes, sets) and objective \(\Psi:U\rightarrow {\mathbb{R}}\), often want to show:
Optimality theorems: The objects that optimize \(\Psi\) are “nice” (e.g., circles in the classical isoperimtry case, halfspaces in Borel’s case, codewords in testing cases)
Stability theorems: If object is nearly optimal for \(\Psi\) then it’s close to a “nice” object (e.g., unique decoding).
Inverse theorems: If object has better than random value for \(\Psi\) then “somewhat correlated” with “nice” (e.g., list decoding)
Many examples in this course, TCS, math, and elsewhere.