The Maximum Cut Problem

Boaz Barak and David Steurer

UCSD winter school on Sum-of-Squares, January 2017

The Maximum Cut problem

Max Cut Problem

Input: \(d\)-regular graph \(G=(V,E)\) on \(n\) vertices.
Goal: Compute \(\max_{x\in\{0,1\}^n} f_G(x)\) where \(f_G(x) = \tfrac{1}{|E|}\sum_{\{i,j\}\in E} (x_i-x_j)^2\).

Basic graph problem
Great testbed for algorithmic techniques
Related to practical questions including X-ray crystallography (phase recovery), cryo-electron microscopy (angular synchronization), and more

Known results about max cut

Can test for bipartiteness: if \(\max f_G = 1\) then we can find a maximizing assignment.

Theorem (Erdős 67): A random assignment \(x\) satisfies \(f_G(x) \geq 1/2\) in expectation.

Can we beat this? “combinatorial” algorithms can’t distinguish bet random \(d\)-regular graph and random nearly bipartite graph.

Today’s lecture

Geomans-Williamson algorithm - non trivial approx using degree \(2\) sos
Feige Schechtman integrality gap - GW analysis is tight
Khot et al, Mossell et al, Raghavendra - degree \(2\) sos is optimal under the Unique Games Conjecture.

GW Algorithm

Input: Graph \(G\).
Output: \(\max {\tilde{\mathbb{E}}}_\mu f_G\) over \(2\)-pdist \(\mu\).

\(= \max \mathrm{Tr}(MG)\) over \(M \succeq 0\) s.t. \(\forall_i M_{i,i}=1/2\)

Clearly \(GW(G) \geq \max f_G\).

GW Thm 1: If \(\max f_G = 1 - \epsilon\) then \(GW(G) \in [1-\epsilon,1-C\epsilon^2]\).

GW Thm 2: \(GW(G) \in [\max f_G , \max f_G / 0.878 ]\).

GW Algorithm Analysis

Suppose \({\mathbb{E}}_{i \sim j} {\tilde{\mathbb{E}}}_\mu (x_i-x_j)^2 \geq 1-\delta\).

Quadratic sampling lemma \(\Rightarrow\) sample Gaussian \(Y\in {\mathbb{R}}^n\) s.t. \({\mathbb{E}}Y_iY_j = {\tilde{\mathbb{E}}}_\mu x_ix_j\), \({\mathbb{E}}Y_i =1/2\).

Exercise: \(Y,Y'\) Gaussians s.t \({\mathbb{E}}Y = {\mathbb{E}}Y' = {\mathbb{E}}Y^2 = {\mathbb{E}}{Y'}^2 = 1/2\) and \({\mathbb{E}}(Y-Y')^2 = 1 -\delta\) then \({\mathbb{E}}(Y-1/2)(Y'-1/2) \leq C\sqrt{\delta}\).

(\(Y' \sim -(1-\delta)Y + \sqrt{\delta}Z\))

Corollary: If \(Z_i = \mathbb{I}_{Y_i \geq 1/2}\) and \({\tilde{\mathbb{E}}}_\mu (x_i-x_j)^2 = 1-\delta_{i,j}\) then \({\mathbb{E}}(Z_i-Z_j)^2 \geq 1-C\sqrt{\delta_{i,j}}\).

By concavity of \(\sqrt{\cdot}\), get \({\mathbb{E}}_{i\sim j} {\mathbb{E}}(Z_i-Z_j)^2 \geq 1 - C\sqrt{\delta}\). ∎

GW Thm 2 follows similarly.

Digression: where do square roots come from?

The square root approximation in the GW Max Cut algorithm is related to many other settings:

Cheeger’s inequality
Relation between the \(\ell_1\) and \(\ell_2\) norm.
Pinsker’s Inequality relating the Kullback-Leibler divergence and total variation distance.
The fact that \(1/\epsilon^2\) samples are required to detect an \(\epsilon\)-biased coin.

Big open question: Is this loss inherent in this setting?

Tightness of GW

Thm (Feige-Schechtman ’02): \(\forall \delta\gt0\), \(\exists\) \(G\) and \(2\) pdist \(\mu\) s.t. \({\tilde{\mathbb{E}}}_\mu f_G = 1-\delta\) but \(\max f_G \leq 1 - C\sqrt{\delta}\).

Moreover, constant \(C\) exactly matches GW analysis.

(Otherwise can use \(\Theta(\tfrac{1}{\sqrt{\delta}})\) long odd cycle.)

Proof of FS Thm

Thm (Feige-Schechtman ’02): \(\forall \delta\gt0\), \(\exists\) \(G\) and \(2\) pdist \(\mu\) s.t. \({\tilde{\mathbb{E}}}_\mu f_g = 1-\delta\) but \(\max f_G \leq 1 - C\sqrt{\delta}\).

Fix \(d=poly(1/\delta)\) and let \(n \gg 2^d\) be a very large integer.

Pick \(v_1,\ldots,v_n \in {\mathbb{R}}^d\) to be random unit vectors, and define the graph \(G\) on \(n\) vertices such that \(i \sim j\) if \({\langle v_i,v_j \rangle} \leq -1+2\delta\).

Claim 1: If we define \({\tilde{\mathbb{E}}}x_ix_j = {\langle v_i,v_j \rangle}/2+1/2\) then (i) \({\tilde{\mathbb{E}}}f_G \geq 1-\delta\) and (ii) \({\tilde{\mathbb{E}}}f^2 \geq 0\) for every linear \(f\).

Proof: (i) is by inspection. For (ii) note that

\[ {\tilde{\mathbb{E}}}f(x)^2 = \sum_{f_if_j}{\tilde{\mathbb{E}}}x_ix_j = 1/2 + \tfrac{1}{2} {\langle w ,w \rangle} \]

where \(w = \sum_{i=1}^n f_iv_i\).

Claim 2: The maximum cut of \(G\) is at most \(Pr[ (Y-1/2)(Y'-1/2) \leq 0 ]+o(1)\) where \(Y,Y'\) are Gaussians satisfying \({\mathbb{E}}Y = {\mathbb{E}}Y' = {\mathbb{E}}Y^2 = {\mathbb{E}}{Y'}^2 = 1/2\) and \({\mathbb{E}}(Y-Y')^2 = 1-\delta\).

Note that Claim 2 implies the theorem.

Isoperimetry:

Claim 2 (restatement): Let \(v_1,\ldots,v_n \in {\mathbb{R}}^d\) and connect \(i\sim j\) if \({\langle v_i,v_j \rangle} \leq -1+2\delta\). Then, maximum cut is obtained by \(S = \{ v : v_i \gt 0 \}\).

Instantiation of isoperimetry:

Classical isomeperimetric inequality: Circles are the shapes in the plane of given volume that minimize the perimeter.

Borel’s isoperimetry for \({\mathbb{R}}^d\): Half spaces are the subsets of the unit sphere that minimize the perimeter if connect close vectors

\(\sim\) maximize the cut edges if we connected nearly antipodal vectors.

Implies the claim. ∎

General isoperimetric results:

Given universe \(U\) of objects (e.g., shapes, sets) and objective \(\Psi:U\rightarrow {\mathbb{R}}\), often want to show:

Optimality theorems: The objects that optimize \(\Psi\) are “nice” (e.g., circles in the classical isoperimtry case, halfspaces in Borel’s case, codewords in testing cases)

Stability theorems: If object is nearly optimal for \(\Psi\) then it’s close to a “nice” object (e.g., unique decoding).

Inverse theorems: If object has better than random value for \(\Psi\) then “somewhat correlated” with “nice” (e.g., list decoding)

Many examples in this course, TCS, math, and elsewhere.