Lower bound for planted clique problem

Boaz Barak and David Steurer

UCSD winter school on Sum-of-Squares, January 2017

Computational Bayesian probabilities.

Last time: Grigoriev pdist for 3XOR: \({\tilde{\mathbb{E}}}\chi_S \in \{ \pm 1 , 1/2 \}\).

Is there zero/one/half law for computational probabilities?
(Either prove, refute or random?)

\({\mathbb{P}}[ \color{red}{ P \neq NP} ] = 1/2\)?

\({\mathbb{P}}[ \color{red}{P\neq NP \;|\; 3SAT \not\in Dtime(n^2)} ] = 1/2\)?

Today: Natural example of pdist where \({\tilde{\mathbb{E}}}\chi_S \not\in \{ \pm 1, 1/2 \}\).

As degree \(d\) grows, expectations tend to \(\pm 1\).

Planted clique problem

Distinguish between
* \(G \sim G(n,1/2)\): each edge w.p. \(1/2\)
* \(G' \sim G(n,1/2,k)\): above + planted random \(k\)-clique

\({\mathbb{E}}|E_{G'}| = {\mathbb{E}}|E_G| + \Theta(k^2)\)

Cor: If \(k^2 \gg \sqrt{n^2}\) can distinguish w.h.p.

Can we do better?

If guess \(d\) vertices, can look at \(n/2^d\) sized common neighborhood

Best known tradeoff: \(k=O(\sqrt{n/2^d})\) in \(n^{O(d)}\) time.

SOS and planted clique

Distinguish \(G\sim G(n,1/2)\) from \(G'\sim G(n,1/2,k)\).

Thm: (B-Hopkins-Kelner-Kothari-Moitra-Potechin) If \(d=O(1)\) and \(k \lt n^{1/2-\epsilon}\), w.h.p. over \(G\), \(\exists\) \(d\)-pdist \(\mu\) over \({\{0,1\}}^n\) s.t. \({\tilde{\mathbb{E}}}_\mu |x| =k\) but \({\tilde{\mathbb{E}}}_\mu x_ix_j = 0\) for all \((i,j)\not\in E_G\).

Improving on (Meka-Potechin-Wigderson’14),(Deshpande-Montanari’15),(Hopkins-Kothari-Potechin-Raghavendra-Schramm’16)

Proof: If problem hard, same pdist for \(G\) and \(G'\).

Need to come up with bets for \({\mathbb{E}}_{G'} x_A = {\mathbb{P}}_{G'}[ A \subseteq CLIQUE ]\)

1st attempt: Naive pdist
\({\tilde{\mathbb{E}}}x_A = 0\) if \(A\) not a clique.
\({\tilde{\mathbb{E}}}x_A = c\left(\tfrac{k}{n}\right)^{|A|}\) otherwise.

\(c=c(|A|)\) chosen s.t. \({\tilde{\mathbb{E}}}(\sum x_i)^t = k^t\) for \(t=1..d\)

Naive pdist is “Bayesianially incorrect”

Naive pdist: \({\tilde{\mathbb{E}}}x_A = 0\) if \(A\) not clique, \({\tilde{\mathbb{E}}}x_A = c\left(\tfrac{k}{n}\right)^{|A|}\) o/w

Suppose \(k = \epsilon\sqrt{n}\) and \(deg(1)=\tfrac{n}{2}+t\sqrt{n}\) for \(t\gg 1\).

Naive bet: \({\mathbb{P}}[ 1 \in CLIQUE ] = {\tilde{\mathbb{E}}}x_1 = \tfrac{k}{n}\).

Bayes condition: \({\mathbb{P}}[ 1 \in CLIQUE | deg(1) = \tfrac{n}{2} + t\sqrt{n}] \sim \tfrac{k}{n}e^{\epsilon t}\)

\(x_1=0 \color{red}{\Rightarrow} deg(1) \sim N(n/2,n/2)\)
\(x_1=1 \color{red}{\Rightarrow} deg(1) \sim N(n/2 + \epsilon\sqrt{k},n/2)\)

\({\mathbb{P}}[ deg(1) = \tfrac{n}{2}+t\sqrt{n} | x_1 = 0] \sim \exp(-t^2)\)
\({\mathbb{P}}[ deg(1) = \tfrac{n}{2}+t\sqrt{n} | x_1 = 0] \sim \exp(-(t-\epsilon)^2)\)

ratio is \(\exp(2\epsilon t)\)!

Is Bayesian incorrect \(=\) non SOS?

NO!

Lemma: If \(k\lt\sqrt{n}/10\) then w.h.p. over \(G\sim G(n,1/2)\) Naive pdist is valid degree \(2\) pdist.

Proof: For naive \({\tilde{\mathbb{E}}}x_i^2 = {\tilde{\mathbb{E}}}x_i = \tfrac{n}{k}\) for all \(i\), \({\tilde{\mathbb{E}}}x_ix_j\) eqs \(0\) if \((i,j) \not\in E_G\) and \(2\tfrac{k^2}{n^2}\) o/w, and so:

\[ \left( {\tilde{\mathbb{E}}}x_i x_j \right)_{i,j} = \tfrac{k}{n}I + 2\tfrac{k^2}{n^2}A = \tfrac{k}{n}(I + \tfrac{k}{n}A) \]

where \(A=A(G)\) is adjacency matrix.

Theorem: w.h.p. \(\| A_G \|_{spectral} \leq 2\sqrt{n}\).

Deg 4 helps

In contrast:

Lemma: If \(k \gg n^{1/3}\) then w.h.p. over \(G\sim G(n,1/2)\) Naive pdist is not valid degree \(4\) pdist.

Naive deg 4 is not pdist

Proof: Let \(r(x)= \sum R_{1,i}x_i\) where \(R_{i,i}=0\) and \(R_{i,j}\) eqs \(+1\) if \(i\sim j\) and \(-1\) otherwise, and let \(p(x)=(x_1-\alpha r(x)^2)\).

We will show \({\mathbb{E}}_G {\tilde{\mathbb{E}}}p^2 \lt 0\).

\[{\tilde{\mathbb{E}}}p(x)^2 = \color{purple}{{\tilde{\mathbb{E}}}x_1^2} - 2\alpha \color{red}{{\tilde{\mathbb{E}}}x_1 r(x)^2} + \alpha^2\color{green}{{\tilde{\mathbb{E}}}r(x)^4}\]

Claim 0: \(\color{purple}{{\tilde{\mathbb{E}}}_{\mu(G)} x_1^1} = \tfrac{k}{n}\)

Claim 1: \(\color{red}{{\tilde{\mathbb{E}}}x_1 r(x)^2} \gtrsim \tfrac{k^3}{n}\)

Proof: \({\tilde{\mathbb{E}}}x_1 r(x)^2 = \sum_{i,j} R_{1,i}R_{1,j} {\tilde{\mathbb{E}}}x_1x_ix_j \sim n^2\tfrac{k^3}{n^3} = \tfrac{k^3}{n}\)

Since \({\tilde{\mathbb{E}}}x_1x_ix_j \neq 0 \;\Rightarrow\; R_{1,i}=R_{1,j}=1\) ∎

Naive deg 4 is not pdist

Proof: Let \(r(x)= \sum R_{1,i}x_i\) where \(R_{i,i}=0\) and \(R_{i,j}\) eqs \(+1\) if \(i\sim j\) and \(-1\) otherwise, and let \(p(x)=(x_1-\alpha r(x)^2)\).

\[{\tilde{\mathbb{E}}}p(x)^2 = \color{purple}{{\tilde{\mathbb{E}}}x_1^2} - 2\alpha \color{red}{{\tilde{\mathbb{E}}}x_1 r(x)^2} + \alpha^2\color{green}{{\tilde{\mathbb{E}}}r(x)^4}\]

Claim 0: \(\color{purple}{{\tilde{\mathbb{E}}}_{\mu(G)} x_1^1} = \tfrac{k}{n}\)

Claim 1: \(\color{red}{{\tilde{\mathbb{E}}}x_1 r(x)^2} \gtrsim \tfrac{k^3}{n}\)

Claim 2: \({\mathbb{E}}_G \color{green}{{\tilde{\mathbb{E}}}_{\mu (G)} r(x)^4} \lesssim k^2\)

Proof: \({\mathbb{E}}_G {\tilde{\mathbb{E}}}_{\mu(G)} r(x)^4 = {\mathbb{E}}_G \sum_{i_1,i_2,i_3,i_4\gt1} R_{1,i_1}R_{1,i_2} R_{1,i_3} R_{1,i_4} {\tilde{\mathbb{E}}}_{\mu(G)} x_{i_1}x_{i_2}x_{i_3}x_{i_4}\)

Key observation: Event \(\{i_1,i_2,i_3,i_4\}\) is clique is independent of \(R_{1,i_1},\ldots,R_{1,i_4}\).

\(\Rightarrow {\mathbb{E}}_G {\tilde{\mathbb{E}}}_{mu(G)} r(x)^4 \sim \sum_{i,j} {\tilde{\mathbb{E}}}_{mu(G)} x_i^2x_j^2 = \sim n^2\tfrac{k^2}{n^2} = k^2 \; ∎\)

Putting it all together

Proof: Let \(r(x)= \sum R_{1,i}x_i\) where \(R_{i,i}=0\) and \(R_{i,j}\) eqs \(+1\) if \(i\sim j\) and \(-1\) otherwise, and let \(p(x)=(x_1-\alpha r(x)^2)\).

\[{\tilde{\mathbb{E}}}p(x)^2 = \color{purple}{{\tilde{\mathbb{E}}}x_1^2} - 2\alpha \color{red}{{\tilde{\mathbb{E}}}x_1 r(x)^2} + \alpha^2\color{green}{{\tilde{\mathbb{E}}}r(x)^4}\]

Claim 0: \(\color{purple}{{\tilde{\mathbb{E}}}_{\mu(G)} x_1^1} = \tfrac{k}{n}\)

Claim 1: \(\color{red}{{\tilde{\mathbb{E}}}x_1 r(x)^2} \gtrsim \tfrac{k^3}{n}\)

Claim 2: \({\mathbb{E}}_G \color{green}{{\tilde{\mathbb{E}}}_{\mu (G)} r(x)^4} \lesssim k^2\)

To ensure \({\mathbb{E}}_G {\tilde{\mathbb{E}}}_{\mu(G)} p^2 \lt0\), want \(\alpha \color{red}{\tfrac{k^3}{n}} \gg \color{purple}{\tfrac{k}{n}} + \alpha^2\color{green}{k^2}\)

Set \(\alpha = 10/k^2\) and then need \(\tfrac{10}{k^2}\color{red}{\tfrac{k^3}{n}} \gg \tfrac{10^2}{k^4}\color{green}{k^2}\). . . .

Works whenever \(k^{1/3} \gg n\).

Recap

Naive pdist is Bayesianally incorrect and not valid for deg \(\geq 4\).

Problem: Only “Bayeisanally correct” dist is actual clique.

Our approach: Match only “simple” correlations.

Construct pdist \(\mu\) such that if \(F:{\mathbb{R}}^{n^2}\rightarrow {\mathbb{R}}\) is low degree in the graph and \(p:{\mathbb{R}}^n\rightarrow{\mathbb{R}}\) is low degree in the clique then:

\[ {\mathbb{E}}_{(G',x)\sim G(n,1/2,k)} F(G)p(x) = {\mathbb{E}}_G F(G) {\tilde{\mathbb{E}}}_{\mu (G)} p(x) \;\;(*) \]

Example: \(F(G) = deg_G(1)\), \(p(x)=x_1\).

Call this condition pseudo calibration

Pseudo calibrated pdist

\(\mu\) needs to satisfy:

\[ {\mathbb{E}}_{(G',x)\sim G(n,1/2,k)} F(G)p(x) = {\mathbb{E}}_G F(G) {\tilde{\mathbb{E}}}_{\mu (G)} p(x) \;\;(*) \]

for every edge-poly \(F\) of degree \(\leq \tau\) and vertex-poly \(p\) of deg \(\leq d\).

For every \(F,p\) we get a linear constraint on the map \(\mu:{\mathbb{R}}^{n^2+n^d}\rightarrow R\) mapping \(G,p\) to \({\tilde{\mathbb{E}}}_{\mu(G)} p\).

If we restrict \(\mu\) to degree at most \(\tau\), then have \(n^{\tau +d}\) constraints on \(n^{\tau + d}\) degrees of freedom.

\(\Rightarrow\) pseudo-distribution is fully determined!

Pseudo calibrated pdist

We use unique \(\mu\) s.t., map \(G \mapsto \mu(G)\) has edge-degree \(\leq \tau\) and

\[ {\mathbb{E}}_{(G',x)\sim G(n,1/2,k)} F(G)p(x) = {\mathbb{E}}_G F(G) {\tilde{\mathbb{E}}}_{\mu (G)} p(x) \;\;(*) \]

for every edge-poly \(F\) of degree \(\leq \tau\) and vertex-poly \(p\) of deg \(\leq d\).

Why restrict degree of \(\mu\)?
1. So it is efficiently computable.
2. Low degree map = higher entropy. (max entropy principle)
3. Because it works.

Hard part: Show that \(\mu\) is a valid \(d\)-pdist.

Proving psd’ness

Compute \(\mu\) based on pseudo-calibration

Need: \(M={\tilde{\mathbb{E}}}_\mu x^{\otimes d}\) is p.s.d

Naive approach: Show that \({\mathbb{E}}M \succeq \lambda I\), w.h.p \({\| M \|}_{spectral} \preceq \lambda\).

Not true.

Our approach: “Symbolic diagonalization” of \(M\).

Summary

SOS can’t do much better than \(\sqrt{n}\) for planted clique.

New “creativity free” approach to designing pseudo-distributions.

Open question: “Creativity free” approach for analyzing pseudo-distributions? Universal rounding algorithm?