Hardness of Approximation Between P and NP. Aviad Rubinstein. Читать онлайн. Hotlib. HOTLIB.NET

Hardness of Approximation Between P and NP

Fact 2.6

Conditioning decreases entropy. .

Another measure we will use (briefly) in our proof is that of mutual information, which informally captures the correlation between two random variables.

Definition 2.14

Conditional mutual information. The mutual information between two random variables A and Β, denoted by I(A; B), is defined as

The conditional mutual information between A and Β given C, denoted by I(A; B|C), is defined as

The following is a well-known fact on mutual information.

Fact 2.7

Data processing inequality. Suppose we have the following Markov Chain:

where X⊥Z|Y. Then I(X; Z) ≥ I(X; Z) or, equivalently, H(X|Y) ≤ H(X|Z).

Mutual information is related to the Kullback Leiber Divergence similarity measure (also known as relative entropy).

Definition 2.15

Kullback-Leiber Divergence. Given two probability distributions μ₁ and μ₂ on the same sample space Ω such that (∀_ω ∈ Ω)(μ₂(ω) = 0 ⇒ μ₁(ω) = 0), the Kullback-Leibler Divergence between them is defined as

The connection between the mutual information and the Kullback-Leibler divergence is provided by the following fact.

Fact 2.8

For random variables A, B, and C we have

2.7 Useful Lemmata

2.7.1 Concentration

Lemma 2.2

Chernoff bound. Let X1,…, Xn be independent and identically distributed (i.i.d.) random variables taking value from {0, 1} and let p be the probability that Xi = 1; then, for any δ > 0, we have

2.7.2 Pseudorandomness

Theorem 2.4

k-wise independence Chernoff bound [Schmidt et al. 1995 (Theorem 5.I)]. Let x₁ … xn ∈ [0, 1] be k-wise independent random variables, and let and δ ≤ 1. Then

2.7.3 λ-Biased Sets

Definition 2.16

λ-biased sets. Let G be a finite field, and t > 0 an integer. A multiset S ⊆ G^t is λ-biased if for every non-trivial character χ of Gt,

Lemma 2.3

[Azar et al. 1998 (Theorem 3.2)]. A randomly chosen multi-set S ⊆ G^t of cardinality Θ(t log |G|/λ²) is λ-biased with high probability.

For many applications, an explicit construction is necessary. In our case, however, we can enumerate over all sets S of sufficient cardinality in quasi-polynomial time.² The following Sampling Lemma due to Ben-Sasson et al. [2003] allows us to estimate the average of any function over Gt using only one line and (1 + o(1)) log₂ |G^t| randomness:

Lemma 2.4

Sampling Lemma [Ben-Sasson et al. 2003 (Lemma 4.3)]. Let B : G^t → [0, 1]. Then, for any ϵ > 0,

2.7.4 Partitions

Given a 2CSP formula, we provide a few techniques to deterministically partition n variables to approximately subsets of approximately variables each, so that the number of constraints between every pair of partitions is approximately as expected.

Greedy Partition

Lemma 2.5

Let G = (V, E) bea d-regular graph and n ≜ |V|. We can partition V into n/k disjoint subsets {S₁,…, S_n/k} of size at most 2k such that:

Proof

We assign vertices to subsets iteratively, and show by induction that we can always maintain (2.1) and the bound on the subset size. Since the average set size is less than k, we have by Markov’s inequality that at each step less than half of the subsets are full. The next vertex we want to assign, v, has neighbors in at most d subsets. By our induction hypothesis, each Si is of size at most 2k, so on average over j ∈ [n/k], it has less than 4dk²/n neighbors in each Sj. Applying Markov’s inequality again, Si has at least 8d²k²/n neighbors in less than a (1/2d)-fraction of subsets Sj. In total, we ruled out less than half of the subsets for being full, and less than half of the subsets for having too many neighbors with subsets that contain neighbors of v. Therefore there always exists some subset Si to which we can add v while maintaining the induction hypothesis.

■

Derandomized Partition

We use Chernoff bound with Θ(log n)-wise independent variables to deterministically partition variables into subsets of cardinality ≈ . Our (somewhat naive) deterministic algorithm for finding a good partition takes quasi-polynomial time (n^{O(log n)}), which is negligible with respect to the sub-exponential size

Скачать книгу