Haitham Hassanieh

The Sparse Fourier Transform


Скачать книгу

let P be the set of all pairs (i, v) for which the command ŵiv was executed. Claims 4.1 and 4.2 and Lemma 4.2 together guarantee that for each iS the probability that P does not contain the pair (i, ()i) is at most 4|S|/B + α. We complement this observation with the following claim.

      Claim 4.3 For any jJ we have jhσ,b(S). Therefore, |J| = |P| ≤ |S|.

      Proof Consider any jhσ,b(S). From Equation (A.1) in the proof of Lemma 4.2 it follows that |ûj| ≤ δnL < 1/2.

      Lemma 4.3 Consider an execution of NOISELESSSPARSEFFTINNER, and let S = supp().If |S| ≤ k′, then

Image

      Proof Let e denote the number of coordinates iS for which either Ecoll(i) or Eoff (i) holds. Each such coordinate might not appear in P with the correct value, leading to an incorrect value of ŵi. In fact, it might result in an arbitrary pair (i′, v′) being added to P, which in turn could lead to an incorrect value of Image. By Claim 4.3 these are the only ways that ŵ can be assigned an incorrect value. Thus, we have

Image

      Since E[e] ≤ (4|S|/B + α)|S| ≤ (4β + α)|S|, the lemma follows.

      Analysis of NOISELESSSPARSEFFT

      Consider the tth iteration of the procedure, and define St = supp() where denotes the value of the variable at the beginning of loop. Note that |S0| = |supp()| ≤ k.

      We also define an indicator variable It which is equal to 0 iff |St|/|St−1| ≤ 1/8. If It = 1 we say the tth iteration was not successful. Let γ = 8 · 8(β + α). From Lemma 4.3 it follows that Image. From Claim 4.3 it follows that even if the tth iteration is not successful, then |St|/|St−1| ≤ 2.

      For any t ≤ 1, define an event E(t) that occurs iff Image. Observe that if none of the events E(1) … E(t) holds then |St| ≤ k/2t.

      Lemma 4.4 Let E = E(1) ∪ … ∪ E (λ) for λ = 1 + log k. Assume that (4γ)1\2 < 1/4. Then Pr[E] ≤ 1/3.

      Proof Let t′ = [t/2]. We have

Image

      Therefore,

Image

      Theorem 4.1 Suppose is k-sparse with entries from {−L, …, L} for some known L = nO(1). Then the algorithm NOISELESSSPARSEFFT runs in expected O(k log n) time and returns the correct vector with probability at least 2/3.

      Proof The correctness follows from Lemma 4.4. The running time is dominated by O(log k) executions of HASHTOBINS.

      Assuming a correct run, in every round t we have

Image

      Therefore,

Image

      so the expected running time of each execution of HASHTOBINS is Image. Setting α = Θ(2−t/2) and β = Θ(1), the expected running time in round t is Image. Therefore the total expected running time is O(k log n). ■

      For the general case, we only achieve Equation (4.1) for Image. In general, for any parameter δ > 0 we can add Image to the right hand side of Equation (4.1) and run in time O(k log(n/k) log(n/δ)).

      Pseudocode for the general algorithm SFT 4.0 is shown in Algorithms 4.2 and 4.3.

       4.3.1 Intuition

      Let S denote the “heavy” O(k/) coordinates of . The overarching algorithm SPARSEFFT (SFT 4.0) works by first “locating” a set L containing most of S, then “estimating” x̂L to get . It then repeats on Image. We will show that each heavy coordinate has a large constant probability of both being in L and being estimated well. As a result, Image is probably nearly k/4-sparse, so we can run the next iteration with kk/4. The later iterations then run faster and achieve a higher success probability, so the total running time is dominated by the time in the first iteration and the total error probability is bounded by a constant.

      Algorithm 4.2 SFT 4.0: General Sparse Fourier Transform for k = o(n), part 1 of 2

Image

      In the rest of this intuition, we will discuss the first iteration of SPARSEFFT with simplified constants. In this iteration, hashes are to B = O(k/∈) bins and, with 3/4 probability, we get so Image is nearly k/4-sparse. The actual algorithm will involve a parameter α in each iteration, roughly guaranteeing that with 1 − Image probability, we get so Image is nearly Imagek-sparse; the formal guarantee will be given by Lemma 4.11. For this intuition we only consider the first iteration where α is a constant.

       Location

      As in the noiseless case, to locate the “heavy” coordinates we consider the “bins” computed by HASHTOBINS with Pσ,a,b. This roughly corresponds to first permuting the coordinates according to the (almost) pairwise independent permutation Pσ,a,b, partitioning the coordinates into B = O(k/∊) “bins” of n/B consecutive indices, and observing the sum of values in each bin. We get that each heavy coordinate i has a large constant probability