Academia.eduAcademia.edu
2086 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 47, NO. 5, JULY 2001 The Optimal Transform for the Discrete Hirschman Uncertainty Principle We shall identify G1 (A) with the Cartesian product A 2 A 2 A via the map Tomasz Przebinda, Victor DeBrunner, Senior Member, IEEE, and Murad Özaydın Abstract—We determine all signals giving equality for the discrete Hirschman uncertainty principle. We single out the case where the entropies of the time signal and its Fourier transform are equal. These signals (up to scalar multiples) form an orthonormal basis giving an orthogonal transform that optimally packs a finite-duration discrete-time signal. The transform may be computed via a fast algorithm due to its relationship to the discrete Fourier transform. G1 (A) 3 0 x z 1 y 0 0 1 ! (x; y; z) 2 A 2 A 2 A: 1 (1) In terms of (1), the matrix multiplication and the inverse look as follows: (x; y; z )(x0 ; y0 ; z 0 ) = (x + x0 ; y + y0 ; z + z 0 + xy0 ); (x; Index Terms—Entropy, information measures, orthogonal functions, signal representation theory. y; z )01 = (0x; 0y; 0z + xy)(x; y; z; x0 ; y0 ; z 0 2 A): Let (a) = exp(2ja=N ) I. INTRODUCTION In [1], we introduced the weighted average of the entropies of a discrete-time signal and its Fourier transform Hp that measures the concentration of a signal in the sample-frequency phase plane. This was used to show that discretized Gaussian pulses may not be the most compact basis [2], and a lower limit on the compaction in the phase plane was conjectured. We have since discovered that part of this conjectured lower limit was proven in [3] under the moniker of “a discrete Hirschman’s uncertainty principle.” This principle states that H is at least 21 log(N ), where N is the length of the discrete-time signal. However, that result did not describe the characteristics of the signals that meet the limit, as our conjecture did [1], [4]. We further argued in [5] that this measure indicates two possible “best basis” options: 1) the multitransform (nonorthogonal) option, 2) the orthogonal discrete Hirschman uncertainty principle option. We have discussed many results in the first option (see [1] for pointers to many references). The second option is the focus of this correspondence. We have found a basis (transform) that is orthogonal and that uniquely minimizes the discrete Hirschman uncertainty principle. II. STATEMENT OF THE MAIN THEOREM Fix a positive integer N . Let A denote the ring =N . Thus A = f0; 1; 2; . . . ; N 0 1g, with the addition and multiplication modulo N . Often we shall view A as a group with respect to the addition. The Heisenberg group of degree one, with coefficients in A, is the group G1 (A) of all matrices of the form 0 x z 1 y 0 0 1 1 (x; (a 2 A): This is a unitary character of the (additive) group A. For a function u: A ! let kuk2 = a2A ju(a)j 2 1=2 (2) and let L2 (A) denote the Hilbert space of all such functions, with the norm (2). Let (x; y; z )u(a) 2 = (ay + z )u(a + x)(u 2 L (A); a; x; y; z 2 A): (3) It is easy to check that  is a group homomorphism from G1 (A) to the group of unitary operators on L2 (A). In other words,  is a unitary representation of G1 (A) on the space L2 (A). Recall the discrete Fourier transform (DFT), defined with respect to the character  F u(b) = u^(b) = jAj01=2 a2A u(a)(0ab) (u 2 L2 (A); b 2 A): Here jAj = N is the cardinality of the set A. The inverse Fourier transform is given by u(a) = jAj01=2 b2A u^(b)(ab) (u 2 L2 (A); a 2 A): A straightforward calculation shows that y; z 2 A): F (x; y; z)F 01 = (0y; x; z 0 xy) Manuscript received October 5, 1999; revised January 25, 2001. This work was supported in part by the National Science Foundation under Grant DMS9622610. T. Przebinda and M. Özaydın are with the Department of Mathematics, The University of Oklahoma, Norman, OK 73019 USA (e-mail: przebin@crystal. ou.edu; [email protected]). V. DeBrunner is with the School of Electrical and Computer Engineering, The University of Oklahoma, Norman, OK 73019 USA (e-mail: [email protected]). Communicated by J. A. O’Sullivan, Associate Editor for Detection and Estimation. Publisher Item Identifier S 0018-9448(01)04430-3. (x; y; z 2 A): (4) In other words, the Fourier transform normalizes the group (G1 (A)). For u 2 L2 (A), with kuk2 = 1, let H (u) = 0 a2A ju(a)j2 log ju(a)j2 and let Hp (u) = pH (u) + (1 0 p)H (^ u) 0018–9448/01$10.00 © 2001 IEEE (0  p  1): IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 47, NO. 5, JULY 2001 A straightforward calculation shows that for a nonzero function u: A ! and for a number 0 < t < 1 the following formulas hold: It is easy to see that Hp ((h)u) = Hp (u) (h 2 G (A); 0  p  1): d dt log kuk =t = 0 d2 dt2 log kuk =t = 1t 1 We would like to consider u 2 L2 (A) with kuk2 = 1 equivalent to u where jj = 1. As H (u) = H (v) and Hp (u) = Hp (v) for equivalent u and v , H and Hp are defined on the equivalence classes. This set of equivalence classes forms a complex projective space which we will denote by P (A). Note that being orthogonal is well-defined on the equivalence classes, so a subset of P (A) being orthonormal makes sense. There is an induced action of the Heisenberg group G1 (A) on P (A) defined via (3) at the level of representatives for the equivalence classes. Below we will use the same symbol u for an element of L2 (A) with kuk2 = 1 and the equivalence class it represents in P (A). If B is a subset of A, let B denote the indicator function of B . Thus, B (a) = 1 if a 2 B , and B (a) = 0 if a 2 A n B . Here is our main theorem. v 2087 and = b) 1 2 1 2 1 2 cides with the union of the orbits (G1 (A)) 1 log( B (B —a subgroup of A): jAj) coin(5) c) Each orbit (5) is an orthonormal basis of L (A). d) The set of vectors u 2 P (A) and Hp (u) = 21 log(jAj) for all 0  p  1 is not empty if and only if jAj is a square. In this case, this set coincides with the orbit (5) for the unique subgroup B  A of cardinality jB j = jAj. Part a) of the above theorem has been proven by Dembo, Cover and Thomas, [3]. The idea of their proof is based on Hirschman’s work, [6]. In fact, those authors name the inequality a) “the discrete Hirschman uncertainty principle.” Following this line, we have chosen the title of this correspondence. While unaware of the work in [3], we conjectured a result close to the above theorem in [1]. The conjecture was refined in [4]. The strategy of the proof of part b) is to reduce it to a result of Donoho and Stark [7, Theorem 13]. In order to keep the presentation self-contained, we give proofs for this as well as part a). Part c) suggests a close connection of the functions listed in b) with wavelets, along the lines explored partially in [8]. A generalization of parts a) and b) of the Main Theorem, where the finite cyclic group A is replaced with a compactly generated, locally compact abelian group is available [9]. This includes multidimensional finite (A is a product of finite cyclic groups), continuous (A = N ), and periodic (A = ( = )N ) cases, as well as their products. kukp = 1 1 1 1 a2A ju(a)jp 1=p : 2 1 1 1 : 1 1 Since the second derivative is nonnegative, the function log kuk1=t , < t < 1, is convex. Hence, for u 2 L2 (A) with kuk2 = 1 0 d log kuk1=t dt H (u) = t=  t!lim1 + d dt kuk =t = log(jsupp uj) log 1 where jsupp uj stands for the cardinality of supp u, the support of u. The inequality (7) is of course well known. Since ku ^k2 = kuk2 , and since ku^k1  jAj0 = kuk 1 let 1 the Riesz–Thorin theorem, [10, Ch. 12, eq. (1.11)], implies ku^k = 0t  jAj 0t kuk =t  t  1; u 6= 0 1 ) 2 1 : (8) By applying negative logarithm to both sides of (8) we obtain the following inequality: kuk =t) 0 log(ku^k = 0t )  log( 1 1 (1 t0 ) 1 2 jAj) log( t1 1 2 : (9) As an aside, notice that the left-hand side of (9) is a difference of two convex functions. We assume from now on that kuk2 = 1. Then both sides of (9) are equal to zero for t = 21 . Hence, (6) and (7) imply H (u) + H (^ u)  log(jAj): (10) This verifies part a) of the theorem as in [3]. We are interested in functions u for which the equality holds in (10). We are going to use some ideas of Zygmund, [10, Ch. 12, eqs. (1.20)–(1.24)]. For a complex number z 2 define f (z ) = jAj0 +z F juj z juuj j b2A j z juu^^((bb))j : (b) u ^(b) 2 2 Here juuj = 0 outside the support of u, and, similarly, for juu^^j . Notice that for y 2 f III. PROOF OF THE MAIN THEOREM and a number 0 < p < (6) 1 1 ju(a)j =t =t a2A kuk =t =t 1 log ju(a)j=t kuk =t =t =t 0 ju(b)j=t log ju(b)j=t kuk =t b2A kuk =t 1 1 (1 ! 1 1 1 2 For a function u: A log 1 2 jBj ju(a)j =t kuk =t=t 1 1 (7) 2 P (A), then H = (u)  log(jAj). The set of vectors u 2 P (A) and H = (u) = 1 2 ju(a)j =t =t a2A kuk =t 1 1 Theorem 1 (Main Theorem): a) If u 1 1 2 + iy  kjuj i y k 1 kju^j i y k = kuk 1 ku ^k = 1 1 1 = 1; 1+ 2 2 1+ 2 2 2 2 and jf (1 + iy)j  jAj F juj 2+i2y  kuk 1 ku^k 2 2 2 2 u juj 1 1 kju^j 2+i2y k 1 = 1: Hence, by the Phragmén and Lindelöf theorem, [10, Ch. 12, eq. (1.1)] Also, let kuk1 = maxfju(a)j; a 2 Ag: jf (z)j  1 1 2  Re(z)  1 : 2088 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 47, NO. 5, JULY 2001 2 supp u u(a) = jAj0 Similarly, for a A straightforward calculation shows that d f (z ) = f 0 (z ) dz = f (z ) log(jAj) + jAj 0 +z 2A F = b 1 juj2 juuj log(juj2 ) z jAj0 + +z F juj2 juuj 2 b z j ju(a)j = jAj0 z for 1 2  Re(z)  1; f 1 2 = 1; and f 0 1 2 f (1) = jAj 2A F (juju) (b)ju^(b)ju^(b): (11) b a; b 2A Thus, the equality in (10) implies jsupp uj 1 jsupp u^j = jAj: ju(a)j2 ju^(b)j2(0ab) u(a) u^(b) : ju(a)j ju^(b)j Thus, part b) of the theorem will follow as soon as we verify the following theorem of Donoho and Stark, [7]. 2 L2(A). Then the equation jsupp vj 1 jsupp v^j = jAj holds if and only if there is a subgroup B  A, an element h 2 G1 (A), Theorem 2: Let v and a constant “const” such that v = const (h) Hence, for b v^(1) v^(2) ju(a)j2 ju^(b)j2 = ku k22 1ku^k22 = 1 .. . v^(m 0 1) u(a) u ^(b) ju(a)j ju^(b)j 2 supp u^ 0 u ^(b) = jAj (a 1 2 supp u; b 2 supp u^): = 2supp u (0a1 ) (0a1 )2 jAj0 .. . u(a)(0ab) jAj0 2supp u a u(a) u ^(b) v (a1 ) ju(a)j ju^(b)j : v (a2 ) By taking the absolute value of the extreme left and right sides of the above equations, we get ju^(b)j = jAj0 2supp u a ju(a)j (b 2 supp u^): (0a2 ) (13) 1 v (a3 ) .. . v (am ) 111 (0a2 )2 1 (0am ) (0am )2 .. m u(a) 1 (0a1 ) 01 (0a2 )m01 a = 0 1)) 6= (0; 0; . . . ; 0): v^(0) (12) (12) implies (0ab) . Let supp v = fa1 ; a2 ; . . . ; am g. Then a; b 1= B Lemma 3 [7]: Let v 2 L2 (A) and let m = jsupp uj. Then v^ cannot have more than m consecutive zeros. Proof: Since the translations of v^ do not effect the support of v , it shall suffice to show that Since 2A (14) = (^ v (0); v^(1); . . . ; v^(m Now we follow Zygmund’s proof of [10, Ch. 12, eq. (2.18)]. The formula (11) may be rewritten as 1= 2 supp u): H (u) + H (^ u) = log(jsupp uj) + log(jsupp u ^j): = 0: In particular, Re(f (z )) is a real-valued harmonic function in the disc of radius 41 centered at z = 43 . This harmonic function achieves its maximum at z = 21 , and has derivative equal to zero at this point. Hence, the Hopf’s Maximum Principle [11, Theorem 3.1.6’], implies that Re(f (z )) is constant on this disc. Hence, by standard properties of entire functions, f (z ) = 1 for all z 2 . This equation coincides with the formula [10, Ch. 12, eq. (1.24)], which has been obtained there under a slightly stronger assumption, [10, Ch. 12, eq. (1.20)]. In particular, for z = 1 we obtain 1= (a Hence, Thus, the equality in (10) is equivalent to f 0 ( 21 ) = 0. Altogether, we have checked that the function f (z ) has the following properties: f (z ) is an entire function jf (z)j  1; ju^(b)j = = log( 2 2supp u^ ju(a)j = jsupp uj01 2 and ju^(b)j = jsupp u^j01 2 (a 2 supp u; b 2 supp u ^): jAj) 0 H (u) 0 H (^u): 1 u ^(b) u(a) ju^(b)j ju(a)j The statements (13) and (14) mean that the functions juj and ju ^j are constant on their support. Since kuk2 = 1, it follows that A Hence, by Plancherel’s formula f0 2supp u^ u ^(b) b u ^(b) 2 ju^(b)j log(ju^(b)j ): 1 jAj0 and, therefore, j2 (b) u ^(b) z u ^(b)(ab) b j2 juu^^((bb))j j (b) u ^(b) 2supp u^ b . (0am )m01 (15) IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 47, NO. 5, JULY 2001 Since, by Vandermonde, the above m 2 m-matrix is invertible, (15) follows, and we are done. Next we recall a few facts concerning the Fourier transform. For a subset S  A let S? = fa 2 A; ab = 0 for all b 2 S g: It is easy to see that S ? is a subgroup of A, and that S ?? is the smallest subgroup of A containing S . Furthermore, v is invariant under translations by (supp v^)? , i.e., v (a + b) = v (a) (a 2 A; b 2 (supp v^)? ): (16)  A; if v(a + b) = v(a) for a 2 A and b 2 B; then supp v^  B ? : An elementary counting argument shows that for any subgroup B  A jAj = jBjjB? j (17) for a subgroup B and H jBj B 1 IB jBj F jBj (h) 1 jBj B = = log( jBj) F 0) 1 B (a) B (a) = 0(x 2 A n B or y 2 A n B ? ): (19) a2B \(0x+B) (h) jBj B 1 = jB? j F (h)F 0 F B : B = (h0 ) 1 1 1 B (18) jBj 1 jB? j B : 1 supp (h) jBj B jB? j 1 jBj = jAj: 2 L (A) is such that jsupp vj 1 jsupp v^j = = 2 Conversely, suppose v jAj. Then the lemma implies that the elements of supp v^ are equally spaced. Hence, there is h 2 G1 (A) such that supp (h)v is a subgroup of A. Thus, we may assume that supp v^ is a subgroup of A. Let B be the unique subgroup of A such that B ? = supp v^. Then v is invariant under translations by elements of B , by (16). In particular, jsupp v j is a multiple of jB j. But our assumption implies that jsupp vj = jAj=jB? j = jBj. Hence, v is a translation of B . This completes the proof of part b) of the Main Theorem. Part d) of the Main Theorem is immediate from part b) because the equation Hp (u) = 1 2 jAj) log( (0  p  1) is equivalent to jAj) which, for u = p jBj , becomes jB j = jAj, by (18). H (u) = H (^ u) = 1 1 2 log( 2 It remains to verify part c) of the Main Theorem. A straightforward argument shows that, under the isomorphism (1), the stabilizer of the complex line B , in G1 (A), is given by StabG (A) ( a2B\(x+B) (ay ): (20) If x 2 A n B , then B \ (x + B ) is empty, so the quantity (20) is zero. If x 2 B , then B \ (x + B ) = B , so the quantity (20) is equal to F ( B )(y ) = 0, by (18). This completes our proof of part c) of the Main Theorem, and thus of the whole theorem. The basis functions that define the HOT are derived according to part b) of the Main Theorem, and those that are suggested in [7]. Consequently, we use the K -dimensional DFT as the originator signals for our N = K 2 -dimensional HOT basis. Each of these basis functions must then be shifted and interpolated to produce the sufficient number of orthogonal basis functions that define the HOT. We note that the DFT basis can be extended in a similar manner to produce an N = KL-dimensional transform. This basis, however, does not yield a HOT. To detail this process, consider the three-point DFT defined X (0) 1 (ay ) = A. The HOT Basis Functions Hence, by (17) supp a2A (x; y; Now that we have seen the theorem that actually defined the discrete Hirschman uncertainty principle optimal transform (HOT), we provide details regarding the transform. 2 Proof of Theorem 3.1: Let B  A be a subgroup and let h 2 G1 (A). We know from (4) and (18) that there is h0 2 G1 (A) such that F jBj coincides with the dimension of the space L2 (A). It remains to check that any two distinct elements of this orbit are orthogonal. Since the representation  is unitary, it shall suffice to show that IV. THE HIRSCHMAN OPTIMAL TRANSFORM =1 and 1 This is a normal subgroup of G1 (A), the quotient group G1 (A)=StabG (A) ( B ) is isomorphic to (A=B ) 2 (A=B ? ), via (1), and, by (17), has (jAj=jB j)(jAj=jB ? j) = jAj elements. Thus, the number of distinct elements in the orbit (G1 (A)) p1 B , (5), The left-hand side of (19) is equal to Here is a statement dual to (16) 1 2089 B ) = B 2 B ? 2 A: X (1) X (2) 1 = 1 1 1 e0i e0i 1 e0i e0i x[0] x[2] : x[3] This three-point DFT yields the nine-point HOT shown at the bottom of the next page. This organization is not unique—the rows can be reordered as desired. This representation would be consistent with the DFT. The Matlab source that implements the general version of the HOT is shown: function H = hot(x); % This function implements an N = K^2 Hirschman optimal transform % H = hot(x); % Input: x is a sequence of length N = K^2 % Output: H is the transform sequence [N; M] = size(x); K = sqrt(N); T = zeros(size(N)); W = fft(eye(K)); n = 1 : K : N; for tr = 0 : K 0 1 T(n + tr; n + tr) = W; end T = (1=sqrt(K)) 3 T; H = T 3 x; 2090 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 47, NO. 5, JULY 2001 In this script, H is the transform sequence, T is the transform, and x is the input sequence. The transform is unitary to a scale (just like the DFT), and so the inverse transform p can be achieved by taking the conjugate transpose and scaling by K . B. Fast HOT Computation Because the HOT is based on periodic shifts of the DFT, the N = -point HOT can be accomplished using K separate K -point DFT computations. Because the HOT requires lengths N that are squares of integers, the efficiency of any computational procedure will depend on the exact length N . For N = 4; 16; 64; 256; etc., this provides a fast HOT that requires O(N log K ) computations. For other lengths N , the efficiency is less. For instance, in the N = 9-point HOT shown above, we can see that the HOT transform coefficients are determined from K 2 H (0) H (3) x[0] DFT = Of course, in practice, the square roots need not be carried out. This is, as is commonly done in the DFT; that is, by moving the one square root out of the analysis relationship and moving it into the synthesis re1 . The N -point HOT is computationally lationship to create the scale K more efficient than the N -point DFT, and increasingly more efficient as N ! 1. As we have mentioned above, this is somewhat simplistic because the squared integers are not, in general, powers of 2. Consequently, for any length N we should compare specific counts. ACKNOWLEDGMENT The authors would like to thank Dr. M. Doroslovacki at George Washington University in Washington, DC, for his comments on [1] that helped lead us to finding some example signals that met the conjectured minimum, and thus ultimately to this proof that defines all such signals that are optimal according to the discrete form of the Hirschman uncertainty principle. x[3] H (6) x[6] REFERENCES and H (1) H (4) x[1] DFT = [1] V. DeBrunner, M. Özaydın, and T. Przebinda, “Resolution in time-frequency,” IEEE Trans. Signal Processing, vol. 47, pp. 783–788, Mar. 1999. [2] V. DeBrunner, M. Özaydın, T. Przebinda, and J. Havlicek, “The optimal solutions to the continuous- and discrete-time versions of the Hirschman uncertainty principle,” in Proc. ICASSP’00, Istanbul, Turkey, June 5–9, 2000. [3] A. Dembo, T. M. Cover, and J. A. Thomas, “Information theoretic inequalities,” IEEE Trans Inform. Theory, vol. 37, pp. 1501–1518, Nov. 1991. [4] V. DeBrunner, M. Özaydın, and T. Przebinda, “Analysis in a finite time-frequency plane,” IEEE Trans. Signal Processing, vol. 48, pp. 3586–3587, Dec. 2000. [5] T. Przebinda, V. E. DeBrunner, and M. Özaydın, “Using a new uncertainty measure to determine optimal bases for signal representations,” in Proc. ICASSP’99, Phoenix, AZ, Mar. 1999, paper 1575. [6] I. I. Hirschman, “A Note on entropy,” Amer. J. Math., vol. 79, pp. 152–156, 1957. [7] D. L. Donoho and P. B. Stark, “Uncertainty principles and signal recovery,” SIAM J. Appl. Math., vol. 49, pp. 906–931, 1989. [8] M. Özaydın and T. Przebinda, “Platonic orthonormal wavelets,” Appl. Comput. Harmon. Anal., vol. 4, pp. 351–365, 1997. [9] , “An entropy-based uncertainty principle for a locally compact, abelian, compactly generated group,” paper, submitted for publication. [10] A. Zygmund, Trigonometric Series, 2nd ed. Cambridge, U.K.: Cambridge Univ. Press, 1990, vol. I and II. [11] L. Hörmander, Notions of Convexity. Basel, Switzerland: Birkhäuser, 1994. x[4] H (7) x[7] and, finally, that H (2) H (5) = x[2] DFT x[5] H (8) : x[8] This requires three separate three-point DFT computations. In general, we have the (unitary) transform relationship H (K r + l) = p1 K K 01 0j nr ; x[K n + l ]e H (K r + l)e n=0 0  r; l  01 K and its inverse x[K n + l] = p1 K K 01 r=0 j nr ; 0  n; l  01 K : H (0) 1 0 0 1 0 0 1 0 0 x[0] H (1) 0 1 0 0 1 0 0 1 0 x[1] H (2) 0 0 1 0 0 1 0 1 x[2] 1 0 0 0 0 0 x[3] 1 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 x[6] 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0i e x[7] H (8) 0i e 0i e 0 H (7) 0i e 0i e 0 H (6) 0i e 0i e x[4] H (5) 0i e 0i e 0 0 0i e 0i e 0 H (3) 0i e H (4) = x[5] x[8] :