2086
IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 47, NO. 5, JULY 2001
The Optimal Transform for the Discrete Hirschman
Uncertainty Principle
We shall identify G1 (A) with the Cartesian product A 2 A 2 A via
the map
Tomasz Przebinda, Victor DeBrunner, Senior Member, IEEE, and
Murad Özaydın
Abstract—We determine all signals giving equality for the discrete
Hirschman uncertainty principle. We single out the case where the
entropies of the time signal and its Fourier transform are equal. These
signals (up to scalar multiples) form an orthonormal basis giving an
orthogonal transform that optimally packs a finite-duration discrete-time
signal. The transform may be computed via a fast algorithm due to its
relationship to the discrete Fourier transform.
G1 (A) 3
0
x z
1 y
0
0
1
! (x; y; z) 2 A 2 A 2 A:
1
(1)
In terms of (1), the matrix multiplication and the inverse look as follows:
(x;
y; z )(x0 ; y0 ; z 0 ) = (x + x0 ; y + y0 ; z + z 0 + xy0 );
(x;
Index Terms—Entropy, information measures, orthogonal functions,
signal representation theory.
y; z )01 = (0x; 0y; 0z + xy)(x; y; z; x0 ; y0 ; z 0 2 A):
Let
(a) = exp(2ja=N )
I. INTRODUCTION
In [1], we introduced the weighted average of the entropies of a discrete-time signal and its Fourier transform Hp that measures the concentration of a signal in the sample-frequency phase plane. This was
used to show that discretized Gaussian pulses may not be the most
compact basis [2], and a lower limit on the compaction in the phase
plane was conjectured. We have since discovered that part of this conjectured lower limit was proven in [3] under the moniker of “a discrete
Hirschman’s uncertainty principle.” This principle states that H is
at least 21 log(N ), where N is the length of the discrete-time signal.
However, that result did not describe the characteristics of the signals
that meet the limit, as our conjecture did [1], [4]. We further argued in
[5] that this measure indicates two possible “best basis” options:
1) the multitransform (nonorthogonal) option,
2) the orthogonal discrete Hirschman uncertainty principle option.
We have discussed many results in the first option (see [1] for
pointers to many references). The second option is the focus of this
correspondence. We have found a basis (transform) that is orthogonal
and that uniquely minimizes the discrete Hirschman uncertainty
principle.
II. STATEMENT OF THE MAIN THEOREM
Fix a positive integer N . Let A denote the ring =N . Thus A =
f0; 1; 2; . . . ; N 0 1g, with the addition and multiplication modulo N .
Often we shall view A as a group with respect to the addition.
The Heisenberg group of degree one, with coefficients in A, is the
group G1 (A) of all matrices of the form
0
x z
1 y
0
0
1
1
(x;
(a
2 A):
This is a unitary character of the (additive) group A. For a function
u: A ! let
kuk2 =
a2A
ju(a)j
2
1=2
(2)
and let L2 (A) denote the Hilbert space of all such functions, with the
norm (2). Let
(x; y; z )u(a)
2
= (ay + z )u(a + x)(u 2 L (A); a; x; y; z 2 A): (3)
It is easy to check that is a group homomorphism from G1 (A) to
the group of unitary operators on L2 (A). In other words, is a unitary
representation of G1 (A) on the space L2 (A).
Recall the discrete Fourier transform (DFT), defined with respect to
the character
F u(b) = u^(b) = jAj01=2
a2A
u(a)(0ab)
(u
2 L2 (A); b 2 A):
Here jAj = N is the cardinality of the set A. The inverse Fourier
transform is given by
u(a) = jAj01=2
b2A
u^(b)(ab)
(u
2 L2 (A); a 2 A):
A straightforward calculation shows that
y; z 2 A):
F (x; y; z)F 01 = (0y; x; z 0 xy)
Manuscript received October 5, 1999; revised January 25, 2001. This work
was supported in part by the National Science Foundation under Grant DMS9622610.
T. Przebinda and M. Özaydın are with the Department of Mathematics, The
University of Oklahoma, Norman, OK 73019 USA (e-mail: przebin@crystal.
ou.edu;
[email protected]).
V. DeBrunner is with the School of Electrical and Computer Engineering, The
University of Oklahoma, Norman, OK 73019 USA (e-mail:
[email protected]).
Communicated by J. A. O’Sullivan, Associate Editor for Detection and Estimation.
Publisher Item Identifier S 0018-9448(01)04430-3.
(x;
y; z 2 A):
(4)
In other words, the Fourier transform normalizes the group (G1 (A)).
For u 2 L2 (A), with kuk2 = 1, let
H (u) = 0
a2A
ju(a)j2 log ju(a)j2
and let
Hp (u) = pH (u) + (1 0 p)H (^
u)
0018–9448/01$10.00 © 2001 IEEE
(0
p 1):
IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 47, NO. 5, JULY 2001
A straightforward calculation shows that for a nonzero function
u: A ! and for a number 0 < t < 1 the following formulas hold:
It is easy to see that
Hp ((h)u) = Hp (u)
(h
2 G (A); 0 p 1):
d
dt
log
kuk =t = 0
d2
dt2
log
kuk =t = 1t
1
We would like to consider u 2 L2 (A) with kuk2 = 1 equivalent to
u where jj = 1. As H (u) = H (v) and Hp (u) = Hp (v) for
equivalent u and v , H and Hp are defined on the equivalence classes.
This set of equivalence classes forms a complex projective space which
we will denote by P (A). Note that being orthogonal is well-defined on
the equivalence classes, so a subset of P (A) being orthonormal makes
sense. There is an induced action of the Heisenberg group G1 (A) on
P (A) defined via (3) at the level of representatives for the equivalence
classes. Below we will use the same symbol u for an element of L2 (A)
with kuk2 = 1 and the equivalence class it represents in P (A).
If B is a subset of A, let B denote the indicator function of B . Thus,
B (a) = 1 if a 2 B , and B (a) = 0 if a 2 A n B . Here is our main
theorem.
v
2087
and
=
b)
1
2
1
2
1 2
cides with the union of the orbits
(G1 (A))
1
log(
B (B —a subgroup of A):
jAj) coin(5)
c) Each orbit (5) is an orthonormal basis of L (A).
d) The set of vectors u 2 P (A) and Hp (u) = 21 log(jAj) for all
0 p 1 is not empty if and only if jAj is a square. In this
case, this set coincides with the orbit (5) for the unique subgroup
B A of cardinality jB j = jAj.
Part a) of the above theorem has been proven by Dembo, Cover and
Thomas, [3]. The idea of their proof is based on Hirschman’s work, [6].
In fact, those authors name the inequality a) “the discrete Hirschman
uncertainty principle.” Following this line, we have chosen the title of
this correspondence. While unaware of the work in [3], we conjectured
a result close to the above theorem in [1]. The conjecture was refined
in [4].
The strategy of the proof of part b) is to reduce it to a result of Donoho
and Stark [7, Theorem 13]. In order to keep the presentation self-contained, we give proofs for this as well as part a). Part c) suggests a close
connection of the functions listed in b) with wavelets, along the lines
explored partially in [8].
A generalization of parts a) and b) of the Main Theorem, where the
finite cyclic group A is replaced with a compactly generated, locally
compact abelian group is available [9]. This includes multidimensional
finite (A is a product of finite cyclic groups), continuous (A = N ),
and periodic (A = ( = )N ) cases, as well as their products.
kukp =
1
1
1
1
a2A
ju(a)jp
1=p
:
2
1
1
1
:
1
1
Since the second derivative is nonnegative, the function log kuk1=t ,
< t < 1, is convex. Hence, for u 2 L2 (A) with kuk2 = 1
0
d
log kuk1=t
dt
H (u) =
t=
t!lim1
+
d
dt
kuk =t = log(jsupp uj)
log
1
where jsupp uj stands for the cardinality of supp u, the support of u.
The inequality (7) is of course well known.
Since ku
^k2 = kuk2 , and since
ku^k1 jAj0 = kuk
1 let
1
the Riesz–Thorin theorem, [10, Ch. 12, eq. (1.11)], implies
ku^k = 0t jAj 0t
kuk =t
t 1; u 6= 0
1
)
2
1
:
(8)
By applying negative logarithm to both sides of (8) we obtain the following inequality:
kuk =t) 0 log(ku^k = 0t )
log(
1
1 (1
t0
)
1
2
jAj)
log(
t1
1
2
: (9)
As an aside, notice that the left-hand side of (9) is a difference of two
convex functions.
We assume from now on that kuk2 = 1. Then both sides of (9) are
equal to zero for t = 21 . Hence, (6) and (7) imply
H (u) + H (^
u) log(jAj):
(10)
This verifies part a) of the theorem as in [3].
We are interested in functions u for which the equality holds in
(10). We are going to use some ideas of Zygmund, [10, Ch. 12, eqs.
(1.20)–(1.24)]. For a complex number z 2 define
f (z ) = jAj0
+z
F juj z juuj
j
b2A
j z juu^^((bb))j :
(b) u
^(b)
2
2
Here juuj = 0 outside the support of u, and, similarly, for juu^^j .
Notice that for y 2
f
III. PROOF OF THE MAIN THEOREM
and a number 0 < p <
(6)
1
1
ju(a)j =t
=t
a2A kuk =t
=t
1 log ju(a)j=t
kuk =t
=t
=t
0 ju(b)j=t log ju(b)j=t
kuk =t
b2A kuk =t
1
1 (1
!
1
1
1
2
For a function u: A
log
1 2
jBj
ju(a)j =t
kuk =t=t
1
1
(7)
2 P (A), then H = (u) log(jAj).
The set of vectors u 2 P (A) and H = (u) =
1 2
ju(a)j =t
=t
a2A kuk =t
1
1
Theorem 1 (Main Theorem):
a) If u
1
1
2
+ iy
kjuj i y k 1 kju^j i y k
= kuk 1 ku
^k = 1 1 1 = 1;
1+ 2
2
1+ 2
2
2
2
and
jf (1 + iy)j jAj F juj
2+i2y
kuk 1 ku^k
2
2
2
2
u
juj 1 1 kju^j
2+i2y
k
1
= 1:
Hence, by the Phragmén and Lindelöf theorem, [10, Ch. 12, eq. (1.1)]
Also, let
kuk1 = maxfju(a)j; a 2 Ag:
jf (z)j 1
1
2
Re(z) 1
:
2088
IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 47, NO. 5, JULY 2001
2 supp u
u(a) = jAj0
Similarly, for a
A straightforward calculation shows that
d
f (z ) = f 0 (z )
dz
= f (z ) log(jAj) + jAj
0 +z
2A
F
=
b
1 juj2 juuj log(juj2 )
z
jAj0
+
+z
F juj2 juuj
2
b
z
j
ju(a)j = jAj0
z
for
1
2
Re(z) 1; f
1
2
= 1;
and f 0
1
2
f (1) = jAj
2A
F (juju) (b)ju^(b)ju^(b):
(11)
b
a; b
2A
Thus, the equality in (10) implies
jsupp uj 1 jsupp u^j = jAj:
ju(a)j2 ju^(b)j2(0ab) u(a) u^(b) :
ju(a)j ju^(b)j
Thus, part b) of the theorem will follow as soon as we verify the following theorem of Donoho and Stark, [7].
2 L2(A). Then the equation
jsupp vj 1 jsupp v^j = jAj
holds if and only if there is a subgroup B A, an element h 2 G1 (A),
Theorem 2: Let v
and a constant “const” such that v = const (h)
Hence, for b
v^(1)
v^(2)
ju(a)j2 ju^(b)j2 = ku k22 1ku^k22 = 1
..
.
v^(m 0 1)
u(a) u
^(b)
ju(a)j ju^(b)j
2 supp u^
0
u
^(b) = jAj
(a
1
2 supp u; b 2 supp u^):
=
2supp u
(0a1 )
(0a1 )2
jAj0
..
.
u(a)(0ab)
jAj0
2supp u
a
u(a) u
^(b)
v (a1 )
ju(a)j ju^(b)j :
v (a2 )
By taking the absolute value of the extreme left and right sides of the
above equations, we get
ju^(b)j = jAj0
2supp u
a
ju(a)j
(b
2 supp u^):
(0a2 )
(13)
1
v (a3 )
..
.
v (am )
111
(0a2 )2
1
(0am )
(0am )2
..
m
u(a)
1
(0a1 ) 01 (0a2 )m01
a
=
0 1)) 6= (0; 0; . . . ; 0):
v^(0)
(12)
(12) implies
(0ab)
.
Let supp v = fa1 ; a2 ; . . . ; am g. Then
a; b
1=
B
Lemma 3 [7]: Let v 2 L2 (A) and let m = jsupp uj. Then v^ cannot
have more than m consecutive zeros.
Proof: Since the translations of v^ do not effect the support of v ,
it shall suffice to show that
Since
2A
(14)
=
(^
v (0); v^(1); . . . ; v^(m
Now we follow Zygmund’s proof of [10, Ch. 12, eq. (2.18)].
The formula (11) may be rewritten as
1=
2 supp u):
H (u) + H (^
u) = log(jsupp uj) + log(jsupp u
^j):
= 0:
In particular, Re(f (z )) is a real-valued harmonic function in the disc
of radius 41 centered at z = 43 . This harmonic function achieves its
maximum at z = 21 , and has derivative equal to zero at this point.
Hence, the Hopf’s Maximum Principle [11, Theorem 3.1.6’], implies
that Re(f (z )) is constant on this disc. Hence, by standard properties
of entire functions, f (z ) = 1 for all z 2 . This equation coincides
with the formula [10, Ch. 12, eq. (1.24)], which has been obtained there
under a slightly stronger assumption, [10, Ch. 12, eq. (1.20)]. In particular, for z = 1 we obtain
1=
(a
Hence,
Thus, the equality in (10) is equivalent to f 0 ( 21 ) = 0. Altogether, we
have checked that the function f (z ) has the following properties: f (z )
is an entire function
jf (z)j 1;
ju^(b)j
=
= log(
2
2supp u^
ju(a)j = jsupp uj01 2 and ju^(b)j = jsupp u^j01 2
(a 2 supp u; b 2 supp u
^):
jAj) 0 H (u) 0 H (^u):
1
u
^(b) u(a)
ju^(b)j ju(a)j
The statements (13) and (14) mean that the functions juj and ju
^j are
constant on their support. Since kuk2 = 1, it follows that
A
Hence, by Plancherel’s formula
f0
2supp u^
u
^(b)
b
u
^(b)
2
ju^(b)j log(ju^(b)j ):
1
jAj0
and, therefore,
j2
(b) u
^(b)
z
u
^(b)(ab)
b
j2 juu^^((bb))j
j
(b) u
^(b)
2supp u^
b
.
(0am )m01
(15)
IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 47, NO. 5, JULY 2001
Since, by Vandermonde, the above m 2 m-matrix is invertible, (15)
follows, and we are done.
Next we recall a few facts concerning the Fourier transform. For a
subset S A let
S?
=
fa 2 A; ab = 0 for all b 2 S g:
It is easy to see that S ? is a subgroup of A, and that S ?? is the smallest
subgroup of A containing S . Furthermore, v is invariant under translations by (supp v^)? , i.e.,
v (a + b) = v (a)
(a
2 A; b 2 (supp v^)? ):
(16)
A; if v(a + b) = v(a)
for a 2 A and b 2 B; then supp v^ B ? :
An elementary counting argument shows that for any subgroup B A
jAj = jBjjB? j
(17)
for a subgroup B
and
H
jBj
B
1
IB
jBj
F
jBj
(h)
1
jBj
B
=
= log(
jBj)
F
0)
1
B (a) B (a) = 0(x 2 A n B or y 2 A n B ? ): (19)
a2B \(0x+B)
(h)
jBj
B
1
=
jB? j
F (h)F 0 F
B :
B = (h0 )
1
1
1
B
(18)
jBj
1
jB? j
B :
1
supp (h)
jBj
B
jB? j 1 jBj = jAj:
2 L (A) is such that jsupp vj 1 jsupp v^j =
=
2
Conversely, suppose v
jAj. Then the lemma implies that the elements of supp v^ are equally
spaced. Hence, there is h 2 G1 (A) such that supp (h)v is a subgroup of A. Thus, we may assume that supp v^ is a subgroup of A.
Let B be the unique subgroup of A such that B ? = supp v^. Then
v is invariant under translations by elements of B , by (16). In particular, jsupp v j is a multiple of jB j. But our assumption implies that
jsupp vj = jAj=jB? j = jBj. Hence, v is a translation of B .
This completes the proof of part b) of the Main Theorem. Part d) of
the Main Theorem is immediate from part b) because the equation
Hp (u) =
1
2
jAj)
log(
(0
p 1)
is equivalent to
jAj)
which, for u = p
jBj , becomes jB j = jAj, by (18).
H (u) = H (^
u) =
1
1
2
log(
2
It remains to verify part c) of the Main Theorem. A straightforward
argument shows that, under the isomorphism (1), the stabilizer of the
complex line B , in G1 (A), is given by
StabG (A) (
a2B\(x+B)
(ay ):
(20)
If x 2 A n B , then B \ (x + B ) is empty, so the quantity (20) is
zero. If x 2 B , then B \ (x + B ) = B , so the quantity (20) is equal
to F ( B )(y ) = 0, by (18). This completes our proof of part c) of the
Main Theorem, and thus of the whole theorem.
The basis functions that define the HOT are derived according to
part b) of the Main Theorem, and those that are suggested in [7]. Consequently, we use the K -dimensional DFT as the originator signals for
our N = K 2 -dimensional HOT basis. Each of these basis functions
must then be shifted and interpolated to produce the sufficient number
of orthogonal basis functions that define the HOT. We note that the DFT
basis can be extended in a similar manner to produce an N = KL-dimensional transform. This basis, however, does not yield a HOT.
To detail this process, consider the three-point DFT defined
X (0)
1
(ay ) =
A. The HOT Basis Functions
Hence, by (17)
supp
a2A
(x; y;
Now that we have seen the theorem that actually defined the discrete
Hirschman uncertainty principle optimal transform (HOT), we provide
details regarding the transform.
2
Proof of Theorem 3.1: Let B A be a subgroup and let h 2
G1 (A). We know from (4) and (18) that there is h0 2 G1 (A) such that
F
jBj
coincides with the dimension of the space L2 (A).
It remains to check that any two distinct elements of this orbit are
orthogonal. Since the representation is unitary, it shall suffice to show
that
IV. THE HIRSCHMAN OPTIMAL TRANSFORM
=1
and
1
This is a normal subgroup of G1 (A), the quotient group
G1 (A)=StabG (A) ( B ) is isomorphic to (A=B ) 2 (A=B ? ),
via (1), and, by (17), has (jAj=jB j)(jAj=jB ? j) = jAj elements. Thus,
the number of distinct elements in the orbit (G1 (A)) p1 B , (5),
The left-hand side of (19) is equal to
Here is a statement dual to (16)
1
2089
B ) = B 2 B ? 2 A:
X (1)
X (2)
1
=
1
1
1
e0i
e0i
1
e0i
e0i
x[0]
x[2] :
x[3]
This three-point DFT yields the nine-point HOT shown at the bottom
of the next page.
This organization is not unique—the rows can be reordered as
desired. This representation would be consistent with the DFT. The
Matlab source that implements the general version of the HOT is
shown:
function H = hot(x);
% This function implements an N = K^2 Hirschman
optimal transform
% H = hot(x);
% Input: x is a sequence of length N = K^2
% Output: H is the transform sequence
[N; M] = size(x);
K = sqrt(N);
T = zeros(size(N));
W = fft(eye(K));
n = 1 : K : N;
for tr = 0 : K 0 1
T(n + tr; n + tr) = W;
end
T = (1=sqrt(K)) 3 T;
H = T 3 x;
2090
IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 47, NO. 5, JULY 2001
In this script, H is the transform sequence, T is the transform, and
x is the input sequence. The transform is unitary to a scale (just like
the DFT), and so the inverse transform
p can be achieved by taking the
conjugate transpose and scaling by K .
B. Fast HOT Computation
Because the HOT is based on periodic shifts of the DFT, the N =
-point HOT can be accomplished using K separate K -point DFT
computations. Because the HOT requires lengths N that are squares
of integers, the efficiency of any computational procedure will depend
on the exact length N . For N = 4; 16; 64; 256; etc., this provides a
fast HOT that requires O(N log K ) computations. For other lengths
N , the efficiency is less. For instance, in the N = 9-point HOT shown
above, we can see that the HOT transform coefficients are determined
from
K
2
H (0)
H (3)
x[0]
DFT
=
Of course, in practice, the square roots need not be carried out. This
is, as is commonly done in the DFT; that is, by moving the one square
root out of the analysis relationship and moving it into the synthesis re1 . The N -point HOT is computationally
lationship to create the scale K
more efficient than the N -point DFT, and increasingly more efficient
as N ! 1. As we have mentioned above, this is somewhat simplistic
because the squared integers are not, in general, powers of 2. Consequently, for any length N we should compare specific counts.
ACKNOWLEDGMENT
The authors would like to thank Dr. M. Doroslovacki at George
Washington University in Washington, DC, for his comments on [1]
that helped lead us to finding some example signals that met the conjectured minimum, and thus ultimately to this proof that defines all such
signals that are optimal according to the discrete form of the Hirschman
uncertainty principle.
x[3]
H (6)
x[6]
REFERENCES
and
H (1)
H (4)
x[1]
DFT
=
[1] V. DeBrunner, M. Özaydın, and T. Przebinda, “Resolution in time-frequency,” IEEE Trans. Signal Processing, vol. 47, pp. 783–788, Mar.
1999.
[2] V. DeBrunner, M. Özaydın, T. Przebinda, and J. Havlicek, “The optimal
solutions to the continuous- and discrete-time versions of the Hirschman
uncertainty principle,” in Proc. ICASSP’00, Istanbul, Turkey, June 5–9,
2000.
[3] A. Dembo, T. M. Cover, and J. A. Thomas, “Information theoretic inequalities,” IEEE Trans Inform. Theory, vol. 37, pp. 1501–1518, Nov.
1991.
[4] V. DeBrunner, M. Özaydın, and T. Przebinda, “Analysis in a finite
time-frequency plane,” IEEE Trans. Signal Processing, vol. 48, pp.
3586–3587, Dec. 2000.
[5] T. Przebinda, V. E. DeBrunner, and M. Özaydın, “Using a new uncertainty measure to determine optimal bases for signal representations,” in
Proc. ICASSP’99, Phoenix, AZ, Mar. 1999, paper 1575.
[6] I. I. Hirschman, “A Note on entropy,” Amer. J. Math., vol. 79, pp.
152–156, 1957.
[7] D. L. Donoho and P. B. Stark, “Uncertainty principles and signal recovery,” SIAM J. Appl. Math., vol. 49, pp. 906–931, 1989.
[8] M. Özaydın and T. Przebinda, “Platonic orthonormal wavelets,” Appl.
Comput. Harmon. Anal., vol. 4, pp. 351–365, 1997.
[9]
, “An entropy-based uncertainty principle for a locally compact,
abelian, compactly generated group,” paper, submitted for publication.
[10] A. Zygmund, Trigonometric Series, 2nd ed. Cambridge, U.K.: Cambridge Univ. Press, 1990, vol. I and II.
[11] L. Hörmander, Notions of Convexity. Basel, Switzerland: Birkhäuser,
1994.
x[4]
H (7)
x[7]
and, finally, that
H (2)
H (5)
=
x[2]
DFT
x[5]
H (8)
:
x[8]
This requires three separate three-point DFT computations. In general, we have the (unitary) transform relationship
H (K r
+ l) =
p1
K
K 01
0j nr ;
x[K n
+ l ]e
H (K r
+ l)e
n=0
0
r; l
01
K
and its inverse
x[K n
+ l] =
p1
K
K 01
r=0
j nr ;
0
n; l
01
K
:
H (0)
1
0
0
1
0
0
1
0
0
x[0]
H (1)
0
1
0
0
1
0
0
1
0
x[1]
H (2)
0
0
1
0
0
1
0
1
x[2]
1
0
0
0
0
0
x[3]
1
0
0
0
0
0
0
0
1
0
0
0
1
0
0
0
0
0
x[6]
0
1
0
0
0
0
0
0
0
1
0
0
0
0
0i
e
x[7]
H (8)
0i
e
0i
e
0
H (7)
0i
e
0i
e
0
H (6)
0i
e
0i
e
x[4]
H (5)
0i
e
0i
e
0
0
0i
e
0i
e
0
H (3)
0i
e
H (4)
=
x[5]
x[8]
: