New Proofs
New Proofs
JENSEN-SHANNON DIVERGENCES
1. Introduction
For 0 ≤ a ≤ b, let us consider the following least square problem with respect to
some distance function d on R:
For the Euclidean distance dE (a, b) = |a − b|, it is obvious that the arithmetic
mean (a + b)/2 is the unique solution of (1). While with respect to the Riemannian
√
distance dR (a, b) = | log a − log b| = dE (log a, log b), the geometric mean ab is the
unique solution of (1). And the difference
a+b √
(2) D(a, b) = − ab
2
is nothing but the distance between two solutions of problem (1) with respect to
different distance functions dE and dR .
Now let A and B be positive definite matrices in the algebra M n of n×n matrices
over C. The Euclidean distance DE (A, B) and the Riemannian distance DR (A, B)
are defined as
n
!1/2
X
2 1/2 2 −1
DE (A, B) = (Tr ((A − B) )) and DR (A, B) = log λi (A B) ,
i=1
Mention that the three distances mentioned above are matrix generalizations of the
difference (2) for scalars. In addition, all distances DB , DH and DL are metrics on
Pn (see [1, 4, 6].) Interestingly, the proofs of the metric property of DL was based
on the following scalar case.
Interestingly, Cherian et al. [3] claimed that DL might not be metric, whereas
Chebbi and Moahker [2] conjectured that DL is a metric. Finally, in [5] mention
that DL is a metric, and the proof was given in [6]. Given that the proof of Theorem
1 by Sra is short but it is not trivial. He has to prove the positive definiteness of
−β
x+y
the function for x, y > 0 and β > 0 which is not elementary.
2
Recently, using the metric property of DL , Virosztek [7] proved that for f (t) =
t log t (t > 0) the square root of the quantum Jensen-Shannon divergence given by
1 A+B
DJS (A, B) = Tr (f (A) + f (B)) − Tr f
2 2
is a metric on the cone of positive matrices, and hence in particular on the quantum
state space. Nevertheless, we did not see any elementary proof of this fact even for
scalars.
Motivated by works mentioned above, in this note we provide three elementary
proofs of Theorem 1 using the AGM inequality and some basic properties of the
logarithmic and exponential functions. Since we do not know any elementary proof
of Vorisztek’s result for scalars, here we show that the square root of the scalar
Jensen-Shannon divergence
1/2
1 a+b
(5) dJS (a, b) = (f (a) + f (b)) − f
2 2
satisfies the triangle inequality, and hence, is a metric on (0, ∞).
2. Proofs
Without loss of generality, we use the natural logarithm in all proofs. Now we
give 2 different proofs of Theorem 1.
The first proof of Theorem 1. Squaring both sides of (4) and simplifying
like terms, we obtain
s
a+b a+c b+c a+c b+c
(6) ln √ ≤ ln √ + ln √ + 2 ln √ · ln √ ,
2 ab 2 ac 2 bc 2 ac 2 bc
4 VAN-QUY NGUYEN, DUC-CHIN VAN, BA-CAN QUOC VO
or, equivalently,
s
2c(a + b) a+c b+c
(7) ln ≤2 ln √ · ln √ .
(a + c)(b + c) 2 ac 2 bc
It is obvious that (7) is true when 2c(a+b) ≤ (a+c)(b+c). Therefore, it is sufficient
to prove (7) for the case when 2c(a + b) > (a + c)(b + c) which is equivalent to
(a − c)(c − b) > 0. Mention that for any x ≥ 0,
x
(8) ≤ ln(1 + x) ≤ x.
x+1
On account of (8) we have
2c(a + b) (a − c)(c − b) (a − c)(c − b)
ln = ln 1 + ≤ ,
(a + c)(b + c) (a + c)(b + c) (a + c)(b + c)
and
√ √ √ √
( a − c)2 ( a − c)2
a+c
ln √ = ln 1 + √ ≥ .
2 ac 2 ac a+c
Similarly, we also have
√ √
b+c ( c − b)2
ln √ ≥ .
2 bc b+c
Therefore, inequality (7) follows if
s
√ √ √ √
(a − c)(c − b) ( a − c)2 ( c − b)2
≤2 · ,
(a + c)(b + c) a+c b+c
or, equivalently,
√ √ √ √
( a + c)2 ( b + c)2 ≤ 4(a + c)(b + c).
√ √ √ √
The last inequality follows from the fact that ( a+ c)2 ≤ 2(a+c) and ( b+ c)2 ≤
2(b + c).
In the difference to the first proof, here we apply the exponent function and two
basic facts of the natural logarithmic function.
The second proof of Theorem 1.
Applying the exponential function to both sides of (6) we obtain
1/2
a+b a+c c+b 2
a+c c+b
ln √ ln √
(9) √ ≤ √ √ e 2 ac 2 cb .
2 ab 2 ac 2 cb
Using the fact that et ≥ 1 + t (t ≥ 0) we have
1/2 1/2
a+c c+b
2 ln 2a+c
√ ln c+b
√
e ac 2 cb ≥ 1 + 2 ln √ ln √ .
2 ac 2 cb
TRIANGLE INEQUALITIES FOR LOG-DETERMINANT AND JENSEN-SHANNON DIVERGENCES
5
Now, we give an elementary proof of the metric property for the Jensen-Shannon
divergence. Using the same idea, we give the third proof of Theorem 1.
We need the following lemma.
(12) x ln x + y ln y + ln x ln y ≤ 0.
Proof. Without loss of generality, we may assume that x ≤ y. The inequality (12)
can be rewritten as h(x) ≤ 0, where
h(x) ≤ h(1) = 0.
Now we are ready to show that the square root of the scalar Jensen-Shannon
divergence (5) satisfies the triangle inequality.
We now show that g 0 (x) > 0 for any x > b. For this, we need to show that ϕ(b) >
2x
ln
ϕ(c), where ϕ(t) = √ 2x
x+t
2t
. Indeed, the function ϕ(t) is continuous on the
x ln x+t +t ln x+t
Since 2t
x+t + 2x
x+t = 2, by Lemma 2, we have ϕ0 (t) ≥ 0 for any t > c. Therefore, the
function ϕ(t) is increasing on [c, +∞), and hence, ϕ(b) > ϕ(c). That means, the
the function g(x) is increasing on the interval [b, +∞). Therefore,
g(b) g(a)
dJS (b, b) + dJS (b, c) = √ ≤ √ = dJS (a, b) − dJS (a, c).
2 2
TRIANGLE INEQUALITIES FOR LOG-DETERMINANT AND JENSEN-SHANNON DIVERGENCES
7
is increasing on [1, ∞) and decreasing on (0, 1]. Therefore, the function δl (a, b) is
increasing in a and decreasing in b. Hence, for c ≤ a, we have δl (a, b) ≤ δl (c, b).
Similarly, for c ≥ b, we have δl (a, b) ≤ δl (c, a). Thus, in these cases, the triangle
inequality (4) is true.
Now, suppose that a < c < b. If we show that the following function
(t − x)2
u u − v
t+x u u
2 ln √ − = ln − = ln + 1 − ≤ 0,
2 xt 4xt v v v v
where the inequality follows from (8). Therefore, the function h(t) is increasing on
[a, c]. Hence, the function H(x) is increasing on [c, b]. We finish the proof.
8 VAN-QUY NGUYEN, DUC-CHIN VAN, BA-CAN QUOC VO
References
[1] R.Bhatia, T.Jain, Y.Lim. On the Bures-Wasserstein distance between positive definite ma-
trices. Expositiones Mathematicae. DOI10.1016/j.exmath.2018.01.002
[2] Z.Chebbi, M.Moahker. Means of Hermitian positive-definite matrices based on the log-
determinant α-divergence function. Linear Algebra Appl. 436 (2012), 1872-1889.
[3] A.Cherian, S.Sra, A.Banerjee, N.Papanikolopoulos. Efficient Similarity Search for Covari-
ance Matrices via the Jensen-Bregman Log-Det Divergence. International Conference on
Computer Vision (ICCV), Nov. 2011.
[4] D.Spehner, F.Illuminati, M.Orszag, W.Roga. Geometric measures of quantum correlations
with Bures and Hellinger distances. ArXiv e-prints, November 2016.
[5] S.Sra. A new metric on the manifold of kernel matrices with application to matrix geometric
means. Advances in Neural Information Processing Systems (NIPS), Dec. 2012.
[6] S.Sra. Positive definite matrices and the S-divergence. Proc. Amer. Math. Soc. 144 (2016),
2787-2797.
[7] R.Virosztek. The metric property of the quantum Jensen-Shannon divergence. Advances in
Math. 380 (2021), 107595.
Duc-Chin Van, Luong The Vinh Highschool, Yen Xa, Tan Trieu, Thanh Tri, Ha Noi,
Viet Nam,
Ba-Can Quoc Vo, Archimedes Academy, 10 Trung Yen, Trung Hoa, Cau Giay, Ha Noi,
Viet Nam,