In mathematics , particularly in linear algebra , the Schur product theorem states that the Hadamard product of two positive definite matrices is also a positive definite matrix.
The result is named after Issai Schur [1] (Schur 1911, p. 14, Theorem VII) (note that Schur signed as J. Schur in Journal für die reine und angewandte Mathematik .[2] [3] )
We remark that the converse of the theorem holds in the following sense. If
M
{\displaystyle M}
is a symmetric matrix and the Hadamard product
M
∘
N
{\displaystyle M\circ N}
is positive definite for all positive definite matrices
N
{\displaystyle N}
, then
M
{\displaystyle M}
itself is positive definite.
For any matrices
M
{\displaystyle M}
and
N
{\displaystyle N}
, the Hadamard product
M
∘
N
{\displaystyle M\circ N}
considered as a bilinear form acts on vectors
a
,
b
{\displaystyle a,b}
as
a
∗
(
M
∘
N
)
b
=
tr
(
M
T
diag
(
a
∗
)
N
diag
(
b
)
)
{\displaystyle a^{*}(M\circ N)b=\operatorname {tr} \left(M^{\textsf {T}}\operatorname {diag} \left(a^{*}\right)N\operatorname {diag} (b)\right)}
where
tr
{\displaystyle \operatorname {tr} }
is the matrix trace and
diag
(
a
)
{\displaystyle \operatorname {diag} (a)}
is the diagonal matrix having as diagonal entries the elements of
a
{\displaystyle a}
.
Suppose
M
{\displaystyle M}
and
N
{\displaystyle N}
are positive definite, and so Hermitian . We can consider their square-roots
M
1
2
{\displaystyle M^{\frac {1}{2}}}
and
N
1
2
{\displaystyle N^{\frac {1}{2}}}
, which are also Hermitian, and write
tr
(
M
T
diag
(
a
∗
)
N
diag
(
b
)
)
=
tr
(
M
¯
1
2
M
¯
1
2
diag
(
a
∗
)
N
1
2
N
1
2
diag
(
b
)
)
=
tr
(
M
¯
1
2
diag
(
a
∗
)
N
1
2
N
1
2
diag
(
b
)
M
¯
1
2
)
{\displaystyle \operatorname {tr} \left(M^{\textsf {T}}\operatorname {diag} \left(a^{*}\right)N\operatorname {diag} (b)\right)=\operatorname {tr} \left({\overline {M}}^{\frac {1}{2}}{\overline {M}}^{\frac {1}{2}}\operatorname {diag} \left(a^{*}\right)N^{\frac {1}{2}}N^{\frac {1}{2}}\operatorname {diag} (b)\right)=\operatorname {tr} \left({\overline {M}}^{\frac {1}{2}}\operatorname {diag} \left(a^{*}\right)N^{\frac {1}{2}}N^{\frac {1}{2}}\operatorname {diag} (b){\overline {M}}^{\frac {1}{2}}\right)}
Then, for
a
=
b
{\displaystyle a=b}
, this is written as
tr
(
A
∗
A
)
{\displaystyle \operatorname {tr} \left(A^{*}A\right)}
for
A
=
N
1
2
diag
(
a
)
M
¯
1
2
{\displaystyle A=N^{\frac {1}{2}}\operatorname {diag} (a){\overline {M}}^{\frac {1}{2}}}
and thus is strictly positive for
A
≠
0
{\displaystyle A\neq 0}
, which occurs if and only if
a
≠
0
{\displaystyle a\neq 0}
. This shows that
(
M
∘
N
)
{\displaystyle (M\circ N)}
is a positive definite matrix.
Proof using Gaussian integration [ edit ]
Let
X
{\displaystyle X}
be an
n
{\displaystyle n}
-dimensional centered Gaussian random variable with covariance
⟨
X
i
X
j
⟩
=
M
i
j
{\displaystyle \langle X_{i}X_{j}\rangle =M_{ij}}
. Then the covariance matrix of
X
i
2
{\displaystyle X_{i}^{2}}
and
X
j
2
{\displaystyle X_{j}^{2}}
is
Cov
(
X
i
2
,
X
j
2
)
=
⟨
X
i
2
X
j
2
⟩
−
⟨
X
i
2
⟩
⟨
X
j
2
⟩
{\displaystyle \operatorname {Cov} \left(X_{i}^{2},X_{j}^{2}\right)=\left\langle X_{i}^{2}X_{j}^{2}\right\rangle -\left\langle X_{i}^{2}\right\rangle \left\langle X_{j}^{2}\right\rangle }
Using Wick's theorem to develop
⟨
X
i
2
X
j
2
⟩
=
2
⟨
X
i
X
j
⟩
2
+
⟨
X
i
2
⟩
⟨
X
j
2
⟩
{\displaystyle \left\langle X_{i}^{2}X_{j}^{2}\right\rangle =2\left\langle X_{i}X_{j}\right\rangle ^{2}+\left\langle X_{i}^{2}\right\rangle \left\langle X_{j}^{2}\right\rangle }
we have
Cov
(
X
i
2
,
X
j
2
)
=
2
⟨
X
i
X
j
⟩
2
=
2
M
i
j
2
{\displaystyle \operatorname {Cov} \left(X_{i}^{2},X_{j}^{2}\right)=2\left\langle X_{i}X_{j}\right\rangle ^{2}=2M_{ij}^{2}}
Since a covariance matrix is positive definite, this proves that the matrix with elements
M
i
j
2
{\displaystyle M_{ij}^{2}}
is a positive definite matrix.
Let
X
{\displaystyle X}
and
Y
{\displaystyle Y}
be
n
{\displaystyle n}
-dimensional centered Gaussian random variables with covariances
⟨
X
i
X
j
⟩
=
M
i
j
{\displaystyle \left\langle X_{i}X_{j}\right\rangle =M_{ij}}
,
⟨
Y
i
Y
j
⟩
=
N
i
j
{\displaystyle \left\langle Y_{i}Y_{j}\right\rangle =N_{ij}}
and independent from each other so that we have
⟨
X
i
Y
j
⟩
=
0
{\displaystyle \left\langle X_{i}Y_{j}\right\rangle =0}
for any
i
,
j
{\displaystyle i,j}
Then the covariance matrix of
X
i
Y
i
{\displaystyle X_{i}Y_{i}}
and
X
j
Y
j
{\displaystyle X_{j}Y_{j}}
is
Cov
(
X
i
Y
i
,
X
j
Y
j
)
=
⟨
X
i
Y
i
X
j
Y
j
⟩
−
⟨
X
i
Y
i
⟩
⟨
X
j
Y
j
⟩
{\displaystyle \operatorname {Cov} \left(X_{i}Y_{i},X_{j}Y_{j}\right)=\left\langle X_{i}Y_{i}X_{j}Y_{j}\right\rangle -\left\langle X_{i}Y_{i}\right\rangle \left\langle X_{j}Y_{j}\right\rangle }
Using Wick's theorem to develop
⟨
X
i
Y
i
X
j
Y
j
⟩
=
⟨
X
i
X
j
⟩
⟨
Y
i
Y
j
⟩
+
⟨
X
i
Y
i
⟩
⟨
X
j
Y
j
⟩
+
⟨
X
i
Y
j
⟩
⟨
X
j
Y
i
⟩
{\displaystyle \left\langle X_{i}Y_{i}X_{j}Y_{j}\right\rangle =\left\langle X_{i}X_{j}\right\rangle \left\langle Y_{i}Y_{j}\right\rangle +\left\langle X_{i}Y_{i}\right\rangle \left\langle X_{j}Y_{j}\right\rangle +\left\langle X_{i}Y_{j}\right\rangle \left\langle X_{j}Y_{i}\right\rangle }
and also using the independence of
X
{\displaystyle X}
and
Y
{\displaystyle Y}
, we have
Cov
(
X
i
Y
i
,
X
j
Y
j
)
=
⟨
X
i
X
j
⟩
⟨
Y
i
Y
j
⟩
=
M
i
j
N
i
j
{\displaystyle \operatorname {Cov} \left(X_{i}Y_{i},X_{j}Y_{j}\right)=\left\langle X_{i}X_{j}\right\rangle \left\langle Y_{i}Y_{j}\right\rangle =M_{ij}N_{ij}}
Since a covariance matrix is positive definite, this proves that the matrix with elements
M
i
j
N
i
j
{\displaystyle M_{ij}N_{ij}}
is a positive definite matrix.
Proof using eigendecomposition [ edit ]
Proof of positive semidefiniteness [ edit ]
Let
M
=
∑
μ
i
m
i
m
i
T
{\displaystyle M=\sum \mu _{i}m_{i}m_{i}^{\textsf {T}}}
and
N
=
∑
ν
i
n
i
n
i
T
{\displaystyle N=\sum \nu _{i}n_{i}n_{i}^{\textsf {T}}}
. Then
M
∘
N
=
∑
i
j
μ
i
ν
j
(
m
i
m
i
T
)
∘
(
n
j
n
j
T
)
=
∑
i
j
μ
i
ν
j
(
m
i
∘
n
j
)
(
m
i
∘
n
j
)
T
{\displaystyle M\circ N=\sum _{ij}\mu _{i}\nu _{j}\left(m_{i}m_{i}^{\textsf {T}}\right)\circ \left(n_{j}n_{j}^{\textsf {T}}\right)=\sum _{ij}\mu _{i}\nu _{j}\left(m_{i}\circ n_{j}\right)\left(m_{i}\circ n_{j}\right)^{\textsf {T}}}
Each
(
m
i
∘
n
j
)
(
m
i
∘
n
j
)
T
{\displaystyle \left(m_{i}\circ n_{j}\right)\left(m_{i}\circ n_{j}\right)^{\textsf {T}}}
is positive semidefinite (but, except in the 1-dimensional case, not positive definite, since they are rank 1 matrices). Also,
μ
i
ν
j
>
0
{\displaystyle \mu _{i}\nu _{j}>0}
thus the sum
M
∘
N
{\displaystyle M\circ N}
is also positive semidefinite.
Proof of definiteness [ edit ]
To show that the result is positive definite requires even further proof. We shall show that for any vector
a
≠
0
{\displaystyle a\neq 0}
, we have
a
T
(
M
∘
N
)
a
>
0
{\displaystyle a^{\textsf {T}}(M\circ N)a>0}
. Continuing as above, each
a
T
(
m
i
∘
n
j
)
(
m
i
∘
n
j
)
T
a
≥
0
{\displaystyle a^{\textsf {T}}\left(m_{i}\circ n_{j}\right)\left(m_{i}\circ n_{j}\right)^{\textsf {T}}a\geq 0}
, so it remains to show that there exist
i
{\displaystyle i}
and
j
{\displaystyle j}
for which corresponding term above is nonzero. For this we observe that
a
T
(
m
i
∘
n
j
)
(
m
i
∘
n
j
)
T
a
=
(
∑
k
m
i
,
k
n
j
,
k
a
k
)
2
{\displaystyle a^{\textsf {T}}(m_{i}\circ n_{j})(m_{i}\circ n_{j})^{\textsf {T}}a=\left(\sum _{k}m_{i,k}n_{j,k}a_{k}\right)^{2}}
Since
N
{\displaystyle N}
is positive definite, there is a
j
{\displaystyle j}
for which
n
j
∘
a
≠
0
{\displaystyle n_{j}\circ a\neq 0}
(since otherwise
n
j
T
a
=
∑
k
(
n
j
∘
a
)
k
=
0
{\displaystyle n_{j}^{\textsf {T}}a=\sum _{k}(n_{j}\circ a)_{k}=0}
for all
j
{\displaystyle j}
), and likewise since
M
{\displaystyle M}
is positive definite there exists an
i
{\displaystyle i}
for which
∑
k
m
i
,
k
(
n
j
∘
a
)
k
=
m
i
T
(
n
j
∘
a
)
≠
0.
{\displaystyle \sum _{k}m_{i,k}(n_{j}\circ a)_{k}=m_{i}^{\textsf {T}}(n_{j}\circ a)\neq 0.}
However, this last sum is just
∑
k
m
i
,
k
n
j
,
k
a
k
{\displaystyle \sum _{k}m_{i,k}n_{j,k}a_{k}}
. Thus its square is positive. This completes the proof.
^ Schur, J. (1911). "Bemerkungen zur Theorie der beschränkten Bilinearformen mit unendlich vielen Veränderlichen". Journal für die reine und angewandte Mathematik . 1911 (140): 1–28. doi :10.1515/crll.1911.140.1 . S2CID 120411177 .
^ Zhang, Fuzhen, ed. (2005). The Schur Complement and Its Applications . Numerical Methods and Algorithms. Vol. 4. doi :10.1007/b105056 . ISBN 0-387-24271-6 . , page 9, Ch. 0.6 Publication under J. Schur
^ Ledermann, W. (1983). "Issai Schur and His School in Berlin". Bulletin of the London Mathematical Society . 15 (2): 97–106. doi :10.1112/blms/15.2.97 .