Diagonal matrix

A diagonal matrix is sometimes called a scaling matrix, since matrix multiplication with it results in changing scale (size). Its determinant is the product of its diagonal values.

As stated above, a diagonal matrix is a matrix in which all off-diagonal entries are zero. That is, the matrix D = (di,j) with n columns and n rows is diagonal if

The term diagonal matrix may sometimes refer to a rectangular diagonal matrix, which is an m-by-n matrix with all the entries not of the form di,i being zero. For example:

More often, however, diagonal matrix refers to square matrices, which can be specified explicitly as a square diagonal matrix. A square diagonal matrix is a symmetric matrix, so this can also be called a symmetric diagonal matrix.

If the entries are real numbers or complex numbers, then it is a normal matrix as well.

In the remainder of this article we will consider only square diagonal matrices, and refer to them simply as "diagonal matrices".

A diagonal matrix with equal diagonal entries is a scalar matrix; that is, a scalar multiple λ of the identity matrix I. Its effect on a vector is scalar multiplication by λ. For example, a 3×3 scalar matrix has the form:

This is mathematically equivalent, but avoids storing all the zero terms of this sparse matrix. This product is thus used in machine learning, such as computing products of derivatives in backpropagation or multiplying IDF weights in TF-IDF,[2] since some BLAS frameworks, which multiply matrices efficiently, do not include Hadamard product capability directly.[3]

The operations of matrix addition and matrix multiplication are especially simple for diagonal matrices. Write diag(a1, ..., an) for a diagonal matrix whose diagonal entries starting in the upper left corner are a1, ..., an. Then, for addition, we have

The diagonal matrix diag(a1, ..., an) is invertible if and only if the entries a1, ..., an are all nonzero. In this case, we have

In particular, the diagonal matrices form a subring of the ring of all n-by-n matrices.

Multiplying an n-by-n matrix A from the left with diag(a1, ..., an) amounts to multiplying the ith row of A by ai for all i; multiplying the matrix A from the right with diag(a1, ..., an) amounts to multiplying the ith column of A by ai for all i.

In other words, the eigenvalues of diag(λ1, …, λn) are λ1, …, λn with associated eigenvectors of e1, …, en.

Diagonal matrices occur in many areas of linear algebra. Because of the simple description of the matrix operation and eigenvalues/eigenvectors given above, it is typically desirable to represent a given matrix or linear map by a diagonal matrix.

In fact, a given n-by-n matrix A is similar to a diagonal matrix (meaning that there is a matrix X such that X−1AX is diagonal) if and only if it has n linearly independent eigenvectors. Such matrices are said to be diagonalizable.

Over the field of real or complex numbers, more is true. The spectral theorem says that every normal matrix is unitarily similar to a diagonal matrix (if AA = AA then there exists a unitary matrix U such that UAU is diagonal). Furthermore, the singular value decomposition implies that for any matrix A, there exist unitary matrices U and V such that UAV is diagonal with positive entries.

In operator theory, particularly the study of PDEs, operators are particularly easy to understand and PDEs easy to solve if the operator is diagonal with respect to the basis with which one is working; this corresponds to a separable partial differential equation. Therefore, a key technique to understanding operators is a change of coordinates—in the language of operators, an integral transform—which changes the basis to an eigenbasis of eigenfunctions: which makes the equation separable. An important example of this is the Fourier transform, which diagonalizes constant coefficient differentiation operators (or more generally translation invariant operators), such as the Laplacian operator, say, in the heat equation.

Especially easy are multiplication operators, which are defined as multiplication by (the values of) a fixed function–the values of the function at each point correspond to the diagonal entries of a matrix.