Trace (linear algebra)

The trace of a matrix A is defined to be the sum of elements on the main diagonal of A

In linear algebra, the trace of a square matrix A, denoted tr(A),[1] is defined to be the sum of elements on the main diagonal (from the upper left to the lower right) of A.

The trace of a matrix is the sum of its (complex) eigenvalues (counted with multiplicities), and it is invariant with respect to a change of basis. This characterization can be used to define the trace of a linear operator in general. The trace is only defined for a square matrix (n × n).

The trace is related to the derivative of the determinant (see Jacobi's formula).

This follows immediately from the fact that transposing a square matrix does not affect elements along the main diagonal.

The trace of a square matrix which is the product of two matrices can be rewritten as the sum of entry-wise products of their elements. More precisely, if A and B are two m × n matrices, then:

This means that the trace of a product of equal-sized matrices functions in a similar way to a dot product of vectors (imagine A and B as long vectors with columns stacked on each other). For this reason, generalizations of vector operations to matrices (e.g. in matrix calculus and statistics) often involve a trace of matrix products.

For real matrices A and B, the trace of a product can also be written in the following forms:

The matrices in a trace of a product can be switched without changing the result: If A is an m × n matrix and B is an n × m matrix, then[1][2][3]: 34 [note 1]

More generally, the trace is invariant under cyclic permutations, that is,

However, if products of three symmetric matrices are considered, any permutation is allowed, since:

where the first equality is because the traces of a matrix and its transpose are equal. Note that this is not true in general for more than three factors.

Unlike the determinant, the trace of the product is not the product of traces, that is there exist matrices A and B such that

The trace of the Kronecker product of two matrices is the product of their traces:

characterize the trace completely in the sense that follows. Let f be a linear functional on the space of square matrices satisfying f (xy) = f (yx). Then f and tr are proportional.[note 2]

The trace is similarity-invariant, which means that for any square matrix A and any invertible matrix P of the same dimensions, the matrices A and P−1AP have the same trace. This is because

The trace of the n × n identity matrix is the dimension of the space, namely n.

The trace of an idempotent matrix A (a matrix for which A2 = A) is equal to the rank of A.

When the characteristic of the base field is zero, the converse also holds: if tr(Ak) = 0 for all k, then A is nilpotent.

that is, the trace of a square matrix equals the sum of the eigenvalues counted with multiplicities.

When both A and B are n × n matrices, the trace of the (ring-theoretic) commutator of A and B vanishes: tr([A,B]) = 0, because tr(AB) = tr(BA) and tr is linear. One can state this as "the trace is a map of Lie algebras glnk from operators to scalars", as the commutator of scalars is trivial (it is an Abelian Lie algebra). In particular, using similarity invariance, it follows that the identity matrix is never similar to the commutator of any pair of matrices.

Conversely, any square matrix with zero trace is a linear combinations of the commutators of pairs of matrices.[note 3] Moreover, any square matrix with zero trace is unitarily equivalent to a square matrix with diagonal consisting of all zeros.

The trace of a Hermitian matrix is real, because the elements on the diagonal are real.

The trace of a permutation matrix is the number of fixed points, because the diagonal term aii is 1 if the ith point is fixed and 0 otherwise.

The trace of a projection matrix is the dimension of the target space.

The matrix PX is idempotent, and more generally, the trace of any idempotent matrix equals its own rank.

Expressions like tr(exp(A)), where A is a square matrix, occur so often in some fields (e.g. multivariate statistical theory), that a shorthand notation has become common:

tre is sometimes referred to as the exponential trace function; it is used in the Golden–Thompson inequality.

In general, given some linear map f : VV (where V is a finite-dimensional vector space), we can define the trace of this map by considering the trace of a matrix representation of f, that is, choosing a basis for V and describing f as a matrix relative to this basis, and taking the trace of this square matrix. The result will not depend on the basis chosen, since different bases will give rise to similar matrices, allowing for the possibility of a basis-independent definition for the trace of a linear map.

Such a definition can be given using the canonical isomorphism between the space End(V) of linear maps on V and VV*, where V* is the dual space of V. Let v be in V and let f be in V*. Then the trace of the indecomposable element vf is defined to be f (v); the trace of a general element is defined by linearity. Using an explicit basis for V and the corresponding dual basis for V*, one can show that this gives the same definition of the trace as given above.

If A is a linear operator represented by a square matrix with real or complex entries and if λ1, …, λn are the eigenvalues of A (listed according to their algebraic multiplicities), then

This follows from the fact that A is always similar to its Jordan form, an upper triangular matrix having λ1, …, λn on the main diagonal. In contrast, the determinant of A is the product of its eigenvalues; that is,

The trace corresponds to the derivative of the determinant: it is the Lie algebra analog of the (Lie group) map of the determinant. This is made precise in Jacobi's formula for the derivative of the determinant.

As a particular case, at the identity, the derivative of the determinant actually amounts to the trace: tr = det′I. From this (or from the connection between the trace and the eigenvalues), one can derive a connection between the trace function, the exponential map between a Lie algebra and its Lie group (or concretely, the matrix exponential function), and the determinant:

For example, consider the one-parameter family of linear transformations given by rotation through angle θ,

These transformations all have determinant 1, so they preserve area. The derivative of this family at θ = 0, the identity rotation, is the antisymmetric matrix

which clearly has trace zero, indicating that this matrix represents an infinitesimal transformation which preserves area.

A related characterization of the trace applies to linear vector fields. Given a matrix A, define a vector field F on Rn by F(x) = Ax. The components of this vector field are linear functions (given by the rows of A). Its divergence div F is a constant function, whose value is equal to tr(A).

By the divergence theorem, one can interpret this in terms of flows: if F(x) represents the velocity of a fluid at location x and U is a region in Rn, the net flow of the fluid out of U is given by tr(A) · vol(U), where vol(U) is the volume of U.

The trace is a linear operator, hence it commutes with the derivative:

The trace of a 2 × 2 complex matrix is used to classify Möbius transformations. First, the matrix is normalized to make its determinant equal to one. Then, if the square of the trace is 4, the corresponding transformation is parabolic. If the square is in the interval [0,4), it is elliptic. Finally, if the square is greater than 4, the transformation is loxodromic. See classification of Möbius transformations.

The trace is used to define characters of group representations. Two representations A, B : GGL(V) of a group G are equivalent (up to change of basis on V) if tr(A(g)) = tr(B(g)) for all gG.

The trace also plays a central role in the distribution of quadratic forms.

is called the Killing form, which is used for the classification of Lie algebras.

The form is symmetric, non-degenerate[note 4] and associative in the sense that:

For an m × n matrix A with complex (or real) entries and H being the conjugate transpose, we have

yields an inner product on the space of all complex (or real) m × n matrices.

The norm derived from the above inner product is called the Frobenius norm, which satisfies submultiplicative property as matrix norm. Indeed, it is simply the Euclidean norm if the matrix is considered as a vector of length mn.

It follows that if A and B are real positive semi-definite matrices of the same size then

The concept of trace of a matrix is generalized to the trace class of compact operators on Hilbert spaces, and the analog of the Frobenius norm is called the Hilbert–Schmidt norm.

The partial trace is another generalization of the trace that is operator-valued. The trace of a linear operator Z which lives on a product space AB is equal to the partial traces over A and B:

For more properties and a generalization of the partial trace, see traced monoidal categories.

If A is a general associative algebra over a field k, then a trace on A is often defined to be any map tr : Ak which vanishes on commutators: tr([a,b]) for all a, bA. Such a trace is not uniquely defined; it can always at least be modified by multiplication by a nonzero scalar.

A supertrace is the generalization of a trace to the setting of superalgebras.

The operation of tensor contraction generalizes the trace to arbitrary tensors.

The trace can also be approached in a coordinate-free manner, i.e., without referring to a choice of basis, as follows: the space of linear operators on a finite-dimensional vector space V (defined over the field F) is isomorphic to the space VV via the linear map

There is also a canonical bilinear function t : V × VF that consists of applying an element w of V to an element v of V to get an element of F:

This induces a linear function on the tensor product (by its universal property) t : VV → F, which, as it turns out, when that tensor product is viewed as the space of operators, is equal to the trace.

This also clarifies why tr(AB) = tr(BA) and why tr(AB) ≠ tr(A)tr(B), as composition of operators (multiplication of matrices) and trace can be interpreted as the same pairing. Viewing

coming from the pairing V × VF on the middle terms. Taking the trace of the product then comes from pairing on the outer terms, while taking the product in the opposite order and then taking the trace just switches which pairing is applied first. On the other hand, taking the trace of A and the trace of B corresponds to applying the pairing on the left terms and on the right terms (rather than on inner and outer), and is thus different.

In coordinates, this corresponds to indexes: multiplication is given by

The latter, however, is just the Kronecker delta, being 1 if i = j and 0 otherwise. This shows that tr(A) is simply the sum of the coefficients along the diagonal. This method, however, makes coordinate invariance an immediate consequence of the definition.

This map is precisely the inclusion of scalars, sending 1 ∈ F to the identity matrix: "trace is dual to scalars". In the language of bialgebras, scalars are the unit, while trace is the counit.

which yields multiplication by n, as the trace of the identity is the dimension of the vector space.

Using the notion of dualizable objects and categorical traces, this approach to traces can be fruitfully axiomatized and applied to other mathematical areas.