Determinant

In mathematics, the determinant is a scalar value that is a function of the entries of a square matrix. It allows characterizing some properties of the matrix and the linear map represented by the matrix. In particular, the determinant is nonzero if and only if the matrix is invertible and the linear map represented by the matrix is an isomorphism. The determinant of a product of matrices is the product of their determinants (the preceding property is a corollary of this one). The determinant of a matrix A is denoted det(A), det A, or |A|.

Each determinant of a 2 × 2 matrix in this equation is called a minor of the matrix A. This procedure can be extended to give a recursive definition for the determinant of an n × n matrix, known as Laplace expansion.

Determinants occur throughout mathematics. For example, a matrix is often used to represent the coefficients in a system of linear equations, and determinants can be used to solve these equations (Cramer's rule), although other methods of solution are computationally much more efficient. Determinants are used for defining the characteristic polynomial of a matrix, whose roots are the eigenvalues. In geometry, the signed n-dimensional volume of a n-dimensional parallelepiped is expressed by a determinant. This is used in calculus with exterior differential forms and the Jacobian determinant, in particular for changes of variables in multiple integrals.

The area of the parallelogram is the absolute value of the determinant of the matrix formed by the vectors representing the parallelogram's sides.

If the matrix entries are real numbers, the matrix A can be used to represent two linear maps: one that maps the standard basis vectors to the rows of A, and one that maps them to the columns of A. In either case, the images of the basis vectors form a parallelogram that represents the image of the unit square under the mapping. The parallelogram defined by the rows of the above matrix is the one with vertices at (0, 0), (a, b), (a + c, b + d), and (c, d), as shown in the accompanying diagram.

The absolute value of adbc is the area of the parallelogram, and thus represents the scale factor by which areas are transformed by A. (The parallelogram formed by the columns of A is in general a different parallelogram, but since the determinant is symmetric with respect to rows and columns, the area will be the same.)

The absolute value of the determinant together with the sign becomes the oriented area of the parallelogram. The oriented area is the same as the usual area, except that it is negative when the angle from the first to the second vector defining the parallelogram turns in a clockwise direction (which is opposite to the direction one would get for the identity matrix).

To show that adbc is the signed area, one may consider a matrix containing two vectors u ≡ (a, b) and v ≡ (c, d) representing the parallelogram's sides. The signed area can be expressed as |u| |v| sin θ for the angle θ between the vectors, which is simply base times height, the length of one vector times the perpendicular component of the other. Due to the sine this already is the signed area, yet it may be expressed more conveniently using the cosine of the complementary angle to a perpendicular vector, e.g. u = (−b, a), so that |u| |v| cos θ′, which can be determined by the pattern of the scalar product to be equal to adbc:

The volume of this parallelepiped is the absolute value of the determinant of the matrix formed by the columns constructed from the vectors r1, r2, and r3.

Thus the determinant gives the scaling factor and the orientation induced by the mapping represented by A. When the determinant is equal to one, the linear mapping defined by the matrix is equi-areal and orientation-preserving.

The object known as the bivector is related to these ideas. In 2D, it can be interpreted as an oriented plane segment formed by imagining two vectors each with origin (0, 0), and coordinates (a, b) and (c, d). The bivector magnitude (denoted by (a, b) ∧ (c, d)) is the signed area, which is also the determinant adbc.[2]

In the sequel, A is a square matrix with n rows and n columns, so that it can be written as

The determinant of A is denoted by det(A), or it can be denoted directly in terms of the matrix entries by writing enclosing bars instead of brackets:

There are various equivalent ways to define the determinant of a square matrix A, i.e. one with the same number of rows and columns: the determinant can be defined via the Leibniz formula, an explicit formula involving sums of products of certain entries of the matrix. The determinant can also be characterized as the unique function depending on the entries of the matrix satisfying certain properties. This approach can also be used to compute determinants by simplifying the matrices in question.

The Leibniz formula for the determinant of a 3 × 3 matrix is the following:

The rule of Sarrus is a mnemonic for this formula: the sum of the products of three diagonal north-west to south-east lines of matrix elements, minus the sum of the products of three diagonal south-west to north-east lines of elements, when the copies of the first two columns of the matrix are written beside it as in the illustration:

This scheme for calculating the determinant of a 3 × 3 matrix does not carry over into higher dimensions.

Using these notions, the definition of the determinant using the Leibniz formula is then

a sum involving all permutations, where each summand is a product of entries of the matrix, multiplied with a sign depending on the permutation.

To see this it suffices to expand the determinant by multi-linearity in the columns into a (huge) linear combination of determinants of matrices in which each column is a standard basis vector. These determinants are either 0 (by property 9) or else ±1 (by properties 1 and 12 below), so the linear combination gives the expression above in terms of the Levi-Civita symbol. While less technical in appearance, this characterization cannot entirely replace the Leibniz formula in defining the determinant, since without it the existence of an appropriate function is not clear.[citation needed]

This formula can be applied iteratively when several columns are swapped. For example
Yet more generally, any permutation of the columns multiplies the determinant by the sign of the permutation.

This can be proven by inspecting the Leibniz formula.[7] This implies that in all the properties mentioned above, the word "column" can be replaced by "row" throughout. For example, viewing an n × n matrix as being composed of n rows, the determinant is an n-linear function.

The Cauchy–Binet formula is a generalization of that product formula for rectangular matrices. This formula can also be recast as a multiplicative formula for compound matrices whose entries are the determinants of all quadratic submatrices of a given matrix.[9][10]

Laplace expansion can be used iteratively for computing determinants, but this approach is inefficient for large matrices. However, it is useful for computing the determinants of highly symmetric matrix such as the Vandermonde matrix

This determinant has been applied, for example, in the proof of Baker's theorem in the theory of transcendental numbers.

Thus the adjugate matrix can be used for expressing the inverse of a nonsingular matrix:

Sylvester's determinant theorem states that for A, an m × n matrix, and B, an n × m matrix (so that A and B have dimensions allowing them to be multiplied in either order forming a square matrix):

where Im and In are the m × m and n × n identity matrices, respectively.

The product of all non-zero eigenvalues is referred to as pseudo-determinant.

A Hermitian matrix is positive definite if all its eigenvalues are positive. Sylvester's criterion asserts that this is equivalent to the determinants of the submatrices

The trace tr(A) is by definition the sum of the diagonal entries of A and also equals the sum of the eigenvalues. Thus, for complex matrices A,

Here exp(A) denotes the matrix exponential of A, because every eigenvalue λ of A corresponds to the eigenvalue exp(λ) of exp(A). In particular, given any logarithm of A, that is, any matrix L satisfying

cf. Cayley-Hamilton theorem. Such expressions are deducible from combinatorial arguments, Newton's identities, or the Faddeev–LeVerrier algorithm. That is, for generic n, detA = (−1)nc0 the signed constant term of the characteristic polynomial, determined recursively from

where the sum is taken over the set of all integers kl ≥ 0 satisfying the equation

The formula can be expressed in terms of the complete exponential Bell polynomial of n arguments sl = −(l – 1)! tr(Al) as

This formula can also be used to find the determinant of a matrix AIJ with multidimensional indices I = (i1, i2, ..., ir) and J = (j1, j2, ..., jr). The product and trace of such matrices are defined in a natural way as

An important arbitrary dimension n identity can be obtained from the Mercator series expansion of the logarithm when the expansion converges. If every eigenvalue of A is less than 1 in absolute value,

is expanded as a formal power series in s then all coefficients of sm for m > n are zero and the remaining polynomial is det(I + sA).

For a positive definite matrix A, the trace operator gives the following tight lower and upper bounds on the log determinant

with equality if and only if A = I. This relationship can be derived via the formula for the KL-divergence between two multivariate normal distributions.

These inequalities can be proved by bringing the matrix A to the diagonal form. As such, they represent the well-known fact that the harmonic mean is less than the geometric mean, which is less than the arithmetic mean, which is, in turn, less than the root mean square.

Historically, determinants were used long before matrices: A determinant was originally defined as a property of a system of linear equations. The determinant "determines" whether the system has a unique solution (which occurs precisely if the determinant is non-zero). In this sense, determinants were first used in the Chinese mathematics textbook The Nine Chapters on the Mathematical Art (九章算術, Chinese scholars, around the 3rd century BCE). In Europe, solutions of linear systems of two equations were expressed by Cardano in 1545 by a determinant-like entity.[22]

Determinants proper originated from the work of Seki Takakazu in 1683 in Japan and parallelly of Leibniz in 1693.[23][24][25][26] Cramer (1750) stated, without proof, Cramer's rule.[27] Both Cramer and also Bezout (1779) were led to determinants by the question of plane curves passing through a given set of points.[28]

Vandermonde (1771) first recognized determinants as independent functions.[24] Laplace (1772) gave the general method of expanding a determinant in terms of its complementary minors: Vandermonde had already given a special case.[29] Immediately following, Lagrange (1773) treated determinants of the second and third order and applied it to questions of elimination theory; he proved many special cases of general identities.

Gauss (1801) made the next advance. Like Lagrange, he made much use of determinants in the theory of numbers. He introduced the word "determinant" (Laplace had used "resultant"), though not in the present signification, but rather as applied to the discriminant of a quantic.[30] Gauss also arrived at the notion of reciprocal (inverse) determinants, and came very near the multiplication theorem.

The next contributor of importance is Binet (1811, 1812), who formally stated the theorem relating to the product of two matrices of m columns and n rows, which for the special case of m = n reduces to the multiplication theorem. On the same day (November 30, 1812) that Binet presented his paper to the Academy, Cauchy also presented one on the subject. (See Cauchy–Binet formula.) In this he used the word "determinant" in its present sense,[31][32] summarized and simplified what was then known on the subject, improved the notation, and gave the multiplication theorem with a proof more satisfactory than Binet's.[24][33] With him begins the theory in its generality.

(Jacobi 1841) used the functional determinant which Sylvester later called the Jacobian.[34] In his memoirs in Crelle's Journal for 1841 he specially treats this subject, as well as the class of alternating functions which Sylvester has called alternants. About the time of Jacobi's last memoirs, Sylvester (1839) and Cayley began their work. Cayley 1841 introduced the modern notation for the determinant using vertical bars.[35][36]

The study of special forms of determinants has been the natural result of the completion of the general theory. Axisymmetric determinants have been studied by Lebesgue, Hesse, and Sylvester; persymmetric determinants by Sylvester and Hankel; circulants by Catalan, Spottiswoode, Glaisher, and Scott; skew determinants and Pfaffians, in connection with the theory of orthogonal transformation, by Cayley; continuants by Sylvester; Wronskians (so called by Muir) by Christoffel and Frobenius; compound determinants by Sylvester, Reiss, and Picquet; Jacobians and Hessians by Sylvester; and symmetric gauche determinants by Trudi. Of the textbooks on the subject Spottiswoode's was the first. In America, Hanus (1886), Weld (1893), and Muir/Metzler (1933) published treatises.

The determinant can be thought of as assigning a number to every sequence of n vectors in Rn, by using the square matrix whose columns are the given vectors. For instance, an orthogonal matrix with entries in Rn represents an orthonormal basis in Euclidean space. The determinant of such a matrix determines whether the orientation of the basis is consistent with or opposite to the orientation of the standard basis. If the determinant is +1, the basis has the same orientation. If it is −1, the basis has the opposite orientation.

More generally, if the determinant of A is positive, A represents an orientation-preserving linear transformation (if A is an orthogonal 2 × 2 or 3 × 3 matrix, this is a rotation), while if it is negative, A switches the orientation of the basis.

For a general differentiable function, much of the above carries over by considering the Jacobian matrix of f. For

the Jacobian matrix is the n × n matrix whose entries are given by the partial derivatives

Its determinant, the Jacobian determinant, appears in the higher-dimensional version of integration by substitution: for suitable functions f and an open subset U of Rn (the domain of f), the integral over f(U) of some other function φ : RnRm is given by

The above identities concerning the determinant of products and inverses of matrices imply that similar matrices have the same determinant: two matrices A and B are similar, if there exists an invertible matrix X such that A = X−1BX. Indeed, repeatedly applying the above identities yields

The determinant is therefore also called a similarity invariant. The determinant of a linear transformation

for some finite-dimensional vector space V is defined to be the determinant of the matrix describing it, with respect to an arbitrary choice of basis in V. By the similarity invariance, this determinant is independent of the choice of the basis for V and therefore only depends on the endomorphism T.

For matrices with an infinite number of rows and columns, the above definitions of the determinant do not carry over directly. For example, in the Leibniz formula, an infinite sum (all of whose terms are infinite products) would have to be calculated. Functional analysis provides different extensions of the determinant for such infinite-dimensional situations, which however only work for particular kinds of operators.

The Fredholm determinant defines the determinant for operators known as trace class operators by an appropriate generalization of the formula

Another infinite-dimensional notion of determinant is the functional determinant.

For operators in a finite factor, one may define a positive real-valued determinant called the Fuglede−Kadison determinant using the canonical trace. In fact, corresponding to every tracial state on a von Neumann algebra there is a notion of Fuglede−Kadison determinant.

For matrices over non-commutative rings, multilinearity and alternating properties are incompatible for n ≥ 2,[47] so there is no good definition of the determinant in this setting.

Determinants are mainly used as a theoretical tool. They are rarely calculated explicitly in numerical linear algebra, where for applications like checking invertibility and finding eigenvalues the determinant has largely been supplanted by other techniques.[49] Computational geometry, however, does frequently use calculations related to determinants.[50]

If the determinant of A and the inverse of A have already been computed, the matrix determinant lemma allows rapid calculation of the determinant of A + uvT, where u and v are column vectors.

Charles Dodgson (i.e. Lewis Carroll of Alice's Adventures in Wonderland fame) invented a method for computing determinants called Dodgson condensation. Unfortunately this interesting method does not always work in its original form.[citation needed]