Euclidean distance

Using the Pythagorean theorem to compute two-dimensional Euclidean distance

In mathematics, the Euclidean distance between two points in Euclidean space is the length of a line segment between the two points. It can be calculated from the Cartesian coordinates of the points using the Pythagorean theorem, therefore occasionally being called the Pythagorean distance. These names come from the ancient Greek mathematicians Euclid and Pythagoras, although Euclid did not represent distances as numbers, and the connection from the Pythagorean theorem to distance calculation was not made until the 18th century.

The distance between two objects that are not points is usually defined to be the smallest distance among pairs of points from the two objects. Formulas are known for computing distances between different types of objects, such as the distance from a point to a line. In advanced mathematics, the concept of distance has been generalized to abstract metric spaces, and other distances than Euclidean have been studied. In some applications in statistics and optimization, the square of the Euclidean distance is used instead of the distance itself.

In three dimensions, for points given by their Cartesian coordinates, the distance is

For pairs of objects that are not both points, the distance can most simply be defined as the smallest distance between any two points from the two objects, although more complicated generalizations from points to sets such as Hausdorff distance are also commonly used.[6] Formulas for computing distances between different types of objects include:

The Euclidean distance is the prototypical example of the distance in a metric space,[9] and obeys all the defining properties of a metric space:[10]

In many applications, and in particular when comparing distances, it may be more convenient to omit the final square root in the calculation of Euclidean distances. The value resulting from this omission is the square of the Euclidean distance, and is called the squared Euclidean distance.[13] As an equation, it can be expressed as a sum of squares:

Beyond its application to distance comparison, squared Euclidean distance is of central importance in statistics, where it is used in the method of least squares, a standard method of fitting statistical estimates to data by minimizing the average of the squared distances between observed and estimated values.[14] The addition of squared distances to each other, as is done in least squares fitting, corresponds to an operation on (unsquared) distances called Pythagorean addition.[15] In cluster analysis, squared distances can be used to strengthen the effect of longer distances.[13]

Squared Euclidean distance does not form a metric space, as it does not satisfy the triangle inequality.[16] However it is a smooth, strictly convex function of the two points, unlike the distance, which is non-smooth (near pairs of equal points) and convex but not strictly convex. The squared distance is thus preferred in optimization theory, since it allows convex analysis to be used. Since squaring is a monotonic function of non-negative values, minimizing squared distance is equivalent to minimizing the Euclidean distance, so the optimization problem is equivalent in terms of either, but easier to solve using squared distance.[17]

The collection of all squared distances between pairs of points from a finite set may be stored in a Euclidean distance matrix, and is used in this form in distance geometry.[18]

In more advanced areas of mathematics, when viewing Euclidean space as a vector space, its distance is associated with a norm called the Euclidean norm, defined as the distance of each vector from the origin. One of the important properties of this norm, relative to other norms, is that it remains unchanged under arbitrary rotations of space around the origin.[19] By Dvoretzky's theorem, every finite-dimensional normed vector space has a high-dimensional subspace on which the norm is approximately Euclidean; the Euclidean norm is the only norm with this property.[20] It can be extended to infinite-dimensional vector spaces as the L2 norm or L2 distance.[21] The Euclidean distance gives Euclidean space the structure of a topological space, the Euclidean topology, with the open balls (subsets of points at less than a given distance from a given point) as its neighborhoods.[22]

Other common distances on Euclidean spaces and low-dimensional vector spaces include:[23]

For points on surfaces in three dimensions, the Euclidean distance should be distinguished from the geodesic distance, the length of a shortest curve that belongs to the surface. In particular, for measuring great-circle distances on the earth or other spherical or near-spherical surfaces, distances that have been used include the haversine distance giving great-circle distances between two points on a sphere from their longitudes and latitudes, and Vincenty's formulae also known as "Vincent distance" for distance on a spheroid.[24]

Euclidean distance is the distance in Euclidean space; both concepts are named after ancient Greek mathematician Euclid, whose Elements became a standard textbook in geometry for many centuries.[25] Concepts of length and distance are widespread across cultures, can be dated to the earliest surviving "protoliterate" bureaucratic documents from Sumer in the fourth millennium BC (far before Euclid),[26] and have been hypothesized to develop in children earlier than the related concepts of speed and time.[27] But the notion of a distance, as a number defined from two points, does not actually appear in Euclid's Elements. Instead, Euclid approaches this concept implicitly, through the congruence of line segments, through the comparison of lengths of line segments, and through the concept of proportionality.[28]

The Pythagorean theorem is also ancient, but it could only take its central role in the measurement of distances after the invention of Cartesian coordinates by René Descartes in 1637. The distance formula itself was first published in 1731 by Alexis Clairaut.[29] Because of this formula, Euclidean distance is also sometimes called Pythagorean distance.[30] Although accurate measurements of long distances on the earth's surface, which are not Euclidean, had again been studied in many cultures since ancient times (see history of geodesy), the idea that Euclidean distance might not be the only way of measuring distances between points in mathematical spaces came even later, with the 19th-century formulation of non-Euclidean geometry.[31] The definition of the Euclidean norm and Euclidean distance for geometries of more than three dimensions also first appeared in the 19th century, in the work of Augustin-Louis Cauchy.[32]