Probability space

In order to provide a sensible model of probability, these elements must satisfy a number of axioms, detailed in this article.

The Russian mathematician Andrey Kolmogorov introduced the notion of probability space, together with other axioms of probability, in the 1930s. In modern probability theory there are a number of alternative approaches for axiomatization — for example, algebra of random variables.

In short, a probability space is a measure space such that the measure of the whole space is equal to one.

If Ω is uncountable, still, it may happen that p(ω) ≠ 0 for some ω; such ω are called atoms. They are an at most countable (maybe empty) set, whose probability is the sum of probabilities of all atoms. If this sum is equal to 1 then all other points can safely be excluded from the sample space, returning us to the discrete case. Otherwise, if the sum of probabilities of all atoms is between 0 and 1, then the probability space decomposes into a discrete (atomic) part (maybe empty) and a non-atomic part.

In this case the open intervals of the form (a,b), where 0 < a < b < 1, could be taken as the generator sets. Each such set can be ascribed the probability of P((a,b)) = (ba), which generates the Lebesgue measure on [0,1], and the Borel σ-algebra on Ω.

A random variable X is a measurable function X: Ω → S from the sample space Ω to another measurable space S called the state space.

Kolmogorov’s definition of probability spaces gives rise to the natural concept of conditional probability. Every set A with non-zero probability (that is, P(A) > 0) defines another probability measure

For any event B such that P(B) > 0 the function Q defined by Q(A) = P(A|B) for all events A is itself a probability measure.

Two events, A and B are said to be independent if P(AB) = P(A) P(B).

Two random variables, X and Y, are said to be independent if any event defined in terms of X is independent of any event defined in terms of Y. Formally, they generate independent σ-algebras, where two σ-algebras G and H, which are subsets of F are said to be independent if any element of G is independent of any element of H.

Two events, A and B are said to be mutually exclusive or disjoint if the occurrence of one implies the non-occurrence of the other, i.e., their intersection is empty. This is a stronger condition than the probability of their intersection being zero.

If A and B are disjoint events, then P(AB) = P(A) + P(B). This extends to a (finite or countably infinite) sequence of events. However, the probability of the union of an uncountable set of events is not the sum of their probabilities. For example, if Z is a normally distributed random variable, then P(Z = x) is 0 for any x, but P(ZR) = 1.

The event AB is referred to as "A and B", and the event AB as "A or B".

The first major treatise blending calculus with probability theory, originally in French: Théorie Analytique des Probabilités.
The modern measure-theoretic foundation of probability theory; the original German version (Grundbegriffe der Wahrscheinlichkeitrechnung) appeared in 1933.
An empiricist, Bayesian approach to the foundations of probability theory.
Foundations of probability theory based on nonstandard analysis. Downloadable.
A lively introduction to probability theory for the beginner, Cambridge Univ. Press.
An undergraduate introduction to measure-theoretic probability, Cambridge Univ. Press.