Redundancy (information theory)

In describing the redundancy of raw data, the rate of a source of information is the average entropy per symbol. For memoryless sources, this is merely the entropy of each symbol, while, in the most general case of a stochastic process, it is

the logarithm of the cardinality of the message space, or alphabet. (This formula is sometimes called the Hartley function.) This is the maximum possible rate of information that can be transmitted with that alphabet. (The logarithm should be taken to a base appropriate for the unit of measurement in use.) The absolute rate is equal to the actual rate if the source is memoryless and has a uniform distribution.

A measure of redundancy between two variables is the mutual information or a normalized variant. A measure of redundancy among many variables is given by the total correlation.