Mises Develops the Frequency Theory of Probability

Richard von Mises used the concept of relative frequency to develop the first precise empirical interpretation of probability.

Summary of Event

The classical theory of probability, the earliest detailed analysis of probability, was developed by Pierre-Simon Laplace Laplace, Pierre-Simon in 1820. Laplace proposed his principle of indifference Principle of indifference as the foundation for understanding probabilities. This principle stipulates that the probability of an event is the number of favorable outcomes of the event divided by the total number of cases possible, given no particular reason to prefer any of those cases. Thus the probability of heads on a coin toss is ½, as there is one favorable case out of two “equipossible” cases. Laplace’s theory served quite well in systematizing earlier work by Blaise Pascal and others on the probabilities of gambling devices. Over the course of the nineteenth century, however, it was gradually realized that games of chance formed an artificial set of problems, where the outcomes of, say, the roll of a die were already fixed to be equiprobable. When the principle of indifference is applied to other situations, a variety of absurdities may result. For example, consider the probability that it will rain tomorrow. The two outcomes of interest are rain and no rain, so the indifference principle suggests an assignment of equal probabilities to each, namely, ½. If one then asks “What is the probability that it will snow tomorrow?” the answer must come out the same. Therefore, the probability of rain or snow is one. Probability, frequency theory
Frequency theory of probability
Mathematics;frequency theory of probability
[kw]Mises Develops the Frequency Theory of Probability (1919)
[kw]Frequency Theory of Probability, Mises Develops the (1919)
[kw]Probability, Mises Develops the Frequency Theory of (1919)
Probability, frequency theory
Frequency theory of probability
Mathematics;frequency theory of probability
[g]Germany;1919: Mises Develops the Frequency Theory of Probability[04610]
[c]Mathematics;1919: Mises Develops the Frequency Theory of Probability[04610]
Mises, Richard von
Church, Alonzo
Popper, Karl Raimund
Kolmogorov, Andrey Nikolayevich
De Finetti, Bruno

Richard von Mises.

Such difficulties led John Venn Venn, John (in his Logic of Chance, 1866), Charles Sanders Peirce, Peirce, Charles Sanders and other nineteenth century thinkers to identify probabilities with the observed frequencies of an outcome relative to a reference class, or relative frequencies. Thus the probability of a sunny day tomorrow would be identified with the empirical frequency of sunny days in the given location. Probabilities cannot be identified with frequencies within a finite sample, however. For example, if a coin is tossed ten times, one may get a frequency of heads of 4/10. A second series may yield six heads. Clearly, the probability of heads is not both 4/10 and 6/10; indeed, it is most likely neither. Frequency theorists instead equated the probability with the relative frequency in “the long run”—that is, what happens if the sampling is continued without limit. Unfortunately, the frequentists of the nineteenth century did not provide an unambiguous account of long-run frequencies. Therefore, in 1921, John Maynard Keynes regretted that he could not find a careful analysis of the frequency theory to criticize. About that time, Richard von Mises was developing such a detailed exposition, initially in technical papers (1919) and subsequently in the book Probability, Statistics, and Truth (1928). Probability, Statistics, and Truth (Mises)

Mises conceived of his task as replacing a commonsense, vague notion of probability with a scientific, precise concept. For this purpose, he restricted himself to considering “mass phenomena”—that is, the infinite sequences that would be formed by indefinitely repeating an experiment or event. Such a sequence is a collective only in the case that it satisfies two requirements: the axiom of convergence Axiom of convergence and the axiom of randomness. Axiom of randomness The axiom of convergence postulates that as a sequence is extended, the proportion of outcomes having a specified property shall tend toward a definite mathematical limit. This corresponds to what Mises called the empirical law of large numbers: Repeating experiments tends to produce stable frequencies of outcomes.

The axiom of randomness was Mises’s most original contribution to the definition of probability. The idea behind it is that probabilities depend on more than merely stable frequencies in the limit. Thus the sequence of heads and tails (T, H, T, H, T, H, . . .) will yield the limiting frequency of ½ for heads; however, the probability of an H in the Nth location is not ½, but either 1 or 0, depending on whether the index N is even or odd. To eliminate such trivialization, one must disallow sequences open to useful predictions based on their initial outcomes; in other words, one must rule out the possibility of successful gambling systems. The axiom of randomness is designed to do so. It requires that the limits of the relative frequencies be unaltered by any place selection, where place selections are subsequences whose members are chosen only on the basis of their index numbers in the original sequence and knowledge of prior members of the sequence. The sequence of H’s and T’s is not a collective, because the place selection of even-numbered outcomes has a different limiting frequency of heads (namely, one) from that of the original sequence.

A problem with this definition of place selection is that it imposes no clear restriction on which subsequences count as a place selection and which do not. Any subsequence will be selected by some infinite binary number, where each digit “1” is used to indicate that the corresponding element of the original sequence goes into the subsequence and the digit “0” indicates the element is excluded. Alonzo Church answered this objection in 1940. He restricted the allowable place selections to those that are Turing computable (calculable by a computer), given the index of the element and prior outcomes. Thus a sequence is a collective only in the case that none of the computable place selections alters its limit frequencies.

Limiting frequencies within collectives are provable probabilities in the sense of the modern mathematical theory; that is, they satisfy the axioms of probability theory put forward by Andrey Nikolayevich Kolmogorov in 1933. Mises had succeeded in providing an objective and empirical conception of probability: Rather than attempting to identify probabilities with a priori analyses of the state of ignorance, like Laplace, he defined them in terms of an idealization from observed frequencies. The task of statistics would be to identify what kinds of experiments are capable of generating collectives and to educe the associated probabilities.

Mises’s theory, however, has remained highly controversial. If a probability is identified with a limit in an infinite sequence, then any finite sequence can be attached to the front of the sequence without affecting the probability in the least. For example, if the probability of heads in a collective is ½, then it will remain ½ if one puts any number of heads, say one trillion, in front of the sequence. The unfortunate consequence of this mathematical result is that any observation of finite frequencies is compatible with any probability at all, so it seems that statistics can find no foundation in Mises’s theory.

Mises attempted to defend his theory by invoking the law of large numbers: Assuming a hypothetical probability (limit frequency) for a sequence, then given a large enough sample from that sequence, it can be deduced that there is a high probability that the sample frequency will be very close to the hypothetical probability—that the sample estimate will be good. This argument has not been accepted because it leads immediately to an infinite regress: The “high probability” of getting a good sample must itself be understood as a limit frequency on Mises’s account, and so is again compatible with any finite number of bad samples that diverge arbitrarily from the probability being estimated.

The price of Mises’s insistence on considering only mass phenomena was that it left probability statements impossible to confirm or falsify. It also led Mises to the implausible view that ascriptions of probability to individual events are strictly meaningless.

Significance

Problems of this kind led Sir Karl Raimund Popper to introduce (in the 1950’s) a related conception, the propensity interpretation of probability. He criticized Mises’s theory on the ground that if he introduced even one toss of a biased coin into an infinite sequence of fair tosses, then the probability of heads for that toss must be different from ½, regardless of the limit frequency. Popper then identified probabilities with the individual experiments, rather than infinite repetitions of experiments. The individual experiment has the disposition or propensity to produce a limiting relative frequency if it were to be repeated indefinitely. Popper’s theory is thus empirical in the same sense as Mises’s theory, but it directly applies to finite cases. Objections have been raised to this theory as well, among them the problem of the reference class: It is unclear what to count as a repetition of the same experiment. Different descriptions of an experiment lead to very different infinite sequences and therefore conflicting propensities for a single experiment.

Another influential competitor to the frequency view is the subjective interpretation of probability (Bayesianism). This view identifies probabilities with rational degrees of belief. It is possible to discover how strongly a person believes something by finding the odds he would accept for a bet on that proposition. Furthermore, using certain natural assumptions, it has been proved that rational degrees of belief obey Kolmogorov’s axioms of probability. Nevertheless, there remain many objections to simply identifying probabilities with the strength of beliefs. Subjectivism fails to do justice to the common intuition that the flip of a coin merely has a probability ½ of heads. In 1937, Bruno de Finetti attempted to answer this with a convergence theorem: In the long run, the probabilities hypotheses scientists ascribe to will converge, as long as they update those probabilities with empirical evidence using Bayesian theories. If two scientists start out with radically divergent probabilities, however, then even an enormous number of experiments will fail to bring their assessments together. In actual practice, agreement is normally reached within decades rather than millennia, so there must be some rational constraint in prior probabilities that subjectivism does not acknowledge. The subjective Bayesian view is at least incomplete.

The frequency theory arose in the context of the logical positivist philosophy of science propounded by the “Vienna Circle,” of which Mises was a member. Frequentism has continued to appeal to empiricist philosophers who wish to find a foundation for scientific knowledge in empirical observation and has been defended by, for example, Hans Reichenbach and Wesley Charles Salmon. The frequency interpretation has been attractive to statisticians also, satisfying their intuition that there should be a connection between frequencies and probability. In practice, however, statisticians use any function obeying Kolmogorov’s probability axioms, ignoring the question of whether it corresponds to limiting relative frequencies.

Despite the intellectual achievements of Mises and others, a consensus has not emerged. How to understand probabilities and their relation to the world remains an open and perplexing issue in philosophy and the foundations of statistics. Probability, frequency theory
Frequency theory of probability
Mathematics;frequency theory of probability

WikiSummaries

Free Book Summaries

Categories

Mises Develops the Frequency Theory of Probability

Categories

Related posts: