New paper: “Logical induction”

MIRI is releasing a paper introducing a new model of deductively limited reasoning: “Logical induction,” authored by Scott Garrabrant, Tsvi Benson-Tilsen, Andrew Critch, myself, and Jessica Taylor. Readers may wish to start with the abridged version.

Consider a setting where a reasoner is observing a deductive process (such as a community of mathematicians and computer programmers) and waiting for proofs of various logical claims (such as the abc conjecture, or “this computer program has a bug in it”), while making guesses about which claims will turn out to be true. Roughly speaking, our paper presents a computable (though inefficient) algorithm that outpaces deduction, assigning high subjective probabilities to provable conjectures and low probabilities to disprovable conjectures long before the proofs can be produced.

This algorithm has a large number of nice theoretical properties. Still speaking roughly, the algorithm learns to assign probabilities to sentences in ways that respect any logical or statistical pattern that can be described in polynomial time. Additionally, it learns to reason well about its own beliefs and trust its future beliefs while avoiding paradox. Quoting from the abstract:

These properties and many others all follow from a single logical induction criterion, which is motivated by a series of stock trading analogies. Roughly speaking, each logical sentence φ is associated with a stock that is worth $1 per share if φ is true and nothing otherwise, and we interpret the belief-state of a logically uncertain reasoner as a set of market prices, where ℙn(φ)=50% means that on day n, shares of φ may be bought or sold from the reasoner for 50¢. The logical induction criterion says (very roughly) that there should not be any polynomial-time computable trading strategy with finite risk tolerance that earns unbounded profits in that market over time.

This criterion is analogous to the “no Dutch book” criterion used to support other theories of ideal reasoning, such as Bayesian probability theory and expected utility theory. We believe that the logical induction criterion may serve a similar role for reasoners with deductive limitations, capturing some of what we mean by “good reasoning” in these cases.

The logical induction algorithm that we provide is theoretical rather than practical. It can be thought of as a counterpart to Ray Solomonoff’s theory of inductive inference, which provided an uncomputable method for ideal management of empirical uncertainty but no corresponding method for reasoning under uncertainty about logical or mathematical sentences. Logical induction closes this gap.

Any algorithm that satisfies the logical induction criterion will exhibit the following properties, among others:

1. Limit convergence and limit coherence: The beliefs of a logical inductor are perfectly consistent in the limit. (Every provably true sentence eventually gets probability 1, every provably false sentence eventually gets probability 0, if φ provably implies ψ then the probability of φ converges to some value no higher than the probability of ψ, and so on.)

2. Provability induction: Logical inductors learn to recognize any pattern in theorems (or contradictions) that can be identified in polynomial time.

◦ Consider a sequence of conjectures generated by a brilliant mathematician, such as Ramanujan, that are difficult to prove but keep turning out to be true. A logical inductor will recognize this pattern and start assigning Ramanujan’s conjectures high probabilities well before it has enough resources to verify them.

◦ As another example, consider the sequence of claims “on input n, this long-running computation outputs a natural number between 0 and 9.” If those claims are all true, then (roughly speaking) a logical inductor learns to assign high probabilities to them as fast as they can be generated. If they’re all false, a logical inductor learns to assign them low probabilities as fast as they can be generated. In this sense, it learns inductively to predict how computer programs will behave.

◦ Similarly, given any polynomial-time method for writing down computer programs that halt, logical inductors learn to believe that they will halt roughly as fast as the source codes can be generated. Furthermore, given any polynomial-time method for writing down computer programs that provably fail to halt, logical inductors learn to believe that they will fail to halt roughly as fast as the source codes can be generated. When it comes to computer programs that fail to halt but for which there is no proof of this fact, logical inductors will learn not to anticipate that the program is going to halt anytime soon, even though they can’t tell whether the program is going to halt in the long run. In this way, logical inductors give some formal backing to the intuition of many computer scientists that while the halting problem is undecidable in full generality, this rarely interferes with reasoning about computer programs in practice.

3. Affine coherence: Logical inductors learn to respect logical relationships between different sentences’ truth-values, often long before the sentences can be proven. (E.g., they will learn for arbitrary programs that “this program outputs 3” and “this program outputs 4” are mutually exclusive, often long before they’re able to evaluate the program in question.)

4. Learning pseudorandom frequencies: When faced with a sufficiently pseudorandom sequence, logical inductors learn to use appropriate statistical summaries. For example, if the Ackermann(n,n)th digit in the decimal expansion of π is hard to predict for large n, a logical inductor will learn to assign ~10% subjective probability to the claim “the Ackermann(n,n)th digit in the decimal expansion of π is a 7.”

5. Calibration and unbiasedness: On sequences that a logical inductor assigns ~30% probability to, if the average frequency of truth converges, then it converges to ~30%. In fact, on any subsequence where the average frequency of truth converges, there is no efficient method for finding a bias in the logical inductor’s beliefs.

6. Scientific induction: Logical inductors can be used to do sequence prediction, and when doing so, they dominate the universal semimeasure.

7. Closure under conditioning: Conditional probabilities in this framework are well-defined, and conditionalized logical inductors are also logical inductors.

8. Introspection: Logical inductors have accurate beliefs about their own beliefs, in a manner that avoids the standard paradoxes of self-reference.

◦ For instance, the probabilities on a sequence that says “I have probability less than 50% on the nth day” go extremely close to 50% and oscillate pseudorandomly, such that there is no polynomial-time method to tell whether the nth one is slightly above or slightly below 50%.

9. Self-trust: Logical inductors learn to trust their future beliefs more than their current beliefs. This gives some formal backing to the intuition that real-world probabilistic agents can often be reasonably confident in their future reasoning in practice, even though Gödel’s incompleteness theorems place strong limits on reflective reasoning in full generality.

The above claims are all quite vague; for the precise statements, refer to the paper.

Logical induction was developed by Scott Garrabrant in an effort to solve an open problem we spoke about six months ago. Roughly speaking, we had formalized two different desiderata for good reasoning under logical uncertainty: the ability to recognize patterns in what is provable (such as mutual exclusivity relationships between claims about computer programs), and the ability to recognize statistical patterns in sequences of logical claims (such as recognizing that the decimal digits of π seem pretty pseudorandom). Neither was too difficult to achieve in isolation, but we were surprised to learn that our simple algorithms for achieving one seemed quite incompatible with our simple algorithms for achieving the other. Logical inductors were born of Scott’s attempts to achieve both simultaneously.

I think there’s a good chance that this framework will open up new avenues of study in questions of metamathematics, decision theory, game theory, and computational reflection that have long seemed intractable. I’m also cautiously optimistic that they’ll improve our understanding of decision theory and counterfactual reasoning, and other problems related to AI value alignment.

We’ve posted a talk online that helps provide more background for our work on logical induction: