Explanationist Aid for the Theory of Inductive Logic

Feb 25, 2023

Here, I solve the problem of induction for you.*

[ *Based on: “Explanationist Aid for the Theory of Inductive Logic,” British Journal for the Philosophy of Science 60 (2009): 1-31. ]

This was a long and somewhat technical paper, so I’m going to give a baby version.

1. The Problem of Induction

1.1. Three Views About an Induction

Say you’ve observed n honey badgers, and all of them were mean (see

). You’re tempted to conclude that the next one you observe will also be mean.

David Hume comes along and says you have no justification for inferring that (Hume is annoying like that), because you don’t have any evidence that “the course of nature is uniform”, or that unobserved objects tend to be like observed objects. This is inductive skepticism. Notice that Hume’s view is not that you can’t be certain that the next badger will be mean; it is that you have no reason at all, either to think the next badger will be mean, or to think it won’t; your experience is evidentially irrelevant.

This contrasts with inductivism, the view that you have reason to think the next badger will be mean, and counter-inductivism, the view that you have reason to think the next badger won’t be mean (not a common view).

Here’s a probabilistic formulation: Let Un be the proposition that the first n badgers were all mean. These are the 3 views, again:

Inductivism: P(Un+1|Un) > P(Un+1) [Read: “The probability of U sub n+1 given U sub n is greater than the initial probability of U sub n+1”.]

Skepticism: P(Un+1|Un) = P(Un+1)

Counter-inductivism: P(Un+1|Un) < P(Un+1)

There is a tradition in which people try to use intuitive principles of probability to vindicate inductivism. (See Laplace, Bayes, Carnap, and David Stove.) This is the tradition of the “Theory of Logical Probability” (because these people treat probabilities as logical properties of propositions or logical relations among propositions).

1.2. A Skeptic’s Probability Distribution

The skeptic’s probability distribution is perfectly coherent. It’s the probability distribution you would assign to outcomes of flips of a coin that you are completely certain is fair: if you get heads 10 times in a row, the probability of the next flip coming up heads is still 50%, the same as it was before you saw the 10 heads.

That doesn’t explain, though, why anyone would think this was the correct probability distribution. Here’s an explanation of that:

Fans of logical probability commonly endorse the Principle of Indifference, namely, that if you have no reason to favor A over B or vice versa, then P(A) (on your evidence) = P(B) (on your evidence). Roughly speaking, you start out assigning all possibilities the same probability, before gathering evidence for or against particular alternatives.

So here’s a way of interpreting that: assign every possible sequence of observed events the same initial probability. In the case of flipping coins, each possible sequence of heads & tails gets the same probability. In the case of honey badgers, you assign the same probability to every possible sequence of observed badger properties. For simplicity, assume that badgers can have only 2 properties, mean and nice. Then if you observe n badgers, there are 2^n possible sequences of observed niceness/meanness properties.

This gives you inductive skepticism. The initial probability of a randomly chosen badger being mean is 1/2. The probability of the 100th badger being mean given that the first 99 were mean = [the number of sequences in which the first 99 are mean and the 100th is mean / (the number of sequences in which the first 99 are mean] = 1/2. (If the first 99 are mean, there is exactly one continuation in which the 100th is mean, and one continuation in which the 100th is nice.)

1.3. An Inductivist Probability Distribution

But here’s another way of applying the Principle of Indifference. We could say that each possible proportion of badger meanness, or each possible number of mean badgers, is equally likely. If you have 99 honey badgers, then there are 100 possibilities:

None are mean.
One is mean.
Two are mean.
…

Ninety-nine are mean.

So you could give each of those an initial probability of 1/100.

This gives you inductivism. Suppose you’re going to observe 99 badgers. The initial probability of the first 98 all being mean is 1/99 (that is one of the 99 possible proportions). The initial probability of the first 99 all being mean is 1/100. It’s an axiom that in general P(A|B) = P(A&B)/P(B). So the probability that the first 99 are all mean given that the first 98 are all mean =

P(the first 99 are mean and the first 98 are mean) / P(the first 98 are mean)

= P(the first 99 are mean) / P(the first 98 are mean)

= (1/100) / (1/99)

= 99/100.

So if you observe 98 mean badgers in a row, there is a 99% chance that the next one will be mean.

This was how Laplace calculated the odds that the sun will rise tomorrow at 1,826,200 to 1 in favor.

1.4. The Problem

So here’s a restatement of the problem of induction, phrased in terms of the logical probability framework: Why is the inductivist’s probability distribution more correct than the skeptic’s probability distribution? Carnap (a great advocate of logical probability) thought about this but never found a good answer.

This makes the problem of induction a special case of the problem with the principle of indifference: there are different ways of describing a given set of possibilities, and you can get incompatible probability distributions by applying the Principle of Indifference (PoI) to these different descriptions. How should we interpret the PoI to get a unique correct probability distribution?

2. How to Interpret the Principle of Indifference

2.1. The Puzzles About the PoI

There are many other cases, not related to induction, where people have pointed to the ambiguity/inconsistency in the PoI. E.g., consider these two problems:

a. Suppose you know only that there is a cube factory that makes cubes whose sides are between 0 and 2 inches. Given this, what is the probability that a cube from the factory has a side between 0 and 1 inch? Applying the PoI, the answer seems to be 1/2.

b. Suppose you know only that there is a cube factory that makes cubes whose volumes are between 0 and 8 cubic inches. What is the probability that a cube from the factory has a volume between 0 and 1 cubic inch? Applying the PoI, the answer seems to be 1/8.

But (a) and (b) are actually the same problem. You seemingly get different answers depending on how you describe the problem.

There are many other cases like that, but I’ll leave it there for brevity. Many people have wondered how to resolve this. Many argue that the PoI must be rejected as inconsistent. Others try to find ways of specifying the correct description of the problem for purposes of applying the PoI.

2.2. The Explanatory Priority Proviso

This is my idea: if you have multiple descriptions of the problem, you want to select the most explanatorily basic description, the description in terms of the properties that would explain the other properties.

I take explanatory basicness here as a metaphysical (and not an epistemological) feature. So what matters is what set of facts would ground the other facts in reality, not what set of facts enable you to know about or understand or describe the other facts. So causal priority matters; definitional priority doesn’t matter (it doesn’t matter if you would explain the concept of X in terms of the concept of Y, but it matters if X would cause Y).

In the cube case, the number of molecules in a cube determines its size (given the material it is made of) (including both its width and its volume). The number of molecules is (for normal-sized objects, almost exactly) proportional to the volume of the cube. So the 1/8 solution is the correct solution.

I have similar stories about other puzzle cases for the Principle of Indifference.

2.3. The Skeptic’s Argument

Let’s apply this to the case of induction on badgers. Actually, the skeptic has a pretty reasonable argument on the face of it. Facts about proportions are determined by (explained by) facts about individuals. E.g., suppose you have four badgers, A, B, C, and D; A and C are nice while B and D are mean. Then you have a fact about the proportion of mean badgers, namely,

Proportion Fact: [Half of the badgers are mean]

And you also have a conjunctive fact about the individuals, namely

Individuals Fact: [A is nice, B is mean, C is nice, and D is mean]

The Individuals Fact grounds the Proportion Fact, not vice versa. The fact about the individuals makes the proposition about the overall proportion true, not vice versa. (It’s false that B and D are mean because half the badgers are mean.)

So on the face of it, the Explanatory Priority Proviso doesn’t solve the problem of induction; it gives comfort to the Skeptic, rather than the Inductivist! The Skeptic will say that we should assign equal probabilities to every possible assignment of meanness/niceness properties to the individual badgers, because that is explanatorily more fundamental than the proportion of mean badgers.

2.4. Laws & Objective Chances

But wait. There might be things even more explanatorily basic than either the Proportion Fact or the Individuals Fact. Maybe there are Laws of Nature and other broad features of the world that exist prior to all the badgers, which determine the objective chance of a given badger being mean.

Facts about objective chances would be prior to facts about either individual outcomes or proportions. Notice that you could explain why a particular coin came up heads about half the time in 1000 throws by citing the fact that it’s a fair coin, i.e., the objective chance of heads for this coin is 50%. This objective chance holds in virtue of the laws of physics plus general features of the coin (its shape, weight balance, etc.) that existed before it was ever flipped.

So the really correct application of the Principle of Indifference would be to the possible values of the objective chance of a badger being mean. This objective chance is some number between 0 and 1. So you want to assign a uniform probability density (this is for your epistemic or logical probability) across the possible values of the objective chance between 0 and 1. (I.e., p(c) = 1 for c between 0 and 1, and =0 otherwise.)

Note: I guess this is a good time to comment on epistemic probability vs. objective chance. Epistemic probability is a measure of how much justification you have to believe something — so probability 1 means you should be absolutely certain of the thing, and probability 0 means you should be absolutely certain of its negation. This isn’t a physical property, it’s just a feature of your epistemic position. That’s why you don’t have to gather empirical evidence before assigning an epistemic probability. (Being ignorant of the facts surrounding x shouldn’t prevent you from describing the state of your own ignorance.)
Objective chances are features of the physical situation, e.g., the balance and shape of a coin. Objective chance is the strength of a causal tendency — the tendency for a given situation to produce a given outcome. You need empirical evidence to know these. Which is why you would be assigning epistemic probabilities to possible objective chances.
When you observe an actual sequence of events, that gives you (probabilistic) information about the objective chances. You can then make (probabilistic) predictions about future events based on your information about objective chances. That’s the basic idea of this solution to the problem of induction.

This way of assigning probabilities, it turns out, gives the same result as the method in sec. 1.3 above, i.e., if you observe n mean badgers in a row, the probability of the next one being mean is (n+1)/(n+2). You need a small amount of calculus to show that, which I’m omitting here for brevity & readability.

3. Comments

That’s the basic solution to the problem of induction. (For more detail & qualifications, see the paper.)

This approach vindicates something that people like Michael Tooley and David Armstrong had been saying: they say that if you don’t have a realist view of laws of nature (esp. if you’re a Humean), then you have to be an inductive skeptic. That’s true because competing views about laws of nature affect explanatory priority judgments. If laws of nature are fundamental features of the world, or relations between universals, then they are explanatorily prior to sequences of particular events, which means that one can apply the Principle of Indifference at the level of possible laws. If, however, laws are just convenient summaries of the particular facts, then they are explanatorily posterior to the sequences of particular facts (as in Hume’s and David Lewis’ views), in which case you have to apply the Principle of Indifference at the level of particular facts, which gives you inductive skepticism.

Mike Van Horn

Feb 26, 2023

Do you have an easier version to understand than the baby version?

Expand full comment

astew

Feb 27, 2023

I take issue with your cube example. I don't think the number of molecules in the cube is causally basic. More causally basic is the manufacturer's intent to produce cubes in which certain measures are within specified bounds.

But then your (a) and (b) cases still should yield different answers if all you know is that in one case they're measuring side lengths and in the other they're measuring volume.

Actually, you could take this a little further (I understand this drifts off topic) and say that 1/2 is the better answer for both cases because in order make the assumption we're asked to make (in either case), the manufacturer would *need* to measure the lengths of edges (and indeed angles) to ensure that it is actually a cube rather than some other parallelepiped. And then the case (b) degenerates into case (a), because if the manufacturer is sure it's a cube, there's no point in measuring the volume because in verifying it's a cube, the side length constraint (between 0 and 2 inches) is sufficient to meet the volume condition.

Fake Noûs

Discussion about this post