According to Wikipedia...
Suppose I want to relate Pr(W ∣ C), a degree of rational belief in W after evidence C becomes available, back to Pr(W), a degree of rational belief in W before evidence C becomes available.
Bayes' Theorem is just a double-application of the conditional probability formula, thus...
Pr(W ∣ C) = Pr(C)Pr(W ∩ C)=Pr(C)Pr(W)Pr(C ∣ W)
Alternatively however, I could apply the conditional probability formula once, and then the inclusion-exclusion formula in the numerator to obtain...
Pr(W ∣ C)=Pr(C)Pr(W) + Pr(C) − Pr(W ∪ C)
This formula also relates Pr(W ∣ C) back to Pr(W). My question is, what warrants embracing the top relation (Bayes' Theorem) and rejecting the bottom relation as the correct mathematical representation of belief updating?
My first attempt at an answer was that the bottom relation is unhelpful because Pr(W ∪ C) is not algebraically independent of Pr(W), and thus the appearance of Pr(W) may be "artificial," analogous to a more blatantly artificial introduction of Pr(W) such as Pr(W ∣ C) = Pr(C)Pr(W ∩ C) + Pr(W) − Pr(W). However, the Pr(C ∣ W) that appears in the top relation is also not algebraically independent of Pr(W), the conditional probability formula already applied being the bridge between the two.
Another thought I had is that the top and bottom relations may actually be comparably good mathematical representations of belief updating, and the top relation is merely preferred because it acts on Pr(W) via multiplication, whereas the bottom relation acts on Pr(W) via addition and multiplication. However, this feels quite weak to explain the ubiquity of the top relation, as it is, in general, far from true that the most useful mathematical expression is always the simplest.
Why should I herald Bayes' Theorem as the unique intuition for belief updating, and not the other expression?
With Bayesian probability interpretation, [Bayes'] theorem expresses how a degree of belief, expressed as a probability, should rationally change to account for the availability of related evidence.
Suppose I want to relate Pr(W ∣ C), a degree of rational belief in W after evidence C becomes available, back to Pr(W), a degree of rational belief in W before evidence C becomes available.
Bayes' Theorem is just a double-application of the conditional probability formula, thus...
Pr(W ∣ C) = Pr(C)Pr(W ∩ C)=Pr(C)Pr(W)Pr(C ∣ W)
Alternatively however, I could apply the conditional probability formula once, and then the inclusion-exclusion formula in the numerator to obtain...
Pr(W ∣ C)=Pr(C)Pr(W) + Pr(C) − Pr(W ∪ C)
This formula also relates Pr(W ∣ C) back to Pr(W). My question is, what warrants embracing the top relation (Bayes' Theorem) and rejecting the bottom relation as the correct mathematical representation of belief updating?
My first attempt at an answer was that the bottom relation is unhelpful because Pr(W ∪ C) is not algebraically independent of Pr(W), and thus the appearance of Pr(W) may be "artificial," analogous to a more blatantly artificial introduction of Pr(W) such as Pr(W ∣ C) = Pr(C)Pr(W ∩ C) + Pr(W) − Pr(W). However, the Pr(C ∣ W) that appears in the top relation is also not algebraically independent of Pr(W), the conditional probability formula already applied being the bridge between the two.
Another thought I had is that the top and bottom relations may actually be comparably good mathematical representations of belief updating, and the top relation is merely preferred because it acts on Pr(W) via multiplication, whereas the bottom relation acts on Pr(W) via addition and multiplication. However, this feels quite weak to explain the ubiquity of the top relation, as it is, in general, far from true that the most useful mathematical expression is always the simplest.
Why should I herald Bayes' Theorem as the unique intuition for belief updating, and not the other expression?