Bayes-type reasoning

This chapter rigorously explores Bayes' theorem and its application to conditional probability problems. Mastery of the tree and table methods, along with an understanding of reverse probability and diagnostic test scenarios, is critical for success in advanced probability examinations.

---

Chapter Contents

| Topic |

|---|-------| | 1 | Tree method | | 2 | Table method | | 3 | Reverse probability | | 4 | Diagnostic test problems |

---

We begin with Tree method.

Part 1: Tree method

Tree Method

Overview

The tree method is one of the clearest ways to organise conditional probability. It is especially useful when events happen in stages, when a probability changes after an earlier event, or when we want to compute a reverse probability using Bayes-type reasoning. In exam problems, the main goal is not drawing a decorative tree, but using the tree to track all branches correctly and read probabilities from it with precision. ---

Learning Objectives

❗ By the End of This Topic

After studying this topic, you will be able to:

Draw a probability tree for a multi-stage experiment.

Label branch probabilities correctly.

Compute the probability of a complete path by multiplying along the branches.

Compute the probability of an event by adding relevant path probabilities.

Use the tree method to solve Bayes-type reverse-probability questions.

---

Core Idea

📖 Probability Tree

A probability tree is a branching diagram used to represent sequential events.

Each level of the tree represents a stage of the experiment, and each branch is labeled by the conditional probability of that outcome given the previous stage.

For example, if an experiment happens in two stages:

first choose source/type/category

then observe result/outcome

then the tree keeps track of both the first-stage probabilities and the conditional second-stage probabilities. ---

Main Rules

📐 Path Multiplication Rule

The probability of a complete path is the product of the probabilities written along that path.

If a path is

\qquad A \to B

then

\qquad P(A \cap B) = P(A)\,P(B \mid A)

:::

📐 Event Addition Rule from a Tree

If an event can happen through several disjoint paths, then its probability is the sum of the probabilities of those paths.

So if event

E

happens through paths

1,2,\dots,k

, then

\qquad P(E) = P(\text{path }1)+P(\text{path }2)+\cdots+P(\text{path }k)

::: ---

Why the Tree Method Works Well

💡 What the Tree Gives You

A good tree makes three things visible:

stage-by-stage dependence

complete-path probabilities

all possible ways an event can happen

This is why it is ideal for:

conditional probability

repeated draws

test accuracy problems

disease-screening problems

manufacturing/source-identification problems

::: ---

Bayes-Type Reasoning from a Tree

📐 Reverse Probability

Suppose an outcome $B$ can arise from multiple starting categories $A_1,A_2,\dots,A_n$ .

Then

$\qquad P(A_i \mid B) = \dfrac{P(A_i \cap B)}{P(B)}$

Using the tree:

numerator = one specific path probability

denominator = sum of all paths leading to $B$

::: This is the most common tree-method use in Bayes-type problems. ---

Minimal Worked Example

Example 1 A box is chosen from two boxes:

Box 1 with probability $\dfrac{1}{3}$

Box 2 with probability $\dfrac{2}{3}$

Then a ball is drawn.

From Box 1, the probability of red is $\dfrac{3}{5}$

From Box 2, the probability of red is $\dfrac{1}{4}$

Find the probability of getting a red ball. Using the tree: Path 1:

\qquad \text{Box 1} \to \text{Red}

Probability:

\qquad \dfrac{1}{3}\cdot \dfrac{3}{5} = \dfrac{1}{5}

Path 2:

\qquad \text{Box 2} \to \text{Red}

Probability:

\qquad \dfrac{2}{3}\cdot \dfrac{1}{4} = \dfrac{1}{6}

Add the disjoint red paths:

\qquad P(\text{Red}) = \dfrac{1}{5}+\dfrac{1}{6}=\dfrac{11}{30}

So the answer is

\qquad \boxed{\dfrac{11}{30}}

---

Common Tree Structures

💡 Recognize These Patterns

source $\to$ outcome

disease status $\to$ test result

first draw $\to$ second draw

biased choice $\to$ success/failure

machine chosen $\to$ defective/non-defective

---

Drawing the Tree Correctly

❗ Good Tree Discipline

When drawing a tree:

every stage must be clearly separated

branch probabilities leaving a node must add to $1$

conditional labels must match the branch's parent node

final answers should come from path multiplication and path addition

---

Common Mistakes

⚠️ Avoid These Errors

❌ multiplying probabilities from different paths together

✅ multiply only along a single path

❌ forgetting that second-stage probabilities may be conditional

✅ read each branch from its parent node

❌ adding probabilities of non-disjoint events without care

✅ only add separate final paths for the target event

❌ using a tree when the stages are not actually sequential

✅ tree method is for multi-stage or condition-based structure

---

CMI Strategy

💡 How to Attack Tree-Method Questions

Identify the stages clearly.

Put first-stage probabilities on the first split.

Put conditional probabilities on the next split.

Multiply along paths.

Add the relevant final paths.

For reverse probability, divide the wanted path by the total probability of the observed event.

---

Practice Questions

:::question type="MCQ" question="In a probability tree, the probability of a complete path is found by" options=["adding the branch probabilities on that path","multiplying the branch probabilities on that path","subtracting the branch probabilities on that path","taking the average of the branch probabilities on that path"] answer="B" hint="Use the multiplication rule for sequential events." solution="For sequential events, the probability of a full path is the product of the probabilities along that path. Therefore the correct option is

\boxed{B}

." ::: :::question type="NAT" question="A box is chosen: Box 1 with probability

\dfrac{1}{2}

and Box 2 with probability

\dfrac{1}{2}

. From Box 1, the probability of a red ball is

\dfrac{3}{4}

; from Box 2, it is

\dfrac{1}{4}

. Find the probability of drawing a red ball." answer="1/2" hint="Add the red paths." solution="There are two red paths. From Box 1:

\qquad \dfrac{1}{2}\cdot \dfrac{3}{4}=\dfrac{3}{8}

From Box 2:

\qquad \dfrac{1}{2}\cdot \dfrac{1}{4}=\dfrac{1}{8}

\qquad P(\text{Red})=\dfrac{3}{8}+\dfrac{1}{8}=\dfrac{1}{2}

Hence the answer is

\boxed{\dfrac{1}{2}}

." ::: :::question type="MSQ" question="Which of the following statements are true?" options=["Branch probabilities leaving the same node should add to

1

","A path probability is found by multiplying along the path","A tree method is useful for Bayes-type problems","In a tree, all probabilities must be equal"] answer="A,B,C" hint="Think about how a probability tree is built." solution="1. True.

True.

False. Tree probabilities need not be equal.

Hence the correct answer is

\boxed{A,B,C}

." ::: :::question type="SUB" question="A factory has two machines. Machine

A

produces

40\%

of the items and Machine

B

produces

60\%

of the items. The defective rates are

5\%

for

A

and

2\%

for

B

. Using a probability tree, find the probability that a randomly chosen item is defective." answer="

0.032

" hint="Compute the defective path from each machine and add." solution="There are two defective paths. From Machine

A

\qquad P(A \cap D)=0.4\times 0.05 = 0.02

From Machine

B

\qquad P(B \cap D)=0.6\times 0.02 = 0.012

Therefore

\qquad P(D)=0.02+0.012=0.032

Hence the required probability is

\boxed{0.032}

." ::: ---

Summary

❗ Key Takeaways for CMI

The tree method organizes conditional probability stage by stage.

Multiply along a path and add across relevant disjoint paths.

Branches from the same node must total $1$ .

Tree diagrams are especially effective in Bayes-type reasoning.

A correct tree prevents logical mixing of cases.

---

💡 Next Up

Proceeding to Table method.

---

Part 2: Table method

Table Method

Overview

The table method is a compact and powerful way to solve conditional probability problems when the information naturally falls into categories. It is especially effective in Bayes-type reasoning, test-result problems, and classification problems. Instead of following paths stage by stage as in a tree, the table method organizes outcomes into rows and columns and lets us read totals, intersections, and conditional probabilities directly. ---

Learning Objectives

❗ By the End of This Topic

After studying this topic, you will be able to:

Construct a probability or frequency table from given data.

Fill row totals, column totals, and internal cells correctly.

Use the table to compute conditional probabilities.

Solve Bayes-type reverse-probability questions using table entries.

Move cleanly between percentages, frequencies, and probabilities.

---

Core Idea

📖 Probability Table

A probability table organizes events into categories so that:

rows represent one classification

columns represent another classification

each cell represents an intersection event

row and column totals represent marginal probabilities

For example, in a medical-testing problem:

rows may represent Disease / No Disease

columns may represent Positive / Negative test result

::: ---

Why the Table Method Works

💡 What the Table Gives You

The table method makes three things easy:

seeing intersection probabilities such as $P(A \cap B)$

seeing marginal totals such as $P(B)$

computing conditional probabilities such as

\qquad P(A \mid B)=\dfrac{P(A \cap B)}{P(B)}

This is why it is very useful when the problem has a classification structure rather than a natural time order. ::: ---

Main Formula in Table Language

📐 Conditional Probability from a Table

If a table gives you:

intersection entry $P(A \cap B)$

column or row total $P(B)$

then

\qquad P(A \mid B)=\dfrac{P(A \cap B)}{P(B)}

In words:

numerator = the favorable cell

denominator = the total of the conditioning category

::: ---

Frequencies Often Make Tables Easier

❗ Use Convenient Totals

In many problems with percentages, it is easier to imagine a sample of:

$100$

$1000$

$10000$

Then fill the table with frequencies instead of decimals.

At the end, convert back to probability if needed.

This often makes Bayes-type questions much clearer. ---

Minimal Worked Example

Example 1 A disease occurs in

10\%

of a population. A test is positive:

in $80\%$ of diseased people

in $20\%$ of non-diseased people

Find the probability that a person actually has the disease given that the test is positive. Take a population of

100

. Disease / No Disease counts:

Disease: $10$

No Disease: $90$

Positive counts:

Diseased and positive: $\qquad 0.8\times 10 = 8$

Non-diseased and positive: $\qquad 0.2\times 90 = 18$

So total positive:

\qquad 8+18=26

Thus

\qquad P(\text{Disease} \mid \text{Positive})=\dfrac{8}{26}=\dfrac{4}{13}

So the answer is

\qquad \boxed{\dfrac{4}{13}}

---

Table Structure Example

📐 Typical Layout

A standard $2\times 2$ table looks like this:

| Category | Positive | Negative | Total |
|---|---:|---:|---:|
| Disease | $P(D \cap P)$ | $P(D \cap N)$ | $P(D)$ |
| No Disease | $P(D^c \cap P)$ | $P(D^c \cap N)$ | $P(D^c)$ |
| Total | $P(P)$ | $P(N)$ | $1$ |

This is often the fastest way to organize the information. ::: ---

Table Method vs Tree Method

❗ When a Table is Better

Use the table method when:

the problem is classification-based

you want totals and subtotals quickly

Bayes-type reverse probability is asked

data is already presented in percentage or count form

Use the tree method when:

the experiment is sequential

stages happen one after another

::: ---

Common Patterns

💡 Typical Exam Patterns

disease / test result

machine / defective status

class membership / success-failure

source / observed outcome

frequency table completion

---

Common Mistakes

⚠️ Avoid These Errors

❌ mixing row totals and column totals

✅ label the table clearly before filling it

❌ using percentages directly without a consistent base

✅ choose

100

1000

if needed

❌ dividing by the wrong total in conditional probability

✅ denominator must match the condition

❌ forcing a tree when a table is simpler

✅ use the structure that matches the data

---

CMI Strategy

💡 How to Attack Table-Method Questions

Identify the two classifications.

Draw the table with clear row and column labels.

Fill the easy totals first.

Fill intersection cells using the given rates.

Use row/column totals for conditional probability.

Check that the full total is consistent.

---

Practice Questions

:::question type="MCQ" question="In a conditional probability table, the denominator of

P(A \mid B)

should be" options=["the grand total","the total corresponding to

B

","the total corresponding to

A

","the sum of all unfavorable cells"] answer="B" hint="Use the definition of conditional probability." solution="By definition,

\qquad P(A \mid B)=\dfrac{P(A \cap B)}{P(B)}

So the denominator is the total corresponding to

B

. Hence the correct option is

\boxed{B}

." ::: :::question type="NAT" question="In a school of

100

students,

40

are girls. Among the girls,

30

play chess. Among the boys,

20

play chess. Find the probability that a randomly chosen student plays chess." answer="1/2" hint="Fill the chess counts and divide by

100

." solution="Girls who play chess:

\qquad 30

Boys in the school:

\qquad 100-40=60

Boys who play chess:

\qquad 20

Total students who play chess:

\qquad 30+20=50

Therefore the required probability is

\qquad \dfrac{50}{100}=\dfrac{1}{2}

Hence the answer is

\boxed{\dfrac{1}{2}}

." ::: :::question type="MSQ" question="Which of the following statements are true?" options=["A table method is useful for Bayes-type questions","A table can organize intersection events and totals together","Conditional probability is obtained by dividing a relevant cell by the relevant total","A table method can never use frequencies"] answer="A,B,C" hint="Think about what a probability table records." solution="1. True.

True.

False. In fact, frequencies are often the easiest way to use the table method.

Hence the correct answer is

\boxed{A,B,C}

." ::: :::question type="SUB" question="A population has

20\%

smokers and

80\%

non-smokers. Among smokers,

15\%

have a condition. Among non-smokers,

5\%

have the condition. Using a table, find the probability that a randomly chosen person has the condition." answer="

0.07

" hint="Use a base of

100

people." solution="Take a population of

100

. Smokers:

\qquad 20

Non-smokers:

\qquad 80

Condition among smokers:

\qquad 0.15\times 20 = 3

Condition among non-smokers:

\qquad 0.05\times 80 = 4

So total with the condition:

\qquad 3+4=7

Hence the required probability is

\qquad \dfrac{7}{100}=0.07

Therefore the answer is

\boxed{0.07}

." ::: ---

Summary

❗ Key Takeaways for CMI

The table method is ideal for category-based conditional probability problems.

Cells represent intersections; row and column totals represent marginals.

Conditional probability is cell divided by the relevant row or column total.

Frequencies often make tables simpler than raw percentages.

The right structure makes Bayes-type reasoning much easier.

---

💡 Next Up

Proceeding to Reverse probability.

---

Part 3: Reverse probability

Reverse Probability

Overview

Reverse probability problems ask you to work backward from observed information to the hidden cause that produced it. This is the logic behind Bayes-type reasoning. In CMI-style questions, such problems often look simple but are dangerous because human intuition overweights the observed event and underweights the prior chances. ---

Learning Objectives

❗ By the End of This Topic

After studying this topic, you will be able to:

Interpret reverse probability questions correctly.

Apply Bayes' theorem in simple and multi-case situations.

Compute posterior probabilities from prior probabilities and likelihoods.

Handle box-selection, test-diagnosis, and coin-selection problems.

Avoid base-rate neglect.

---

Core Idea

📖 Reverse Probability

A reverse probability problem asks for

$\qquad P(\text{cause} \mid \text{observed effect})$

rather than the forward probability

$\qquad P(\text{effect} \mid \text{cause})$

This reversal is the central difficulty. ---

Bayes' Theorem

📐 Two-Event Form

If $P(B)>0$ , then

$\qquad P(A\mid B)=\dfrac{P(B\mid A)P(A)}{P(B)}$

Here:

$P(A)$ is the prior probability,

$P(B\mid A)$ is the likelihood,

$P(A\mid B)$ is the posterior probability.

::: ---

Partition Form

📐 Multiple-Cause Form

If $A_1,A_2,\dots,A_n$ form a partition of the sample space and $P(B)>0$ , then

$\qquad P(A_i\mid B)=\dfrac{P(B\mid A_i)P(A_i)}{\sum_{j=1}^n P(B\mid A_j)P(A_j)}$

This is the practical form used in most exam problems with several boxes, coins, machines, or hypotheses. ::: ---

Standard Bayes Pattern

💡 How Bayes Problems Are Structured

Choose a hidden cause:

box, coin, machine, disease status, route, source

Observe some event:

red ball, head, defective item, positive test

Work backward using Bayes' theorem.

---

Base Rate Warning

⚠️ Very Common Trap

A highly likely observation under one cause does not automatically make that cause the most probable.

You must also account for how common the cause was before the observation.

This is called the base-rate effect. ---

Minimal Worked Examples

Example 1 A box is chosen uniformly from:

Box 1: $2$ red, $3$ blue

Box 2: $4$ red, $1$ blue

A red ball is drawn. Find the probability that Box 2 was chosen. Let

R

be the event “red ball drawn”. Then $\qquad P(\text{Box 2}\mid R) = \dfrac{P(R\mid \text{Box 2})P(\text{Box 2})}{P(R)}$ Now

\qquad P(R\mid \text{Box 1})=\dfrac25,\qquad P(R\mid \text{Box 2})=\dfrac45

and each box was chosen with probability

\dfrac12

. So

\qquad P(R)=\dfrac12\cdot \dfrac25 + \dfrac12\cdot \dfrac45 = \dfrac35

Hence $\qquad P(\text{Box 2}\mid R) = \dfrac{\frac12\cdot \frac45}{\frac35} = \dfrac{2}{3}$ So the answer is

\boxed{\dfrac23}

. --- Example 2 One coin is chosen uniformly from:

a fair coin,

a two-headed coin,

a coin with $P(H)=\dfrac34$ .

If the chosen coin is tossed twice and both tosses are heads, then the probability that the chosen coin was the two-headed coin is $\qquad \dfrac{1\cdot \frac13}{\frac14\cdot \frac13 + 1\cdot \frac13 + \left(\frac34\right)^2\cdot \frac13} = \dfrac{1}{\frac14 + 1 + \frac{9}{16}} = \dfrac{16}{29}$ So the posterior is

\boxed{\dfrac{16}{29}}

. ---

Common Mistakes

⚠️ Avoid These Errors

❌ Confusing $P(A\mid B)$ with $P(B\mid A)$ .

❌ Forgetting to compute the total probability of the observed event.

❌ Ignoring prior probabilities.

❌ Using intuition instead of the formula in base-rate problems.

---

CMI Strategy

💡 How to Solve Reverse Probability Problems

Name the hidden causes clearly.

Write their prior probabilities.

Compute the probability of the observed event under each cause.

Use Bayes' formula carefully.

Simplify only at the end.

---

Practice Questions

:::question type="MCQ" question="In a Bayes-type problem, the quantity

P(\text{cause}\mid \text{evidence})

is called the" options=["likelihood","prior probability","posterior probability","sample probability"] answer="C" hint="It is the probability after the evidence is observed." solution="The probability of the hidden cause after seeing the evidence is called the posterior probability. Hence the correct option is

\boxed{C}

." ::: :::question type="NAT" question="A box is chosen uniformly from two boxes. Box 1 has

2

red and

3

blue balls, and Box 2 has

4

red and

1

blue ball. A red ball is drawn. Find the probability that Box 2 was chosen." answer="2/3" hint="Apply Bayes' theorem." solution="Let

R

be the event that a red ball is drawn. Then

\qquad P(R\mid B_1)=\dfrac25,\qquad P(R\mid B_2)=\dfrac45

and

\qquad P(B_1)=P(B_2)=\dfrac12

\qquad P(R)=\dfrac12\cdot \dfrac25 + \dfrac12\cdot \dfrac45 = \dfrac35

Therefore $\qquad P(B_2\mid R)=\dfrac{P(R\mid B_2)P(B_2)}{P(R)} = \dfrac{\frac45\cdot \frac12}{\frac35} = \dfrac23$ Hence the answer is

\boxed{\dfrac23}

." ::: :::question type="MSQ" question="Which of the following statements are true?" options=["Bayes' theorem computes

P(\text{cause}\mid \text{evidence})

from forward probabilities","Reverse probability problems often require prior probabilities","

P(A\mid B)

and

P(B\mid A)

are always equal","A rare cause can still have a small posterior probability even if the evidence is likely under that cause"] answer="A,B,D" hint="One statement incorrectly treats conditional probabilities as symmetric." solution="1. True.

True.

False. Conditional probabilities are not symmetric in general.

True. This is the base-rate phenomenon.

Hence the correct answer is

\boxed{A,B,D}

." ::: :::question type="SUB" question="One coin is chosen uniformly from a fair coin, a two-headed coin, and a coin with probability of heads

3/4

. The chosen coin is tossed twice and both tosses are heads. Find the probability that the chosen coin was the two-headed coin." answer="16/29" hint="Use Bayes' theorem with the three possible coins as the hidden causes." solution="Let the three possible coins be:

$C_1$ : fair coin

$C_2$ : two-headed coin

$C_3$ : biased coin with $P(H)=\dfrac34$

Each is chosen with prior probability

\qquad \dfrac13

Let

E

be the event that two tosses both show heads. Then

\qquad P(E\mid C_1)=\left(\dfrac12\right)^2=\dfrac14

\qquad P(E\mid C_2)=1

\qquad P(E\mid C_3)=\left(\dfrac34\right)^2=\dfrac{9}{16}

So the total probability of

E

is $\qquad P(E)=\dfrac13\left(\dfrac14+1+\dfrac{9}{16}\right) = \dfrac13\cdot \dfrac{29}{16} = \dfrac{29}{48}$ Now apply Bayes' theorem: $\qquad P(C_2\mid E)=\dfrac{P(E\mid C_2)P(C_2)}{P(E)} = \dfrac{1\cdot \frac13}{\frac{29}{48}} = \dfrac{16}{29}$ Hence the required probability is

\boxed{\dfrac{16}{29}}

." ::: ---

Summary

❗ Key Takeaways for CMI

Reverse probability means working from observed evidence back to a hidden cause.

Bayes' theorem is the standard tool.

Posterior probability depends on both likelihood and prior probability.

Base-rate effects can make intuition unreliable.

Good Bayes solutions start by naming the hidden causes clearly.

---

💡 Next Up

Proceeding to Diagnostic test problems.

---

Part 4: Diagnostic test problems

Diagnostic Test Problems

Overview

Diagnostic test problems are one of the most important applications of conditional probability and Bayes' theorem. The main difficulty is that people often confuse:

the probability of testing positive given disease, and

the probability of having the disease given a positive test.

These are usually very different. In exam problems, the decisive idea is to combine prevalence, sensitivity, and specificity correctly. ---

Learning Objectives

❗ By the End of This Topic

After studying this topic, you will be able to:

Interpret sensitivity, specificity, false positive rate, and false negative rate correctly.

Compute the probability of a positive or negative test using total probability.

Apply Bayes' theorem to find the probability of disease given a test result.

Understand base-rate effects in rare-disease testing.

Avoid the common mistake of confusing $P(+\mid D)$ with $P(D\mid +)$ .

---

Core Setup

📖 Basic Events

Let

$D$ = the event that a person has the disease

$D^c$ = the event that a person does not have the disease

$+$ = the event that the test result is positive

$-$ = the event that the test result is negative

📐 Prevalence

The prevalence of the disease is

$\qquad P(D)$

and the probability that a person does not have the disease is

$\qquad P(D^c)=1-P(D)$

---

Main Test Quantities

📐 Sensitivity and Specificity

Sensitivity:

\qquad P(+\mid D)

This is the probability that the test correctly identifies a diseased person.

Specificity:

\qquad P(-\mid D^c)

This is the probability that the test correctly identifies a non-diseased person.

📐 False Positive and False Negative Rates

False positive rate:

\qquad P(+\mid D^c)=1-\text{specificity}

False negative rate:

\qquad P(-\mid D)=1-\text{sensitivity}

---

The Most Important Distinction

⚠️ Do Not Confuse These

These are different quantities:

$\qquad P(+\mid D)$ = sensitivity

$\qquad P(D\mid +)$ = probability that a person has the disease given a positive result

The first is about test performance on diseased people.

The second is about what a positive result means for a person.

This distinction is the heart of Bayes-type reasoning. ---

Total Probability for Test Outcomes

📐 Probability of a Positive Test

To compute the overall chance of a positive test, split into diseased and non-diseased cases:

$\qquad P(+)=P(+\mid D)P(D)+P(+\mid D^c)P(D^c)$

📐 Probability of a Negative Test

Similarly,

$\qquad P(-)=P(-\mid D)P(D)+P(-\mid D^c)P(D^c)$

---

Bayes' Theorem

📐 Positive Predictive Value

The probability that a person has the disease given a positive test is

$\qquad P(D\mid +)=\dfrac{P(+\mid D)P(D)}{P(+)}$

Using the total probability formula for $P(+)$ , this becomes

$\qquad P(D\mid +)=\dfrac{P(+\mid D)P(D)}{P(+\mid D)P(D)+P(+\mid D^c)P(D^c)}$

📐 Negative Predictive Value

The probability that a person does not have the disease given a negative test is

$\qquad P(D^c\mid -)=\dfrac{P(-\mid D^c)P(D^c)}{P(-)}$

---

Standard Formula in Parameters

📐 General Formula

Let

prevalence = $p$

sensitivity = $s$

specificity = $c$

Then

\qquad P(D)=p,\quad P(D^c)=1-p

\qquad P(+\mid D)=s,\quad P(+\mid D^c)=1-c

\qquad P(+)=sp+(1-c)(1-p)

and

\qquad P(D\mid +)=\dfrac{sp}{sp+(1-c)(1-p)}

This is the main formula for this topic. ---

Table Method

📐 1000-People or 100000-People Method

In many diagnostic test problems, it is easiest to imagine a sample population.

For example, if

prevalence = $1\%$

sensitivity = $90\%$

specificity = $95\%$

then among

1000

people:

diseased: $\qquad 10$

non-diseased: $\qquad 990$

Among the

10

diseased:

true positives: $\qquad 0.90\times10=9$

false negatives: $\qquad 1$

Among the

990

non-diseased:

true negatives: $\qquad 0.95\times990=940.5$

false positives: $\qquad 49.5$

Then

\qquad P(D\mid +)=\dfrac{\text{true positives}}{\text{all positives}}

This method is often faster than symbolic algebra. ---

Why Rare Diseases Are Tricky

❗ Base-Rate Effect

Even a very accurate test can have a surprisingly low value of $P(D\mid +)$ when the disease is rare.

Reason:

the diseased group is tiny

the healthy group is huge

even a small false positive rate applied to a huge healthy group may create many false positives

This is one of the most important conceptual lessons in probability. ---

Minimal Worked Examples

Example 1 A disease affects

1\%

of a population. A test has sensitivity

90\%

and specificity

95\%

. Find the probability of a positive test. We have

\qquad P(D)=0.01,\quad P(D^c)=0.99

\qquad P(+\mid D)=0.90,\quad P(+\mid D^c)=0.05

\qquad P(+)=0.90\cdot0.01+0.05\cdot0.99

\qquad =0.009+0.0495=0.0585

Hence

\qquad P(+)=\boxed{0.0585}

--- Example 2 Using the same data, find the probability that a person has the disease given a positive test. By Bayes' theorem,

\qquad P(D\mid +)=\dfrac{0.90\cdot0.01}{0.0585}

\qquad =\dfrac{0.009}{0.0585}=\dfrac{2}{13}\approx0.1538

\qquad P(D\mid +)\approx \boxed{0.1538}

This is only about

15.38\%

, even though the test is quite accurate. ---

Common Derived Quantities

📐 Useful Probabilities

True positive probability:

\qquad P(\text{TP})=P(+\mid D)P(D)

False positive probability:

\qquad P(\text{FP})=P(+\mid D^c)P(D^c)

True negative probability:

\qquad P(\text{TN})=P(-\mid D^c)P(D^c)

False negative probability:

\qquad P(\text{FN})=P(-\mid D)P(D)

These help compare how often each outcome occurs in the whole population. ---

Common Mistakes

⚠️ Avoid These Errors

❌ Using sensitivity in place of $P(D\mid +)$

❌ Forgetting to include false positives when computing total positives

❌ Ignoring prevalence

❌ Mixing up specificity with false positive rate

❌ Forgetting that

\qquad P(+\mid D^c)=1-\text{specificity}

---

CMI Strategy

💡 How to Attack Diagnostic Test Questions

Define the events $D$ , $D^c$ , $+$ , and $-$ clearly.

Write prevalence, sensitivity, and specificity first.

Compute $P(+)$ or $P(-)$ using total probability.

Then apply Bayes' theorem.

If the numbers are awkward, use a $1000$ -person or $100000$ -person table.

Always check whether the question is asking for:

P(+\mid D)

P(D\mid +)

P(+)

P(-)

---

Practice Questions

:::question type="MCQ" question="Which of the following equals the sensitivity of a test?" options=["

P(D\mid +)

","

P(+\mid D)

","

P(-\mid D^c)

","

P(D^c\mid -)

"] answer="B" hint="Sensitivity is the probability of a positive test among diseased people." solution="By definition, sensitivity is

\qquad P(+\mid D)

. Hence the correct option is

\boxed{B}

." ::: :::question type="NAT" question="A disease affects

10\%

of a population. A test has sensitivity

80\%

and specificity

90\%

. Find the probability of a positive test." answer="0.17" hint="Use total probability." solution="We have

\qquad P(D)=0.10,\quad P(D^c)=0.90

\qquad P(+\mid D)=0.80,\quad P(+\mid D^c)=0.10

\qquad P(+)=0.80\cdot0.10+0.10\cdot0.90

\qquad =0.08+0.09=0.17

Hence the answer is

\boxed{0.17}

." ::: :::question type="MSQ" question="Which of the following statements are true?" options=["Specificity is

P(-\mid D^c)

","False positive rate is

P(+\mid D^c)

","Positive predictive value is

P(D\mid +)

","Sensitivity is

P(D\mid +)

"] answer="A,B,C" hint="Separate test-quality quantities from posterior probabilities." solution="1. True. This is the definition of specificity.

True. This is the definition of the false positive rate.

True. Positive predictive value is the probability of disease given a positive test.

False. Sensitivity is

P(+\mid D)

, not

P(D\mid +)

Hence the correct answer is

\boxed{A,B,C}

." ::: :::question type="SUB" question="A disease affects

1\%

of a population. A test has sensitivity

99\%

and specificity

99\%

. Compute the probability that a person has the disease given that the test is positive." answer="

0.5

" hint="Use Bayes' theorem." solution="We have

\qquad P(D)=0.01,\quad P(D^c)=0.99

Also

\qquad P(+\mid D)=0.99,\quad P(+\mid D^c)=0.01

\qquad P(+)=0.99\cdot0.01+0.01\cdot0.99=0.0198

Now by Bayes' theorem, $\qquad P(D\mid +)=\dfrac{0.99\cdot0.01}{0.0198} =\dfrac{0.0099}{0.0198}=0.5$ Hence the answer is

\boxed{0.5}

." ::: ---

Summary

❗ Key Takeaways for CMI

Sensitivity is $P(+\mid D)$ and specificity is $P(-\mid D^c)$ .

Use total probability to compute $P(+)$ and $P(-)$ .

Use Bayes' theorem to compute $P(D\mid +)$ and $P(D^c\mid -)$ .

Positive predictive value can be much smaller than sensitivity when prevalence is low.

Diagnostic test questions are really conditional probability questions with careful interpretation.

Chapter Summary

Bayes-type reasoning — Key Points

* Conditional Probability Foundation: Conditional probability $P(A|B) = P(A \cap B) / P(B)$ quantifies the likelihood of event A occurring given that event B has already occurred. Understanding this distinction is crucial.
* Law of Total Probability: This theorem, $P(B) = \sum_{i} P(B|A_i)P(A_i)$ for a partition $\{A_i\}$ , is fundamental for calculating the marginal probability of an event B and often forms the denominator in Bayes' Theorem.
* Bayes' Theorem: $P(A|B) = \frac{P(B|A)P(A)}{P(B)}$ provides a rigorous framework for updating prior beliefs $P(A)$ to posterior beliefs $P(A|B)$ based on new evidence $B$ . This concept of "reverse probability" is central to the chapter.
* Tree Diagrams: An indispensable tool for visualizing sequential events, partitioning sample spaces, and systematically calculating joint and conditional probabilities, especially useful for understanding the flow of events in multi-stage problems.
* Contingency Tables: For problems involving multiple categorizations (e.g., disease status and test results), constructing a contingency table effectively organizes data, clarifies relationships, and simplifies the calculation of various conditional probabilities.
* Diagnostic Test Problems: A common application where sensitivity ( $P(\text{positive test}|\text{disease})$ ) and specificity ( $P(\text{negative test}|\text{no disease})$ ) must be carefully distinguished from the positive predictive value ( $P(\text{disease}|\text{positive test})$ ) and negative predictive value ( $P(\text{no disease}|\text{negative test})$ ).
* Interpretation of Results: Beyond calculation, interpreting the updated probabilities (posterior probabilities) in the context of the problem is vital for drawing meaningful conclusions and demonstrating conceptual understanding.

Chapter Review Questions

:::question type="MCQ" question="A rare disease affects 0.1% of the population. A diagnostic test for this disease has a sensitivity of 99% and a specificity of 95%. If a randomly selected person tests positive, what is the probability that they actually have the disease?" options=["Approximately 1.94%", "Approximately 0.099%", "Approximately 99%", "Approximately 5%"] answer="Approximately 1.94%" hint="Use Bayes' Theorem. Let D be the event of having the disease and T+ be the event of testing positive. You need to find $P(D|T+)$ . Consider the prevalence, sensitivity, and specificity to calculate $P(T+|D)$ , $P(D)$ , $P(T+|D^c)$ , and $P(D^c)$ ." solution="Let D be the event that a person has the disease, and T+ be the event that they test positive.
Given:
$P(D) = 0.001$ (prevalence)
$P(D^c) = 1 - P(D) = 0.999$
$P(T+|D) = 0.99$ (sensitivity)
$P(T^c|D^c) = 0.95$ (specificity)
From specificity, $P(T+|D^c) = 1 - P(T^c|D^c) = 1 - 0.95 = 0.05$ .

We want to find $P(D|T+)$ . Using Bayes' Theorem:

P(D|T+) = \frac{P(T+|D)P(D)}{P(T+)}

First, calculate

P(T+)

using the Law of Total Probability:

P(T+) = P(T+|D)P(D) + P(T+|D^c)P(D^c)

P(T+) = (0.99)(0.001) + (0.05)(0.999)

P(T+) = 0.00099 + 0.04995

P(T+) = 0.05094

Now, substitute into Bayes' Theorem:

P(D|T+) = \frac{(0.99)(0.001)}{0.05094} = \frac{0.00099}{0.05094} \approx 0.019434

Converting to percentage, this is approximately 1.94%."
:::

:::question type="NAT" question="Urn A contains 4 red and 6 blue balls. Urn B contains 7 red and 3 blue balls. A fair coin is flipped; if it lands heads, a ball is drawn from Urn A, and if tails, from Urn B. What is the probability that the ball drawn is red?" answer="0.55" hint="Use the Law of Total Probability. Define events for selecting each urn and drawing a red ball from each." solution="Let A be the event that Urn A is chosen, and B be the event that Urn B is chosen.
Since a fair coin is flipped:
$P(A) = 0.5$
$P(B) = 0.5$

Let R be the event that a red ball is drawn.
From Urn A: $P(R|A) = \frac{4}{4+6} = \frac{4}{10} = 0.4$
From Urn B: $P(R|B) = \frac{7}{7+3} = \frac{7}{10} = 0.7$

Using the Law of Total Probability:

P(R) = P(R|A)P(A) + P(R|B)P(B)

P(R) = (0.4)(0.5) + (0.7)(0.5)

P(R) = 0.20 + 0.35

P(R) = 0.55

"
:::

:::question type="MCQ" question="A factory has two machines, M1 and M2, which produce 60% and 40% of the total output, respectively. Machine M1 produces 3% defective items, while Machine M2 produces 5% defective items. If a randomly selected item is found to be defective, what is the probability that it was produced by Machine M2?" options=["0.05", "0.40", "0.5263", "0.02"] answer="0.5263" hint="Apply Bayes' Theorem. Let D be the event that an item is defective. You need to find $P(M2|D)$ ." solution="Let M1 be the event an item is from Machine 1, and M2 be the event an item is from Machine 2.
Let D be the event that an item is defective.

Given:
$P(M1) = 0.60$
$P(M2) = 0.40$
$P(D|M1) = 0.03$
$P(D|M2) = 0.05$

We want to find $P(M2|D)$ . Using Bayes' Theorem:

P(M2|D) = \frac{P(D|M2)P(M2)}{P(D)}

First, calculate

P(D)

using the Law of Total Probability:

P(D) = P(D|M1)P(M1) + P(D|M2)P(M2)

P(D) = (0.03)(0.60) + (0.05)(0.40)

P(D) = 0.018 + 0.020

P(D) = 0.038

Now, substitute into Bayes' Theorem:

P(M2|D) = \frac{(0.05)(0.40)}{0.038} = \frac{0.020}{0.038} \approx 0.526315

Rounding to four decimal places,

P(M2|D) \approx 0.5263

."
:::

What's Next?

Continue Your CMI Journey

With a solid understanding of Bayes-type reasoning and conditional probability, you are well-prepared to delve into the broader landscape of probability theory. The concepts learned here, particularly the foundational idea of updating beliefs with new information, are crucial for future chapters. You should now proceed to explore Discrete Random Variables and their Probability Distributions, followed by Continuous Random Variables and their Probability Distributions. These topics build directly upon the principles of probability to introduce methods for quantifying uncertainty and variability, leading naturally into Expected Value and Variance and eventually Sampling Distributions and Statistical Inference.

Bayes-type reasoning

Bayes-type reasoning

Chapter Contents

| Topic |

Part 1: Tree method

Tree Method

Overview

Learning Objectives

Core Idea

Main Rules

Why the Tree Method Works Well

Bayes-Type Reasoning from a Tree

Minimal Worked Example

Common Tree Structures

Drawing the Tree Correctly

Common Mistakes

CMI Strategy

Practice Questions

Summary

Part 2: Table method

Table Method

Overview

Learning Objectives

Core Idea

Why the Table Method Works

Main Formula in Table Language

Frequencies Often Make Tables Easier

Minimal Worked Example

Table Structure Example

Table Method vs Tree Method

Common Patterns

Common Mistakes

CMI Strategy

Practice Questions

Summary

Part 3: Reverse probability

Reverse Probability

Overview

Learning Objectives

Core Idea

Bayes' Theorem

Partition Form

Standard Bayes Pattern

Base Rate Warning

Minimal Worked Examples

Common Mistakes

CMI Strategy

Practice Questions

Summary

Part 4: Diagnostic test problems

Diagnostic Test Problems

Overview

Learning Objectives

Core Setup

Main Test Quantities

The Most Important Distinction

Total Probability for Test Outcomes

Bayes' Theorem

Standard Formula in Parameters

Table Method

Why Rare Diseases Are Tricky

Minimal Worked Examples

Common Derived Quantities

Common Mistakes

CMI Strategy

Practice Questions

Summary

Chapter Summary

Bayes-type reasoning — Key Points

Chapter Review Questions

What's Next?

Continue Your CMI Journey

🎯 Key Points to Remember

Related Topics in Probability

Counting-based probability

Events and sample space

Discrete random variables

Conditional events

More Resources

Study Notes