100% FREE Updated: Apr 2026 Probability Conditional Probability

Bayes-type reasoning

Comprehensive study notes on Bayes-type reasoning for CMI BS Hons preparation. This chapter covers key concepts, formulas, and examples needed for your exam.

Bayes-type reasoning

This chapter rigorously explores Bayes' theorem and its application to conditional probability problems. Mastery of the tree and table methods, along with an understanding of reverse probability and diagnostic test scenarios, is critical for success in advanced probability examinations.

---

Chapter Contents

|

| Topic |

|---|-------| | 1 | Tree method | | 2 | Table method | | 3 | Reverse probability | | 4 | Diagnostic test problems |

---

We begin with Tree method.

Part 1: Tree method

Tree Method

Overview

The tree method is one of the clearest ways to organise conditional probability. It is especially useful when events happen in stages, when a probability changes after an earlier event, or when we want to compute a reverse probability using Bayes-type reasoning. In exam problems, the main goal is not drawing a decorative tree, but using the tree to track all branches correctly and read probabilities from it with precision. ---

Learning Objectives

❗ By the End of This Topic

After studying this topic, you will be able to:

  • Draw a probability tree for a multi-stage experiment.

  • Label branch probabilities correctly.

  • Compute the probability of a complete path by multiplying along the branches.

  • Compute the probability of an event by adding relevant path probabilities.

  • Use the tree method to solve Bayes-type reverse-probability questions.

---

Core Idea

πŸ“– Probability Tree

A probability tree is a branching diagram used to represent sequential events.

Each level of the tree represents a stage of the experiment, and each branch is labeled by the conditional probability of that outcome given the previous stage.

For example, if an experiment happens in two stages:
  • first choose source/type/category
  • then observe result/outcome
then the tree keeps track of both the first-stage probabilities and the conditional second-stage probabilities. ---

Main Rules

πŸ“ Path Multiplication Rule

The probability of a complete path is the product of the probabilities written along that path.

If a path is Aβ†’B\qquad A \to B then P(A∩B)=P(A) P(B∣A)\qquad P(A \cap B) = P(A)\,P(B \mid A) :::
πŸ“ Event Addition Rule from a Tree

If an event can happen through several disjoint paths, then its probability is the sum of the probabilities of those paths.

So if event EE happens through paths 1,2,…,k1,2,\dots,k, then P(E)=P(pathΒ 1)+P(pathΒ 2)+β‹―+P(pathΒ k)\qquad P(E) = P(\text{path }1)+P(\text{path }2)+\cdots+P(\text{path }k) ::: ---

Why the Tree Method Works Well

πŸ’‘ What the Tree Gives You

A good tree makes three things visible:

  • stage-by-stage dependence

  • complete-path probabilities

  • all possible ways an event can happen

This is why it is ideal for:
  • conditional probability
  • repeated draws
  • test accuracy problems
  • disease-screening problems
  • manufacturing/source-identification problems
::: ---

Bayes-Type Reasoning from a Tree

πŸ“ Reverse Probability

Suppose an outcome BB can arise from multiple starting categories A1,A2,…,AnA_1,A_2,\dots,A_n.

Then

P(Ai∣B)=P(Ai∩B)P(B)\qquad P(A_i \mid B) = \dfrac{P(A_i \cap B)}{P(B)}

Using the tree:
  • numerator = one specific path probability
  • denominator = sum of all paths leading to BB
::: This is the most common tree-method use in Bayes-type problems. ---

Minimal Worked Example

Example 1 A box is chosen from two boxes:
  • Box 1 with probability 13\dfrac{1}{3}
  • Box 2 with probability 23\dfrac{2}{3}
Then a ball is drawn.
  • From Box 1, the probability of red is 35\dfrac{3}{5}
  • From Box 2, the probability of red is 14\dfrac{1}{4}
Find the probability of getting a red ball. Using the tree: Path 1: BoxΒ 1β†’Red\qquad \text{Box 1} \to \text{Red} Probability: 13β‹…35=15\qquad \dfrac{1}{3}\cdot \dfrac{3}{5} = \dfrac{1}{5} Path 2: BoxΒ 2β†’Red\qquad \text{Box 2} \to \text{Red} Probability: 23β‹…14=16\qquad \dfrac{2}{3}\cdot \dfrac{1}{4} = \dfrac{1}{6} Add the disjoint red paths: P(Red)=15+16=1130\qquad P(\text{Red}) = \dfrac{1}{5}+\dfrac{1}{6}=\dfrac{11}{30} So the answer is 1130\qquad \boxed{\dfrac{11}{30}} ---

Common Tree Structures

πŸ’‘ Recognize These Patterns

  • source β†’\to outcome

  • disease status β†’\to test result

  • first draw β†’\to second draw

  • biased choice β†’\to success/failure

  • machine chosen β†’\to defective/non-defective

---

Drawing the Tree Correctly

❗ Good Tree Discipline

When drawing a tree:

  • every stage must be clearly separated

  • branch probabilities leaving a node must add to 11

  • conditional labels must match the branch's parent node

  • final answers should come from path multiplication and path addition

---

Common Mistakes

⚠️ Avoid These Errors
    • ❌ multiplying probabilities from different paths together
βœ… multiply only along a single path
    • ❌ forgetting that second-stage probabilities may be conditional
βœ… read each branch from its parent node
    • ❌ adding probabilities of non-disjoint events without care
βœ… only add separate final paths for the target event
    • ❌ using a tree when the stages are not actually sequential
βœ… tree method is for multi-stage or condition-based structure
---

CMI Strategy

πŸ’‘ How to Attack Tree-Method Questions

  • Identify the stages clearly.

  • Put first-stage probabilities on the first split.

  • Put conditional probabilities on the next split.

  • Multiply along paths.

  • Add the relevant final paths.

  • For reverse probability, divide the wanted path by the total probability of the observed event.

---

Practice Questions

:::question type="MCQ" question="In a probability tree, the probability of a complete path is found by" options=["adding the branch probabilities on that path","multiplying the branch probabilities on that path","subtracting the branch probabilities on that path","taking the average of the branch probabilities on that path"] answer="B" hint="Use the multiplication rule for sequential events." solution="For sequential events, the probability of a full path is the product of the probabilities along that path. Therefore the correct option is B\boxed{B}." ::: :::question type="NAT" question="A box is chosen: Box 1 with probability 12\dfrac{1}{2} and Box 2 with probability 12\dfrac{1}{2}. From Box 1, the probability of a red ball is 34\dfrac{3}{4}; from Box 2, it is 14\dfrac{1}{4}. Find the probability of drawing a red ball." answer="1/2" hint="Add the red paths." solution="There are two red paths. From Box 1: 12β‹…34=38\qquad \dfrac{1}{2}\cdot \dfrac{3}{4}=\dfrac{3}{8} From Box 2: 12β‹…14=18\qquad \dfrac{1}{2}\cdot \dfrac{1}{4}=\dfrac{1}{8} So P(Red)=38+18=12\qquad P(\text{Red})=\dfrac{3}{8}+\dfrac{1}{8}=\dfrac{1}{2} Hence the answer is 12\boxed{\dfrac{1}{2}}." ::: :::question type="MSQ" question="Which of the following statements are true?" options=["Branch probabilities leaving the same node should add to 11","A path probability is found by multiplying along the path","A tree method is useful for Bayes-type problems","In a tree, all probabilities must be equal"] answer="A,B,C" hint="Think about how a probability tree is built." solution="1. True.
  • True.
  • True.
  • False. Tree probabilities need not be equal.
  • Hence the correct answer is A,B,C\boxed{A,B,C}." ::: :::question type="SUB" question="A factory has two machines. Machine AA produces 40%40\% of the items and Machine BB produces 60%60\% of the items. The defective rates are 5%5\% for AA and 2%2\% for BB. Using a probability tree, find the probability that a randomly chosen item is defective." answer="0.0320.032" hint="Compute the defective path from each machine and add." solution="There are two defective paths. From Machine AA: P(A∩D)=0.4Γ—0.05=0.02\qquad P(A \cap D)=0.4\times 0.05 = 0.02 From Machine BB: P(B∩D)=0.6Γ—0.02=0.012\qquad P(B \cap D)=0.6\times 0.02 = 0.012 Therefore P(D)=0.02+0.012=0.032\qquad P(D)=0.02+0.012=0.032 Hence the required probability is 0.032\boxed{0.032}." ::: ---

    Summary

    ❗ Key Takeaways for CMI

    • The tree method organizes conditional probability stage by stage.

    • Multiply along a path and add across relevant disjoint paths.

    • Branches from the same node must total 11.

    • Tree diagrams are especially effective in Bayes-type reasoning.

    • A correct tree prevents logical mixing of cases.

    ---

    πŸ’‘ Next Up

    Proceeding to Table method.

    ---

    Part 2: Table method

    Table Method

    Overview

    The table method is a compact and powerful way to solve conditional probability problems when the information naturally falls into categories. It is especially effective in Bayes-type reasoning, test-result problems, and classification problems. Instead of following paths stage by stage as in a tree, the table method organizes outcomes into rows and columns and lets us read totals, intersections, and conditional probabilities directly. ---

    Learning Objectives

    ❗ By the End of This Topic

    After studying this topic, you will be able to:

    • Construct a probability or frequency table from given data.

    • Fill row totals, column totals, and internal cells correctly.

    • Use the table to compute conditional probabilities.

    • Solve Bayes-type reverse-probability questions using table entries.

    • Move cleanly between percentages, frequencies, and probabilities.

    ---

    Core Idea

    πŸ“– Probability Table

    A probability table organizes events into categories so that:

      • rows represent one classification

      • columns represent another classification

      • each cell represents an intersection event

      • row and column totals represent marginal probabilities

    For example, in a medical-testing problem:
    • rows may represent Disease / No Disease
    • columns may represent Positive / Negative test result
    ::: ---

    Why the Table Method Works

    πŸ’‘ What the Table Gives You

    The table method makes three things easy:

    • seeing intersection probabilities such as P(A∩B)P(A \cap B)

    • seeing marginal totals such as P(B)P(B)

    • computing conditional probabilities such as

    P(A∣B)=P(A∩B)P(B)\qquad P(A \mid B)=\dfrac{P(A \cap B)}{P(B)}

    This is why it is very useful when the problem has a classification structure rather than a natural time order. ::: ---

    Main Formula in Table Language

    πŸ“ Conditional Probability from a Table

    If a table gives you:

      • intersection entry P(A∩B)P(A \cap B)

      • column or row total P(B)P(B)


    then

    P(A∣B)=P(A∩B)P(B)\qquad P(A \mid B)=\dfrac{P(A \cap B)}{P(B)}

    In words:
    • numerator = the favorable cell
    • denominator = the total of the conditioning category
    ::: ---

    Frequencies Often Make Tables Easier

    ❗ Use Convenient Totals

    In many problems with percentages, it is easier to imagine a sample of:

      • 100100

      • 10001000

      • 1000010000


    Then fill the table with frequencies instead of decimals.

    At the end, convert back to probability if needed.

    This often makes Bayes-type questions much clearer. ---

    Minimal Worked Example

    Example 1 A disease occurs in 10%10\% of a population. A test is positive:
    • in 80%80\% of diseased people
    • in 20%20\% of non-diseased people
    Find the probability that a person actually has the disease given that the test is positive. Take a population of 100100. Disease / No Disease counts:
    • Disease: 1010
    • No Disease: 9090
    Positive counts:
    • Diseased and positive: 0.8Γ—10=8\qquad 0.8\times 10 = 8
    • Non-diseased and positive: 0.2Γ—90=18\qquad 0.2\times 90 = 18
    So total positive: 8+18=26\qquad 8+18=26 Thus P(Disease∣Positive)=826=413\qquad P(\text{Disease} \mid \text{Positive})=\dfrac{8}{26}=\dfrac{4}{13} So the answer is 413\qquad \boxed{\dfrac{4}{13}} ---

    Table Structure Example

    πŸ“ Typical Layout

    A standard 2Γ—22\times 2 table looks like this:

    | Category | Positive | Negative | Total |
    |---|---:|---:|---:|
    | Disease | P(D∩P)P(D \cap P) | P(D∩N)P(D \cap N) | P(D)P(D) |
    | No Disease | P(Dc∩P)P(D^c \cap P) | P(Dc∩N)P(D^c \cap N) | P(Dc)P(D^c) |
    | Total | P(P)P(P) | P(N)P(N) | 11 |

    This is often the fastest way to organize the information. ::: ---

    Table Method vs Tree Method

    ❗ When a Table is Better

    Use the table method when:

      • the problem is classification-based

      • you want totals and subtotals quickly

      • Bayes-type reverse probability is asked

      • data is already presented in percentage or count form

    Use the tree method when:
    • the experiment is sequential
    • stages happen one after another
    ::: ---

    Common Patterns

    πŸ’‘ Typical Exam Patterns

    • disease / test result

    • machine / defective status

    • class membership / success-failure

    • source / observed outcome

    • frequency table completion

    ---

    Common Mistakes

    ⚠️ Avoid These Errors
      • ❌ mixing row totals and column totals
    βœ… label the table clearly before filling it
      • ❌ using percentages directly without a consistent base
    βœ… choose 100100 or 10001000 if needed
      • ❌ dividing by the wrong total in conditional probability
    βœ… denominator must match the condition
      • ❌ forcing a tree when a table is simpler
    βœ… use the structure that matches the data
    ---

    CMI Strategy

    πŸ’‘ How to Attack Table-Method Questions

    • Identify the two classifications.

    • Draw the table with clear row and column labels.

    • Fill the easy totals first.

    • Fill intersection cells using the given rates.

    • Use row/column totals for conditional probability.

    • Check that the full total is consistent.

    ---

    Practice Questions

    :::question type="MCQ" question="In a conditional probability table, the denominator of P(A∣B)P(A \mid B) should be" options=["the grand total","the total corresponding to BB","the total corresponding to AA","the sum of all unfavorable cells"] answer="B" hint="Use the definition of conditional probability." solution="By definition, P(A∣B)=P(A∩B)P(B)\qquad P(A \mid B)=\dfrac{P(A \cap B)}{P(B)} So the denominator is the total corresponding to BB. Hence the correct option is B\boxed{B}." ::: :::question type="NAT" question="In a school of 100100 students, 4040 are girls. Among the girls, 3030 play chess. Among the boys, 2020 play chess. Find the probability that a randomly chosen student plays chess." answer="1/2" hint="Fill the chess counts and divide by 100100." solution="Girls who play chess: 30\qquad 30 Boys in the school: 100βˆ’40=60\qquad 100-40=60 Boys who play chess: 20\qquad 20 Total students who play chess: 30+20=50\qquad 30+20=50 Therefore the required probability is 50100=12\qquad \dfrac{50}{100}=\dfrac{1}{2} Hence the answer is 12\boxed{\dfrac{1}{2}}." ::: :::question type="MSQ" question="Which of the following statements are true?" options=["A table method is useful for Bayes-type questions","A table can organize intersection events and totals together","Conditional probability is obtained by dividing a relevant cell by the relevant total","A table method can never use frequencies"] answer="A,B,C" hint="Think about what a probability table records." solution="1. True.
  • True.
  • True.
  • False. In fact, frequencies are often the easiest way to use the table method.
  • Hence the correct answer is A,B,C\boxed{A,B,C}." ::: :::question type="SUB" question="A population has 20%20\% smokers and 80%80\% non-smokers. Among smokers, 15%15\% have a condition. Among non-smokers, 5%5\% have the condition. Using a table, find the probability that a randomly chosen person has the condition." answer="0.070.07" hint="Use a base of 100100 people." solution="Take a population of 100100. Smokers: 20\qquad 20 Non-smokers: 80\qquad 80 Condition among smokers: 0.15Γ—20=3\qquad 0.15\times 20 = 3 Condition among non-smokers: 0.05Γ—80=4\qquad 0.05\times 80 = 4 So total with the condition: 3+4=7\qquad 3+4=7 Hence the required probability is 7100=0.07\qquad \dfrac{7}{100}=0.07 Therefore the answer is 0.07\boxed{0.07}." ::: ---

    Summary

    ❗ Key Takeaways for CMI

    • The table method is ideal for category-based conditional probability problems.

    • Cells represent intersections; row and column totals represent marginals.

    • Conditional probability is cell divided by the relevant row or column total.

    • Frequencies often make tables simpler than raw percentages.

    • The right structure makes Bayes-type reasoning much easier.

    ---

    πŸ’‘ Next Up

    Proceeding to Reverse probability.

    ---

    Part 3: Reverse probability

    Reverse Probability

    Overview

    Reverse probability problems ask you to work backward from observed information to the hidden cause that produced it. This is the logic behind Bayes-type reasoning. In CMI-style questions, such problems often look simple but are dangerous because human intuition overweights the observed event and underweights the prior chances. ---

    Learning Objectives

    ❗ By the End of This Topic

    After studying this topic, you will be able to:

    • Interpret reverse probability questions correctly.

    • Apply Bayes' theorem in simple and multi-case situations.

    • Compute posterior probabilities from prior probabilities and likelihoods.

    • Handle box-selection, test-diagnosis, and coin-selection problems.

    • Avoid base-rate neglect.

    ---

    Core Idea

    πŸ“– Reverse Probability

    A reverse probability problem asks for

    P(cause∣observed effect)\qquad P(\text{cause} \mid \text{observed effect})

    rather than the forward probability

    P(effect∣cause)\qquad P(\text{effect} \mid \text{cause})

    This reversal is the central difficulty. ---

    Bayes' Theorem

    πŸ“ Two-Event Form

    If P(B)>0P(B)>0, then

    P(A∣B)=P(B∣A)P(A)P(B)\qquad P(A\mid B)=\dfrac{P(B\mid A)P(A)}{P(B)}

    Here:
    • P(A)P(A) is the prior probability,
    • P(B∣A)P(B\mid A) is the likelihood,
    • P(A∣B)P(A\mid B) is the posterior probability.
    ::: ---

    Partition Form

    πŸ“ Multiple-Cause Form

    If A1,A2,…,AnA_1,A_2,\dots,A_n form a partition of the sample space and P(B)>0P(B)>0, then

    P(Ai∣B)=P(B∣Ai)P(Ai)βˆ‘j=1nP(B∣Aj)P(Aj)\qquad P(A_i\mid B)=\dfrac{P(B\mid A_i)P(A_i)}{\sum_{j=1}^n P(B\mid A_j)P(A_j)}

    This is the practical form used in most exam problems with several boxes, coins, machines, or hypotheses. ::: ---

    Standard Bayes Pattern

    πŸ’‘ How Bayes Problems Are Structured

    • Choose a hidden cause:

    box, coin, machine, disease status, route, source

    • Observe some event:

    red ball, head, defective item, positive test

    • Work backward using Bayes' theorem.

    ---

    Base Rate Warning

    ⚠️ Very Common Trap

    A highly likely observation under one cause does not automatically make that cause the most probable.

    You must also account for how common the cause was before the observation.

    This is called the base-rate effect. ---

    Minimal Worked Examples

    Example 1 A box is chosen uniformly from:
    • Box 1: 22 red, 33 blue
    • Box 2: 44 red, 11 blue
    A red ball is drawn. Find the probability that Box 2 was chosen. Let RR be the event β€œred ball drawn”. Then $\qquad P(\text{Box 2}\mid R) = \dfrac{P(R\mid \text{Box 2})P(\text{Box 2})}{P(R)}$ Now P(R∣BoxΒ 1)=25,P(R∣BoxΒ 2)=45\qquad P(R\mid \text{Box 1})=\dfrac25,\qquad P(R\mid \text{Box 2})=\dfrac45 and each box was chosen with probability 12\dfrac12. So P(R)=12β‹…25+12β‹…45=35\qquad P(R)=\dfrac12\cdot \dfrac25 + \dfrac12\cdot \dfrac45 = \dfrac35 Hence $\qquad P(\text{Box 2}\mid R) = \dfrac{\frac12\cdot \frac45}{\frac35} = \dfrac{2}{3}$ So the answer is 23\boxed{\dfrac23}. --- Example 2 One coin is chosen uniformly from:
    • a fair coin,
    • a two-headed coin,
    • a coin with P(H)=34P(H)=\dfrac34.
    If the chosen coin is tossed twice and both tosses are heads, then the probability that the chosen coin was the two-headed coin is $\qquad \dfrac{1\cdot \frac13}{\frac14\cdot \frac13 + 1\cdot \frac13 + \left(\frac34\right)^2\cdot \frac13} = \dfrac{1}{\frac14 + 1 + \frac{9}{16}} = \dfrac{16}{29}$ So the posterior is 1629\boxed{\dfrac{16}{29}}. ---

    Common Mistakes

    ⚠️ Avoid These Errors
      • ❌ Confusing P(A∣B)P(A\mid B) with P(B∣A)P(B\mid A).
      • ❌ Forgetting to compute the total probability of the observed event.
      • ❌ Ignoring prior probabilities.
      • ❌ Using intuition instead of the formula in base-rate problems.
    ---

    CMI Strategy

    πŸ’‘ How to Solve Reverse Probability Problems

    • Name the hidden causes clearly.

    • Write their prior probabilities.

    • Compute the probability of the observed event under each cause.

    • Use Bayes' formula carefully.

    • Simplify only at the end.

    ---

    Practice Questions

    :::question type="MCQ" question="In a Bayes-type problem, the quantity P(cause∣evidence)P(\text{cause}\mid \text{evidence}) is called the" options=["likelihood","prior probability","posterior probability","sample probability"] answer="C" hint="It is the probability after the evidence is observed." solution="The probability of the hidden cause after seeing the evidence is called the posterior probability. Hence the correct option is C\boxed{C}." ::: :::question type="NAT" question="A box is chosen uniformly from two boxes. Box 1 has 22 red and 33 blue balls, and Box 2 has 44 red and 11 blue ball. A red ball is drawn. Find the probability that Box 2 was chosen." answer="2/3" hint="Apply Bayes' theorem." solution="Let RR be the event that a red ball is drawn. Then P(R∣B1)=25,P(R∣B2)=45\qquad P(R\mid B_1)=\dfrac25,\qquad P(R\mid B_2)=\dfrac45 and P(B1)=P(B2)=12\qquad P(B_1)=P(B_2)=\dfrac12 So P(R)=12β‹…25+12β‹…45=35\qquad P(R)=\dfrac12\cdot \dfrac25 + \dfrac12\cdot \dfrac45 = \dfrac35 Therefore $\qquad P(B_2\mid R)=\dfrac{P(R\mid B_2)P(B_2)}{P(R)} = \dfrac{\frac45\cdot \frac12}{\frac35} = \dfrac23$ Hence the answer is 23\boxed{\dfrac23}." ::: :::question type="MSQ" question="Which of the following statements are true?" options=["Bayes' theorem computes P(cause∣evidence)P(\text{cause}\mid \text{evidence}) from forward probabilities","Reverse probability problems often require prior probabilities","P(A∣B)P(A\mid B) and P(B∣A)P(B\mid A) are always equal","A rare cause can still have a small posterior probability even if the evidence is likely under that cause"] answer="A,B,D" hint="One statement incorrectly treats conditional probabilities as symmetric." solution="1. True.
  • True.
  • False. Conditional probabilities are not symmetric in general.
  • True. This is the base-rate phenomenon.
  • Hence the correct answer is A,B,D\boxed{A,B,D}." ::: :::question type="SUB" question="One coin is chosen uniformly from a fair coin, a two-headed coin, and a coin with probability of heads 3/43/4. The chosen coin is tossed twice and both tosses are heads. Find the probability that the chosen coin was the two-headed coin." answer="16/29" hint="Use Bayes' theorem with the three possible coins as the hidden causes." solution="Let the three possible coins be:
    • C1C_1: fair coin
    • C2C_2: two-headed coin
    • C3C_3: biased coin with P(H)=34P(H)=\dfrac34
    Each is chosen with prior probability 13\qquad \dfrac13 Let EE be the event that two tosses both show heads. Then P(E∣C1)=(12)2=14\qquad P(E\mid C_1)=\left(\dfrac12\right)^2=\dfrac14 P(E∣C2)=1\qquad P(E\mid C_2)=1 P(E∣C3)=(34)2=916\qquad P(E\mid C_3)=\left(\dfrac34\right)^2=\dfrac{9}{16} So the total probability of EE is $\qquad P(E)=\dfrac13\left(\dfrac14+1+\dfrac{9}{16}\right) = \dfrac13\cdot \dfrac{29}{16} = \dfrac{29}{48}$ Now apply Bayes' theorem: $\qquad P(C_2\mid E)=\dfrac{P(E\mid C_2)P(C_2)}{P(E)} = \dfrac{1\cdot \frac13}{\frac{29}{48}} = \dfrac{16}{29}$ Hence the required probability is 1629\boxed{\dfrac{16}{29}}." ::: ---

    Summary

    ❗ Key Takeaways for CMI

    • Reverse probability means working from observed evidence back to a hidden cause.

    • Bayes' theorem is the standard tool.

    • Posterior probability depends on both likelihood and prior probability.

    • Base-rate effects can make intuition unreliable.

    • Good Bayes solutions start by naming the hidden causes clearly.

    ---

    πŸ’‘ Next Up

    Proceeding to Diagnostic test problems.

    ---

    Part 4: Diagnostic test problems

    Diagnostic Test Problems

    Overview

    Diagnostic test problems are one of the most important applications of conditional probability and Bayes' theorem. The main difficulty is that people often confuse:
    • the probability of testing positive given disease, and
    • the probability of having the disease given a positive test.
    These are usually very different. In exam problems, the decisive idea is to combine prevalence, sensitivity, and specificity correctly. ---

    Learning Objectives

    ❗ By the End of This Topic

    After studying this topic, you will be able to:

    • Interpret sensitivity, specificity, false positive rate, and false negative rate correctly.

    • Compute the probability of a positive or negative test using total probability.

    • Apply Bayes' theorem to find the probability of disease given a test result.

    • Understand base-rate effects in rare-disease testing.

    • Avoid the common mistake of confusing P(+∣D)P(+\mid D) with P(D∣+)P(D\mid +).

    ---

    Core Setup

    πŸ“– Basic Events

    Let

      • DD = the event that a person has the disease

      • DcD^c = the event that a person does not have the disease

      • ++ = the event that the test result is positive

      • βˆ’- = the event that the test result is negative

    πŸ“ Prevalence

    The prevalence of the disease is

    P(D)\qquad P(D)

    and the probability that a person does not have the disease is

    P(Dc)=1βˆ’P(D)\qquad P(D^c)=1-P(D)

    ---

    Main Test Quantities

    πŸ“ Sensitivity and Specificity
      • Sensitivity:
    P(+∣D)\qquad P(+\mid D) This is the probability that the test correctly identifies a diseased person.
      • Specificity:
    P(βˆ’βˆ£Dc)\qquad P(-\mid D^c) This is the probability that the test correctly identifies a non-diseased person.
    πŸ“ False Positive and False Negative Rates
      • False positive rate:
    P(+∣Dc)=1βˆ’specificity\qquad P(+\mid D^c)=1-\text{specificity}
      • False negative rate:
    P(βˆ’βˆ£D)=1βˆ’sensitivity\qquad P(-\mid D)=1-\text{sensitivity}
    ---

    The Most Important Distinction

    ⚠️ Do Not Confuse These

    These are different quantities:

      • P(+∣D)\qquad P(+\mid D) = sensitivity

      • P(D∣+)\qquad P(D\mid +) = probability that a person has the disease given a positive result


    The first is about test performance on diseased people.

    The second is about what a positive result means for a person.

    This distinction is the heart of Bayes-type reasoning. ---

    Total Probability for Test Outcomes

    πŸ“ Probability of a Positive Test

    To compute the overall chance of a positive test, split into diseased and non-diseased cases:

    P(+)=P(+∣D)P(D)+P(+∣Dc)P(Dc)\qquad P(+)=P(+\mid D)P(D)+P(+\mid D^c)P(D^c)

    πŸ“ Probability of a Negative Test

    Similarly,

    P(βˆ’)=P(βˆ’βˆ£D)P(D)+P(βˆ’βˆ£Dc)P(Dc)\qquad P(-)=P(-\mid D)P(D)+P(-\mid D^c)P(D^c)

    ---

    Bayes' Theorem

    πŸ“ Positive Predictive Value

    The probability that a person has the disease given a positive test is

    P(D∣+)=P(+∣D)P(D)P(+)\qquad P(D\mid +)=\dfrac{P(+\mid D)P(D)}{P(+)}

    Using the total probability formula for P(+)P(+), this becomes

    P(D∣+)=P(+∣D)P(D)P(+∣D)P(D)+P(+∣Dc)P(Dc)\qquad P(D\mid +)=\dfrac{P(+\mid D)P(D)}{P(+\mid D)P(D)+P(+\mid D^c)P(D^c)}

    πŸ“ Negative Predictive Value

    The probability that a person does not have the disease given a negative test is

    P(Dcβˆ£βˆ’)=P(βˆ’βˆ£Dc)P(Dc)P(βˆ’)\qquad P(D^c\mid -)=\dfrac{P(-\mid D^c)P(D^c)}{P(-)}

    ---

    Standard Formula in Parameters

    πŸ“ General Formula

    Let

      • prevalence = pp

      • sensitivity = ss

      • specificity = cc


    Then

    P(D)=p,P(Dc)=1βˆ’p\qquad P(D)=p,\quad P(D^c)=1-p

    P(+∣D)=s,P(+∣Dc)=1βˆ’c\qquad P(+\mid D)=s,\quad P(+\mid D^c)=1-c

    So

    P(+)=sp+(1βˆ’c)(1βˆ’p)\qquad P(+)=sp+(1-c)(1-p)

    and

    P(D∣+)=spsp+(1βˆ’c)(1βˆ’p)\qquad P(D\mid +)=\dfrac{sp}{sp+(1-c)(1-p)}

    This is the main formula for this topic. ---

    Table Method

    πŸ“ 1000-People or 100000-People Method

    In many diagnostic test problems, it is easiest to imagine a sample population.

    For example, if

      • prevalence = 1%1\%

      • sensitivity = 90%90\%

      • specificity = 95%95\%


    then among 10001000 people:

      • diseased: 10\qquad 10

      • non-diseased: 990\qquad 990


    Among the 1010 diseased:
      • true positives: 0.90Γ—10=9\qquad 0.90\times10=9

      • false negatives: 1\qquad 1


    Among the 990990 non-diseased:
      • true negatives: 0.95Γ—990=940.5\qquad 0.95\times990=940.5

      • false positives: 49.5\qquad 49.5


    Then

    P(D∣+)=true positivesall positives\qquad P(D\mid +)=\dfrac{\text{true positives}}{\text{all positives}}

    This method is often faster than symbolic algebra. ---

    Why Rare Diseases Are Tricky

    ❗ Base-Rate Effect

    Even a very accurate test can have a surprisingly low value of P(D∣+)P(D\mid +) when the disease is rare.

    Reason:

      • the diseased group is tiny

      • the healthy group is huge

      • even a small false positive rate applied to a huge healthy group may create many false positives

    This is one of the most important conceptual lessons in probability. ---

    Minimal Worked Examples

    Example 1 A disease affects 1%1\% of a population. A test has sensitivity 90%90\% and specificity 95%95\%. Find the probability of a positive test. We have P(D)=0.01,P(Dc)=0.99\qquad P(D)=0.01,\quad P(D^c)=0.99 P(+∣D)=0.90,P(+∣Dc)=0.05\qquad P(+\mid D)=0.90,\quad P(+\mid D^c)=0.05 So P(+)=0.90β‹…0.01+0.05β‹…0.99\qquad P(+)=0.90\cdot0.01+0.05\cdot0.99 =0.009+0.0495=0.0585\qquad =0.009+0.0495=0.0585 Hence P(+)=0.0585\qquad P(+)=\boxed{0.0585} --- Example 2 Using the same data, find the probability that a person has the disease given a positive test. By Bayes' theorem, P(D∣+)=0.90β‹…0.010.0585\qquad P(D\mid +)=\dfrac{0.90\cdot0.01}{0.0585} =0.0090.0585=213β‰ˆ0.1538\qquad =\dfrac{0.009}{0.0585}=\dfrac{2}{13}\approx0.1538 So P(D∣+)β‰ˆ0.1538\qquad P(D\mid +)\approx \boxed{0.1538} This is only about 15.38%15.38\%, even though the test is quite accurate. ---

    Common Derived Quantities

    πŸ“ Useful Probabilities

    • True positive probability:

    P(TP)=P(+∣D)P(D)\qquad P(\text{TP})=P(+\mid D)P(D)

    • False positive probability:

    P(FP)=P(+∣Dc)P(Dc)\qquad P(\text{FP})=P(+\mid D^c)P(D^c)

    • True negative probability:

    P(TN)=P(βˆ’βˆ£Dc)P(Dc)\qquad P(\text{TN})=P(-\mid D^c)P(D^c)

    • False negative probability:

    P(FN)=P(βˆ’βˆ£D)P(D)\qquad P(\text{FN})=P(-\mid D)P(D)

    These help compare how often each outcome occurs in the whole population. ---

    Common Mistakes

    ⚠️ Avoid These Errors
      • ❌ Using sensitivity in place of P(D∣+)P(D\mid +)
      • ❌ Forgetting to include false positives when computing total positives
      • ❌ Ignoring prevalence
      • ❌ Mixing up specificity with false positive rate
      • ❌ Forgetting that
    P(+∣Dc)=1βˆ’specificity\qquad P(+\mid D^c)=1-\text{specificity}
    ---

    CMI Strategy

    πŸ’‘ How to Attack Diagnostic Test Questions

    • Define the events DD, DcD^c, ++, and βˆ’- clearly.

    • Write prevalence, sensitivity, and specificity first.

    • Compute P(+)P(+) or P(βˆ’)P(-) using total probability.

    • Then apply Bayes' theorem.

    • If the numbers are awkward, use a 10001000-person or 100000100000-person table.

    • Always check whether the question is asking for:

    - P(+∣D)P(+\mid D)
    - P(D∣+)P(D\mid +)
    - P(+)P(+)
    - P(βˆ’)P(-)

    ---

    Practice Questions

    :::question type="MCQ" question="Which of the following equals the sensitivity of a test?" options=["P(D∣+)P(D\mid +)","P(+∣D)P(+\mid D)","P(βˆ’βˆ£Dc)P(-\mid D^c)","P(Dcβˆ£βˆ’)P(D^c\mid -)"] answer="B" hint="Sensitivity is the probability of a positive test among diseased people." solution="By definition, sensitivity is P(+∣D)\qquad P(+\mid D). Hence the correct option is B\boxed{B}." ::: :::question type="NAT" question="A disease affects 10%10\% of a population. A test has sensitivity 80%80\% and specificity 90%90\%. Find the probability of a positive test." answer="0.17" hint="Use total probability." solution="We have P(D)=0.10,P(Dc)=0.90\qquad P(D)=0.10,\quad P(D^c)=0.90 P(+∣D)=0.80,P(+∣Dc)=0.10\qquad P(+\mid D)=0.80,\quad P(+\mid D^c)=0.10 So P(+)=0.80β‹…0.10+0.10β‹…0.90\qquad P(+)=0.80\cdot0.10+0.10\cdot0.90 =0.08+0.09=0.17\qquad =0.08+0.09=0.17 Hence the answer is 0.17\boxed{0.17}." ::: :::question type="MSQ" question="Which of the following statements are true?" options=["Specificity is P(βˆ’βˆ£Dc)P(-\mid D^c)","False positive rate is P(+∣Dc)P(+\mid D^c)","Positive predictive value is P(D∣+)P(D\mid +)","Sensitivity is P(D∣+)P(D\mid +)"] answer="A,B,C" hint="Separate test-quality quantities from posterior probabilities." solution="1. True. This is the definition of specificity.
  • True. This is the definition of the false positive rate.
  • True. Positive predictive value is the probability of disease given a positive test.
  • False. Sensitivity is P(+∣D)P(+\mid D), not P(D∣+)P(D\mid +).
  • Hence the correct answer is A,B,C\boxed{A,B,C}." ::: :::question type="SUB" question="A disease affects 1%1\% of a population. A test has sensitivity 99%99\% and specificity 99%99\%. Compute the probability that a person has the disease given that the test is positive." answer="0.50.5" hint="Use Bayes' theorem." solution="We have P(D)=0.01,P(Dc)=0.99\qquad P(D)=0.01,\quad P(D^c)=0.99 Also P(+∣D)=0.99,P(+∣Dc)=0.01\qquad P(+\mid D)=0.99,\quad P(+\mid D^c)=0.01 So P(+)=0.99β‹…0.01+0.01β‹…0.99=0.0198\qquad P(+)=0.99\cdot0.01+0.01\cdot0.99=0.0198 Now by Bayes' theorem, $\qquad P(D\mid +)=\dfrac{0.99\cdot0.01}{0.0198} =\dfrac{0.0099}{0.0198}=0.5$ Hence the answer is 0.5\boxed{0.5}." ::: ---

    Summary

    ❗ Key Takeaways for CMI

    • Sensitivity is P(+∣D)P(+\mid D) and specificity is P(βˆ’βˆ£Dc)P(-\mid D^c).

    • Use total probability to compute P(+)P(+) and P(βˆ’)P(-).

    • Use Bayes' theorem to compute P(D∣+)P(D\mid +) and P(Dcβˆ£βˆ’)P(D^c\mid -).

    • Positive predictive value can be much smaller than sensitivity when prevalence is low.

    • Diagnostic test questions are really conditional probability questions with careful interpretation.

    Chapter Summary

    Bayes-type reasoning β€” Key Points

    * Conditional Probability Foundation: Conditional probability P(A∣B)=P(A∩B)/P(B)P(A|B) = P(A \cap B) / P(B) quantifies the likelihood of event A occurring given that event B has already occurred. Understanding this distinction is crucial.
    * Law of Total Probability: This theorem, P(B)=βˆ‘iP(B∣Ai)P(Ai)P(B) = \sum_{i} P(B|A_i)P(A_i) for a partition {Ai}\{A_i\}, is fundamental for calculating the marginal probability of an event B and often forms the denominator in Bayes' Theorem.
    * Bayes' Theorem: P(A∣B)=P(B∣A)P(A)P(B)P(A|B) = \frac{P(B|A)P(A)}{P(B)} provides a rigorous framework for updating prior beliefs P(A)P(A) to posterior beliefs P(A∣B)P(A|B) based on new evidence BB. This concept of "reverse probability" is central to the chapter.
    * Tree Diagrams: An indispensable tool for visualizing sequential events, partitioning sample spaces, and systematically calculating joint and conditional probabilities, especially useful for understanding the flow of events in multi-stage problems.
    * Contingency Tables: For problems involving multiple categorizations (e.g., disease status and test results), constructing a contingency table effectively organizes data, clarifies relationships, and simplifies the calculation of various conditional probabilities.
    * Diagnostic Test Problems: A common application where sensitivity (P(positive test∣disease)P(\text{positive test}|\text{disease})) and specificity (P(negative test∣no disease)P(\text{negative test}|\text{no disease})) must be carefully distinguished from the positive predictive value (P(disease∣positive test)P(\text{disease}|\text{positive test})) and negative predictive value (P(no disease∣negative test)P(\text{no disease}|\text{negative test})).
    * Interpretation of Results: Beyond calculation, interpreting the updated probabilities (posterior probabilities) in the context of the problem is vital for drawing meaningful conclusions and demonstrating conceptual understanding.

    Chapter Review Questions

    :::question type="MCQ" question="A rare disease affects 0.1% of the population. A diagnostic test for this disease has a sensitivity of 99% and a specificity of 95%. If a randomly selected person tests positive, what is the probability that they actually have the disease?" options=["Approximately 1.94%", "Approximately 0.099%", "Approximately 99%", "Approximately 5%"] answer="Approximately 1.94%" hint="Use Bayes' Theorem. Let D be the event of having the disease and T+ be the event of testing positive. You need to find P(D∣T+)P(D|T+). Consider the prevalence, sensitivity, and specificity to calculate P(T+∣D)P(T+|D), P(D)P(D), P(T+∣Dc)P(T+|D^c), and P(Dc)P(D^c)." solution="Let D be the event that a person has the disease, and T+ be the event that they test positive.
    Given:
    P(D)=0.001P(D) = 0.001 (prevalence)
    P(Dc)=1βˆ’P(D)=0.999P(D^c) = 1 - P(D) = 0.999
    P(T+∣D)=0.99P(T+|D) = 0.99 (sensitivity)
    P(Tc∣Dc)=0.95P(T^c|D^c) = 0.95 (specificity)
    From specificity, P(T+∣Dc)=1βˆ’P(Tc∣Dc)=1βˆ’0.95=0.05P(T+|D^c) = 1 - P(T^c|D^c) = 1 - 0.95 = 0.05.

    We want to find P(D∣T+)P(D|T+). Using Bayes' Theorem:

    P(D∣T+)=P(T+∣D)P(D)P(T+)P(D|T+) = \frac{P(T+|D)P(D)}{P(T+)}

    First, calculate P(T+)P(T+) using the Law of Total Probability:
    P(T+)=P(T+∣D)P(D)+P(T+∣Dc)P(Dc)P(T+) = P(T+|D)P(D) + P(T+|D^c)P(D^c)

    P(T+)=(0.99)(0.001)+(0.05)(0.999)P(T+) = (0.99)(0.001) + (0.05)(0.999)

    P(T+)=0.00099+0.04995P(T+) = 0.00099 + 0.04995

    P(T+)=0.05094P(T+) = 0.05094

    Now, substitute into Bayes' Theorem:
    P(D∣T+)=(0.99)(0.001)0.05094=0.000990.05094β‰ˆ0.019434P(D|T+) = \frac{(0.99)(0.001)}{0.05094} = \frac{0.00099}{0.05094} \approx 0.019434

    Converting to percentage, this is approximately 1.94%."
    :::

    :::question type="NAT" question="Urn A contains 4 red and 6 blue balls. Urn B contains 7 red and 3 blue balls. A fair coin is flipped; if it lands heads, a ball is drawn from Urn A, and if tails, from Urn B. What is the probability that the ball drawn is red?" answer="0.55" hint="Use the Law of Total Probability. Define events for selecting each urn and drawing a red ball from each." solution="Let A be the event that Urn A is chosen, and B be the event that Urn B is chosen.
    Since a fair coin is flipped:
    P(A)=0.5P(A) = 0.5
    P(B)=0.5P(B) = 0.5

    Let R be the event that a red ball is drawn.
    From Urn A: P(R∣A)=44+6=410=0.4P(R|A) = \frac{4}{4+6} = \frac{4}{10} = 0.4
    From Urn B: P(R∣B)=77+3=710=0.7P(R|B) = \frac{7}{7+3} = \frac{7}{10} = 0.7

    Using the Law of Total Probability:

    P(R)=P(R∣A)P(A)+P(R∣B)P(B)P(R) = P(R|A)P(A) + P(R|B)P(B)

    P(R)=(0.4)(0.5)+(0.7)(0.5)P(R) = (0.4)(0.5) + (0.7)(0.5)

    P(R)=0.20+0.35P(R) = 0.20 + 0.35

    P(R)=0.55P(R) = 0.55
    "
    :::

    :::question type="MCQ" question="A factory has two machines, M1 and M2, which produce 60% and 40% of the total output, respectively. Machine M1 produces 3% defective items, while Machine M2 produces 5% defective items. If a randomly selected item is found to be defective, what is the probability that it was produced by Machine M2?" options=["0.05", "0.40", "0.5263", "0.02"] answer="0.5263" hint="Apply Bayes' Theorem. Let D be the event that an item is defective. You need to find P(M2∣D)P(M2|D)." solution="Let M1 be the event an item is from Machine 1, and M2 be the event an item is from Machine 2.
    Let D be the event that an item is defective.

    Given:
    P(M1)=0.60P(M1) = 0.60
    P(M2)=0.40P(M2) = 0.40
    P(D∣M1)=0.03P(D|M1) = 0.03
    P(D∣M2)=0.05P(D|M2) = 0.05

    We want to find P(M2∣D)P(M2|D). Using Bayes' Theorem:

    P(M2∣D)=P(D∣M2)P(M2)P(D)P(M2|D) = \frac{P(D|M2)P(M2)}{P(D)}

    First, calculate P(D)P(D) using the Law of Total Probability:
    P(D)=P(D∣M1)P(M1)+P(D∣M2)P(M2)P(D) = P(D|M1)P(M1) + P(D|M2)P(M2)

    P(D)=(0.03)(0.60)+(0.05)(0.40)P(D) = (0.03)(0.60) + (0.05)(0.40)

    P(D)=0.018+0.020P(D) = 0.018 + 0.020

    P(D)=0.038P(D) = 0.038

    Now, substitute into Bayes' Theorem:
    P(M2∣D)=(0.05)(0.40)0.038=0.0200.038β‰ˆ0.526315P(M2|D) = \frac{(0.05)(0.40)}{0.038} = \frac{0.020}{0.038} \approx 0.526315

    Rounding to four decimal places, P(M2∣D)β‰ˆ0.5263P(M2|D) \approx 0.5263."
    :::

    What's Next?

    Continue Your CMI Journey

    With a solid understanding of Bayes-type reasoning and conditional probability, you are well-prepared to delve into the broader landscape of probability theory. The concepts learned here, particularly the foundational idea of updating beliefs with new information, are crucial for future chapters. You should now proceed to explore Discrete Random Variables and their Probability Distributions, followed by Continuous Random Variables and their Probability Distributions. These topics build directly upon the principles of probability to introduce methods for quantifying uncertainty and variability, leading naturally into Expected Value and Variance and eventually Sampling Distributions and Statistical Inference.

    🎯 Key Points to Remember

    • βœ“ Master the core concepts in Bayes-type reasoning before moving to advanced topics
    • βœ“ Practice with previous year questions to understand exam patterns
    • βœ“ Review short notes regularly for quick revision before exams

    Related Topics in Probability

    More Resources

    Why Choose MastersUp?

    🎯

    AI-Powered Plans

    Personalized study schedules based on your exam date and learning pace

    πŸ“š

    15,000+ Questions

    Verified questions with detailed solutions from past papers

    πŸ“Š

    Smart Analytics

    Track your progress with subject-wise performance insights

    πŸ”–

    Bookmark & Revise

    Save important questions for quick revision before exams

    Start Your Free Preparation β†’

    No credit card required β€’ Free forever for basic features