Originally written by John Webb

tw: fictionalised statistics of disease rates.

———

Today there are many tests that are widely used to detect life-threatening diseases early. How effective are they? Should they be believed?

At a routine checkup, your doctor tells you that there is a simple and inexpensive blood test that can detect a rare but particularly
nasty form of cancer. You agree to have the test done, and the doctor takes a blood sample and sends it off to the pathology laboratory.

Two days later the doctor calls to tell you that the test has come up positive. The good news is that the cancer can be cured since it
has been caught at an early stage. The bad news is that the treatment, though effective, is very expensive and has a number of unpleasant side-effects.

Before agreeing to treatment you need to do a little bit of basic arithmetic.

The first thing to ask the doctor about is the accuracy of the cancer test when applied to people like you (of your age, gender, race, etc). The doctor tells you that when the test is administered on somebody like you with this form of cancer, it will detect the disease 99.9\% of the time. That means that the test may miss the cancer in one out of a thousand patients.

On the other hand, the test can give a positive result when the patient does not have the cancer. This is very rare: it has been found to give a positive reading in 0.1\% of all subjects who do not have the cancer.

That means the test registers a false positive result in one in a thousand cases.

So far, so good. The test sounds accurate. But there is one more important factor: the actual prevalence of this rare form of cancer,
which is said to be 0.4\%. That means that two in every five hundred people have it.

The time has come to do a little arithmetic.

Suppose that the test is administered to a random sample of a million people like you. Of that group, 0.4\% have the disease, which comes to $0.004 \times 1~000~000 = 4~000$ people.

But of the sample group of a million, 3~996~000 do not have the disease, and the test will register false positives on 0.1\% of this group, i.e. $0.001 \times 3~990~000 = 3996 $ people.

So when your test has come out positive, how do you know whether you are one of the 4000 who have the disease, or one of the 3996 who
don’t have the disease but have been incorrectly diagnosed?

You don’t. All you can say is that the test suggests that you have a roughly 50-50 chance that you have the disease.

How clear is this post?