Reason and Rhyme: The Interpretation Of Statistical Tests

Saturday, December 02, 2006

The Interpretation Of Statistical Tests

In this article, we assume the following hypothesis: if the reliability of a dichotomical test is f, then the probability that it gives a wrong result is 1-f.

The following question arises: Below what reliability will a test result have a probability of being correct of less than 0.5?

Let P be the number of elements in the population, a the probability (known) for an element of this population to have a definite feature K, and f the reliability of the test. The number of K-elements detected by the test equals a f P. The number of non-K detected (wrongly) is (1-a) (1-f) P. The probability that an element detected by the test is effectively a K-element is 0.5 if a f P = (1-a) (1-f) P, equivalent to f = 1-a. So, as soon as f ≤ a, the test becomes a nonsense.

A test must be more reliable if what it attempts to detect is very rare.

This simple fact is very often neglected.

Let's take an example: the alcohol test. We assume as hypothesis that one driver out of 100 is at "0.8 or more" (European norm for heavy offence is in excess of 0.8 gm/ltr.). In the following table, we examine for several reliabilities of the test the probability that somebody with a positive test is actually positive. We take a population of 100,000 persons, of which 1,000 are supposed to be "at 0.8 or more."