Reason and Rhyme: The Statistics of Stereotyping

Tuesday, December 19, 2006

The Statistics of Stereotyping

In Albert Frank's article "The Interpretation of Statistical Tests" he provides formulas and examples of the errors one can get into when applying a test of a known high reliability to determine whether subjects taken at random qualify as members of a select (generally a negatively perceived) group. This is the problem with "racial profiling" that was identified by "Renaissance" in his posted comment which many fail to understand properly.

Let us take as a given that a particular racial type that is readily identifiable happens to be represented at a much higher incidence frequency in some sort of crime or other. This could be theft, murder, terrorism, or whatever statistics provide convincing "justification." And let us suppose further that the statistics that are used are completely valid such that, for example, although one race constitutes only 10% of the total population, its members who perpetrate said crime outnumber those of the majority racial type who also perpetrate such crimes. Why would profiling in such cases be unwarranted even (or especially) from a mathematical perspective?

Here's why.

Suppose that there is a test in place that can be applied to individuals that is extremely reliable (defined as f as in the original article) with regard to determining the culpability of an individual having already committed (or who will in the future commit) said crime. There is nothing in the justification statement given above that has any direct bearing on the appropriateness of implementing such a program. Although those may be completely valid statistics, they are not sufficient to determine the efficacy of a program which they attempt to justify. The appropriate statistic is "what is the probability that an individual of the subject race may commit such a crime — the parameter a in Albert Frank's article. This number will always be small — much much smaller than the probability of an individual having already committed the crime being a member of the subject race. This may seem like a subtle difference, but it is not!

Suppose the racial mix of a population is only 10% A and 90% B and that some precentage a_a of those who commit crime C are from the subset A. So far we have nothing to go on. We need to know the percentage of the entire population who commit crime C. Let c be that probability. Then we can determine the likelihood of a member of A or B comitting that crime. If we define a and b as the probabilities that members of A and B will, respectively, commit the crime, we can then solve the problem using what we know as follows:

a * (0.1) + b * (0.9) = c, and

a * (0.1) / c = a_a

Given a_a(which is usually all that is given and that is usually insinuated as though it were a itself — which it is not), and one of the following, a, b, or c, we can determine the effectiveness of profiling for a given reliability of testing.

Let's say by way of example that one in a thousand (c = 0.1%) of the total population commit the crime. Then for a_a= 0.6 we would have:

a = a_a / 100 = 0.006 and b = ( 0.01 − 0.1 * a ) / 0.9 = 0.00044.

Since in racial profiling the population is effectively reduced to that of A rather than the much larger A + B, it is a (as defined here) that corresponds to the same term in Albert Frank's article. So in order to avoid the use of profiling producing law enforcement nonsense, the reliability of testing an individual once he is subject to test must be so good that 1 − f would be much less than 0.006 or there will be as many or more unlawful (false positive) arrests as lawful ones. And in law enforcement a reliability as high as even 0.9 (let alone the required 0.994) is unheard of!

Therefore quite aside from the issue of the impertinence of the practice, it is a very ineffective approach to fighting crime.

Doodles in Author's Anthropology Class Notes from 1962