The DBA team here at Coeo have spent a fair amount of time over the last few months discussing and debating the Netflix series ‘Making a Murderer’. If you haven't seen it already, I thoroughly recommend it. It’s a documentary based on the conviction of an American man, Steven Avery, for murder and although I don't want to give too much away, the series does begin with a wrongful conviction.
Recently I’ve been revising for a data science statistics exam and have been pointed in the direction of a number of articles about wrongful convictions based on the misunderstanding of statistics.
Earthquake Experts Convicted of Manslaughter
The first example is of a group of scientists who, in summary, got convicted of manslaughter as they stated that an earthquake was unlikely to occur then it did. In reality it was a little more complex as, although a bright person may understand that a low probability is still something that could occur, the point the judge made was that they did not conduct due diligence and they should have.
All but one of the scientists were later acquitted, but it begs the question: should data scientists be held responsible for their decisions?
Mother wrongly convicted of killing her sons
The second case was of a woman who was convicted of murdering her two children, whom she claimed died of cot death. The doctor claimed that, as the probability of one child dying due to SIDs was around 1 in about 10 million (numbers not exact), the probability of her having two children that died was 1 in 100 million. This would assume that the two events were completely independent but really we know that, in all likelihood, they are not. In fact, the chance of a second child dying is more like 1 in 200.
It does make you think how people who wield statistics can easily mislead people, even when they are completely wrong.
The Prosecutor's Fallacy
The presentation of evidence based on probabilities has many pitfalls, but there is one in particular that has led to a number of acquittals: The Prosecutor’s Fallacy. The fallacy occurs when you confuse the probability of someone matching DNA in a database or a description with the probability that, given they do match, the probability they are guilty.
An example of the Prosecutors Fallacy would be if you were performing a test on a defendant’s blood sample for a DNA match to a piece of evidence. Given the degraded nature of the sample the probability of a match is 1 in 1000. The prosecutor would claim, given that the defendant matched, there was 99.99% chance of them being guilty.
In fact, if the crime took place in the city of Birmingham, which has around 2 million people and we expect a match 1 in 1000 times, then there are 2000 people in Birmingham that match. This would mean, based on DNA evidence alone, he had a 1 in 2000 chance of being guilty. 0.05% chance of being guilty is far different to 99.99% chance.
This isn’t just restricted to the prosecution, there’s also the defendants fallacy, which does make me wonder how common this is. Maybe some more probability work to come…