INVESTIGATIVE JOURNALISM

Counting the dead: How statistics can find unreported killings

When investigative journalists Sheila Coronel and her team began counting drug-related killings in the Philippines last year they turned to statistician Patrick Ball to reveal the truth.

When investigative journalist Sheila Coronel and her team began counting drug-related killings in the Philippines last year, they encountered a problem: Many of the people who had been killed in President Rodrigo Duterte’s brutal war on drugs didn’t show up in police records or media reports. In some cases, even the local priests hadn’t heard about their deaths. One priest told the reporters he’d only learned about a homicide when he smelled a rotting corpse and followed the stench to it.

How many other killings had gone unreported? Coronel and her team wondered.

The journalists enlisted the help of Patrick Ball, a statistician with the San Francisco-based Human Rights Data Analysis Group.

Everyone who has been murdered should be remembered. – Patrick Ball

Ball analyzed the data reporters had collected from a variety of sources – including on-the-ground interviews, police records, and human rights groups – and used a statistical technique called multiple systems estimation to roughly calculate the number of unreported deaths in three areas of the capital city Manila.

The team discovered that the number of drug-related killings was much higher than police had reported. The journalists, who published their findings last month in The Atlantic, documented 2,320 drug-linked killings over an 18-month period, approximately 1,400 more than the official number. Ball’s statistical analysis, which estimated the number of killings the reporters hadn’t heard about, found that close to 3,000 people could have been killed – more than three times the police figure.

Ball said there are both moral and technical reasons for making sure everyone who has been killed in mass violence is counted.

“The moral reason is because everyone who has been murdered should be remembered,” he said. “A terrible thing happened to them and we have an obligation as a society to justice and to dignity to remember them.”

Patrick Ball explains a new model for multiple systems estimation to some visitors to the Human Rights Data Analysis Group.

Ball first began applying data analysis to human rights violations in the early 1990s when he traveled to El Salvador with Peace Brigades International, a group that accompanies local activists. A Salvadoran church asked Ball for help indexing files and he ended up creating a database of crimes reported to the non-governmental Human Rights Commission. Ball was able to compare that database with the career histories of Salvadoran military officers to determine who was in charge in a particular area when a crime took place.

Since then, the Human Rights Data Analysis Group has used statistics to estimate the number of killings in conflicts around the world, including the civil war in Syria, and to calculate the number of homicides committed by police in the United States. Ball also served as an expert witness in the genocide case against Guatemala’s General Efraín Ríos Montt and had to leave the country when he and other witnesses faced threats.

Although Ball has worked with journalists in the past, his collaboration with Coronel and her team was his first experience being closely involved in a journalistic investigation. Ball said he sees an opportunity for more collaboration between reporters and statisticians.

“I always urge journalists not to try to do statistics on their own,” he said.

Sheila Coronel goes through records of drug-related killings in the Philippines.

Coronel, the director of the Toni Stabile Center for Investigative Journalism at Columbia University, agreed that data scientists can play an important role in investigative journalism.

“A lot of the work that investigative journalists do is trying to figure out the magnitude of the wrongdoing,” she said. “There are limits with what we can do with documents and data given the lack of documents and the lack of data. I think machine learning and statistical modeling provides a way for us to be able to get a bigger grasp of the problems we are investigating.”