An independent investigation found evidence of racial and economic bias in a crime-prevention model used by police departments in at least nine U.S. states.
What’s new: Geolitica, a service that forecasts where crimes will occur, disproportionately targeted Black, Latino, and low-income populations, according to an analysis of leaked internal data by Gizmodo and The Markup. The reporters found the data on an unsecured police website. Geolitica, formerly called PredPol, changed its name in March.
How it works: The model predicts where crimes are likely to occur, helping police departments use allocate personnel. The company trains a separate model for each jurisdiction on two to five years of crime dates, locations, and types.
- The reporters filtered out jurisdictions with less than six months’ worth of data, leaving 5.9 million crime predictions from 38 U.S. jurisdictions between February 15, 2018 and January 30, 2021.
- They compared the output with census data that shows the geographic distribution of racial and socioeconomic groups. PredPol was more likely to predict crimes in areas with high numbers of Black and Latino residents in 84 percent of jurisdictions. It was less likely to target areas with high numbers of White residents in 74 percent of jurisdictions. The most-targeted areas included a higher proportion of lower-income households in 71 percent of jurisdictions.
- The reporters found no strong correlation between the system’s predictions and arrest rates provided by 11 police departments.
Sources of bias: Critics point to pervasive biases in the models’ training data as well as potential adverse social effects of scheduling patrols according to automated crime predictions.
- The training data was drawn from crimes reported to police. The U.S. Bureau of Justice Statistics found that only around 40 percent of violent crimes and 33 percent of property crimes were reported in 2020, leaving many possible crimes unaccounted for. Moreover, people who earned $50,000 or more reported crimes 12 percent less frequently than those who earned $25,000 or less, which would skew the dataset toward less wealthy neighborhoods.
- Because the models are trained on historical data, they learn patterns that reflect documented disparities in police practices. Black people were more likely to be arrested than White people in 90 percent of jurisdictions in the study, according to an FBI report, the authors wrote.
- Such algorithms perpetuate patrols in areas that already are heavily patrolled, leading to arrests for minor offenses that tend to receive scant attention elsewhere, critics said.
The response: Geolitica confirmed that the data used in the investigation “appeared to be” authentic, but it took issue with the analysis:
- The data was “erroneous” and “incomplete,” the company said. One jurisdiction that showed extreme disparities had misused the software, leading to extra predictions.
- The models aren’t trained on demographic, ethnic, or socioeconomic information, which “eliminates the possibility for privacy or civil rights violations seen with other intelligence-led or predictive policing models,” the company said. However, research has shown that learning algorithms can absorb biases in datasets that don’t explicitly label biased features.
Why it matters: Over 70 U.S. law enforcement jurisdictions use Geolitica’s service, and it is used in other countries as well. Yet this report is the first independent analysis of the algorithm’s performance based on internal data. Its findings underscore concerns that predictive policing systems invite violations of civil liberties, which have prompted efforts to ban such applications.
We’re thinking: Predictive policing can have a profound impact on individuals and communities. Companies that offer such high-stakes systems should audit them for fairness and share the results proactively rather than waiting for data leaks and press reports.