AI Analysis of Body Camera Videos Offers a Data-Driven Approach to Police Reform

Examining body camera videos at scale reveals racial differences in how police treat drivers during traffic stops—and what corrective programs really work

Close up of a police officer in Santa Fe, New Mexico, wearing a body-worn camera made by Digital-Ally

A police officer in Santa Fe, N.M., wears a body camera.

Shiiko Alexander/Alamy Stock Photo

A decade ago then president Barack Obama proposed spending $75 million over three years to help states buy police body cameras to expand their use. The move came in the wake of the killing of teenager Michael Brown, for which no body camera footage existed, and was designed to increase transparency and build trust between police and the people they served.

Since the first funds were allocated in 2015, tens of millions of traffic stops and accidents, street stops, arrests and the like have been recorded with these small digital devices, which police attach to their uniform or winter jacket. The footage was considered useful as evidence in disputed incidents such as the one that led to the death of George Floyd in Minneapolis in 2020. Use of the cameras may also deter bad behavior by police in their interactions with the public.

But unless something tragic happens, body camera footage generally goes unseen. “We spend so much money collecting and storing this data, but it’s almost never used for anything,” says Benjamin Graham, a political scientist at the University of Southern California.


On supporting science journalism

If you're enjoying this article, consider supporting our award-winning journalism by subscribing. By purchasing a subscription you are helping to ensure the future of impactful stories about the discoveries and ideas shaping our world today.


Graham is among a small number of scientists who are reimagining this footage as data rather than just evidence. Their work leverages advances in natural language processing, which relies on artificial intelligence, to automate the analysis of video transcripts of citizen-police interactions. The findings have enabled police departments to spot policing problems, find ways to fix them and determine whether the fixes improve behavior.

Only a small number of police agencies have opened their databases to researchers so far. But if this footage were analyzed routinely, it would be a “real game changer,” says Jennifer Eberhardt, a Stanford University psychologist, who pioneered this line of research. “We can see beat-by-beat, moment-by-moment how an interaction unfolds.”

In papers published over the past seven years, Eberhardt and her colleagues have examined body camera footage to reveal how police speak to white and Black people differently and what type of talk is likely to either gain a person’s trust or portend an undesirable outcome, such as handcuffing or arrest. The findings have refined and enhanced police training. In a study published in PNAS Nexus in September, the researchers showed that the new training changed officers’ behavior.

“By taking on these types of studies and making improvements in your department, it helps actually to build trust in communities that have really low trust levels,” says LeRonne Armstrong, former chief of police of California’s Oakland Police Department, which has had a long-standing collaboration with the Stanford team.

The approach is slowly catching on. Inspired by the Stanford findings, the Los Angeles Board of Police Commissioners, which oversees the Los Angeles Police Department (LAPD), asked U.S.C. for help making sense of the department’s footage. A project to analyze 30,000 body camera videos spanning a year of traffic stops is now underway. And the Stanford group is also partnering with the San Francisco Police Department to use body camera footage to evaluate a program in which its officers travel to Birmingham, Ala., to learn about the Civil Rights Movement and the principles of nonviolence.

The Stanford work began in 2014 in the wake of a scandal involving the Oakland Police Department. Four Oakland, Calif., police officers known as “the Riders” had been accused of roughing up and arresting innocent people and planting drugs on them, among other crimes, back in the late 1990s. Of the 119 plaintiffs, 118 were Black. So as part of the $10.9-million settlement agreement, the department was required to collect data on vehicle and pedestrian stops and analyze them by race. More than a decade after the agreement was reached, the department’s federal monitor reached out to Eberhardt for help.

Plaintiffs’ attorneys told Eberhardt that what they most wanted to know was what happened after the cruiser lights came on—why officers were stopping people and how the interactions proceeded. The department was an early adopter of body cameras, which it had put into service about five years before. “You actually have footage,” Eberhardt recalls telling them, though no one at the department had thought to use it for that purpose.

Eberhardt recruited Dan Jurafsky, a Stanford linguist and computer scientist, and his then student Rob Voigt, now a computational linguist at Northwestern University, to develop an automated way to analyze video transcripts for nearly 1,000 traffic stops. The researchers decided to measure whether officers were speaking less respectfully to Black drivers than to white ones. They first had people rate the respectfulness of excerpts from the transcripts. Then they built a computational model that associated the ratings with various words or phrases and gave those utterances numerical weights. Expressing concern for the driver, for example, was rated as highly respectful, while addressing them by their first name was less respectful.

The model then gave a respect score to all officer language in a month of traffic stops, and the researchers associated these scores with the race of the person pulled over, among other variables. They found a clear racial disparity in the respectfulness of officers’ language. When speaking to Black drivers, officers were less likely to state the reason for the stop, offer reassurance or express concern for the safety of the driver, for example. The respect gap existed throughout an interaction and did not depend on the race of the officer, the reason for the stop, or its location or outcome.

Those initial results, published in 2017, had a profound impact in Oakland. “When Stanford released the findings, it was almost like a sigh of relief for minority communities,” Armstrong says. “This validated the concerns that people had always felt, and it made the department reexamine how we train our officers to communicate with our community.”

The Stanford team used the findings to develop a “respect” module for a procedural justice training program that the department delivered. Procedural justice seeks to build fairness into policing procedures. In addition to emphasizing respect, it may involve police explaining their actions to others and giving those individuals a chance to provide their perspective. As part of that effort, the team used its computational model to pull out real interactions that were particularly respectful and disrespectful. “As a training example, that seems a lot more legitimate to someone being trained” than made-up scenarios, Jurafsky says. “[Officers] recognize their own language.”

After the training went into effect, the researchers did another body camera study to determine whether officers used what they had learned. The Stanford team compared key features of officer language in 313 stops that occurred up to four weeks before training with those in 302 stops made in the four weeks after training. The researchers found that officers who had gone through training were more likely to express concern for drivers’ safety, offer reassurance and provide explicit reasons for the stop, they reported in their September PNAS Nexus study.

Systematic analysis of body camera footage, Eberhardt says, provides a promising way to understand what kinds of police training are effective. “A lot of those trainings that they have now are just not evaluated rigorously,” she says. “We don’t know whether whatever it is that they’re learning in those trainings… actually translates to real interactions with real people on the street.”

In a study published last year, the Stanford researchers analyzed body camera footage to find language associated with an “escalated outcome” for a traffic stop, such as handcuffing, search or arrest. Using footage from 577 stops of Black drivers in an undisclosed city, they found what Eberhardt calls a “linguistic signature” for escalation in the first 45 words spoken by an officer: giving orders to the driver from the start and not giving the reason for the stop. “The combination of those two was a good signal that the stop was going to end up with the driver being handcuffed, searched or arrested,” she says.

None of the stops in the study involved the use of force. But the researchers were curious if the signature they found would be present in footage of the police interaction that led to Floyd’s death. It was. In the initial 27 seconds of the encounter (about the time it takes for police officers to produce 45 words during stops), the officer gave only orders and did not tell Floyd why he was stopped.

The U.S.C. team has recruited a diverse group of people, including some who have been previously incarcerated and retired cops, to judge interactions captured by LAPD body cameras for politeness, respect and other aspects of procedural justice. The team plans to use advancements in AI to capture these perspectives in ways that may reveal, for example, why a statement intended as funny or deferential may be perceived as sarcastic or disrespectful. “The biggest hope is that our work can improve LAPD officer training, to have a data-driven way of updating and changing the training procedure so that it better fits the populations that they’re serving,” says Morteza Dehghani, a U.S.C. cognitive scientist, who co-leads the project with Graham.

Politics may dissuade police departments from sharing footage with academics. In some cases, departments may be reluctant to surface systematic problems. In the future, however, departments may be able to analyze the footage themselves. Some private firms—such as TRULEO and Polis Solutions—already offer software for that purpose.

“We are getting closer to departments being able to use these tools and not just having it be an academic exercise,” says Nicholas Camp, a social psychologist at the University of Michigan, who has worked on Eberhardt’s team. But commercial models tend not to be fully transparent—users cannot inspect their component modules—so some academics, including Camp and Dehghani, are wary of their output.

The U.S.C. team plans to make the language models it builds, which will be open to inspection, available to the LAPD and other police departments so that they can routinely monitor officers’ interactions with the public. “We should have a lot more detailed information about how these everyday interactions are going. That’s a big part of democratic governance,” Graham says.