Deborah Raji, a fellow at nonprofit Mozilla, and Genevieve Fried, who advises members of the US Congress on algorithmic accountability, examined over 130 facial-recognition data sets compiled over 43 years. They found that researchers, driven by the exploding data requirements of deep learning, gradually abandoned asking for people’s consent. This has led more and more of people’s personal photos to be incorporated into systems of surveillance without their knowledge.
It has also led to far messier data sets: they may unintentionally include photos of minors, use racist and sexist labels, or have inconsistent quality and lighting. The trend could help explain the growing number of cases in which facial-recognition systems have failed with troubling consequences, such as the false arrests of two Black men in the Detroit area last year.
People were extremely cautious about collecting, documenting, and verifying face data in the early days, says Raji. “Now we don’t care anymore. All of that has been abandoned,” she says. “You just can’t keep track of a million faces. After a certain point, you can’t even pretend that you have control.”
A history of facial-recognition data
The researchers identified four major eras of facial recognition, each driven by an increasing desire to improve the technology. The first phase, which ran until the 1990s, was largely characterized by manually intensive and computationally slow methods.
But then, spurred by the realization that facial recognition could track and identify individuals more effectively than fingerprints, the US Department of Defense pumped $6.5 million into creating the first large-scale face data set. Over 15 photography sessions in three years, the project captured 14,126 images of 1,199 individuals. The Face Recognition Technology (FERET) database was released in 1996.
The following decade saw an uptick in academic and commercial facial-recognition research, and many more data sets were created. The vast majority were sourced through photo shoots like FERET’s and had full participant consent. Many also included meticulous metadata, Raji says, such as the age and ethnicity of subjects, or illumination information. But these early systems struggled in real-world settings, which drove researchers to seek larger and more diverse data sets.
In 2007, the release of the Labeled Faces in the Wild (LFW) data set opened the floodgates to data collection through web search. Researchers began downloading images directly from Google, Flickr, and Yahoo without concern for consent. LFW also relaxed standards around the inclusion of minors, using photos found with search terms like “baby,” “juvenile,” and “teen” to increase diversity. This process made it possible to create significantly larger data sets in a short time, but facial recognition still faced many of the same challenges as before. This pushed researchers to seek yet more methods and data to overcome the technology’s poor performance.
Then, in 2014, Facebook used its user photos to train a deep-learning model called DeepFace. While the company never released the data set, the system’s superhuman performance elevated deep learning to the de facto method for analyzing faces. This is when manual verification and labeling became nearly impossible as data sets grew to tens of millions of photos, says Raji. It’s also when really strange phenomena start appearing, like auto-generated labels that include offensive terminology.