One of the largest open datasets for training face recognition systems has its roots in a popular photo-sharing service. Companies that have used this data could find themselves liable for millions in legal recompense.

What’s new: Many Flickr users were surprised and upset when reporters informed them their likeness, or that of their children and other family members, was part of a public database used to train face recognition algorithms, according to the New York Times. Such training may violate an Illinois digital privacy law that’s currently being tested in court.

Tracing the data: MegaFace, which depicts 672,000 individuals in nearly 4 million photos, comprises images from Flickr that their creators licensed for commercial use under the Creative Commons intellectual property license.

  • Yahoo owned Flickr between 2007 and 2017. In 2014, the web giant released 100 million Flickr photos for training image classifiers.
  • The following year, University of Washington researchers started distributing the MegaFace subset.
  • Since then, MegaFace has been used to train face recognition software by Amazon, Google, Mitsubishi, SenseTime, Tencent, and others.

Legal jeopardy: In 2008, Illinois passed the Biometric Information Privacy Act, which prevents commercial entities from capturing, purchasing, or otherwise obtaining a private individual’s likeness without the person’s consent. Individuals whose faces have been used without permission are entitled to between $1,000 and $5,000 per use.

Court action: The Illinois law already is fueling a $35 billion class action lawsuit against Facebook for the way it stores and uses data to automatically identify faces in photos.

  • Facebook argued that the people pictured have no grounds to sue because its software didn’t cause them financial harm.
  • The 9th U.S. Circuit Court overruled the objection, citing an earlier Illinois Supreme Court ruling that invasion of privacy alone is enough to break the law.
  • The case will be decided by a jury before the federal court, on a schedule that hasn’t yet been announced.

Why it matters: MegaFace is still available, and at least 300 organizations have used it to train their models, according to a 2016 University of Washington press release. Any group that has used this data to make money is liable under the Illinois law.

We’re thinking: With 50 states in the U.S. and around 200 countries in the world, regulatory mismatches among various jurisdictions seem inevitable. User privacy and data rights are important, and legal requirements must be as clear and coherent as possible to advance the technology in a positive way.

Share

Subscribe to The Batch

Stay updated with weekly AI News and Insights delivered to your inbox