May 26, 2021

Conventional and data-centric benchmarks

May 26, 2021

The Batch: Face Recognition for the Masses, Labeling Libel, Documenting Datasets, What Machines Want to See

Benchmarks have been a significant driver of research progress in machine learning. But they've driven progress in model architecture, not approaches to building datasets, which can have a large impact on performance in practical applications.

PimEyes working with pictures of Andrew Ng

May 26, 2021

Face Recognition for the Masses: PimEyes is reverse image search for face recognition.

Face recognition tech tends to be marketed to government agencies, but PimEyes offers a web app that lets anyone scan the internet for photos of themself — or anyone they have a picture of. The company says it aims to help people control their online presence and fight identity theft.

May 26, 2021

Double Check for Defamation: CaliberAI uses NLP to scan for possible legal defamation.

A libel-detection system could help news outlets and social media companies stay out of legal hot water. CaliberAI, an Irish startup, scans text for statements that could be considered defamatory, Wired reported.

Walking through a narrow hallway in a library

May 26, 2021

Bias By the Book: Researchers find bias in influential NLP dataset BookCorpus.

Researchers found serious flaws in an influential language dataset, highlighting the need for better documentation of data used in machine learning.

A new metod for compressing images and yielding better classification

May 26, 2021

What Machines Want to See: An image compressor for more accurate computer vision

Researchers typically downsize images for vision networks to accommodate limited memory and accelerate processing. A new method not only compresses images but yields better classification.

May 26, 2021

Data-Centric AI Development: A New Kind of Benchmark

Benchmarks have been a significant driver of research progress in machine learning. But they've driven progress in model architecture, not approaches to building datasets, which can have a large impact on performance in practical applications.

The Batch: Face Recognition for the Masses, Labeling Libel, Documenting Datasets, What Machines Want to See

Face Recognition for the Masses: PimEyes is reverse image search for face recognition.

Double Check for Defamation: CaliberAI uses NLP to scan for possible legal defamation.

Bias By the Book: Researchers find bias in influential NLP dataset BookCorpus.

What Machines Want to See: An image compressor for more accurate computer vision

Data-Centric AI Development: A New Kind of Benchmark

Subscribe to The Batch