Build It Break It

1 Post

Screen captures of online platform Dynabench
Build It Break It

Dynamic Benchmarks: A platform for fooling language models

Benchmarks provide a scientific basis for evaluating model performance, but they don’t necessarily map well to human cognitive abilities. Facebook aims to close the gap through a dynamic benchmarking method that keeps humans in the loop.

Subscribe to The Batch

Stay updated with weekly AI News and Insights delivered to your inbox