In June, I announced the first Data-centric AI Competition. The deadline for submissions was in early September, and today I’m thrilled to announce the winners!
A total of 489 individuals and teams submitted 2,458 unique datasets. By improving the data alone — not the model architecture, which was fixed — many contestants were able to improve on the baseline performance of 64.4% by over 20%. The winners in the Best Performance category achieved between 86.034% and 86.405%. The winners in the Most Innovative category, as well as the honorable mentions, achieved high performance using novel approaches.
Congratulations to Divakar Roy, Team Innotescus, and Team Synaptic-AnN, who took the top three spots for Best Performance. Congratulations also to Mohammad Motamedi, Johnson Kuan, and Team GoDataDriven, winners of the Most Innovative category. Pierre-Louis Bescond and Team KAIST-AIPRLab earned honorable mentions. I couldn’t be more proud of you all.
You can learn more about their approaches here. I hope you’ll apply these ideas to your own work.
The winners joined me at a private roundtable event to discuss how to grow the data-centric AI movement. I was surprised to learn that almost all of them — some of whom have been involved in AI for a long time, and some of whom have little AI background — already have seen positive effects of data-centric techniques in their own work.
We chatted about the potential benefits of data-centric AI development to entrepreneurs and startups that may not have access to large datasets, and how it opens machine learning to non-engineers who, although they may not have the skills to build models, can make important contributions by gathering and refining data.
We also discussed how working with data is often wrongly viewed as the boring part of machine learning even though, in reality, it’s a critical aspect of any project. I was reminded that, 10 years ago, working with neural networks was viewed in a similar light — people were more interested in hand-engineering features and viewed neural networks as uninteresting. I’m optimistic that the AI community before long will take as much interest in systematically improving data as architecting models.
Thank you to all the participants for helping build a foundation for future data-centric AI benchmarks. I hope this competition spurs you to innovate further systematic approaches to improving data. And I hope you’ll compete again in future data-centric AI challenges!