New Models Inherit Old Flaws: AI Models May Inherit Flaws From Previous Systems

Published
Reading time
1 min read
Halloween family portrait showing the inheritance of some spooky characteristics

Is AI becoming inbred?

The fear: The best models increasingly are fine-tuned versions of a small number of so-called foundation models that were pretrained on immense quantities of data scraped from the web. The web is a repository of much that’s noble in humanity — but also much that’s lamentable including social biases, ignorance, and cruelty. Consequently, while the fine-tuned models may attain state-of-the-art performance, they also exhibit a penchant for prejudice, misinformation, pornography, violence, and other undesirable traits.

Horror stories: Over 100 Stanford University researchers jointly published a paper that outlines some of the many ways foundation models could cause problems in fine-tuned implementations.

  • A foundation model may amplify biases in the data used for fine-tuning.
  • Engineers may train a foundation model on private data, then license the work to others who create systems that inadvertently expose personal details.
  • Malefactors could use a foundation model to fine-tune a system to, say, generate fake news articles.

How firm is the foundation? The Stanford paper stirred controversy as critics took issue with the authors’ definition of a foundation model and questioned the role of large, pretrained models in the future of AI. Stanford opened a center to study the issue.

Facing the fear: It’s not practical to expect every user of a foundation model to audit it fully for everything that might go wrong. We need research centers like Stanford’s — in both public and private institutions — to investigate the effects of AI systems, how harmful capabilities originate, and how they spread.

Share

Subscribe to The Batch

Stay updated with weekly AI News and Insights delivered to your inbox