Jan 30, 2026
Training For Engagement Can Degrade Alignment: Stanford Researchers coin “Moloch’s Bargain,” show fine-tuning can affect social values
Individuals and organizations increasingly use large language models to produce media that helps them compete for attention. Does fine-tuning LLMs to encourage engagement, purchases, or votes affect their alignment with social values? Researchers found that it does.