Jan 30, 2026
Training For Engagement Can Degrade Alignment: “Moloch’s Bargain” shows fine-tuning can affect social values
Individuals and organizations increasingly use large language models to produce media that helps them compete for attention. Does fine-tuning LLMs to encourage engagement, purchases, or votes affect their alignment with social values? Researchers found that it does.