Reinforcement Learning from Human Feedback (RLHF)

1 Post

The Politics of Language Models: AI's political opinions differ from most Americans'.
Reinforcement Learning from Human Feedback (RLHF)

The Politics of Language Models: AI's political opinions differ from most Americans'.

Do language models have their own opinions about politically charged issues? Yes — and they probably don’t match yours. Shibani Santurkar and colleagues at Stanford compared opinion-poll responses of large language models with those of various human groups.

Subscribe to The Batch

Stay updated with weekly AI News and Insights delivered to your inbox