What should be AI’s role in moderating the millions of messages posted on social media every day? The volume of messages means that automation is required. But the question of what is appropriate moderation versus inappropriate censorship lingers.
AI is helpful for scaling up a moderation policy. But it doesn’t address the core challenge of defining a policy: Which expressions to permit and which to block. This is hard for both humans and AI.
Deciding what to block is hard because natural language is ambiguous.
- “Don’t let them get away with this” could be an incitement to violence or a call for justice.
- “The vaccine has dangerous side effects” could be a scientific fact or misinformation.
The meanings of words vary from person to person. My son says “wawa” when he wants water, and only his close family (and now you!) understand. At work, teams invent acronyms that others don’t understand. More problematically, lawbreakers and hate groups develop code words to discuss their activities.
If humans understand the same words differently, how can we train an AI to make such distinctions? If a piece of text has no fixed meaning, then enforcing policies based on the text is difficult. Should we hide it from user A if they would read it as promoting violence, but show it to user B if they would view it as benign? Or should hiding a message be based on the intent of the sender? None of these options is satisfying.
Further, getting the data to build an AI system to accomplish any of this is hard. How can the developers who gather the data understand its full range of meanings? Different communities have their own interpretations, making it impossible to keep track.
Even if the meaning are unambiguous, making the right decision is still hard. Fortunately, social media platforms can choose from a menu of options depending on how egregious a message is and the degree of confidence that it’s problematic. Choices include showing it to a smaller audience, adding a warning label, and suspending, temporarily or permanently, the user who posted it. Having a range of potential consequences helps social media platforms manage the tradeoff between silencing and protecting users (and society).
Despite their flaws, AI systems make social media better. Imagine email without AI-driven spam filtering; it would rapidly become unusable. Similarly, AI is critical for eliminating the most spammy or toxic social media messages. But the challenge of moderating any given message transcends AI.
It’s important to acknowledge this challenge openly, so we can debate the principles we would apply to this problem and recognize that there may be no perfect solution. Through transparent and robust debate, I believe that we can build trust around content moderation and make tradeoffs that maximize social media’s benefit.