OpenAI hasn't released the full version of its GPT-2 language model, fearing the system would create a dark tide of fake news masquerading as real reporting. Now researchers offer a way to detect such computer-generated fancies — but their approach essentially requires distributing language models that can generate made-up news.
What’s new: Researchers propose a framework for building verifiers, or classifiers that discriminate between human- and machine-authored articles, based on models that generate text. They introduce the fake-news generator GROVER — continuing the recent vogue for naming language models after Muppets — along with its complementary verifier. The new model is so good, human judges rated its Infowars-style propaganda output more credible than examples produced by human writers. You can try out a limited version here.
How it works: Rowan Zellers and his University of Washington collaborators constructed verifiers by copying text-generator architectures up to their output layer and then substituting a classification layer. They initialized the verifiers using transfer learning before training on generated text. Key insights:
- A verifier can do well at spotting the output of generators of different architectures if it has more parameters and training examples.
- The more examples a verifier has from a given generator, the more accurately it can classify articles from that generator.
- The verifier based on GROVER achieved more than 90 percent accuracy identifying GROVER articles. Verifiers that weren’t based on GROVER achieved 78 percent accuracy identifying GROVER's output.
Why it matters: Systems like GROVER threaten to flood the world with highly believable hoaxes. Automated countermeasures are the only viable defense. Zellers argues in favor of releasing newly invented language models, since they're tantamount to verifiers anyway.
Yes, but: The fact that larger models can fool verifiers suggests that we’re in for a fake-news arms race in which dedicated mischief makers continually up the ante.
We’re thinking: While Zellers' method can recognize machine authorship, the real goal is an algorithm that distinguishes fact from fiction. Until someone invents that, hoaxers — both digital and flesh-and-blood — are likely to remain one step ahead.