Halloween Special! Skeletons in the AI Closet including Bias

Dear friends,

Welcome to this special Halloween issue of The Batch!

In AI, we use many challenging technical terms. To help you keep things straight, I would like to offer some definitions that I definitely would not use. I hope you’ll find this alternative AI glossary a breath of fresh scare:

Activation function: An incantation used to raise the dead
Dropout: A portal to another dimension that suddenly appears underfoot
Early stopping: When you’re tired of collecting candy and you go home to bed
Feature extraction: Getting a vampire’s fangs out of your neck
Greedy policy: Self-explanatory when trick-or-treating

Illustration of a Halloween pumpking on a book

Hinge loss: When the squeaky door falls off of a haunted house
Learning rate: How quickly werewolves realize they can’t break down your door but can climb through your window
Mini-batch: The amount of candy you have after early stopping
Overfit: When you’ve eaten so much Halloween candy you can’t button your clothes
Random forest: Where random witches live

Happy Halloween to all who celebrate it. Now let’s get this party started!

Keep learning,

Andrew

Trick or Treat!

Skeletons in the (Server) Closet

As the days grow short, we peer into the gathering night to glimpse dark shapes amid the shadows. Last year at this season, we trembled before rogue AGI, ubiquitous surveillance, and the chill winds of AI winter. Those goblins still dance just beyond the jack o’lantern’s candle — yet other shades now join them: algorithms that exploit our basest instincts, models that consume every watt we can generate, tribal drumbeats that divide our community. But we need not cower. Build the bonfire high! Face the dire omens! Let our very fears spur us to extinguish these demons forevermore!

Illustration of Frankenstein painting a billboard with the text "Bunnies are the real monsters"

AI Spreads Disinformation

Will AI promote lies that deepen social divisions?

The fear: Propagandists will bait online recommendation algorithms with sensationalized falsehoods. People who snap at the clickbait will be reeled into opposing ideological silos.

Behind the worries: Consumption of online content has skyrocketed since the pandemic began. Social media platforms, especially, are known to be vectors for disinformation. Bad actors have embraced algorithmically boosted disinformation campaigns to advance their agendas.

This year alone, agitators have exploited these systems to widen political divisions, spread false data about Covid-19, and promote irrational prejudices.
Russian operatives have been blamed for spreading misinformation on a vast range of topics since at least 2014, when the Kremlin flooded the internet with conspiracy theories about the shooting down of a Malaysian passenger jet over Ukraine. That campaign helped to cast doubt on official conclusions that Russian forces had destroyed the plane.
YouTube’s recommendation engine is primarily responsible for the growing number of people who believe that Earth is a flat disc rather than a sphere, a 2019 study found.

How scared should you be: Social media networks are getting better at spotting and blocking coordinated disinformation campaigns. But they’re still playing cat-and-mouse with propagandists.

Earlier this month, researchers found that Facebook users could slip previously flagged posts past the automated content moderation system by making simple alterations like changing the background color.
Creators of social media bots are using portraits created by generative adversarial networks to make automated accounts look like they belong to human users.
Efforts to control disinformation occasionally backfire. Conservative media outlets in the U.S. accused Twitter of left-wing bias after it removed a tweet by President Trump that contained falsehoods about coronavirus.

What to do: No company can tell fact from fiction definitively among the infinite shades of gray. AI-driven recommendation algorithms, which generally optimize for engagement, can be designed to limit the spread of disinformation. The industry is badly in need of transparent processes designed to reach reasonable decisions that most people can get behind (like free elections in a democracy). Meanwhile, we can all be more vigilant for signs of disinformation in our feeds.

Illustration of two witches with half a pumpkin each and the moon behind them

The AI Community Splinters

Will international rivalries fragment international cooperation in machine learning?

The fear: Countries competing for AI dominance will lash out at competitors. Without the free flow of research, data, talent, and ideas, the field will slow down. Advances in the industry will only benefit the country where they originated, and the worldwide research community will dissolve into clusters of regional cliques.

Behind the worries: Restrictive immigration rules have prevented engineers, scientists, and students from pursuing opportunities across national borders. At the same time, global powers have moved to dominate AI through industrial and trade policy, and to limit its reach through social policy.

Earlier this month, the U.S. government tightened restrictions on H1-B visas, on which many tech companies rely to recruit talented workers from overseas.
AI researchers from developing countries have also complained about the difficulty of obtaining visas to attend conferences in the U.S. and Canada.
China requires companies to physically house their data on servers within the country and to pass a regulatory review before moving any of it overseas.
Last year, the U.S. government banned American firms from doing business with top Chinese AI companies. The U.S. has also intensified scrutiny of transactions between American and foreign companies that might have national security implications.

How scared should you be: AI is truly a global effort. The international AI community has a strong tradition of collaboration, and it has built an infrastructure of sharing — including open code, datasets, publications, and conferences — that transcends national boundaries. Yet the aspirations of sovereign states can put the spirit of cooperation at risk. It will take a concerted effort to keep the community alive and thriving, so we can bring the benefits of AI to all people.

What to do: Governments should heed calls by leading AI organizations to make it easier for researchers to gain visas. Conferences should consider meeting in countries with less restrictive borders. Widespread translations of research papers, particularly those that address AI governance, would be helpful. Efforts to develop international standards for data privacy and use, such as those advanced by the Organization for Economic Cooperation and Development and other groups, would help foster international collaboration in a way that respects individual rights.

Illustration of a neighborhood haunted by an evil pumpkin and a black cat

Giant Models Bankrupt Research

What if AI requires so much computation that it becomes unaffordable?

The fear: Training ever more capable models will become too pricey for all but the richest corporations and government agencies. Rising costs will throttle progress as startups, academics, and students — especially in emerging economies — are left out in the cold. Customers will turn away from AI in search of less costly alternatives.

Behind the worries: Training a model to beat the top image classification and object detection benchmarks currently costs millions of dollars. And that cost is rising fast: The processing power required to train state-of-the-art models doubled every 3.4 months between 2012 and 2018, according to a study by OpenAI.

The high cost of beating the state of the art has prompted some institutions to rethink their approach. OpenAI, founded as a nonprofit lab, has morphed into a for-profit company. Last month, the organization granted Microsoft an exclusive commercial license for its GPT-3 language model.
A European grocery store chain recently decided against deploying an inventory tracking model due the cost in cloud computing charges, Wired reported.
AI’s environmental impact is growing as training consumes increasing quantities of energy. A 2019 paper from the University of Massachusetts concluded that training a large language model produced five times as much carbon dioxide as an average car spews over its entire working life.

How scared should you be: The massive inflation in training costs arises from trying to beat the best models. If you can make do with something less, the price comes way down. The cost to train an image classification model with Top-5 accuracy of 93 percent on Imagenet fell from $2,523 in 2017 to $13 the following year, according to a Stanford report. Pretrained models like Hugging Face’s implementations of popular language models and APIs like the one the OpenAI offers for GPT-3 make access to high-end AI even less expensive.

What to do: Researchers at the Allen Institute for AI and elsewhere argue that we should consider a model’s energy efficiency to be just as important as accuracy. Meanwhile, policymakers and executives who see the value in fostering competition should work to boost research funding and access to compute resources.

A MESSAGE FROM DEEPLEARNING.AI

Course 3 of the GANs Specialization from DeepLearning.AI is available now on Coursera! Enroll now

Doctor holding candy and kid dressed as a ghost on a wheighing scale

Unfair Outcomes Destroy Trust

Will AI that discriminates based on race, gender, or economic status undermine the public’s confidence in the technology?

The fear: Seduced by the promise of cost savings and data-driven decision making, organizations will deploy biased systems that end up doing real-world damage. Systems incorporating biased algorithms or trained on biased data will misdiagnose medical patients, bar consumers from loans or insurance, deny parole to reformed convicts, or grant it to unrepentant ones.

Behind the worries: Biased implementations have raised public backlash as organizations both private and public figure out what AI can and can’t do, and how to use it properly.

The UK recently abandoned an algorithm designed to streamline visa applications after human rights activists sued. The plaintiffs charged that the model discriminated against people from countries with large non-white populations.
Financial regulators in New York last year launched an investigation into the algorithm behind Apple’s credit card. Users reported that women had received lower interest rates than men with comparable credit ratings.
The Los Angeles Police Department adopted systems designed to forecast crimes, but it stopped using one and promised to revamp another after determining that they were flawed. Some people identified as high-risk offenders, for instance, had no apparent history of violent crime.

How scared should you be: Many organizations are attracted by AI’s promises to cut costs and streamline operations, but they may not be equipped to vet systems adequately. The biased systems that have made headlines are just the tip of the iceberg, according to Cathy O’Neil, author of the book Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy. Further reports of systems prone to unfair outcomes are bound to emerge.

What to do: AI systems won’t enjoy broad public trust until we demonstrate clearly that they perform well and pose minimal risk of unintended consequences. Much work remains to be done to establish guidelines and systematically audit systems for accuracy, reliability, and fairness.

Illustration of a hand putting candy on a trick or treat bag

The Black Box Has Dark Corners

Will we ever understand what goes on inside the mind of a neural network?

The fear: When AI systems go wrong, no one will be able to explain the reasoning behind their decisions. Imperceptible changes to a model’s input will lead unaccountably to fickle outputs. Seemingly well-designed systems will produce biased results without warning. People will suffer harm without explanation or recourse.

Behind the worries: Decisions made by neural networks are notoriously difficult to explain. In the real world, they have profoundly affected peoples’ lives. In the lab, they have made it impossible to trust their output — even when they showed highly accurate results.

Models deployed by U.S. state governments slashed healthcare benefits for thousands of people living in Arkansas and Idaho. The people affected typically couldn’t figure out why their care was cut. The process for appealing the decision wasn’t clear either.
A study of six neural networks designed to enhance low-resolution medical images found that they often altered the input in ways that made them unreliable as diagnostic tools. Deep learning systems, the authors concluded, provide no clues about the quality of input they require, and developers must tease out the limits experimentally.
A deep learning system accurately predicted the onset of psychiatric disorders like schizophrenia based on a patient’s medical record. However, the developers said their model wouldn’t be useful to doctors until they had a better idea about how it made its predictions.

How scared should you be: The inability to explain AI-driven decisions is keeping people from using the technology more broadly. For instance, in a recent survey of UK information technology workers in the financial services industry, 89 percent said that lack of transparency was the primary impediment to using AI. Europe’s General Data Protection Regulation gives citizens the right to obtain information on automated systems that make decisions affecting their lives. AI makers that can’t provide these details about their technology can face steep fines or outright bans.

What to do: Research into explaining neural network outputs has made substantial strides, but much more work is needed. Meanwhile, it’s imperative to establish standard procedures to ensure that models are built and deployed responsibly.