Things are really starting to get going in the field of AI. After many years (decades?!) of focusing on algorithms, the AI community is finally ready to accept the central role of data and the high-capacity models that are capable of taking advantage of this data. But when people talk about “AI,” they often mean very different things, from practical applications (such as self-driving cars, medical image analysis, robo-lawyers, image/video editing) to models of human cognition and consciousness. Therefore, it might be useful to distinguish two broad types of AI efforts: semantic or top-down AI versus ecological or bottom-up AI.
The goal of top-down AI is to match or exceed human performance on a specific human task like image labeling, driving, or text generation. The tasks are defined either by explicit labels (supervised learning), a set of rules (e.g., rules of the road), or a corpus of human-produced artifacts (for instance, GPT3 is trained on human-written texts using human-invented words). Thus, top-down AI is necessarily subjective and anthropocentric. It is the type of AI where we have seen the most advances to date.
Bottom-up AI, on the other hand, aims to ignore humans, their tasks and their labels. Its only goal is to predict the surrounding world given sensory inputs (passive and active). Because the world is continuously changing, this goal will never be reached. But the hope is that, along the way, a general, task-agnostic model of the world will emerge. Self-supervised learning on raw sensory data, various generative models such as GANs, and intrinsic motivation approaches (e.g., curiosity) are all attempts at bottom-up AI.
While top-down AI is currently king in industry as well as academia, its focus on imitating humans (via labels and tasks) points to its main limitation. It is like an undergraduate student who didn’t attend lectures all semester but still gets an A by cramming for the final exam — its knowledge is of a superficial nature. Real understanding must be built up slowly and patiently, from the raw sensory inputs upward. This is already starting to happen, and I hope the progress of bottom-up AI will continue in 2022.
As a teenager in the 1980s USSR, I spent a lot of time hanging out with young physicists (as one does) talking about computers. One of them gave a definition of artificial intelligence that I still find the most compelling: “AI is not when a computer can write poetry. AI is when a computer will want to write poetry.” By this definition, AI may be a tall order, but if we want to bring it closer, I suspect we will need to start from the bottom up.
Happy 2022! Bottoms up!
Alexei Efros is a professor of computer science at UC Berkeley.