Silicon Valley startup Cerebras shifted out of stealth mode to unveil its flagship product: an enormous chip designed from the ground up to accelerate neural networks.
What’s new: The Cerebras Wafer Scale Engine is aimed at data centers, where the company claims it will perform AI computations 100 to 1,000 times faster than alternatives. The chips will be housed in servers equipped with a special cooling system to dissipate the chip’s heat. They’re scheduled to reach the market next month for an undisclosed price.
Why it’s different: Where many chips are measured in millimeters, this monster is 56 times larger than Nvidia’s top-of-the-line GPU and bigger than a standard iPad. It comprises more than 400,000 cores and 18 gigabytes of memory right on the chip. That’s equivalent to 84 GPUs communicating with one another 150 times more efficiently than usual, with an additional boost thanks to the ability to handle sparse linear algebra.
How it works: Nvidia’s chip architecture is extraordinarily efficient at performing the predictable, repetitive matrix multiplications required by neural networks. Yet it has practical limitations: It must hold an entire neural network in off-chip memory and communicate with other chips through external interfaces that are far slower than communication on the chip itself.
- By putting all computing resources on a single piece of silicon, the new chip makes it possible to process neural networks at top speed.
- For even higher efficiency, it processes sparse networks by pruning unnecessary calculations.
Behind the news: Deep learning’s rapid growth has prompted a top-to-bottom redesign of computing systems to accelerate neural network training.
- Cerebras is a front runner among a plethora of startups working on AI chips.
- And not only startups: Amazon, Facebook, Google, and Tesla have all designed chips for in-house use.
- Among traditional chip companies, Nvidia has progressively retooled its GPUs to accelerate deep learning, Intel is rolling out its competing Nervana technology, and Qualcomm has been building inferencing engines into its smartphone chips.
- Cerebras is the only one to opt for a wafer-scale chip. Soon, it may become the first company to have overcome the considerable technical hurdles to putting a wafer-scale chip into production.
Why it matters: If the new hardware works as advertised, it will open virgin territory for neural networks several orders of magnitude bigger than today’s largest models. Larger models have been shown to yield higher accuracy, and the additional headroom may well allow new kinds of models that wouldn’t be practical otherwise.
We’re thinking: The advent of Nvidia GPUs two decades ago spurred innovations in model architecture that boosted the practical number of network layers from handfuls to 1,000-plus. Cerebras’ approach portends fresh architectures capable of solving problems that are currently out of reach. We don’t yet know what those models will look like, but we’re eager to find out!