New Horsepower for Neural Nets UK startup Graphcore released its Colossus MK2 chip for AI.

Reading time
1 min read
Colossus Mk2, processor by Graphcore

A high-profile semiconductor startup made a bid for the future of AI computation.

What’s new: UK startup Graphcore released the Colossus Mk2, a processor intended to perform the matrix math calculations at the heart of deep learning more efficiently than other specialized processors or general-purpose chips from Intel and AMD. The company expects to be shipping at full volume in the fourth quarter.

How it works: The Mk2 comprises nearly 60 billion transistors. (Nvidia’s flagship A100 has 54 billion, while Cerebras’ gargantuan Wafer-Scale Engine boasts 1.2 trillion. Google doesn’t advertise its TPU transistor counts.) Girded by 900 megabytes of random access memory, the Mk2’s transistors are organized into 1,500 independent cores capable of running nearly 9,000 parallel threads.

  • Graphcore is selling the new chips as part of a platform called IPU-Machine M200. Each M200 will hold four Mk2 chips to deliver a combined computational punch of 1 petaflop, or 1015 floating point operations per second.
  • Each M200 can connect to up to 64,000 others for 16 exaflops of compute. (An exaflop is 1,000 petaflops.) That’s a hefty claim, given that competing systems have yet to reach 1 exaflop.
  • The package includes software designed to manage a variety of machine learning frameworks. Developers can code directly using Python and C++.
  • J.P. Morgan, Lawrence Berkeley National Laboratory, and the University of Oxford are among the first users of the new chip.

Why it matters: AI’s demand for computational resources is insatiable. A recent study from researchers at MIT, the University of Brasilia, and Yonsei University suggests that progress in deep learning could stall for lack of processing power. Innovations in chip technology may make a difference.

We’re thinking: The fact that software evolves faster than hardware is a major challenge to building chips. Graphcore’s design is geared to accelerate large, sparse recurrent neural networks at a moment when transformer networks are beginning to supplant RNNs in some applications. Will some bold chip maker tune its next generation for transformers?


Subscribe to The Batch

Stay updated with weekly AI News and Insights delivered to your inbox