Why AI Will Move to Edge Devices AI will continue to run in data centers, but technology and economics are pushing it to the edge as well.

Published

Oct 18, 2023

Reading time

3 min read

Dear friends,

I wrote earlier about how my team at AI Fund saw that GPT-3 set a new direction for building language applications, two years before ChatGPT was released. I’ll go out on a limb to make another prediction: I think we’ll see significant growth in AI, including Generative AI, applications running at the edge of the network (PC, laptop, mobile, and so on).

I realize this flies in the face of conventional wisdom. Most AI runs in data centers, not on edge devices. There are good reasons for this:

The most powerful large language models require 100B+ parameters and massive amounts of memory even for inference (100B parameters, stored using 8- bit quantization, requires 100GB of memory).
Many businesses prefer to operate cloud-based, software-as-a-service (SaaS) products (which allows them to charge a recurring subscription fee) rather than software running at the edge (where customers tend to prefer paying a one-time fee). SaaS also gives the company access to data to improve the product and makes the product easier to upgrade.
Many developers today have been trained to build SaaS applications, and want to build cloud-hosted applications rather than desktop or other edge applications.

Here’s why I think those factors won’t stop AI’s growth at the edge.

AI applications are starting to run quite well on modern edge devices. For example, I regularly run models with around 1B to 10B parameters on my laptop. If I’m working on an airplane without WiFi access, I will occasionally run a small model to help me with my work.
For many applications, a model of modest size works fine, especially if it’s fine-tuned to the task at hand. To help me find grammatical errors in my writing, do I really need a 175B parameter model that has broad knowledge of philosophy, history, astronomy, and every other topic under the sun?
Many users, especially those from Gen Z (born around 1996 to 2010), whose behavior tends to be a leading indicator of future consumer trends, are increasingly sensitive to privacy. This has been a boon to Apple’s product sales, given the company’s reputation for privacy. Surely, to check my grammar, I don’t need to share my data with a big tech company?
Similarly, for corporations worried about their own data privacy, edge computing (as well as on-premises and virtual private cloud options) could be appealing.

Further, strong commercial interests are propelling AI to the edge. Chip makers like Nvidia, AMD, and Intel sell chips to data centers (where sales have grown rapidly) and for use in PCs and laptops (where sales have plummeted since the pandemic). Thus, semiconductor manufacturers as well as PC/laptop makers (and Microsoft, whose sales of the Windows operating system depend on sales of new PC/laptops) are highly motivated to encourage adoption of edge AI, since this would likely require consumers to upgrade their devices to have the more modern AI accelerators. So many companies stand to benefit from the rise of edge AI and will have an incentive to promote it.

AI Fund has been exploring a variety of edge AI applications, and I think the opportunities will be rich and varied. Interesting semiconductor technology will support them. For example, AMD’s xDNA architecture, drawing on configurable cores designed by Xilinx (now an AMD company), is making it easier to run multiple AI models simultaneously. This enables a future in which one AI model adjusts image quality on our video call, another checks our grammar in real time, and a third pulls up relevant articles.

While it’s still early days for edge AI — in both consumer and industrial markets (for example, running in factories or on heavy machinery) — I think it’s worth investigating, in addition to the numerous opportunities in cloud-hosted AI applications.

Keep learning!

Andrew

P.S. My team at Landing AI will present a livestream, “Building Computer Vision Applications,” on Monday, November 6, 2023, at 10 a.m. Pacific Time. We’ll discuss the practical aspects of building vision applications including how to identify and scope vision projects, choose a project type and model, apply data-centric AI, and develop an MLOps pipeline. Register here!

Subscribe to The Batch