Dear friends,

The competitive landscape of large language models (LLMs) is evolving quickly. The ultimate winners are yet to be determined, and already the current dynamics are exciting. Let me share a few observations, focusing on direct-to-consumer chat interfaces and the LLM infrastructure and application layers.

First, ChatGPT is a new category of product. It’s not just a better search engine, auto-complete, or something else we already knew. It overlaps with other categories, but people also use it for entirely different purposes such as writing and brainstorming. Companies like Google and Microsoft that are integrating LLMs into existing products may find that the complexity of switching not only technologies but also product categories raises unique challenges.

OpenAI is clearly in the lead in offering this new product category, and ChatGPT is a compelling direct-to-consumer product. While competitors are emerging, OpenAI’s recent move to have ChatGPT support third-party plugins, if widely adopted, could make its business much more defensible, much like the app stores for iOS and Android helped make those platforms very defensible businesses.

Second, the LLM infrastructure layer, which enables developers to interact with LLMs via an API, looks extremely competitive. OpenAI/Microsoft leads in this area as well, but Google and Amazon have announced their own offerings, and players such as Hugging Face, Meta, Stability AI, and many academic institutions are busy training and releasing open source models. It remains to be seen how many applications will need the power of the largest models, such as GPT-4, versus smaller (and cheaper) models offered by cloud providers or even hosted locally, like gpt4all, which runs on a desktop.

Finally, the application layer, in which teams build on top of LLMs, looks less competitive and full of creativity. While many teams are piling onto “obvious” ideas — say, building question-answering bots or summarizers on top of online content — the sheer diversity of potential LLM-powered applications leaves many ideas relatively unexplored in verticals including specialized coaching and robotic process automation. AI Fund, the venture studio I lead, is working with entrepreneurs to build applications like this. Competition feels less intense when you can identify a meaningful use case and go deep to solve it.

LLMs are a general-purpose technology that’s making many new applications possible. Taking a lesson from an earlier era of tech, after the iPhone came out, I paid $1.99 for an app that turned my phone into a flashlight. It was a good idea, but that business didn’t last: The app was easy for others to replicate and sell for less, and eventually Apple integrated a flashlight into iOS. In contrast, other entrepreneurs built highly valuable and hard-to-build businesses such as AirBnB, Snapchat, Tinder, and Uber, and those apps are still with us. We may already have seen this phenomenon in generative AI: Lensa grew rapidly through last December but its revenue run appears to have collapsed.

Today, in a weekend hackathon, you can build a shallow app that does amazing things by taking advantage of amazing APIs. But over the long term, what excites me are the valuable solutions to hard problems that LLMs make possible. Who will build generative AI’s lasting successes? Maybe you!

One challenge is that the know-how for building LLM products is still evolving. While academic studies are important, current research offers a limited view of how to use LLMs. As the InstructGPT paper says, “Public NLP datasets are not reflective of how our language models are used. . . . [They] are designed to capture tasks that are easy to evaluate with automatic metrics.”

In light of this, community is more important than ever. Talking to friends who are working on LLM products often teaches me non-intuitive tricks for improving how I use them. I will continue trying to help others wherever I can.

Keep learning!

Andrew

P.S. On Tuesday April 25, 2023, I’ll share early ideas on Visual Prompting in a livestream on behalf of my team Landing AI. LLMs let users enter a text prompt and quickly get a text output, which has transformed natural language processing. I’m excited about taking these ideas from text to computer vision so we can let users enter a visual prompt (labeling a few pixels) and quickly get a visual output. You can sign up for the livestream here.

News

The Music Industry Strikes Back

The music industry fired early shots in an impending war against AI-generated music.

What’s new: Universal Music Group, which owns labels including Deutsche Grammophon, EMI, Interscope, Motown, Polydor, and Virgin, is pressing Spotify and other streaming media services to counter the threat of AI-driven copycats, Financial Times reported.

How it works: Universal Music Group (UMG), which accounts for nearly one-third of the global music market and thus a substantial portion of revenue to distributors of digital music, is prevailing on top streaming services to protect its intellectual property.

UMG asked Apple Music and Spotify, which license its recordings, to block AI developers from downloading them. It also asked them not to distribute AI-generated songs.
The company issued takedown requests to numerous YouTube users who created AI-generated imitations of UMG artists such as Drake. Some channels shared the notices.

Behind the news: Music generators like Google’s MusicLM are in their infancy but likely to improve quickly. Hugging Face recently added two to its offerings. Meanwhile, the question whether AI developers have a right to train their models on works under copyright — images, so far, rather than music — is central to cases underway in United States courts.

Why it matters: The recording industry has significant economic and political clout, and its preferences may play a major role in determining whether AI developers can continue to train their systems on copyrighted works without permission. In the early years of the internet, recording companies helped shut down peer-to-peer music-sharing sites like Napster, which helped create the market for subscription streaming services like Apple Music and Spotify. The latest moves may portend a similar fight. One difference: While the copyright issues surrounding Napster were clear, they have yet to be established with respect to AI.

We’re thinking: Just as the music industry came to support on-demand digital music by way of streaming services, it can create opportunities — both commercial and creative — for AI models that generate music and form partnerships with AI developers to realize them.

Eyes on the Olympics

French lawmakers said “oui” to broad uses of AI-powered surveillance.

What’s new: France’s National Assembly authorized authorities to test systems that detect unlawful, dangerous, or unusual behavior at next year’s Summer Olympics in Paris, Reuters reported. The bill will become law unless the country’s top court blocks it.

How it works: The bill is part of broader legislation that regulates Olympic advertising, doping, and the route run by torch bearers.

The French data-privacy regulator will process video feeds from closed-circuit cameras and drones “on an experimental basis” at sporting, recreational, and cultural events until June 30, 2025.
The system will send alerts upon detecting certain predetermined events. Lawmakers said the technology will monitor crowds for threats such as surges, abnormal behavior, and abandoned luggage.
The system won’t include face recognition, collect biometric data, or query biometric information systems.

Behind the news: Technology that collects biometric data would be subject to strict monitoring and reporting requirements under the current draft of the European Union’s forthcoming AI Act, which is scheduled for a vote in May. If it passes, the European Parliament, European Council, and European Commission will negotiate a final version.

Yes, but: Amnesty International, Human Rights Watch, and 36 other nongovernmental organizations signed a letter opposing the French bill. The signatories contend that analyzing the behavior of individuals in a crowd requires collecting personal biometric data, although French authorities deny it.

Why it matters: France’s move is emblematic of broader tension between AI’s value in security applications and its potential for harm. If the bill clears legal hurdles, France will become the first EU country to formally legalize AI-powered surveillance.

We’re thinking: AI has great potential in crowd control. Engineers working on such applications should keep in mind that computer vision systems can be compromised by fluctuations in lighting, changes in physical surroundings, and the complexities of group behavior.

A MESSAGE FROM DEEPLEARNING.AI

Learn how to train and fine-tune large language models using the recently released PyTorch 2.0! Join us for an online workshop on Thursday, April 27, 2023 at 10:00 a.m. Pacific Time. RSVP

AWS Joins the Generative AI Race

Amazon joined big-tech peers Google, Meta, and Microsoft in rolling out services that provide generated text and images.

What’s new: The online retailer launched early access to Bedrock, a cloud platform that offers generative models built by Amazon and its partners.

How it works: Bedrock is aimed at business customers, who can select among image- and text-generation models and fine-tune them for proprietary uses. It’s available to selected customers of Amazon Web Services as a “limited preview.” The price has yet to be announced.

The platform hosts Stability AI’s Stable Diffusion for image generation. This arrangement extends a partnership announced in November, when Stability AI named Amazon Web Services its preferred provider of cloud processing and storage.
It offers two third-party language models: AI21’s Jurassic-2 for composing stand-alone text and Anthropic’s Claude for conversational applications such as answering questions.
Bedrock also includes two language models developed by Amazon based on Titan. Titan Text generates and summarizes text, while Titan Embeddings generates text embeddings.

Behind the news: Amazon’s peers offer similar capabilities via their respective cloud services.

Earlier this month, Meta announced plans to launch a tool, powered by an in-house language model, to help advertisers generate ad copy.
In March, Google announced an API for the PaLM language model as well as tools for building generative text apps on Google Cloud.
Microsoft Azure offers access to OpenAI models including GPT-4 for generating text and DALL·E 2 for generating images.

Why it matters: Between Amazon and other cloud computing providers, generative AI rapidly is becoming available to developers of all kinds.

We’re thinking: DALL·E 2 and ChatGPT debuted less than a year ago. Generative AI is gathering momentum at warp speed!

Goodbye Prompt Engineering, Hello Prompt Generation

When you’re looking for answers from a large language model, some prompts are better than others. So how can you come up with the best one? A new model automates the process.

What’s new: Yongchao Zhou, Andrei Ioan Muresanu, Ziwen Han, and colleagues at University of Toronto, Vector Institute, and University of Waterloo developed a procedure for generating effective text to prompt large language models: Automatic Prompt Engineer (APE).

Key insight: Given a handful of input-output pairs, a large language model can generate a prompt that, along with the same inputs, would result in the similar outputs. Moreover, having produced a prompt, it can generate variations that may result in even more similar outputs.

How it works: APE requires two large language models: a prompt generator (which produces prompts) and a content generator (which, given a prompt, produces output). For the prompt generator, they tried both language models that complete inputs (such as GPT-3 and InstructGPT) and those that fill in blanks in inputs (such as T5, GLM, and InsertGPT). For the content generator, they used InstructGPT.

The authors fed the prompt generator a prompt such as, “I gave a friend an instruction and five inputs. The friend read the instruction and wrote an output for every one of the inputs. Here are the input-output pairs:” followed by a small set of example inputs and outputs, such as the names of two animals and which one is larger, from Instruction Induction. After the example inputs and outputs, the prompt concluded, “The instruction was <COMPLETE>”. The prompt generator responded with a prompt such as “Choose the animal that is bigger.”
They fed the generated prompt plus 50 example inputs from the dataset to the content generator, which generated outputs.
They scored the prompt’s quality based on how often the content generator produced outputs that exactly matched the expected outputs.
They sharpened the prompt by asking the prompt generator to produce a prompt similar to the highest-scoring one (“Generate a variation of the following instruction . . . ”) and repeated the process. They performed this step three times. For example, a higher-scoring variation of the earlier prompt example is “Identify which animal is larger”.

Results: Earlier work on automated prompt engineering used large language models to generate prompts but didn’t iteratively refine them. In 19 out of the 24 tasks in Instruction Induction, prompts generated by InstructGPT using APE outperformed the earlier work as well as human-engineered prompts according to Interquartile Mean (IQM), the mean exact-match accuracy after discarding the lowest and the highest 25 percent. On all 24 tasks, prompts produced by InstructGPT using APE achieved 0.765 IQM, while human prompts achieved 0.749 IQM. By optimizing measures of truthfulness and informativeness, the method produced prompts that steered the content generator to produce output with those qualities. For instance, on TruthfulQA, a question-answering dataset that tests for truthful and informative answers, answers produced by InstructGPT using APE were rated true and informative 40 percent of the time, while answers produced using prompts composed by humans achieved 30 percent (although the generated answers produced by InstructGPT using APE often take shortcuts such as “no comment,” which has high truthfulness but little information).

Why it matters: As researchers develop new large language models, APE provides a systematic way to get the most out of them.

We’re thinking: Prompt engineers have only existed for a few years, and already robots are coming for their jobs!

Data Points

Canada investigates OpenAI
The Canadian Office of the Privacy Commissioner announced a probe of ChatGPT’s maker in response to complaints about the chatbot’s collection and use of personal information. (Analytics Insight)

Fanfic writers accused of employing generative AI
Members of Archive of Our Own, an online repository for fan fiction, received dozens of anonymous comments that accuse them of publishing AI-generated content. (The Verge)

United States investors are funding Chinese AI startups
Institutional investors in the U.S. are indirectly financing Chinese AI startups through key Chinese venture capital firms such as Sequoia Capital China. U.S. government officials have expressed concerns about these investments. (The Information)

Medical startup Glass Health developed a chatbot for doctors
Glass AI suggests possible diagnoses and treatment options for patients. (NPR)

How AI is impacting historical research
Historians are adopting machine learning to study historical documents. Although skepticism exists about this new technology, the field is gradually accepting it. (MIT Technology Review)

Survey revealed what students and educators think about ChatGPT
Study.com's survey found that almost all students know of ChatGPT, but more than a third of educators believe that the chatbot should not be used in teaching. (Study.com)

Hollywood grapples with generative AI
The entertainment industry is leveraging models that generate text, images, and audio to streamline production, but it also faces a challenge as these tools can use copyrighted material to generate original scripts, images, and films. (The Wall Street Journal)

China takes steps to regulate generative AI
The Cyberspace Administration of China (CAC) proposed draft measures to manage generative AI services and mitigate possible risks in terms of personal data and inappropriate content. (Reuters)

AI art is shaking the video game industry in ChinaGame developers’ adoption of AI image generators is sparking anxiety among Chinese animators and illustrators. However, players and artists alike are not impressed with the AI-generated products. (Rest of World and Kotaku)