OpenAI added new features designed to help developers build applications using its generative models.
What’s new: OpenAI introduced a plethora of capabilities at its first developer conference in San Francisco.
Upgrades and more: The company rolled out the upgraded GPT-4 Turbo (which now underpins ChatGPT). It extended API access to its DALL·E 3 image generator, text-to-speech engine, speech recognition, and agent-style capabilities. And it showed off a new concept in chatbots called GPTs.
- GPT-4 Turbo expands the number of tokens (typically words or parts of words) the model can process at once to 128,000 — up from a previous maximum of 32,000. That enables the model to process context over the length of a book. API access costs between one-third and half the previous cost of GPT-4 Turbo’s predecessors (some of which got price cuts).
- GPT-4 Turbo includes a JSON mode that returns valid JSON, enabling developers to get usable structured data from a single API call. Reproducible outputs (in beta) make the model’s behavior more consistent from one use to another by letting users specify a random number seed. Log probabilities (available soon) will allow developers to build features like autocomplete by predicting which tokens are likely to appear next in a sequence.
- New API calls enable developers to take advantage of image input/output, text-to-speech, and speech recognition (coming soon). New calls are available for building agent-style applications that can reason about and execute sequences of actions to complete a task. They can also retrieve information external to the model and execute functions.
- The company introduced GPTs: custom chatbots that can be configured using a conversational interface and distributed in store, like mobile apps. For instance, Canva built a GPT that generates graphics to order through conversation.
Why it matters: OpenAI is enabling developers to build intelligence into an ever wider range of applications. GPT-4 Turbo's 128,000-token context window makes possible applications that require tracking information across huge volumes of input. The expanded APIs open up language, vision, and multimodal capabilities as well as agent-style applications that respond to changing conditions and behave in complex ways. The opportunities for developers are immense.
We’re thinking: It’s amazing to see cutting-edge AI developments become widely available so quickly. Early on, OpenAI withheld its work out of fear that it could be misused. But that policy clearly no longer holds. “We believe that gradual iterative deployment is the best way to address safety challenges of AI,” OpenAI CEO Sam Altman said in his keynote. Based on the evidence to date, we agree.