Google’s mid-sized Gemma 2 competes with Llama 3 and other open giants Plus, ESM3’s new model can engineer proteins’ sequence, structure, and function

Published

Jul 1, 2024

Reading time

3 min read

Twice a week, Data Points brings you the latest AI news, tools, models, and research in brief. Today’s edition includes:

Mars5, an open source voice cloning tool
A new study on the most common forms of AI misuse
Google finds new data sources to ground its agents
A prize for AI that can solve puzzles that baffle machines

But first:

Google launches Gemma 2, an open-source model available in 9 billion and 27 billion parameter sizes
The new model offers performance and efficiency gains over its predecessor, with the 27B version competing with Llama 3 and Grok 1 while running on a single GPU. Base and instruction-tuned versions of both model sizes and their weights are freely available through multiple platforms, including Google AI Studio, Kaggle, and Hugging Face Models. Gemma 2 shows some other technical advances, using sliding window attention, logit soft-capping, knowledge distillation, and model merging. 27 billion parameters is also an unusual size for a model, not quite small enough to run locally (except in heavily quantized versions) but not nearly as large as leading open or closed competitors. (Google and Hugging Face)

Evolutionary Scale announces ESM3, an open model for protein engineering
Trained on billions of proteins, the model has potential applications for biology and medicine, and can also simulate evolution. ESM3 can reason over protein sequence, structure, and function as either input or output, one at a time or simultaneously. The model is currently available via an API, and the Amazon- and Nvidia-backed company plans to release open base and instruction-tuned versions in 1.4, 7, and 98 billion parameters to accelerate scientific research. (Evolutionary Scale)

MARS5 releases an open source competitor to ElevenLabs
CAMB.AI’s MARS5, a new speech cloning model, can generate realistic speech for diverse, difficult-to-replicate scenarios like sports commentary and anime using just 5 seconds of audio and a text snippet. MARS5 uses a combination of a transformer encoder-decoder and diffusion inpainting to generate “deep cloned” speech output. The model allows users to guide variations in prosody by using punctuation, capitalization, and other text formatting. (GitHub and CAMB.AI)

Study reveals deepfakes as leading form of AI abuse
A new study by Google DeepMind and Jigsaw analyzed 200 real-world incidents of AI misuse from January 2023 to March 2024. The researchers found that creating and spreading deceptive deepfake media, especially targeting politicians and public figures, is the most common malicious use of AI. The study also identified using language models to generate disinformation as the second most frequent type of AI abuse. Influencing public opinion and political narratives was the primary motivation behind over a quarter of the cases analyzed, followed by the use of deepfakes or disinformation for financial gain, whether through monetization of services or outright fraud. (arXiv.org)

Google’s Agent Builder expands options for grounding agents in real-world data
Google announced new features for its Vertex AI Agent Builder including improved grounding with Google Search, a high-fidelity mode to reduce hallucinations by drawing information only from the provided context, and upcoming support for third-party datasets from Moody’s, MSCI, Thomson Reuters, and Zoominfo. Google is also expanding its Vector Search capabilities to include hybrid search, combining vector-based and keyword-based techniques for more relevant results. These changes address some of the limitations of grounding agents in Google Search, and aim to help developers and businesses build more accurate and capable AI agents by grounding them in reliable information. (Google)

$1 million ARC prize fund offered for AI that can solve human-like reasoning puzzles
The Abstraction and Reasoning Corpus (ARC) test, designed to resist AI’s memorization abilities, challenges systems to deduce patterns in paired grids of pixelated shapes. To win the grand prize of $500,000, an AI must match or exceed average human performance within twelve hours using limited computing power. The prize’s backers, Zapier’s Mike Knoop and Google’s François Chollet, believe any winning model will have to demonstrate capabilities like object permanence and geometric reasoning that current large language models typically lack. (Arc Prize)

Still want to know more about what matters in AI right now?

Read last week’s issue of The Batch for in-depth analysis of news and research.

This week, Andrew Ng discussed the contrasting views of AI as a tool versus a separate entity:

“When I was a high-school student in an internship job, I spent numerous hours photocopying, and I remember wishing I could automate that repetitive work. Humans do lots of valuable work, and AI, used as a tool to automate what we do, will create lots of value. I hope we can empower people to use tools to automate activities they’re allowed to do, and erect barriers to this only in extraordinary circumstances, when we have clear evidence that it creates more harm than benefit to society.”

Read Andrew's full letter here.

Other top AI news and research stories we covered in depth included the U.S. antitrust investigation on three AI giants, the new multilingual competitor to GPT-4, a growing market for lifelike avatars of deceased loved ones, and new benchmarks for agentic behaviors.

Subscribe to Data Points