Generative AI Calling Google brings advanced computer vision and audio tech to Pixel 8 and 8 Pro phones.

Published
Reading time
2 min read
Excerpt from Google Pixel 8 promotional video

Google’s new mobile phones put advanced computer vision and audio research into consumers’ hands.

What’s new: The Alphabet division introduced its flagship Pixel 8 and Pixel 8 Pro smartphones at its annual hardware-launch event. Both units feature AI-powered tools for editing photos and videos.

How it works: Google’s new phones process images in distinctive ways driven by algorithms on the device itself. They raise the bar for Apple, the smartphone leader, to turn its internal projects into market opportunities.

  • The feature called Best Take enables users to select elements from multiple photos and stitches them into a single image. In a group photo, users might replace faces with closed eyes or grimaces with alternatives from other shots that show open eyes and wide smiles.
  • Magic Editor uses image-generation technology to edit or alter images. Users can move and resize individual elements and swap in preset backgrounds. They can also generate out-of-frame parts of an element — or an entire photo — on the fly.
  • Audio Magic Eraser splits a video’s audio into distinct sounds, enabling users to adjust their relative volume. This capability can be useful to reduce distracting noises or boost dialogue.
  • Video Boost, which will arrive later this year on the Pixel 8 Pro only, will improve the image quality of videos by automatically stabilizing motion and adjusting color, lighting, and grain.

Behind the news: Google researchers actively pursued AI systems that alter or enhance images, video, and audio.

  • Best Take and Magic Editor resemble a system Google and Georgia Tech researchers described in an August 2023 paper, which uses diffusion models to segment and merge multiple images.
  • Magic Editor echoes Imagen, Google’s diffusion text-to-image generator. 
  • Audio Magic Eraser resembles capabilities described in a recent paper that proposes AudioScopeV2 to separate and recombine various audio and video tracks.

Why it matters: Smartphones produce most of the world’s photos and videos. Yet generative tools for editing them have been confined to the desktop, social-network photo filters notwithstanding. Google’s new phones bring the world closer to parity between the capabilities of desktop image editors and hand-held devices. And the audio-editing capabilities raise the bar all around.

We’re thinking: Earlier this year, Google agreed to uphold voluntary commitments on AI, including developing robust mechanisms, such as watermarks, that would identify generated media. Will Google apply such a mark to images edited by Pixel users? 

Share

Subscribe to The Batch

Stay updated with weekly AI News and Insights delivered to your inbox