Google introduced Gemini 3 Pro and Nano Banana Pro, its flagship vision-language and image-generation models, and deployed them to billions of users worldwide.
Gemini 3 Pro: A multimodal reasoning model, Gemini 3 Pro leads LMArena’s Text, WebDev, and Vision leaderboards as of this writing. The update replaces Gemini 2.5’s budget of tokens allocated to reasoning with reasoning-level setting (low, medium, or high), which Google says is simpler to manage.
- Input/output: Text, images, PDFs, audio, and video in (up to 1 million tokens), text out (up to 64,000 tokens, 128 tokens per second)
- Architecture: Mixture-of-experts transformer
- Training: Pre-trained on data (text, code, images, video, audio) scraped from the web, licensed data, Google user data, synthetic data; fine-tuned to reason, follow instructions, and align with human preferences via unspecified reinforcement learning methods using data that represents multi-step reasoning, problem-solving, and theorem proofs
- Features: Tool use (Google search, URL context, Python code execution, file search, function calling), structured outputs, adjustable reasoning (low, medium, high)
- Performance: In Google’s tests, Gemini 3 Pro raised the state of the art on Humanity’s Last Exam (reasoning), GPQA Diamond (academic knowledge), AIME 2025 (competition math problems), MMMU-Pro (multimodal reasoning), and MRCR v2 (long-context performance), by substantial margins in some cases. For roughly a week — before Anthropic’s Claude Opus 4.5 swooped in — it also held the top spots on SWE-bench Verified (agentic coding), Terminal-Bench 2.0 (agentic terminal coding), and ARC-AGI-2 (visual reasoning puzzles).
- Availability: Free via Gemini app and AI Overviews in Google Search; integrated with the paid services Google AI Studio, Vertex AI, and Google Antigravity agentic coding tool; API $2/$0.20/$12 per million input/cached/output tokens for input contexts under 200,000 tokens, $4/$0.40/$18 per million input/cached/output tokens for input contexts greater than 200,000 tokens (plus $4.50 per million cached tokens per hour)
- Knowledge cutoff: January 2025
- Undisclosed: Parameter count, architecture details, training methods
Yes, but: Gemini 3 Pro uses a lot of tokens to achieve its outstanding performance. Completing the Artificial Analysis Intelligence Index, a weighted average of 10 benchmarks, cost $1,201, second only to Grok 4 ($1,888). It also produces incorrect output when it could defer. Tested on the Artificial Analysis Omniscience Hallucination Rate, the proportion of wrong answers out of all non-correct attempts including refusals, Gemini 3 Pro (88 percent) was far higher than Claude Sonnet 4.5 (48 percent) and GPT 5.1 High (5 percent).
Nano Banana Pro: Google also launched Nano Banana Pro (also known as Gemini 3 Pro Image), which currently tops Artificial Analysis’ Text-to-Image and Image Editing leaderboards. Nano Banana Pro uses Gemini 3 Pro’s reasoning and knowledge when producing and editing images, generating up to two intermediate images to refine composition and logic before producing the final image. It’s designed to excel at text generation and to maintain up to 5 consistent characters across multiple generations. It grounds images using Google search to make factually accurate infographics, maps, and the like and translates or alters text within images while preserving artistic style.
- Input/output: Text or images in (up to 1 million tokens, up to 14 reference images), images out (up to 64,000 tokens; 1024x1024, 2048x2048, or 4096x4096 pixel resolution)
- Architecture: Based on Google Gemini 3 Pro
- Training: Same as Google Gemini 3 Pro
- Features: Outputs watermarked using SynthID, default reasoning that refines composition before final output, integrated with Google search and creative tools like Adobe and Figma, and editing of multiple characters, text, and doodles (user sketches on images)
- Performance: In Google’s human evaluations, Nano Banana Pro earned higher ratings in all tasks tested compared to OpenAI GPT-Image 1, Gemini 2.5 Flash Image, ByteDance Seedream v4, and Black Forest Labs Flux Pro Kontext Max. In a test of text rendering, Nano Banana Pro (1,198 Elo) outperformed the next-best GPT-Image 1 (1,150 Elo). Producing infographics, Nano Banana Pro (1,268 Elo) outperformed Gemini 2.5 Flash Image (1,162 Elo).
- Availability: Via Gemini app (globally) when selecting Thinking and Create Images (quotas based on tier, free tier included), AI Mode in Google Search (only for U.S.-based Google AI Pro and Ultra subscribers), Google Ads, Google Workspace (Slides and Vids), NotebookLM, Gemini API, Google AI Studio, Vertex AI, and Google Antigravity; API $0.0011 per input image, $0.134 (1024x1024 or 2048x2048 pixel resolution) or $0.24 (4096x4096 pixel resolution) per output image
- Knowledge cutoff: January 2025
- Undisclosed: Parameter count, architecture details, training methods
Behind the news: Google rolled out Gemini 3 Pro and Nano Banana Pro more broadly than Anthropic’s August launch of Claude Opus 4.1 or OpenAI’s early-November launch of GPT-5.1. Rather than leading with an API and a handful of new apps, Google pushed its new models into services that reach over 2 billion people each month, including Google Search’s AI Overview, Gmail, Docs, Sheets, and Android. At the same time, it launched Antigravity, an agentic coding platform that competes with tools like Cursor and Claude Code.
Why it matters: After trailing OpenAI and Anthropic on many benchmarks for months, Google now leads on many of them (despite a partial upset by Claude Opus 4.5, which arrived a week later). For developers who are evaluating which model to use, this could change their default option. Broadly, benchmark leadership has shifted multiple times in 2025, which suggests that no single company has established a durable technical lead.
We’re thinking: While Gemini 3 Pro defines the state of the art for more than a dozen popular benchmarks — this week, at least! — Google’s market power and edge in distribution may matter more. Its ability to deploy to billions of users instantly through its established products provides a wide moat that most competitors, apart from Apple with its iPhone empire, may find difficult to traverse purely by releasing better models.