Google/gemini-3-pro-image-preview
From $0.100500/ callGoogle's flagship image generation and editing model built on Gemini 3 Pro, featuring up to 4K resolution, precise multilingual text rendering, real-time Google Search grounding, and studio-quality creative controls.
More from Google
README
Google/gemini-3-pro-image-preview
Nano Banana Pro (Gemini 3 Pro Image, model ID gemini-3-pro-image-preview) is Google DeepMind's flagship image generation and editing model released in November 2025, built on the Gemini 3 Pro foundation model. As the successor to Nano Banana (Gemini 2.5 Flash Image), it delivers significant improvements in multimodal reasoning, real-world grounding, and visual synthesis fidelity.
The model's core breakthrough is bringing advanced reasoning into the image generation pipeline — it "plans" scenes by understanding physics, lighting, and compositional logic before rendering, producing results with professional-grade logical consistency and technical quality. It can also access real-time data via Google Search to generate content grounded in real-world information (e.g., weather maps, sports data, biological diagrams), dramatically reducing factual hallucinations.
Key Capabilities
- High-Fidelity Text Rendering: Industry-leading in-image text generation — renders clear, correctly spelled, stylistically diverse text directly within images across Chinese, English, Japanese, Korean and more, ideal for posters, menus, infographics, and UI mockups.
- 4K Ultra-High Resolution Output: Supports native 1K, 2K, and 4K resolution output to meet needs ranging from social media to print publishing.
- Multi-Image Fusion & Character Consistency: Accepts up to 14 reference images as input, maintaining facial consistency for up to 5 people or blending 6 high-fidelity reference shots into a unified composition.
- Studio-Quality Creative Controls: Fine-tune camera angles, depth of field, focus, lighting (e.g., day-to-night, bokeh effects), and color grading for professional photography-grade image control.
- Google Search Grounding: Connects to real-time Google Search data to generate data-driven visual content such as maps, charts, and infographics based on factual information.
- Multilingual Localization: Understands the semantic context of text within images, enabling direct cross-language translation of menus, signs, and documents while preserving original layout and artistic style.
- Precise Localized Editing: Select, refine, and transform any region of an image through natural language descriptions to add, remove, or replace elements.
Technical Strengths
| Feature | Benefit |
|---|---|
| Reasoning-driven generation | Leverages Gemini 3 Pro's advanced reasoning to understand physics, lighting, and causal logic before rendering, producing more spatially and compositionally coherent results |
| Text rendering precision | Accurately renders everything from taglines to long paragraphs with multilingual and typographic style support, solving the "gibberish text" problem in AI-generated images |
| Native 4K resolution | Outputs 4K quality without post-processing upscaling — texture detail, color depth, and sharpness meet print and large-format display standards |
| World knowledge & real-time data | Combines Gemini 3 Pro's knowledge base with Google Search access for factually accurate and timely generated content |
| 14-image visual context window | Simultaneously loads complete brand style guides (logos, color palettes, character sheets, etc.) to ensure generated results precisely match brand identity |
| SynthID watermarking | All generated or edited images are automatically embedded with an invisible SynthID digital watermark for AI content provenance |
Pricing
| Quality | LinkAI Price | Official Price |
|---|---|---|
| 1K | 0.100500 | 0.134000 |
| 2K | 0.100500 | 0.134000 |
| 4K | 0.180000 | 0.240000 |