Google DeepMind Introduces Agentic Vision to Gemini 3 Flash

Share This Post

Google DeepMind added this week agentic vision capabilities to its Gemini 3 Flash model, turning image analysis an active rather than passive task.

While typical multimodal models process images in a single “glance,” by introducing agentic capabilities, Google allows its model to actively study a picture and home in on specific details, such as street signs or a serial number on a microchip. 

The new feature works by generating and running Python code that zooms, manipulates and inspects images methodically.

“By combining visual reasoning with code execution, one of the first tools supported by Agentic Vision, the model formulates plans to zoom in, inspect and manipulate images step-by-step, grounding answers in visual evidence,” Rohan Doshi, product manager at Google DeepMind, wrote in a blog post about the announcement.

The feature uses a Think-Act-Observe loop, whereby Gemini 3 Flash will study a user query and image and formulate a plan, use Python code to actively conduct an image analysis, and then inspect the results before generating its final response. 

According to Google, the update saw a quality improvement of between 5% to 10% across vision benchmarks.

A range of new agentic behaviors have, Google said, already been demonstrated from the update via Google AI Studio, such as iterative zooming, direct image annotation and visual plotting. The latter is said to reduce hallucinations ­— a common problem with visual math tasks. 

Looking ahead, the company said it plans to add more implicit code-driven behaviors into the model, meaning certain capabilities that currently require a specific prompt will become an autonomous feature.

More features, such as web and reverse image search, as well as a greater range of model sizes, are also expected to be rolled out in the future. 

Related Posts

Arizona Judge Blocks Gambling Enforcement Against Kalshi Contracts

A federal judge in Arizona has temporarily barred state...

HSBC and Standard Chartered Win Hong Kong’s Inaugural Stablecoin Licenses

The HKMA selected two issuers from a pool of...

Legacy Payments Failing UK businesses: GoCardless Points to Commercial VRPs as the Fix

New research from bank payment company GoCardless reveals that...

Why Outsourcing Hiring Works with Technical Interview as a Service India

Share Share Share Share Email In today’s competitive business environment, hiring the right technical...

Public Backlash Prompts Circle Response To $270M Drift Protocol Theft: Details

Trusted Editorial content, reviewed by leading industry experts and...