AI2’s Computer Use Agent Can Execute Actions Online

Share This Post

With more enterprises interested in using AI agents that are local to their computers and on their devices, AI research lab Ai2 on Tuesday released its own open source web agent called MolmoWeb, a day after Anthropic introduced an update that gives Claude access to personal computers.

MolmoWeb is a visual web agent that automates browser tasks using multimodal AI. It is built on Ai2’s model family, Molmo 2, and is available in two sizes,4B and 8B parameters. Ai2 also released its training data set, MolmoWebMix, evaluation tools, and an inference library, so developers and researchers can self-host, fine-tune and improve the system.

MolmoWeb is similar to Anthropic’s computer use capability, which the AI Lab introduced in 2024, in that it allows the AI agent to act on behalf of users. Using computer vision, the AI agent can perceive what is happening on a user’s computer and reason through a sequence of actions to achieve the user’s goals. Anthropic on March 24 revealed that Claude Cowork and Claude Code users can now allow Claude to complete tasks. Anthropic said that Claude can point, click and navigate what is on a user’s screen to perform tasks. It can open files, use the browser, and automatically run dev tools. Users can also use Dispatch, a feature within Anthropic’s Cowork, to assign Claude tasks from their phones. The feature is available in research preview to Claude Pro and Max subscribers. 

Related:Cohere, Saab Partner on Advanced AI in Aerospace

Molmo Web an Open Option

Both MolmoWeb and the Claude update highlight a trend in the AI market in which the AI agent is becoming more personal, and users are focusing on ways to put AI agents to use on their local computers. The difference between what Anthropic does and what Ai2 has produced with MolmoWeb is that one is open to the community, and the other is not.

“MolmoWeb is an innovation paradigm of computer use agents similar to the proprietary frontier model providers, but with an open approach to data sets and agents,” said Arun Chandrasekaran, an analyst at Gartner. “It lowers the barrier to entry for studying agentic behavior and understanding agent decision-making processes that are otherwise opaque.”

“This is critical for building safe systems in the future,” Chandrasekaran continued.

 MolmoWeb is also an option for enterprises considering open source technology to explore AI agents, said Chris Callison-Burch, a professor of computer and information science at the University of Pennsylvania. Callison-Burch was a visiting research scientist at AI2 from 2023 to 2024.

Related:Fear of Missing Out is Not a Good Reason to Implement AI

“The cost to develop models is quite high, but adopting open source models is potentially a buy-in strategy for a lot of businesses,” Callison-Burch said. 

He added that Ai2 generated synthetic data set MolmoWebMix that enables the AI research lab and developers using MolmoWeb to train agents. 

Challenges to Computer Use

While MolmoWeb seems like a good alternative to proprietary AI agents, it also poses challenges, especially because the agent’s computer vision technology means it sees and perceives what the human does. 

The research lab acknowledged that the AI agent can be thrown off track by actions such as scrolling before a web page has finished loading. It has also not been trained in tasks that require financial login, and its performance degrades with ambiguous instructions.

However, Ai2’s approach of making the data openly available will let researchers and enterprises work through the limitations and overcome them, Callison-Burch said. 

MolmoWeb is available on Hugging Face and GitHub.

Related Posts

Circle Urges EU to Ease Markets Framework for Crypto

Stablecoin issuer Circle has urged the European Commission to...

What Happens to Bitcoin If US Bond Yields Soar Above 5%?

Bitcoin (BTC) has been among the best-performing assets amid...

Invesco (IVZ), a $2.2 trillion asset manager, joins BlackRock and peers in tokenized fund push

Invesco, a U.S.-based asset manager overseeing $2.2 trillion in...

BTC finds stability at 2023 investor cost basis, echoing past cycle

Bitcoin recently found support at a key onchain metric...

Anthropic’s Claude Can Now Take Control of Your Computer

The generative AI vendor has given its Claude large...

Omnes, Apex Tokenize Bitcoin Mining Exposure Through Structured Note

Financial technology company Omnes and financial services provider Apex...