Perplexity, CoreWeave Deal Boosts Inferencing

Share This Post

Neocloud provider CoreWeave and AI search vendor Perplexity have agreed to a multiyear deal to scale Perplexity’s AI search and inference capabilities. 

The agreement, financial terms of which were not disclosed, underscores the broad applicability of inferencing and the ongoing shift from AI training to AI inference.

The vendors revealed on March 4 that Perplexity will migrate its next AI inference workloads to CoreWeave Cloud. The partnership requires Nvidia’s GB200 NVL72 clusters to power Perplexity’s AI model, Sonar, and its Search API ecosystem. Perplexity will also use CKS (CoreWeave Kubernetes Services) and W&B (weights and balances) models for model management and deployment. CKS and W&B are core components of CoreWeave’s AI cloud platform, with CKS a managed service optimized for computationally intensive AI workloads, and W&B Models a specialized “system of record” for managing the lifecycle of machine learning models.

Related:Anthropic Report Says It’s Too Early for AI to Affect Jobs

Perplexity and CoreWeave’s deal further shows the shift in the AI market toward inference, the process by which an AI uses the knowledge or data it acquired during training. Most recently, there have been deals in which vendors are partnering solely for inference. For instance, OpenAI recently committed to using 2 gigawatts of capacity on AWS’ Trainium3 and Trainium4 chips, following an expansion of its partnership with the cloud provider. Meta also plans to deploy millions of Nvidia Blackwell and Rubin GPUs to run high-volume inference and agentic workloads.

More Inferencing 

“Inference is an ongoing, continuous workload,” said Nick Patience, an analyst at Futurum Group, adding that inference is not nonstop. “Everybody in the whole ecosystem believes inference is the bigger opportunity by quite some scale.”

While it may seem that an emphasis on inference only benefits vendors — including AI labs, model makers, and hardware providers — there is a benefit for enterprise customers using AI in the applications consumers will use, according to Sandy Venugopal, CoreWeave’s CIO. 

“For enterprise customers, they want to make sure when they’re building AI features or products or capabilities for their platforms, for their customers, inference does matter,” Venugopal said in an interview. “When somebody comes in and uses AI on your product or platform, they want quick responses. They want to see it in real time.”

The new emphasis on inference is also due to AI vendors such as Perplexity making AI-powered experiences accessible to a broader group of people and companies, leading to a surge in “real-world usage,” said Mike Leone, an analyst at Omdia, a division of Informa TechTarget. This growth provides an opportunity for vendors like CoreWeave to offer purpose-built AI computing.

Related:Gemini’s Canvas in AI Mode Available in Google Search in US

Benefits and Challenges

He added that CoreWeave’s partnership with Perplexity is about the neocloud vendor diversifying its customer base, as its revenue is heavily concentrated in contracts with Microsoft, OpenAI, and Meta.

“Landing an AI application company like Perplexity shows the platform can attract a broader mix of customers with different workload profiles,” Leone said. He added that Perplexity, for its part, gets to secure high-performance infrastructure that they do not have to build themselves.

CoreWeave is also trying to make itself a more credible inference platform, Patience said.

“Winning a customer like Perplexity is quite a big deal because Perplexity is quite a demanding customer,” he said. He added that the APIs will run continuously in production, and if Perplexity is satisfied with CoreWeave, it is less likely to switch to another provider.

“Inference is even more important to Perplexity because that’s essentially its business,” Patience continued.

However, the challenge for CoreWeave is to prove itself in a market in which hyperscalers can compete with their own in-house custom chips.

Related:Nvidia Takes on Telco Industry With Open Source Model

“CoreWeave needs to keep proving that a purpose-built AI cloud delivers better performance and economics than what the hyperscalers offer natively,” Leone said. 

Related Posts

Ex-CFO Gets 2-Year Prison Term

Trusted Editorial content, reviewed by leading industry experts and...

Florida moves ahead with state-level stablecoin regulation – DL News

Florida lawmakers passed a stablecoin bill this week.The legislation...

The Multibillion-dollar shift turning prediction markets into a professional hedging tool

The dominant narrative around prediction markets still centers on...

Euro Regulators Question Meta Over AI Glasses Privacy Fears

Meta has attracted the attention of the authorities in...

Top Wall Street minds see AI rotation ahead as bitcoin seeks role in new cycle

BlackRock’s Rick Rieder, UBS’s Ulrike Hoffmann-Burchardi, and hedge fund...

Kalshi Faces Lawsuit Over Khamenei Prediction Market

A class action lawsuit has been filed against prediction...