In a significant move for the AI sector, Anthropic is currently in discussions with Microsoft to incorporate the tech giant's custom-designed AI chips into its inference workloads. This shift reveals a strategic direction among leading AI labs regarding the hardware that supports their models, with implications extending well beyond Silicon Valley.
The negotiations, first reported by The Information, follow a sweeping partnership involving Anthropic, Microsoft, and NVIDIA, which includes a $30 billion commitment from Anthropic for Azure computing resources. This partnership highlights the increasing reliance on cloud services for AI workloads while also emphasizing the complexities of cost management within AI operations.
A key factor in operating large language models like Claude, Anthropic’s flagship AI, is understanding the cost dynamics. While training these models incurs substantial expenses, the inference phase—the process of responding to user requests—generates ongoing costs. Each time a user prompts Claude to draft an email or summarize a paper, those costs increase significantly, especially when scaled across millions of users.
Anthropic's interest in utilizing Microsoft’s proprietary silicon is aimed at reducing these inference costs. The goal is clear: by using Microsoft’s chips, Anthropic seeks to lower the per-query expenses associated with operating Claude, thereby improving its scalability and efficiency.
However, this strategic shift does not indicate a departure from NVIDIA. Anthropic plans to maintain a strong relationship with NVIDIA, continuing to use its Grace Blackwell and Vera Rubin systems. Both companies are working together to optimize their models for efficient performance on NVIDIA’s hardware. Yet, reliance on a single chip supplier presents potential risks, prompting Anthropic to diversify its hardware options as a precaution.
Microsoft’s CEO, Satya Nadella, has characterized the evolving relationship as increasingly mutually beneficial. He stated that the two firms would "increasingly serve each other as customers," with Claude likely being integrated into Microsoft’s extensive product ecosystem. This arrangement enables Microsoft to use one of the most advanced AI models available while providing Anthropic with preferential access to Azure’s vast cloud resources and potentially its custom silicon.
In its pursuit of hardware diversification, Anthropic is also exploring AI inference chips from Fractile, a UK-based startup focused on creating specialized hardware for similar workloads. This trend reflects Anthropic’s desire for options and flexibility in its AI infrastructure.
This strategy aligns with approaches taken by other major tech players. Google has its Tensor Processing Units (TPUs), while Amazon provides Trainium and Inferentia chips. Microsoft has also made advancements in developing its AI accelerators, including the Maia 100 series, aimed at reducing dependency on NVIDIA’s GPUs and offering cost-effective alternatives.
For Anthropic, the rationale is straightforward. Although NVIDIA's GPUs are powerful, they come with a high price tag and have been in short supply for several years. As major AI labs, cloud providers, and national AI initiatives compete for these chips, Anthropic’s inclusion of Microsoft’s custom silicon and the potential for Fractile’s hardware enhances its negotiating position while increasing redundancy in its infrastructure. This multi-faceted approach could prove essential as the demand for AI capabilities continues to rise across industries.



