Ironwood’s Staggering Performance: 42.5 Exaflops of Inference Power

Ironwood’s Staggering Performance: 42.5 Exaflops of Inference Power
  • calendar_today August 17, 2025
  • Technology

In a significant stride towards the future of artificial intelligence, Google has unveiled its latest custom-designed processor: As the seventh evolution of its Tensor Processing Unit design, Google introduced the Ironwood TPU. The new chip has been engineered to handle Google’s most advanced Gemini models’ increasing computational needs, which require complex reasoning abilities that Google terms as “thinking.”

Google consistently emphasizes how its complex AI models work hand in hand with their precisely engineered infrastructure. Ironwood proves to be a central element within this system by delivering substantial increases in inference speed and broader contextual understanding for these advanced AI models. The company touts Ironwood as their most scalable and powerful TPU to date to establish a new age where AI systems can actively interact with users by autonomously finding information and producing relevant responses. Google embraces a proactive user-centered approach as the foundation of “agentic AI,” while Ironwood stands as the critical force propelling this “age of inference.”

Ironwood: Powering the Next Generation of AI

Ironwood achieves significant throughput improvements when compared to earlier TPU models. Google’s deployment strategy aims to build gigantic clusters with liquid cooling systems that contain as many as 9,216 Ironwood chips. The new upgraded Inter-Chip Interconnect (ICI) enables vast computational arrays to perform efficient seamless data transfers at high bandwidth throughout the system.

Google will provide access to this powerful processing capability for both its internal R&D teams and external cloud-based developers. Ironwood will be available in two distinct configurations: Ironwood provides two distinct configurations including a 256-chip server capable of handling moderate AI workloads and a massive 9,216-chip cluster designed for the most challenging AI tasks.

The fully configured Ironwood pod demonstrates incredible processing power by achieving 42.5 Exaflops of inference computing. Each Ironwood chip achieves a peak performance of 4,614 TFLOPs, which represents a major improvement over earlier TPU versions, as Google states. The memory architecture has experienced a significant upgrade, leading to each Ironwood chip containing 192GB of memory, which represents a sixfold improvement from the memory capacity of the Trillium TPU. The memory bandwidth has increased four and a half times, reaching a remarkable speed of 7.2 Tbps.

Decoding the Performance Metrics

Performance evaluations of AI chips require careful analysis because benchmarking methods vary widely. Google evaluates Ironwood’s performance mainly using FP8 precision as its standard measurement. Google claims its Ironwood “pods” achieve a 24 times faster performance than segments of leading supercomputers, but consumers should approach this statement carefully since many of those supercomputers do not have native FP8 hardware capability.

Google excluded its TPU v6 (Trillium) hardware when presenting direct performance comparisons. Google claims that Ironwood delivers double the performance per watt efficiency when compared with its previous v6 iteration. The company spokesperson explained that Ironwood functions as the successor to TPU v5p while Trillium serves as the upgrade from the less powerful TPU v5e. Trillium hardware reached peak processing levels of around 918 TFLOPS under FP8 precision settings.

The Implications for the Future of AI

Despite the inherent complexities in benchmarking AI hardware, the underlying message is clear: Google’s AI infrastructure capabilities have significantly advanced with the development of Ironwood. The robust foundation of previous TPU generations has enabled rapid development in advanced models like Gemini 2.5, which Ironwood’s enhanced speed and efficiency now build upon.

Google expects Ironwood’s advanced inference capabilities and improved efficiency to create groundbreaking developments in artificial intelligence throughout the upcoming year. Ironwood delivers essential computational resources to build advanced models and true agentic capabilities, making it a crucial force in Google’s vision of the “age of inference,” where AI takes on a proactive and intelligent role in our digital world.