Skip to content

The 5 Most Powerful Supercomputer Chips Ever Created

An Exclusive Look at 5 Game-Changing Chips Powering the World‘s Fastest Supercomputers

Imagine a machine that could flawlessly simulate Earth‘s climate or model the workings of the human brain down to the cellular level. For specialized teams of researchers pushing scientific frontiers, supercomputers make the once inconceivable tangibly within reach. But fueling calculations on such grand scales requires chips far beyond those powering consumer gadgets.

Benchmark tests like LINPACK measure these digital dynamos‘ maximum computational power. Mapping years of advancement, the biannual Top500 list ranks supercomputers descending by demonstrated processing capability measured in floating point operations per seconds (flops). Recently, Japan‘s Fugaku system claimed top honors with 442 petaflops while now the inaugural exascale barrier of 1,000 petaflops has been crossed by the US‘s Frontier system. These milestones drive progress across crucial research from personalized medicine to renewable energy. Powering such digital giants are highly specialized processor designs fine-tuned forintensive parallel number crunching.

In this piece we dive inside five record-breaking chips propelling the world’s mightiest supercomputers. Each architectural marvel reflects an obsessive commitment to performance honed through generations of development by leading chipmakers. Beyond showcasing mankind’s computing capabilities, documenting their details provides valuable perspective as we enter the exascale era.

Breaking Records: AMD‘s Epyc Rome 7A53

Having secured its spot as the world‘s fastest with 1.1 exaflops on LINPACK, AMD’s 3rd generation Epyc CPU code named Rome serves as the central processing engine for Frontier. Specifically, each of Frontier’s 9,408 nodes houses one 64-core Epyc Rome processor clocked at 2.0 Ghz. Working in perfect harmony, they achieve an unprecedented feat.

But how did AMD‘s latest Epyc revision attain such colossal computational heights? Architecturally, 7nm Rome builds on improvements introduced across successive generations to optimize various facets like instructions per clock, memory bandwidth, and interconnect fabrics coupling nodes. Enhancements raised core counts too, from 32 to 64, while balancing tradeoffs around thermal profiles and system costs. Together this compounded to push Epyc and Frontier past rivals. The table below summarizes generational milestones:

Epyc Gen Launch Year Process Tech Max Cores Memory Support TDP Envelope
Naples 2017 14nm 32 2 TB 180W
Rome 2019 7nm 64 4 TB 280W
Genoa ~2022 5nm 96 6 TB 400W

With experience optimizing processors spanning PCs and servers both low-power and high-performance, AMD aggressively pursued burgeoning exascale contracts. Focused execution prompted key architectural decisions paying dividends positioning Epyc CPUs as go-to processing engines inside ambitious Top500 entrants. Successfully disrupting Intel’s dominance, the numbers affirm AMD’s chips moving supercomputing forward today.

Legend Continues: Fujitsu’s Custom A64FX

As the celebrated brains behind Japan’s ex-leading Fugaku, Fujitsu’s A64FX represents another zenith for custom silicon solutions. Primed for number-crunching prowess, the processor packs 48 high performance cores plus 4 assistance cores to accelerate administrative operations. Augmenting them, integrated HBM2 memory controllers gorge cores with ample bandwidth feeds avoiding bottlenecks. For internode exchanges, comprehensive Tofu D interconnect infrastructure combines management, data, and torus traffic atop a 6D mesh topology.

Architecturally, A64FX refines best practices proven through Fujitsu’s decades steering the legendary K Computer previously crowned Top500’s gold standard. Both embrace Fujitsu’s long-held vision melding general and vector processing units to blaze through floating point calculations fundamental for simulations. However, A64FX specifically hones configurations suiting modern workloads via updated single instruction, multiple data (SIMD) vector engines supporting the latest Arm SVE ISA plus Fujitsu’s extra HPC-ACE extension. Enabling 1 PFLOP theoretical peak double precision throughput per chip, teams of A64FX propel Fugaku to hitherto unattained computational scales.

Ranking not just fast but efficient, Fugaku also appears among the Green500’s top 10 most power conscious supercomputers. This highlights A64FX efficiencies balancing blistering performance with economical operation – factors certain to motivate Fujitsu releasing successors. If initial results are indicative, solutions incorporating evolutions of the custom A64FX will continue breaking records worldwide in years ahead.

Legendary History: IBM’s Power9 Processor

As one of the pioneering institutions computer technology itself, IBM boasts a richer history than virtually any competitor in crafting chips propelling the space’s evolution. While no longer the broader semiconductor leader it once was, IBM remainsmatches technical demands of modern computational workloadsthrough designs leveraging its signature POWER reduced instruction set (RISC) architecture manifesting in CPUs, GPUs, and accelerators driving today’s competitive marketplace.

Current standard bearers include IBM’s Power9 processors released 2017. Featuring up to 24 cores per socket and 48 threads leveraging simultaneous multithreading, Power9 emphasizes throughput via immense memory bandwidth capacities topping out around 230 GBps thanks incorporating NVIDIA’s high speed NVLink interconnect. Additionally, PCIe Gen4 quadruples links to peripherals. Together these interconnect advancements glut processors enabling even the most data hungry applications run optimally. The architectural delta separating Power8 from Power9 generations is summarized below:

Power 8 Power 9
Process 22 nm 14 nm FinFET
Max Cores/Socket 12 24
Max Memory Bandwidth 230 GBps 230 GBps
NVLINK N/A Yes
PCIe Gen Gen3 Gen4

Multiple Top500 systems leverage Power9 including the 4th ranked Summit supercomputer operated by Oak Ridge National Laboratory touting 148.6 petaflops. Built cooperatively alongside NVIDIA and Mellanox augmenting CPUs with Tesla GPUs and EDR Infiniband, Summit achieves astounding computational powers improving efficiencies exploring optimizations around node configurations and holistic system packaging. Its Sierra sibling in 5th place follows comparable philosophies. For both, Power9 gives IBM a continued leadership stake helping the US retain supercomputing primacy it held uninterrupted between 1996 until eclipsed by China‘s Sunway Taihulight in 2016.

Extreme Parallelism: Sunway’s SW26010

Determined growing domestic technology capabilities and avoiding future reliance upon foreign suppliers, China diligently invested perfecting supercomputing innovations locally through firms like Sunway. The company’s pioneering efforts crystallized developing the Sunway TaihuLight which claimed the TOP500 crown in 2016 thanks achieving a LINPACK score over 125 petaflops leveraging Sunway’s custom SW26010 processors.

While most server-grade chips like AMD’s Epyc integrate multiple processor cores (typically under 100), the SW26010 dramatically multiplies this convention via an experimental manycore design cramming a whopping 260 cores into a lone socket. Though individually simpler than complex processor cores, collectively the 260 cores handle parallelized workloads in tandem. Eschewing sophisticated logic by relying upon heavier threading proves more efficient for extremely parallelized tasks which supercomputers are purposed to accelerate.

Simplified, traditional multicore processors emphasize complexity within individual cores supporting generalized computations. Contrastingly, manycore philosophies leverage a torrent of simplified cores cooperating running specialized math operations ubiquitously required completing simulations. This horizontal scaling better suits matrix multiplicationintrinsic HPC workloads. Extending this, interlinking multiple SW26010 enabled Sunway TaihuLight incorporating over 10 million cores shattering expectations.

Dominance Disrupted: Intel’s Xeon E5-2692 v2

Until recently Intel’s Xeon family represented the undisputed first choice furnishing processors to ambitious HPC architects. Previously powering #1 ranked Tianhe-2A currently occupying 9th position, the E5-2692 v2 adopts Intel’s workhouse Ivy Bridge-EP microarchitecture. Specifically, Tianhe-2A melds 32,000 nodes each equipped with 2 Ivy Bridge CPUs combining for over 3 million cores in total. Once an exemplar, now it highlights technical stagnation that disrupted Intel’s silicon dominance.

Gradual improvements between subsequent Xeon generations including newer Skylake offerings fail matching pace with adjusted strategies from firms like AMD. Architecturally Skylake builds upon its forebears via updated mesh interconnects and enhanced data transmission capabilities. Yet marginal bumps on metrics around cores, clock speeds and memory capacities didn‘t prevent rapid downward shifts as competing installations leveraged superior alternatives. The table below summarizes Tianhe‘s chips against those in its successor illustrating the modest generational improvements:

E5-2692 v2 (Ivy Bridge) Xeon Platinum 8160 (Skylake)
Process 22 nm 14 nm
Cores/Threads 24/48 24/48
Clock Speed 2.2 GHz 2.1 GHz
Memory DDR3 1866 MHz DDR4 2666 MHz

Rather than revolutionize, Platinum refinementstarget evolutionary traction primarily through added scalability featuresvia Ultra Path Interconnects (UPI).This leaves opportunity for Intel reenvisioning its architectural roadmap prioritizing existingsupercomputing clientele as competitors threatenits stronghold.Future XIeon variants must extract substantively greater efficiency from each cycle meeting demands of Exascale pioneers else risk further relegation.

The Exascale Future Beckons

Pushing computing‘s farthest limits expanding humanity‘s scientific capabilities, the 5 visionary achievements profiled provide perspective appreciating modern semiconductor feats supplying our world‘s fastest supercomputers. Each architectural milestone reflects precision craftsmanship from an industry aggressively competing at the intersection disruptive digital technologies and simulation platforms promising breakthroughs combating pressing global challenges around climate, health, and energy security.

Moreover as colossal computational abilities become further democratized through cloud-based offerings, processor scale efficiencies grow paramount limiting environmental footprints as specialized chips penetrate everyday toolchains and services. Much still remains improving sustainable computing from chip manufacturing processes through holistic datacenter operations and recycling programs. But behind lofty climate commitments made by leading computing providers stand silicon contributions egging technology continually forward responsibly.