At the International Supercomputing Conference 2022 on May 23, Los Alamos National Laboratory offered new details about its collaboration with Nvidia Corp. to develop high-performance computing systems to meet the Laboratory’s diverse mission needs.
The Laboratory will be the first U.S. customer to receive an Nvidia Grace central processing unit, with Hewlett Packard Enterprise (HPE) as the system provider. The system also represents the first U.S. installation of the Nvidia Grace and Grace Hopper Superchips using NVLink-C2C, an ultra-fast, chip-to-chip interconnect.
The new system will be named “Venado,” after Venado Peak, a mountain near Taos. That moniker follows the example of the Chicoma system at the Laboratory, named for the tallest peak in New Mexico’s Jemez Mountains.
“We’re finalizing the makeup of the Venado system, and eagerly anticipating its arrival,” said Irene Qualters, associate Laboratory director for Simulation and Computation. “This advanced system pushes new technical boundaries, enabling Los Alamos researchers and collaborators to make new discoveries, benefiting the nation and society as a whole.”
Venado designed to optimize developer productivity
Venado’s balanced architecture is well-suited for complex, heterogeneous workloads. Built on the HPE Shasta system, the Nvidia Grace Hopper Superchip includes a reticle-sized Grace CPU tightly coupled with a reticle-sized Hopper architecture graphics processing unit. This combination offers the highest levels of application performance and flexibility.
Venado includes a cluster of CPU-only systems powered by the Grace CPU Superchip, which packs 144 Arm cores in a single socket to deliver an immediate performance boost to a wide range of HPC applications.
Venado represents an ongoing collaboration between Los Alamos, Nvidia and HPE to build a robust software ecosystem to optimize developer productivity. This collaboration will continue to focus on developing flexible, portable parallel programming models and driving the continued expansion of the Arm-based HPC and artificial intelligence development ecosystem.
“Venado’s heterogeneous architecture will feature a mix of Grace CPU Superchip nodes and Grace Hopper Superchip nodes and accelerate simulation, machine learning, scientific edge, and digital twin application use cases,” said Gary Grider, division leader of High Performance Computing at the Laboratory.
From materials science to firefighting
Venado enables research across the Laboratory’s portfolio.
- In materials science, the Vienna Ab initio Simulation Package supports density functional theory capabilities, enabled by AI. That AI capability may also be used in inertial confinement fusion research with transfer learning.
- In energy-related research, Venado’s digital twin application capabilities can enable power-grid modeling to simulate grid scenarios and the impacts of investment decisions.
- Venado could also allow for the modeling and simulation of unmanned vehicles with lidar systems, connected over 5G, to help stop the spread of wildfires.
Coming next year
Venado represents a milestone for the collaboration between Los Alamos, Nvidia and HPE. Los Alamos’ current premier institutional computing capability is Chicoma, an HPE Olympus system with a mixture of node types including an Nvidia A100 partition that is being used as a stepping stone to be ready for the much-awaited Venado system.
When it is delivered in 2023, Venado will afford the Laboratory a leadership-class advanced technology supercomputer to conduct modeling, simulation and data analysis in support of Laboratory research and missions.
A multiyear collaboration focused on codesign for a broad spectrum of computing, memory and software technologies will build on the capabilities deployed with Venado. Codesign draws on the combined expertise of vendors, hardware architects, system software developers, domain scientists, computer scientists and applied mathematicians working together to make informed decisions about hardware and software components.
LA-UR-22-24939