System Specifications and Configuration
Last updated
Last updated
Nurion, which is KISTI's 5th supercomputer, is a Linux-based massively parallel cluster system that exhibits a total theoretical operation performance (Rpeak) of 25.7 PFlops (approximately 70 times that of the 4th supercomputer, Tachyon2).
Nurion is an ultra-high-performance computing system designed in a cluster architecture, consisting of 8,305 many-core CPUs, each delivering 3 TFlops of performance per socket (CPU), which is about 1/100th of the total performance of the Tachyon2 system. With its 68-core processors, Nurion can simultaneously utilize hundreds of thousands of cores. It is equipped with a high-performance interconnect, large-scale storage, and a burst buffer. The burst buffer is designed to efficiently handle large-scale I/O requests that occur momentarily in applications.
Nurion aims to contribute to enhancing the competitiveness of national research and development (R&D) through its large-scale computational capabilities. It is intended to solve complex problems such as climate prediction, new material development, and to assist in resolving issues in drug discovery, aerospace testing, and other fields where experimental solutions are risky and costly. From the planning stage, Nurion has been focused on identifying and addressing challenges in various scientific and technological fields to solve these problems.
Additionally, Nurion is equipped with and continuously upgrades a wide range of software to support these research and development efforts, striving to maintain optimal productivity through seamless integration with high-performance hardware. Furthermore, through initiatives such as optimizing and parallelizing code, providing extensive user support, and identifying large-scale projects aimed at solving significant challenges, Nurion seeks to serve as a key infrastructure for national scientific and technological advancement and the realization of an intelligent information society.
The Nurion system consists of a total of 8,437 compute nodes, including 8,305 nodes equipped with many-core Intel Xeon Phi 7250 processors and 132 nodes equipped with Intel Xeon Gold 6148 processors.
The Intel Xeon Phi 7250 processor (code-named Knights Landing) installed in the KNL nodes operates at a base frequency of 1.4 GHz with 68 cores (hyperthreading off). The L2 cache memory is 34 MB, and the on-package MCDRAM (Multi-Channel DRAM) memory is 16 GB with a bandwidth of 490 GB/s. Each node has 96 GB of memory, configured with 16 GB DDR4-2400 memory across six channels. A 2U enclosure houses four compute nodes, with each standard 42U rack containing 72 compute nodes.
The SKL (CPU-only) nodes are equipped with two Intel Xeon Gold 6148 processors (code-named Skylake). The base frequency is 2.4 GHz, and each processor consists of 20 CPU cores (hyperthreading off). The L3 cache memory is 27.5 MB, and each CPU has 96 GB of memory (192 GB per node), configured with 16 GB DDR4-2666 memory across six channels. A 2U enclosure houses four compute nodes.
The interconnect network for inter-node computation and file I/O communication utilizes Intel OPA (Omni-Path Architecture). The system is configured with a 100 Gbps OPA network in a Fat-Tree topology, featuring 50% blocking between compute nodes and a non-blocking network between storage nodes. Compute nodes and storage are connected to 277 48-port OPA Edge switches, with all Edge switches connected to eight 768-port OPA Director switches.
Nurion provides a parallel file system and a Burst Buffer for I/O processing and data storage. To support the parallel file system, three Lustre file systems are provided: scratch (/scratch, 21PB), home (/home01, 760TB), and application (/app, 507TB).
Each file system is composed of a metadata server (MDS) based on SFA7700X storage and an object storage server (OSS) based on ES14KX storage. The Burst Buffer is an NVMe-based I/O cache that operates between the compute nodes and the parallel file system, offering approximately 844TB of capacity. The Lustre file system is mounted on compute nodes, login nodes, and archival servers (Data Movers) to provide I/O services.
The Burst Buffer (BB) is a newly introduced cache layer in the 5th system, designed to accelerate I/O between compute nodes and storage (/scratch). It enhances the performance of small or random I/O operations on the parallel file system and maximizes parallel I/O performance. It aims to improve the performance of user applications that are highly dependent on I/O and is supported across all queues that use KNL nodes.
The IME240 solution of DDN was applied for the BB configuration; the [Burst buffer server configuration] figure above shows the detailed configuration of the BB.
System
DDN IME240, Infinite Memory Engine Appliance
Software version
CentOS 7.4, IME 1.3
Max. I/O performance
20 GB/s
Network interface
2 x OPA
SSD type
1.2TB, NVMe drive
SDD quantity
16ea(1.2TB NVMe) + 1ea(450GB SSD)
Capacity (RAW)
19.2TB
[IME single node configuration]
Total no. of systems
IME240 x 48
Total capacity (RAW)
979.2 TB
Total capacity (when parity is applied))
816 TB (EC*, 10+2)
Max. I/O performance
800 GB/s
[Overall configuration of burst buffer]
EC : Erasure Coding (settings may change)
Last updated on November 06, 2024.