Burst Buffer
Last updated
Last updated
Burst buffer IME performs the role of a cache in the Nurion /scratch filesystem. The data access method through IME is as shown in the figure below.
The IME is mounted on client nodes (all compute nodes and login nodes) using FUSE (File System in USErspace), a user-level file system. It is important to note that IME acts as a cache, so the /scratch file system must be mounted beforehand. The IME directory path is /scratch_ime, and when a user first accesses this directory (/scratch_ime/$USER), they will see the structure of all directories and files in the /scratch/$USER file system replicated there.
These are not actual data stored on the IME device; rather, they are used as a cache when performing tasks with the burst buffer, caching data from /scratch to IME. To use IME, you must specify the burst buffer project name (#PBS -P burst_buffer) in your job script. There are two main methods to use the application, as outlined below:
e.g.) INPUT="/scratch_ime/$USER/input.dat", OUTPUT="/scratch_ime/$USER/output.dat“
e.g.) OUTFILE=ime:///scratch/$USER/output.dat (refer to a sample job script below)
Load the mvapich2/2.3.1 module as mentioned above, and write your job script as shown below.
※ Supported compiler: gcc/6.1.0, gcc/7.2.0, intel/17.0.5, intel/18.0.1, intel/18.0.3, intel/19.0.4, pgi/18.10
※ MPI-IO in IME is implemented using a custom ROMIO interface, but the official ROMIO feature supporting IME is included in the MVAPICH2/2.3.1 version. (OpenMPI is not supported)
※ Burst buffer IME can be used in all compute nodes of Nurion (SKL, KNL).
To manage data in IME, it is essential to understand the data lifecycle as shown in the diagram below. IME data processing involves four stages: Prestage, Prefetch, Sync, and Release, each of which is managed using the IME-API (#ime-ctl) command.
ime-ctl -i $INPUT_FILE
Stage-In task data to IME
(Caching data from /scratch to /scratch_ime)
ime-ctl -r $OUTPUT_FILE
Synchronize IME data with the parallel file system
(Transfer data from /scratch_ime to /scratch)
ime-ctl -p $TMP_FILE
Purge data in IME
(Purge the data in /scratch_ime)
ime-ctl -s $FILE
Provides status information of the IME data
※ Detailed options can be checked through #ime-ctl --help
The total capacity of IME is approximately 900TB, and data is automatically flushed to or deleted from the /scratch file system depending on usage. IME automatically frees up cache space based on two threshold settings, as described below.
When starting a job in IME, the process of caching data from PFS to IME and then flushing or syncing cached data back to PFS incurs a load. Therefore, performance improvements can be expected in applications with large numbers of small I/O operations, frequent checkpointing, or high I/O frequency, which are relatively less efficient in PFS (Lustre).
Additionally, because IME (approximately 0.9PB) is used as a cache for PFS (approximately 20PB), its capacity is relatively small. Therefore, if the IME capacity is fully utilized, data may be removed from the cache based on threshold settings, so careful data management is required.
※ Caution: To delete cached data in IME, you must use the provided IME-API commands. If you delete data using the rm command in /scratch_ime, the actual data stored in /scratch will also be deleted, so caution is necessary.
Last updated on November 08, 2024.