Data Archiving
Last updated
Last updated
To support long-term storage (up to one year after contract expiration) of users' important and large files, Nurion operates a 10PB tape library (IBM TS4500).
The Tiered Storage Management Script Facility (TSMSF) is software installed and operated on Nurion's Datamover nodes to support user data backup and archiving. TSMSF automatically archives user data and allows users to manually restore data when needed. Unlike the regular backup of the user home directory (/home01) performed by the administrator, this process requires users to manually upload files to a specific directory (/scratch/arcv/[$USER]). Currently, TSMSF is composed of a tape library and Datamover (DM) nodes, as shown in the diagram.
Nurion system TSMSF: Tape (10PB)
Nurion system Datamover Node: nurion-dm.ksc.re.kr
When users upload files to the designated directory on the Datamover (/scratch/arcv/$USER), the data will be automatically archived after 7 days (subject to change). Files can be uploaded to the Datamover directly from external sources using FTP, SCP, SSH, SFTP, or via the Nurion login node (note: FTP cannot be used on the login node). For services other than FTP, a one-time password (OTP) must be entered. Additionally, for services other than FTP, file uploads will automatically be interrupted if the limited resources (10 minutes of CPU time) are exhausted. Therefore, it is recommended to use FTP for file uploads.
The storage capacity on TSMSF is 100TB per account (ID) with a maximum of 1,000,000 files, and the retention period is up to one year after the account usage ends.
TSMSF primarily uses tape media and may pose challenges for data recovery in the event of a tape media failure. Therefore, it is advisable to keep important data on your local system.
Additionally, since tape media generally takes considerable time to load into the tape library for access, it is advisable to compress files using tools like tar and gzip, especially when there are many small source files, before backup/archiving.
To use TSMSF, a specific configuration of the usage environment is required. Please apply by sending an email to account@ksc.re.kr (no specific form is needed. You can simply provide your account information (ID)).
For other general technical support inquiries, please use the email consult@ksc.re.kr.
To archive user data, please use the Nurion system's login node or Datamover node. Note that upon initial login, the directory location is set to ‘/home01/$USER’, which is the general home directory of the Nurion system, not the home directory within TSMSF. Therefore, use the cd command (cd /scratch/arcv/$USER) to change the path and move to your user directory for TSMSF.
(Method 1) Use ssh to access the Login node (nurion.ksc.re.kr) or Datamover node (nurion-dm.ksc.re.kr) and use the cp command.
(Method 2) Use scp from a remote location.
(Method 3) Use an FTP/SFTP client program to connect to the Datamover node (nurion-dm.ksc.re.kr) and upload files
Enter the connection information for the Datamover node and click the Quick Connect button.
nurion-dm.ksc.re.kr
USER ID (user account)
User password
21
Since the path on the remote site is the user's home directory (/home01/$USER), navigate to the designated directory (/scratch/arcv/$USER)
※ You can easily navigate by entering the absolute path directly in the remote site path field
Select the files or directories from the local site on the left and upload the files
※ For more detailed FileZilla instructions, refer to the website (https://filezilla-project.org/).
Click the 'Quick Connect' button to connect to the Datamover node.
Enter the connection information for the Datamover node and click the 'Connect' button.
nurion-dm.ksc.re.kr
USER ID (user account)
22
Keyboard Interactive
Enter the one-time password (OTP) and the password sequentially, and then click the “OK” button.
Since the path on the right side is the user home directory (/home01/$USER), navigate to the designated directory (/scratch/arcv/$USER).
※ You can easily navigate by entering the absolute path directly in the path field.
Select the files or directories from the local site on the left and upload the files
Files within the designated directory (/scratch/arcv/$USER) that are larger than 10MB and have not been accessed for 3 days (subject to change) will be automatically archived. Archived files will only leave a temporary file (chunk file) with a size of 0 in the directory, while the actual data will be stored in the tape library. As the number of files increases, the time required for archiving to the tape library or restoring files to disk will increase, so it is recommended to compress (e.g., tar) the files to reduce the number of files.
Example
You can check the detailed information of archived files using the arc_ls command as follows:
The arc_ls command allows you to specify the relative or absolute path of a directory or file as an argument, and if needed, you can use the -r option to search subdirectories within the specified directory. In the output, DIR indicates a directory, NRM represents a regular file, and ARC denotes an archived file.
To use archived data, it must first be restored. The command provided for file restoration is arc_restore. At this time, retrieving data files from the tape library can take a significant amount of time. You can restore archived files as follows:
Example
Similar to the arc_ls command, the arc_restore command allows you to specify the relative or absolute path of a directory or file as an argument. If needed, you can use the -r option to search subdirectories within the specified directory. Additionally, when specifying a file, you can choose between the chunk file name (timestamp.[File Name].archived) and the original file name, allowing you to use either.
To delete some or all files within the designated directory (/scratch/arcv/$USER), you can use the standard Linux rm command.
Temporary files (chunk files with the .archived extension) can also be deleted using the rm command. However, note that it may take approximately 3 days for the deletion to be synchronized and for the data archived in the tape library to be permanently deleted. If a temporary file no longer exists in its original path, it is assumed that the user has deleted the file. Therefore, you must restore the file before renaming or moving it—do not directly modify or move the temporary file. (Please be aware that deleting a temporary file will make file restoration impossible.)
Last updated on November 08, 2024.