User Programming Environment
A. Programming Tool Installation Status
Compiler and library module
Abaqus resumed service on August 9, 2023
Category
Item (Name/Version)
Architecture
distinction module
∙ craype-mic-knl
∙ craype-x86-skylake
Compiler
∙ intel/17.0.5
∙ intel/18.0.1
∙ intel/18.0.3
∙ intel/19.0.1
∙ intel/19.0.4
∙ intel/19.0.5 ∙ intel/19.1.2 ∙ intel/oneapi_21.2
∙ cce/8.6.3
∙ gcc/6.1.0
∙ gcc/7.2.0
∙ gcc/8.3.0
∙ pgi/18.10
∙ pgi/19.1
Compiler dependent
library
∙ hdf4/4.2.13
∙ hdf5/1.10.2
∙ lapack/3.7.0
∙ ncl/6.7.0
∙ ncview/2.1.7
∙ NCO/4.7.4
∙ netcdf/4.6.1
MPI
∙ impi/17.0.5
∙ impi/18.0.1
∙ impi/18.0.3
∙ impi/19.0.1
∙ impi/19.0.4
∙ impi/19.0.5 ∙ impi/19.1.2 ∙ impi/oneapi_21.2
∙ mvapich2/2.3
∙ mvapich2/2.3.1
∙ openmpi/3.1.0
∙ mvapich-verbs/2.2.ddn1.4
∙ ime/mvapich-verbs/2.2.ddn1.4
MPI dependent
library
∙ fftw_mpi/2.1.5
∙ fftw_mpi/3.3.7
∙ hdf5-parallel/1.10.2
∙ netcdf-hdf5-parallel/4.6.1
∙ parallel-netcdf/1.10.0
∙ pio/2.3.1
Commercial software
∙ cfx/v145
∙ cfx/v170
∙ cfx/v181
∙ cfx/v191
∙ cfx/v192
∙ cfx/v195
∙ cfx/v201
∙ cfx/v202
∙ fluent/v145
∙ fluent/v170
∙ fluent/v181
∙ fluent/v191
∙ fluent/v192
∙ fluent/v195
∙ fluent/v201
∙ fluent/v202
∙ fluent/v221
∙ fluent/v222
∙ gaussian/g16.a03
∙ gaussian/g16.a03.linda
∙ gaussian/g16.b01.linda
∙ gaussian/g16.c01.linda
Application software
∙ cp2k/5.1.0
∙ cp2k/6.1.0
∙ ferret/7.4.3
∙ forge/18.1.2
∙ grads/2.2.0 ∙ gromacs/5.0.6
∙ gromacs/2016.4
∙ gromacs/2018.6
∙ gromacs/2020.2 ∙ gromacs/2021.7 ∙ gromacs/2022.6
∙ lammps/12Dec18
∙ lammps/8Mar18 ∙ lammps/3Mar20 ∙ lammps/23Jun22 ∙ lammps/2Aug23
∙ namd/2.12
∙ namd/2.13
∙ PETSc/3.8.4
∙ python/2.7.15
∙ python/3.7 ∙ python/3.9.5 ∙ python/3.12.5
∙ qe/6.1
∙ qe/6.4.1 ∙ qe/6.6. ∙ qe/7.2
∙ R/3.5.0 ∙ R/3.6.2 ∙ R/4.2.1
∙ siesta/4.0.2
∙ siesta/4.1-b3
Virtualization module
∙ singularity/3.6.4 ∙ singularity/3.9.7 ∙ singularity/3.11.0 ∙ singularity/4.1.0
∙ conda/pytorch_1.0
∙ conda/tensorflow_1.13
∙ conda/intel_caffe_1.1.5
Intel
Debugging module
∙ advisor/17.0.5
∙ advisor/18.0.1
∙ advisor/18.0.3
∙ advisor/18.0.3a
∙ vtune/17.0.5
∙ vtune/18.0.1
∙ vtune/18.0.2
∙ vtune/18.0.3
Cray module
∙ cdt/17.10
∙ cray-ccdb/3.0.3
∙ cray-cti/1.0.6
∙ cray-fftw/3.3.6.2
∙ cray-fftw_impi/3.3.6.2
∙ cray-impi/1.1.4
∙ cray-lgdb/3.0.7
∙ cray-libsci/17.09.1
∙ craype/2.5.13
∙ craypkg-gen/1.3.5
∙ chklimit/1.0
∙ mvapich2_cce/2.2rc1.0.3_noslurm
∙ mvapich2_gnu/2.2rc1.0.3_noslurm
∙ papi/5.5.1.3
∙ perftools/6.5.2
∙ perftools-bas/6.5.2
∙ perftools-lite/6.5.2
∙ PrgEnv-cray/1.0.2
∙ libfabric/1.7.0
∙ pbs/trace
∙ pbs/tools
Others
∙ cmake/3.12.3
∙ cmake/3.17.4 ∙ cmake/3.26.2
∙ common/memkind-1.9.0
∙ git/1.8.3.4 ∙ git/2.35.1
∙ IGPROF/5.9.16
∙ ImageMagick/7.0.8-20
∙ perl/5.28.1
∙ qt/4.8.7
∙ qt/5.9.6
∙ subversion/1.7.19
∙ subversion/1.9.3
/apps/Modules/modulefiles/test is test module
User groups that can use ANSYS are limited to universities, industries (small and medium-sized enterprises), and research institutes.
Please note that the use of ANSYS by users who are not in the available user group or who have not applied for use may be subject to legal sanctions by ANSYS.
User groups that can use Abaqus are limited to universities, industries (small and medium-sized enterprises), and research institutes.
Please note that the use of Abaqus by users who are not in the available user group or who have not applied for use may be subject to legal sanctions by Dassault Systèmes.
For Gaussian, obtain permission for use from the helpdesk account manager (account@ksc.re.kr) first
Refer to [APPENDIX 8] for the installation status of shared libraries (e.g.: cairo, expat, jasper, libpng, udunits, etc.)
Commercial software information
B. How to Use Compiler
1. Compiler and MPI configuration settings (modules)
1) Default required module
One corresponding module must be added to the computing node being used.
Computing node to be used
KNL
SKL
Module name
craype-mic-knl
craype-x86-skylake
2) Basic commands related to the module
Print a list of available modules A list of modules that are available such as compiler and library can be checked.
Add a module to be used Modules to be used such as compiler and library can be added. All modules to be used can be added at once.
Delete modules being used Remove the modules that are no longer needed. Multiple modules can be deleted at once.
Print a list of modules being used. A list of currently set modules can be checked.
Purge all modules being used
In this case, the default required modules are also deleted at once; those modules need to be added again when reusing them.
Examples of using other modules
The module help command (view help)
The module show command (view configured environment variables)
2. Compiling sequential programs
A sequential program refers to a program that does not consider a parallel programming environment. In other words, it is a program that does not use parallel programming interfaces like OpenMP or MPI and runs using only one processor on a single node. The compiler options used for compiling sequential programs are the same as those used for compiling parallel programs, so even if you are not interested in sequential programs, it is beneficial to refer to them.
1) Intel compiler
A version of an Intel compiler module required for using the Intel compiler should be added. Available modules can be checked with the command module avail.
※ Check the available version by referring to the programming tool installation status table.
Compiler type
Compiler
program
Source extensions
icc / icpc
C / C++
.C, .cc, .cpp, .cxx,.c++
ifort
F77/F90
.f, .for, .ftn, .f90, .fpp, .F, .FOR, .FTN, .FPP, .F90
Compiler option
Compiler option
Description
-O[1|2|3]
Object optimization. Optimization level for numbers.
-ip, ipo
Optimization between procedures
-qopt_report=[0|1|2|3|4]
Adjusts the amount of vector diagnostic information
-xCORE-AVX512 -xMIC-AVX512
Supports CPU with a 512 bit register (when the SKL node is used for computing) Supports MIC with a 512 bit register (when the KNL node is used for computing)
-fast
-O3 -ipo -no-prec-div -static, -fp-model fast=2 macro
-static/-static-intel/ -i_static
Does not allow shared libraries to be linked
-shared/-shared-intel/ -i_dynamic
Allows shared libraries to be linked
-g -fp
Generates debugging information
-qopenmp
Uses OpenMP-based multi-thread code
-openmp_report=[0|1|2]
Adjusts OpenMP parallelization diagnostic level
-ax -axS
Generates code optimized for a specific processor Generates specialized code utilizing the SIMD Extensions4 (SSE4) vectorizing compiler and media acceleration commands
-tcheck
Activates the analysis of thread-based programs
-pthread
Adds the pthread library to receive multi-threading support
-msse<3,4.1>,-msse3
Supports Streaming SIMD Extensions 3
-fPIC,fpic
Compiles to be position independent code (PIC)
-p
Generates profiling information (gmon.out)
-unroll
Unroll activation, is the maximum number (only supports 64 bit)
-mcmodel medium
Used when a memory allocation of 2 GB or higher is required
-help
Outputs a list of options
Example of using Intel compiler
The following is an example of creating an execution file test.exe by compiling a test sample file with the Intel compiler in the KNL compute node.
※ You can copy and use the job submission test example files from /apps/shell/home/job_examples
Recommended options
Computing node
Recommended options
SKL
-O3 –fPIC –xCORE-AVX512
KNL
-O3 -fPIC -xMIC-AVX512
SKL & KNL
-O3 –fPIC -xCOMMON-AVX512
2) GNU compiler
Add a GNU compiler module required for using the GNU compiler. Available modules can be checked with the command module avail.
※ Check the available version by referring to the programming tool installation status table.
※ Must use "gcc/6.1.0” version or higher
Compiler type
Compiler
program
Source extensions
gcc / g++
C / C++
.C, .cc, .cpp, .cxx,.c++
gfortran
F77/F90
.f, .for, .ftn, .f90, .fpp, .F, .FOR, .FTN, .FPP, .F90
GNU compiler option
Compiler option
Description
-O[1|2|3]
Object optimization. Optimization level for numbers.
-march=skylake-avx512 -march=knl
Supports CPU with a 512 bit register (when the SKL node is used for computing) Supports MIC with a 512 bit register (when the KNL node is used for computing)
-Ofast
-O3 -ffast-math macro
-funroll-all-loops
Unrolls all loops
-ffast-math
Uses a fast floating point model
-mline-all-stringops
Allows more inlining and improves the performance of memcpy, strlen, and memsetdp dependent codes
-fopenmp
Uses OpenMP-based multi-thread code
-g
Generates debugging information
-pg
Generates profiling information (gmont.out)
-fPIC
Compiles to generate position independent code (PIC)
-help
Outputs a list of options
Example of using GNU compiler
The following is an example of creating an execution file test.exe by compiling a test sample file with the GNU compiler in the KNL compute node.
※ Copy the test sample file for job submission in /apps/shell/home/job_examples
Recommended options
Computing node
Recommended options
SKL
-O3 -fPIC -march=skylake-avx512
KNL
-O3 -fPIC -march=knl
SKL & KNL
-fPIC -mpku
3) PGI compiler
Add a PGI compiler module version required for using the PGI compiler to be used. Available modules can be checked with the command module avail.
※ Check the available version by referring to the programming tool installation status table.
Compiler type
Compiler
program
Source extensions
pgcc / pgc++
C / C++
.C, .cc, .cpp, .cxx, .c++
pgfortran
F77/F90
.f, .for, .ftn, .f90, .fpp, .F, .FOR, .FTN, .FPP, .F90
PGI compiler option
Compiler option
Description
-O[1|2|3|4]
Object optimization. Optimization level for numbers.
-Mipa=fast
Optimization between procedures
-fast
-O2 -Munroll=c:1 -Mnoframe -Mlre –Mautoinline macro
-fastsse
Optimization supporting SSE and SSE2
-g, -gopt
Generates debugging information
-mp
Uses OpenMP-based multi-thread code
-Minfo=mp, ipa
OpenMP related information, optimization between procedures
-pg
Generates profiling information (gmon.out)
-Mprof=time
-Mprof=func
-Mprof=lines
Generates PGPROF output file
- Frequently used for generating profiling information at the time-based command level
- Generates profiling information at the function level
- Generates profiling information at the line level
(- for Mprof=lines, computing time can be significantly slow owing to an increase in overhead)
-mcmodel medium
Used when a memory allocation of 2 GB or higher is required
-tp=skylake
-tp=knl
Option exclusive for the Skylake architecture processor
Option exclusive for the KNL architecture processor
-fPIC
Compiles to generate position independent code (PIC)
-help
Outputs a list of options
Example of using PGI compiler
The following is an example of creating an execution file test.exe by compiling a test sample file with the PGI compiler in the KNL compute node.
※ You can copy and use the job submission test example files from /apps/shell/home/job_examples
Recommended options
Computing node
Recommended options
SKL
-fast –tp=skylake
KNL
-fast –tp=knl
SKL & KNL
-fast –tp=skylake,knl
4) Cray compiler
To use the Cray compiler, load the appropriate version of the Cray compiler module. You can check available modules with the module avail command.
Compiler type
Compiler
program
Source extensions
cc / CC
C / C++
.C, .cc, .cpp, .cxx,.c++
ftn
F77/F90
.f, .for, .ftn, .f90, .fpp, .F, .FOR, .FTN, .FPP, .F90
Compiler option
Compiler option
Description
-O[1|2|3]
Object optimization. Optimization level for numbers.
-hcpu=mic-knl
Supports MIC with a 512 bit register
-Oipa[0
1
-hunroll[0|1|2]
Unrolling option. Unrolls all loops if Default is 2
-hfp[0|1|2|3|4]
Floating_Point optimization
-homp(default)
Uses OpenMP-based multi-thread code
-g | -G0
Generates debugging information
-h pic
Used when a static memory of 2 GB or higher is required (used with -dynamic)
-dynamic
Links shared libraries
Example of using Cray compiler
The following is an example of creating an execution file test.exe by compiling a test sample file with the PGI compiler in the KNL compute node.
※ You can copy and use the job submission test example files from /apps/shell/home/job_examples
Recommended options
Computing node
Recommended options
SKL
Default value
KNL
-hcpu=mic-knl
SKL & KNL
Default value
※ test.c and test.f90 for testing can be found in /apps/shell/home/job_examples (test by copying to the user directory)
※ For programs to use the KNL optimization option, it is recommended to access them through interactive job submission by the KNL debug node and then compile (refer to “Job execution through scheduler → B. Job submission monitoring → 2) Interactive job submission”).
3. Compiling parallel programs
1) OpenMP compile
OpenMP is a technique simply developed to enable multi-thread utilization only by a compiler directive. A compiler used for compiling parallel programs using OpenMP is the same as that of sequential programs. A compiler option can be added for parallel compilation, and most compilers currently support the OpenMP directive.
Compiler option
Program
Option
icc / icpc / ifort
C / C++ / F77/F90
-qopenmp
gcc / g++ / gfortran
C / C++ / F77/F90
-fopenmp
cc / CC /ftn
C / C++ / F77/F90
-homp
pgcc / pgc++ / pgfortran
C / C++ / F77/F90
-mp
Example of OpenMP program compilation (Intel compiler)
The following is an example of creating an execution file test_omp.exe by compiling a test_omp sample file that uses OpenMP with the Intel compiler in the KNL compute node.
Example of OpenMP program compilation (GNU compiler)
The following is an example of creating an execution file test_omp.exe by compiling a test_omp sample file that uses OpenMP with the GNU compiler in the KNL compute node.
Example of OpenMP program compilation (PGI compiler)
The following is an example of creating an execution file test_omp.exe by compiling a test_omp sample file that uses OpenMP with the PGI compiler in the KNL compute node.
Example of OpenMP program compilation (Cray compiler)
The following is an example of creating an execution file test_omp.exe by compiling a test_omp sample file that uses OpenMP with the Cray compiler in the KNL compute node.
2) MPI compiler
Users can execute the MPI commands in the following table, and these commands are a type of wrapper where a designated compiler compiles the source through .bashrc.
Category
Intel
GNU
PGI
Cray
Fortran
ifort
gfortran
pgfortran
ftn
Fortran + MPI
mpiifort
mpif90
mpif90
ftn
C
icc
gcc
pgcc
cc
C + MPI
mpiicc
mpicc
mpicc
cc
C++
icpc
g++
pgc++
CC
C++ + MPI
mpiicpc
mpicxx
mpicxx
CC
Even when compiled through mpicc, the options corresponding to the original compiler being wrapped must be used.
Example of MPI program compilation (Intel compiler)
The following is an example of creating an execution file test_mpi.exe by compiling a test_mpi sample file that uses MPI with the Intel compiler in the KNL compute node.
Example of MPI program compilation (GNU compiler)
The following is an example of creating an execution file test_mpi.exe by compiling a test_mpi sample file that uses MPI with the GNU compiler in the KNL compute node.
Example of MPI program compilation (PGI compiler)
The following is an example of creating an execution file test_mpi.exe by compiling a test_mpi sample file that uses MPI with the PGI compiler in the KNL compute node.
Example of MPI program compilation (Cray compiler)
The following is an example of creating an execution file test_mpi.exe by compiling a test_mpi sample file that uses MPI with the Cray compiler in the KNL compute node.
C. Debugger and Profiler
The 5th supercomputer Nurion beta service provides DDT for program debugging by users. Furthermore, two profilers, namely Intel vtune and CaryPat, are provided for the program profiling of users.
1. Example of using debugger DDT
Select the architecture, compiler, and MPI to be used for using DDT in the 5th supercomputer, and then select a module for using DDT.
This example was tested in the identical environment as above.
Select an execution file by adding the -g -O0 option when compiling as preparation before using DDT.
After running xming and completing the settings of the SSH X environment on a user’s desktop, execute the DDT execution command.
Execute the command to see if the following pop-up execution window appears.
Select “RUN” among the listed commands, select the file for debugging as shown below, and then click “RUN” in the new pop-up window.
Debugging can be initiated by entering the debugging mode for the following selected execution file.
2. Example of using profiler Intel vtune Amplifier
Select the architecture, compiler, and MPI for using a profiler vtune in this system, and then select vtune to use the profiler.
This example was tested in an identical environment as above.
How to use CLI
The command for executing the Intel vtune Amplifier in CLI mode is as follows.
Executing a program for analyzing the amplxe-cl option
If a compiled execution file is prepared and executed according to the command, the r000hs directory is generated. After confirming that the directory has been generated, the command for generating a report is executed to output the result shown below.
How to check the result of using GUI
Intel vtune Amplifier also supports the GUI mode. Only the method for checking the result using GUI is explained here.
Run xming on a user’s desktop
Click “New Analysis” in the screen below.
When the screen shown below appears, check the number of CPUs and click the Start button to begin the analysis.
Once completed, the analysis results are summarized in multiple tabs as shown below.
3. Example of using profiler Cary-Pat
The environment including the architecture is set as below for using CaryPat, which is a profiler, and the example was initiated.
First, test.c file to be used in the example is compiled.
As a result, an execution file a.out is generated.
For analyzing with CaryPat, use pat_build to generate a new execution file.
As a result, a.out+pat file is generated.
The generated execution file is written with MPI; thus, it is executed with mpirun.
Once execution is complete, the a.out+pat+378250-3s directory is generated, and the xf-files/002812.xf file is created in the directory.
When pat_report is executed as above, .ap2 and .apa files are created in the a.out+pat+378250-3s directory.
When the execution file is created again using the .apa file, the file named a.out+apa is created.
When the generated a.out+apa file is executed
a new xf file is generated in a.out+pat+378250-3t.
Reuse pat_report to process the new data.
When the file is executed as above, the ap2 file and tracing report are created.
app2 is provided as a method for visualizing the collected data.
Visualization results are produced as shown below.
Last updated on November 07, 2024.
Last updated