HSMCompute
System Overview
HSMCompute is a cluster based on IBM Power9 processors, with a Linux operating system and Ethernet connection.
It has the following configuration:
1 login node (hsmcompute1), with:
- 2 x IBM Power9 8335-GTH @ 2.3GHz (3.8GHz on turbo, 20 cores and 4 threads/core, total 160 threads)
- 512 GB memory distributed in 16 RDIMM x 32GB DDR4 @ 2666 MHz
- GPFS via Ethernet, 200GBps (2x100)
- 2 x 1 TB NVME mounted as RAID0
This node is divided in two using cgroups, 80 threads and 256 GB memory for interactive sessions and and 80 threads and 256 GB memory for slurm, to ensure enough resources for slurm.
2 compute nodes (hsmcompute[2,3]), each with:
- 2 x IBM Power9 8335-GTH @ 2.3GHz (3.8GHz on turbo, 16 cores and 4 threads/core, total 128 threads)
- 1 TB memory
- 2 x GPU NVIDIA V100 (Volta) with 16GB HBM2
- 2 x 1 TB NVME mounted as RAID0
- GPFS via Ethernet, 50GBps (2x25)
The operating system is Red Hat Enterprise Linux Server 8.6 alternative.
Compiling applications
As of now the system provides only GCC as C/C++ compiler: version 8.5.0 as default, and 8.2.0, 9.2.0 and 11.2.0 available as modules.
You can also find:
- GO/1.12.5
Connecting to HSM Compute
hsmcompute is not publicly accessible from the internet, if you want to access from the outside you will need to send your public IP address to support@bsc.es for us to whitelist it (if you are a BSC employee, you can connect directly from the BSC VPN network).
You can connect to hsmcompute1 using the login node (both are the same node, hsmcompute1 is an alias).
- icc.bsc.es / hsmcompute1.bsc.es
hsmcompute2 and 3 are only accessible via slurm, since they are aren't login nodes. You can request one of the two using the salloc command with the -x option and exclude hsmcompute1. Have in mind that any interactive session requesting GPUs will connect to hsmcompute2/3, this can be done using --gres=gpu:number on your salloc command.
You must use Secure Shell (ssh) tools to login into or transfer files into the cluster. We do not accept incoming connections from protocols like telnet, ftp, rlogin, rcp, or rsh commands. Once you have logged into the cluster you cannot make outgoing connections for security reasons.