Skip to main content

MareNostrum 5

NEW essential changes!

info

This information is provisional and will be available only during the pre-production period.

HPC user accounts management

  • Users will now have a unique username associated with their (institutional) email address:

    • Your username can now have resource assignments for multiple projects such as BSC, RES, EuroHPC, etc.
    • Your username belongs to a primary Unix group (typically corresponding to your institution but without any resource allocation) and will have an associated secondary group per project with resource allocation.
    • So you must use newgrp (Linux) and account (Slurm) with the secondary group to manage your projects' data and jobs.
  • A slight modification is applied to existing BSC staff usernames:

    • bscXXYYY bsc0XXYYY
  • A new bsc_command has been developed to easily switch between your projects, called bsc_project.

Submitting jobs

  • When submitting a job, it is now mandatory to specify both the account (which will be the same as the secondary group associated with your project) and the Slurm queue.

  • By specifying the queue, you can send jobs from any login node to any partition.

Available storage spaces

  • New (empty) filesystems:
    /gpfs/home      #one per user account (username)
    /gpfs/projects #one per project (secondary group)
    /gpfs/scratch #one per project (secondary group)
Backups

We keep incremental backups of /gpfs/home, /gpfs/apps and /gpfs/projects, the frequency of which depends on the amount of data. That said, it is your responsibility as a user of our facilities to backup all your critical data.

FilesystemTime to complete copy
/gpfs/home~1 day
/gpfs/apps~1 day
/gpfs/projects3~4 days

MareNostrum 4 and (old) Storage filesystems

  • The final location for your old MN4-Storage data is as follows:
    /gpfs/home/<PRIMARY_GROUP>/$USER/MN4/<MN4_USER>
    /gpfs/projects/<GROUP>/MN4/<GROUP>
    /gpfs/scratch/<GROUP>/MN4/<GROUP>/<MN4_USER>

Slurm changes with performance implications

  • Due to modifications in the current Slurm version, srun no longer considers SLURM_CPUS_PER_TASK and does not inherit the --cpus-per-task option from sbatch.

  • Therefore, it is necessary to specify explicitly --cpus-per-task in your srun commands or set the environment variable SRUN_CPUS_PER_TASK instead, for example:

    • Example 1:

      [...]
      #SBATCH -n 1
      #SBATCH -c 2

      srun --cpus-per-task=2 ./openmp_binary

    • Example 2:

      [...]
      #SBATCH -n 1
      #SBATCH -c 2

      export SRUN_CPUS_PER_TASK=${SLURM_CPUS_PER_TASK}

      srun ./openmp_binary

WARNING
  • This only applies to srun, not mpirun.
  • This becomes crucial when executing with more than one thread per process.
  • If the aforementioned is excluded, thread pinning (thread affinity) will be adversely affected, resulting in threads overlapping on the same cores (hardware threads).
  • This will have a direct impact on the application's performance.
  • When using mpirun instead of srun, the SLURM_CPU_BIND variable must be set to "none":

    export SLURM_CPU_BIND=none
  • When using the Nvidia HPC SDK in the accelerated partition, MPI binaries must be run using mpirun, rather than srun. This is due to the Slurm inside the Nvidia SDK not being entirely compatible with Marenostrum5's Slurm configuration, causing it to fail.

Other considerations

Remote Operation Error

If you run into an error similar to this one:

[gs15r1b68:2016180:0:2016180] ib_mlx5_log.c:179 Remote operation error on mlx5_0:1/IB (synd 0x14 vend 0x89 hw_synd 0/0)
[gs15r1b68:2016180:0:2016180] ib_mlx5_log.c:179 DCI QP 0x1b270 wqe[106]: SEND s-e [rqpn 0xce03 rlid 4285] [va 0x7f072bdf5400 len 65 lkey 0xb300f5]

Check that you are using an UCX module, as this error comes from a known bug in the system-wide installation of UCX, running this command should fix the issue:

module load ucx

Floating-Point Exception Error

Another error you might encounter is a floating-point exception, which appears as:

Program received signal SIGFPE: Floating-point exception - erroneousarithmetic operation.

This error could also be related to the UCX module. To address it, try loading the UCX module with:

module load ucx

By loading the UCX module, you should be able to resolve both types of errors.

Hyper-Threading

All nodes in MareNostrum 5 come with Hyper-Threading capability. In this regard, unless you explicitly request to run on SMT, you don't need to be concerned, and you can continue configuring your jobs just as you did in MN4.

info

We'll soon provide guidance on effective utilization for those interested in leveraging this new functionality.