MareNostrum ONA (Quantum)

Running jobs

Slurm is the utility used for batch processing support, so all jobs must be run through it. This section provides information for getting started with job execution at the cluster.

Submitting jobs

The method for submitting jobs is to use the Slurm batch directives.

SBATCH commands

The basic Slurm directives for submitting and overseeing jobs using sbatch include the following (refer to Job directives for additional options):

Submit a job script to the queue system:
```
sbatch -A, --account={account} -q, --qos={qos} {job_script} [args]
```
Example
```
sbatch -A ona01 -q bl_short myjob.sh
```
WARNING
The system will generate specific error messages if an attempt is made to submit a job without specifying a Slurm account and/or queue (QoS):
INFO
You can get your available account/s (unixgroup) by running:
bsc_project list
And your QoS with:
bsc_queues
Display all submitted jobs (from all your current accounts/projects):
```
squeue
```
Display all submitted jobs from a specific account or project:
```
squeue -A, --account={account}
```
Remove a job from the queue system, canceling the execution of the processes (if they were still running):
```
scancel {jobid}
```

To access more detailed information:

man sbatch
man srun
man salloc

Queues (QoS)

Several queues are present in the machines, and users may access different queues. Queues have unlike limits regarding the number of cores and duration for the jobs.

Anytime you can check all queues you have access to and their limits by using:

bsc_queues

Standard queues

Standard queues (QoS) limits are as follows:

Partition Qblue:
Queue Wallclock Slurm QoS name Priority
Short 10min bl_short 1000
Long 1h bl_long 10
Extra* 24h bl_extra 1
Partition Qred:
Queue Wallclock Slurm QoS name Priority
Short 10min rd_short 1000
Long 1h rd_long 10
Extra* 24h rd_extra 1
Partition Interactive:
Queue Wallclock Slurm QoS name Priority
Interactive 2h interactive 100

Queue	Wallclock	Slurm QoS name	Priority
Short	10min	bl_short	1000
Long	1h	bl_long	10
Extra*	24h	bl_extra	1

Queue	Wallclock	Slurm QoS name	Priority
Short	10min	rd_short	1000
Long	1h	rd_long	10
Extra*	24h	rd_extra	1

Queue	Wallclock	Slurm QoS name	Priority
Interactive	2h	interactive	100

Special queues

Additional queues, such as rd/bl_extra, can be provided for extended or larger executions. However, their allocation requires justification based on factors such as demand, the current workload of the machine, and other relevant conditions.

To request access to these specialized queues, please get in touch with us.

Job example

Simple circuit

Sbatch file myjob.sh:

#!/bin/bash
#SBATCH --job-name=quantum_job
#SBATCH --account=ona01
#SBATCH --qos=bl_short
#SBATCH --time=00:05:00
#SBATCH --output=quantum_%j.out
#SBATCH --error=quantum_%j.err

module load python
python circuit.py

circuit.py

import os
import qililab as ql
from qibo import Circuit, gates
ql.logger.setLevel(40)  # Set qililab's logger to a higher level so it only shows error messages
# Load the runcard
PLATFORM_PATH = os.getenv("RUNCARD")

# Construct the circuit
circuit = Circuit(2)

# Add some gates
circuit.add(gates.X(0))
circuit.add(gates.H(1))
circuit.add(gates.M(0,1))

# Execute the circuit
result = ql.execute(circuit,PLATFORM_PATH,nshots=1000)
print(result.counts())

Send job:
```
sbatch myjob.sh
```

Job directives

A job script must include a set of directives to convey the job's characteristics to the batch system. These directives appear as comments within the job script and must adhere to the syntax specified for sbatch. Additionally, the job script may contain a set of commands to execute.

sbatch syntax is of the form:

#SBATCH --directive=value

Below are some of the most frequently used directives.

Resource reservation

REMARK

All jobs in the Qblue or Qred partition will allocate the entire node (80 CPUs + 1 QPU).

Request the queue for the job:
```
#SBATCH --qos={qos}
```
REMARK
- Remember that this parameter is mandatory when submitting jobs in the current MareNostrum Ona configuration.
Set a time limit for the total runtime of the job:
```
#SBATCH --time={time} #DD-HH:MM:SS
```
caution
- This field is mandatory, and you should set it to a value greater than the execution time needed for your application while staying within the time limit of the chosen queue.
- Please be aware that your job will be terminated once the specified time has elapsed.
- The maximum execution time is determined by the queue's time limit, not by the number of shots or circuits in the job.
Example 1
a
Example 2
Sometimes, node reservations may be approved for executions exclusive to specific accounts, which is particularly beneficial for educational courses.
To specify the reservation name for job allocation, assuming your account has access to that reservation:
```
#SBATCH --reservation={name}
```

Job arrays

Submit a job array for the execution of multiple jobs with identical parameters:
```
#SBATCH --array={indexes}
```
REMARKS
- The index specification determines which array index values to utilize.
- Multiple values can be indicated using a comma-separated list and/or a range of values with a "-" separator.
Job arrays will include the configuration of two additional environment variables:
1. SLURM_ARRAY_JOB_ID: will be assigned the initial job ID of the array.
2. SLURM_ARRAY_TASK_ID: will be assigned the job array index value.
Example
```
sbatch --array=1-3 job.cmd
Submitted batch job 36
```
Will create a job array consisting of three jobs, and subsequently, the environment variables will be configured as follows:
```
# Job 1
SLURM_JOB_ID=36
SLURM_ARRAY_JOB_ID=36
SLURM_ARRAY_TASK_ID=1

# Job 2
SLURM_JOB_ID=37
SLURM_ARRAY_JOB_ID=36
SLURM_ARRAY_TASK_ID=2

# Job 3
SLURM_JOB_ID=38
SLURM_ARRAY_JOB_ID=36
SLURM_ARRAY_TASK_ID=3
```

Working directory and job output/error files

Establish the working directory for your job, indicating the location where the job will be executed:
```
#SBATCH --chdir={pathname}
```
caution
- If not explicitly specified, it defaults to the current working directory when the job is submitted.
Specify the filenames to which to redirect the job's standard output (stdout) and standard error output (stderr):
```
#SBATCH --output={filename}
#SBATCH --error={filename}
```

Email notifications

These two directives are presented together as they should be utilized simultaneously. They activate email notifications, which are triggered when a job commences its execution (begin), concludes its execution (end), or both (all):
```
#SBATCH --mail-type={begin|end|all|none}
#SBATCH --mail-user={email}
```
Example (notified at the end of the job execution)
```
#SBATCH --mail-type=end
#SBATCH --mail-user=brucespringsteen@bsc.es
```
REMARKS
- The none option doesn't trigger any e-mail; it is the same as not putting the directives.
- The only requirement is that the specified email is valid and matches the one you use for the HPC User Portal (you may wonder, what is the HPC User Portal? An excellent question; you can find more information here).

Interactive jobs

The allocation of an interactive session must be done through Slurm. This queue does not have access to the chip.

salloc -A, --account={account} -q, --qos={qos} [ OPTIONS ]

Here are some of the parameters you can use with salloc (see also Job directives):

-J, --job-name={name}
-q, --qos={name}
-p, --partition={name}

-t, --time={time}
-n, --ntasks={number}
-c, --cpus-per-task={number}
-N, --nodes={number}

--exclusive
--x11

Examples

Request an interactive session for 10 minutes, 1 task, 4 CPUs (cores) per task, in the "Interactive" partition (interactive):
```
salloc -A ona01 -t 00:10:00 -n 1 -c 4 -q interactive -J myjob
```

Understanding job status and reason codes

When using the squeue command, Slurm provides information about the status of your submitted jobs. If a job is still waiting before execution, it will be accompanied by a reason. Slurm employs specific codes to show this information, and the following section explains the significance of the most relevant ones.

Job state codes

Common state codes for submitted jobs include:

COMPLETED (CD): The job has completed the execution.
COMPLETING (CG): The job is finishing, but some processes are still active.
FAILED (F): The job terminated with a non-zero exit code.
PENDING (PD): The job is waiting for resource allocation. The most common state after running sbatch, it will run eventually.
PREEMPTED (PR): The job was terminated because of preemption by another job.
RUNNING (R): The job is allocated and running.
SUSPENDED (S): A running job has been stopped with its cores released to other jobs.
STOPPED (ST): A running job has been stopped with its cores retained.

Job reason codes

The following list outlines the most frequently encountered reason codes for jobs that have been submitted but have not yet entered the running state:

Priority: One or more higher priority jobs is in queue for running. Your job will eventually run.
Dependency: This job is waiting for a dependent job to complete and will run afterwards.
Resources: The job is waiting for resources to become available and will eventually run.
InvalidAccount: The job’s account is invalid. Cancel the job and resubmit with correct account.
InvaldQoS: The job’s QoS is invalid. Cancel the job and resubmit with correct account.
QOSGrpCpuLimit: All CPUs assigned to your job’s specified QoS are in use; job will run eventually.
QOSGrpMaxJobsLimit: Maximum number of jobs for your job’s QoS have been met; job will run eventually.
QOSGrpNodeLimit: All nodes assigned to your job’s specified QoS are in use; job will run eventually.
PartitionCpuLimit: All CPUs assigned to your job’s specified partition are in use; job will run eventually.
PartitionMaxJobsLimit: Maximum number of jobs for your job’s partition have been met; job will run eventually.
PartitionNodeLimit: All nodes assigned to your job’s specified partition are in use; job will run eventually.
AssociationCpuLimit: All CPUs assigned to your job’s specified association are in use; job will run eventually.
AssociationMaxJobsLimit: Maximum number of jobs for your job’s association have been met; job will run eventually.
AssociationNodeLimit: All nodes assigned to your job’s specified association are in use; job will run eventually.

Resource usage and job priorities

Projects will be allocated a specific amount of compute hours or core hours for utilization. A single core hour represents the computational time of one core over one hour. In the case of Quantum nodes, which require the use of the entire node, a job will always consume 80 core hours per hour from the allocated budget.

The accounting relies solely on the utilization of compute hours.

Various factors determine a job's priority and subsequent scheduling, the most significant being job size, queue waiting time, and fair share among groups:

MareNostrum ONA queues (QoS): the short one will have higher priority.
The waiting time in queues is also considered, and jobs progressively gain more priority the longer they wait.
Additionally, our queue system incorporates a fair-share policy among groups. Users with fewer executed jobs and consumed compute hours receive higher priority for their jobs compared to groups with increased usage. This ensures a fair distribution of computing time, allowing users to run jobs without favouring one group over another.

You can review your current fair-share score using the command:

sshare -la

Notifications

Currently, receiving email notifications about job status is not supported. To monitor the execution or completion of your jobs, you must connect to the system and manually check their status. Automatic notifications are planned to be enabled in the future.

MareNostrum ONA (Quantum)

Running jobs​

Submitting jobs​

SBATCH commands​

Example​

Queues (QoS)​

Standard queues​

Special queues​

Job example​

Simple circuit​

Job directives​

Resource reservation​

Example 1​

Example 2​

Job arrays​

Example​

Working directory and job output/error files​

Email notifications​

Example (notified at the end of the job execution)​

Interactive jobs​

Examples​

Understanding job status and reason codes​

Job state codes​

Job reason codes​

Resource usage and job priorities​

Notifications​