LSB_RESOURCE_ENFORCE

Syntax

LSB_RESOURCE_ENFORCE="resource [resource]"

Description

Controls resource enforcement through the Linux cgroup memory and cpuset subsytem on Linux systems with cgroup support. Memory and cpuset enforcement for Linux cgroups is supported on Red Hat Enterprise Linux (RHEL) 6.2 or above, SuSe Linux Enterprise Linux 11 SP2 or above.

resource can be either memory or cpu, or both cpu and memory in either order.

LSF can impose strict host-level memory and swap limits on systems that support Linux cgroups. These limits cannot be exceeded. All LSF job processes are controlled by the Linux cgroup system. If job processes on a host use more memory than the defined limit, the job will be immediately killed by the Linux cgroup memory subsystem. Memory is enforced on a per job/per host basis, not per task. If the host OS is Red Hat Enterprise Linux 6.3 or above, cgroup memory limits are enforced, and LSF is notified to terminate the job. Additional notification is provided to users through specific termination reasons displayed by bhist –l.

To enable memory enforcement, configure LSB_RESOURCE_ENFORCE="memory".

Note: If LSB_RESOURCE_ENFORCE="memory" is configured, all existing LSF memory limit related parameters such as LSF_HPC_EXTENSIONS="TASK_MEMLIMIT", LSF_HPC_EXTENSIONS="TASK_SWAPLIMIT", LSB_JOB_MEMLIMIT and LSB_MEMLIMIT_ENFORCE will be ignored.

LSF can also enforce CPU affinity binding on systems that support the Linux cgroup cpuset subsystem. When CPU affinity binding through Linux cgroups is enabled, LSF will create a cpuset to contain job processes if the job has affinity resource requirements, so that the job processes cannot escape from the allocated CPUs. Each affinity job cpuset includes only the CPU and memory nodes that LSF distributes. Linux cgroup cpusets are only created for affinity jobs.

To enable CPU enforcement, configure LSB_RESOURCE_ENFORCE="cpu".

If you are enabling memory and CPU enforcement through the Linux cgroup memory cpsuset subsystems after upgrading an existing LSF cluster, make sure that the following parameters are set in lsf.conf:
  • LSF_PROCESS_TRACKING=Y

  • LSF_LINUX_CGROUP_ACCT=Y

Examples

For a parallel job with 3 tasks and a memory limit of 100 MB, such as the following:

bsub -n 3 -M 100 –R "span[ptile=2]" blaunch ./mem_eater 

The application mem_eater keeps increasing the memory usage. LSF will kill the job if it consumes more than 200 MB total memory on one host. For example, if hosta runs 2 tasks and hostb runs 1 task, the job will only be killed if total memory on exceeds 200 MB on either hosta or hostb. If one of the tasks consumes more than 100 MB memory but less than 200 MB, and the other task doesn’t consume any memory, the job will not be killed. That is, LSF does not support per task memory enforcement for cgroups.

For a job with affinity requirement, such as the following:

bsub -R "affinity[core:membind=localonly]"./myapp

LSF will create a cpuset which contains one core and attach the process ID of the application ./myapp to this cpuset. The cpuset serves as a strict container for job processes, so that the application ./myapp cannot bind to other CPUs. LSF will add all memory nodes into the cpuset to make sure the job can access all memory nodes on the host, and will make sure job processes will access preferred memory nodes first.

Default

Not defined. Resource enforcement through the Linux cgroup system is not enabled.