Package guarantees

A package comprises some number of slots and some amount of memory all on a single host. Administrators can configure an SLA of a number of packages for jobs of a particular class. The idea is that a package has all the slot and memory resources for a single job of that class to run. Each job running in a guarantee pool must occupy the whole multiple of packages. Best practice is to define a package size based on the resource requirement of the jobs for which you made the guarantees.

Configuring guarantee package policies

Guarantee policies (pools) are configured in lsb.resources. For package guarantees, these policies specify:

  • A set (pool) of hosts

  • The resources in a package

  • How many packages to reserve for each set of service classes

  • Policies for loaning out reserved resources that are not immediately needed

Configuration is done the same as for a slot or host guarantee policy, with a GuaranteedResourcePoolsection in lsb.resources. The main difference being that the TYPE parameter is used to express the package resources. The following example is a guarantee package pool defined in lsb.resources:

Begin GuaranteedResourcePool
NAME = example_pool
TYPE = package[slots=1:mem=1000]
HOSTS = hgroup1
RES_SELECT = mem > 16000
DISTRIBUTION = ([sc1, 25%] [sc2, 25%] [sc3, 30%])
End GuaranteedResourcePool

A package need not have both slots and memory. Setting TYPE=package[slots=1] gives essentially the same result as a slot pool. It may be useful to have only slots in a package (and not mem) in order to provide guarantees for parallel jobs that require multiple CPUs on a single host, where memory is not an important resource. It is likely not useful to configure guarantees of only memory without slots, although the feature supports this.

Each host can belong to at most one slot/host/package guarantee pool. At mbatchd startup time, it will go through hosts one by one. For each host, mbatchd will go through the list of guarantee pools in configuration order, and assign the host to the first pool for which the job meets the RES_SELECT and HOSTS criteria.

Total packages of a pool

The total packages of a pool is intended to represent the number of packages that can be supplied by the pool if there are no jobs running in the pool. This total is used for:

  • Display purposes – bresources displays the total for each pool, as well as showing the pool status as overcommitted when the number guaranteed in the pool exceeds the total.

  • Determining the actual number of packages to reserve when guarantees are given as percentages instead of absolute numbers.

LSF calculates the total packages of a pool by summing over all hosts in the pool, the total package each host. Hosts that are currently unavailable are not considered to be part of a pool. On each host in a pool, the total contributed by the host is the number of packages that fit into the MXJ and total memory of the host. For the purposes of computing the total packages of the host, mbschd estimates the total memory for LSF jobs as the minimum of:

  • The total slots of the host (MXJ), and

  • The maximum memory of the host, i.e. maxmem as reported by lshosts.

The total packages on a host is the number of packages that can fit into the total slots and maxmem of the host.. This way, if there are processes on the host not belonging to LSF jobs, the memory occupied by these processes does not count toward the total packages for the host. Even if we kill all the LSF jobs on the host, we may not be able to have LSF jobs use mem all the way to maxmem.

Memory on a host will be used by processes outside of LSF jobs. The result may be that even when there are no jobs running on a host, the number of free packages on the host is less than the total packages of the host. The free packages are computed from the available slots and available memory (mem).

Currently available packages in a pool

So that LSF knows how many packages to reserve during scheduling, LSF must track the number of available packages in each package pool. The number of packages available on a host in the pool is equal to the number of packages that fit into the free resources on the host. The available packages of a pool is simply this amount summed over all hosts in the pool.

For example, suppose there are 5 slots and 5 GB free on the host. Each package contains 2 slots and 2 GB memory. Therefore, there are 2 packages currently available on the host.

Hosts in other states are temporarily excluded from the pool, and any SLA jobs running on hosts in other states are not counted towards the guarantee.