Typical slot allocation scenarios

3 queues with SLOT_SHARE 50%, 30%, 20%, with 15 job slots

This scenario has three phases:

  1. All three queues have jobs running, and LSF assigns the number of slots to queues as expected: 8, 5, 2. Though queue Genova deserves 3 slots, the total slot assignment must be 15, so Genova is allocated only 2 slots:

    bqueues
    QUEUE_NAME    PRIO STATUS          MAX JL/U JL/P JL/H NJOBS  PEND   RUN  SUSP  
    Roma           50   Open:Active      -    -    -    -  1000   992     8     0 
    Verona         48   Open:Active      -    -    -    -   995   990     5     0 
    Genova         48   Open:Active      -    -    -    -   996   994     2     0
  2. When queue Verona has done its work, queues Roma and Genova get their respective shares of 8 and 3. This leaves 4 slots to be redistributed to queues according to their shares: 50% (2 slots) to Roma, 20% (1 slot) to Genova. The one remaining slot is assigned to queue Roma again:

    bqueues
    QUEUE_NAME  PRIO STATUS          MAX JL/U JL/P JL/H NJOBS  PEND   RUN  SUSP  
    Roma         50   Open:Active      -    -    -    -   231   221    11     0 
    Verona       48   Open:Active      -    -    -    -     0     0     0     0 
    Genova       48   Open:Active      -    -    -    -   496   491     4     0
  3. When queues Roma and Verona have no more work to do, Genova can use all the available slots in the cluster:

    bqueues
    QUEUE_NAME   PRIO STATUS          MAX JL/U JL/P JL/H NJOBS  PEND   RUN  SUSP  
    Roma          50   Open:Active      -    -    -    -     0     0     0     0 
    Verona        48   Open:Active      -    -    -    -     0     0     0     0 
    Genova        48   Open:Active      -    -    -    -   475   460    15     0
The following figure illustrates phases 1, 2, and 3:

2 pools, 30 job slots, and 2 queues out of any pool

  • poolA uses 15 slots and contains queues Roma (50% share, 8 slots), Verona (30% share, 5 slots), and Genova (20% share, 2 remaining slots to total 15).

  • poolB with 15 slots containing queues Pisa (30% share, 5 slots), Venezia (30% share, 5 slots), and Bologna (30% share, 5 slots).

  • Two other queues Milano and Parma do not belong to any pool, but they can use the hosts of poolB. The queues from Milano to Bologna all have the same priority.

The queues Milano and Parma run very short jobs that get submitted periodically in bursts. When no jobs are running in them, the distribution of jobs looks like this:

QUEUE_NAME  PRIO STATUS          MAX JL/U JL/P JL/H NJOBS  PEND   RUN  SUSP  
Roma         50   Open:Active      -    -    -    -  1000   992     8     0 
Verona       48   Open:Active      -    -    -    -  1000   995     5     0 
Genova       48   Open:Active      -    -    -    -  1000   998     2     0 
Pisa         44   Open:Active      -    -    -    -  1000   995     5     0 
Milano       43   Open:Active      -    -    -    -     2     2     0     0 
Parma        43   Open:Active      -    -    -    -     2     2     0     0 
Venezia      43   Open:Active      -    -    -    -  1000   995     5     0 
Bologna      43   Open:Active      -    -    -    -  1000   995     5     0

When Milano and Parma have jobs, their higher priority reduces the share of slots free and in use by Venezia and Bologna:

QUEUE_NAME   PRIO STATUS          MAX JL/U JL/P JL/H NJOBS  PEND   RUN  SUSP  
Roma          50   Open:Active      -    -    -    -   992   984     8     0 
Verona        48   Open:Active      -    -    -    -   993   990     3     0 
Genova        48   Open:Active      -    -    -    -   996   994     2     0 
Pisa          44   Open:Active      -    -    -    -   995   990     5     0 
Milano        43   Open:Active      -    -    -    -    10     7     3     0 
Parma         43   Open:Active      -    -    -    -    11     8     3     0 
Venezia       43   Open:Active      -    -    -    -   995   995     2     0 
Bologna       43   Open:Active      -    -    -    -   995   995     2     0

Round-robin slot distribution: 13 queues and 2 pools

  • Pool poolA has 3 hosts each with 7 slots for a total of 21 slots to be shared. The first 3 queues are part of the pool poolA sharing the CPUs with proportions 50% (11 slots), 30% (7 slots) and 20% (3 remaining slots to total 21 slots).

  • The other 10 queues belong to pool poolB, which has 3 hosts each with 7 slots for a total of 21 slots to be shared. Each queue has 10% of the pool (3 slots).

The initial slot distribution looks like this:

bqueues
QUEUE_NAME   PRIO STATUS          MAX JL/U JL/P JL/H NJOBS  PEND   RUN  SUSP  
Roma          50   Open:Active      -    -    -    -    15     6    11     0 
Verona        48   Open:Active      -    -    -    -    25    18     7     0 
Genova        47   Open:Active      -    -    -    -   460   455     3     0 
Pisa          44   Open:Active      -    -    -    -   264   261     3     0 
Milano        43   Open:Active      -    -    -    -   262   259     3     0 
Parma         42   Open:Active      -    -    -    -   260   257     3     0 
Bologna       40   Open:Active      -    -    -    -   260   257     3     0 
Sora          40   Open:Active      -    -    -    -   261   258     3     0 
Ferrara       40   Open:Active      -    -    -    -   258   255     3     0 
Napoli        40   Open:Active      -    -    -    -   259   256     3     0 
Livorno       40   Open:Active      -    -    -    -   258   258     0     0 
Palermo       40   Open:Active      -    -    -    -   256   256     0     0 
Venezia        4   Open:Active      -    -    -    -   255   255     0     0

Initially, queues Livorno, Palermo, and Venezia in poolB are not assigned any slots because the first 7 higher priority queues have used all 21 slots available for allocation.

As jobs run and each queue accumulates used slots, LSF favors queues that have not run jobs yet. As jobs finish in the first 7 queues of poolB, slots are redistributed to the other queues that originally had no jobs (queues Livorno, Palermo, and Venezia). The total slot count remains 21 in all queues in poolB.

bqueues
QUEUE_NAME    PRIO STATUS          MAX JL/U JL/P JL/H NJOBS  PEND   RUN  SUSP  
Roma           50   Open:Active      -    -    -    -    15     6     9     0 
Verona         48   Open:Active      -    -    -    -    25    18     7     0 
Genova         47   Open:Active      -    -    -    -   460   455     5     0 
Pisa           44   Open:Active      -    -    -    -   263   261     2     0 
Milano         43   Open:Active      -    -    -    -   261   259     2     0 
Parma          42   Open:Active      -    -    -    -   259   257     2     0 
Bologna        40   Open:Active      -    -    -    -   259   257     2     0 
Sora           40   Open:Active      -    -    -    -   260   258     2     0 
Ferrara        40   Open:Active      -    -    -    -   257   255     2     0 
Napoli         40   Open:Active      -    -    -    -   258   256     2     0 
Livorno        40   Open:Active      -    -    -    -   258   256     2     0 
Palermo        40   Open:Active      -    -    -    -   256   253     3     0 
Venezia         4   Open:Active      -    -    -    -   255   253     2     0

The following figure illustrates the round-robin distribution of slot allocations between queues Livorno and Palermo:

How LSF rebalances slot usage

In the following examples, job runtime is not equal, but varies randomly over time.

3 queues in one pool with 50%, 30%, 20% shares

A pool configures 3 queues:

  • queue1 50% with short-running jobs

  • queue2 20% with short-running jobs

  • queue3 30% with longer running jobs

As queue1 and queue2 finish their jobs, the number of jobs in queue3 expands, and as queue1 and queue2 get more work, LSF rebalances the usage:
10 queues sharing 10% each of 50 slots

In this example, queue1 (the curve with the highest peaks) has the longer running jobs and so has less accumulated slots in use over time. LSF accordingly rebalances the load when all queues compete for jobs to maintain a configured 10% usage share.