The lsb.queues file defines batch queues. Numerous controls are available at the queue level to allow cluster administrators to customize site policies.
This file is optional; if no queues are configured, LSF creates a queue named default, with all parameters set to default values.
This file is installed by default in LSB_CONFDIR/cluster_name/configdir.
After making any changes to lsb.queues, run badmin reconfig to reconfigure mbatchd.
Some parameters such as run window and run time limit do not take effect immediately for running jobs unless you run mbatchd restart or sbatchd restart on the job execution host.
Each queue definition begins with the line Begin Queue and ends with the line End Queue. The queue name must be specified; all other parameters are optional.
ADMINISTRATORS=user_name | user_group ...
List of queue administrators. To specify a Windows user account or user group, include the domain name in uppercase letters (DOMAIN_NAME\user_name or DOMAIN_NAME\user_group).
Queue administrators can perform operations on any user’s job in the queue, as well as on the queue itself.
Not defined. You must be a cluster administrator to operate on this queue.
APS_PRIORITY=WEIGHT[[factor, value] [subfactor, value]...]...] LIMIT[[factor, value] [subfactor, value]...]...] GRACE_PERIOD[[factor, value] [subfactor, value]...]...]
Specifies calculation factors for absolute priority scheduling (APS). Pending jobs in the queue are ordered according to the calculated APS value.
If weight of a subfactor is defined, but the weight of parent factor is not defined, the parent factor weight is set as 1.
The WEIGHT and LIMIT factors are floating-point values. Specify a value for GRACE_PERIOD in seconds (values), minutes (valuem), or hours (valueh).
The default unit for grace period is hours.
GRACE_PERIOD[[MEM,10h] [JPRIORITY, 10m] [QPRIORITY,10s] [RSRC, 10]]
You cannot specify zero (0) for the WEIGHT, LIMIT, and GRACE_PERIOD of any factor or subfactor.
APS queues cannot configure cross-queue fairshare (FAIRSHARE_QUEUES). The QUEUE_GROUP parameter replaces FAIRSHARE_QUEUES, which is obsolete in LSF 7.0.
Suspended (bstop) jobs and migrated jobs (bmig) are always scheduled before pending jobs. For migrated jobs, LSF keeps the existing job priority information.
If LSB_REQUEUE_TO_BOTTOM and LSB_MIG2PEND are configured in lsf.conf, the migrated jobs keep their APS information. When LSB_REQUEUE_TO_BOTTOM and LSB_MIG2PEND are configured, the migrated jobs need to compete with other pending jobs based on the APS value. If you want to reset the APS value, the you should use brequeue, not bmig.
Not defined
BACKFILL=Y | N
If Y, enables backfill scheduling for the queue.
A possible conflict exists if BACKFILL and PREEMPTION are specified together. If PREEMPT_JOBTYPE = BACKFILL is set in the lsb.params file, a backfill queue can be preemptable. Otherwise a backfill queue cannot be preemptable. If BACKFILL is enabled do not also specify PREEMPTION = PREEMPTABLE.
BACKFILL is required for interruptible backfill queues (INTERRUPTIBLE_BACKFILL=seconds).
When MAX_SLOTS_IN_POOL, SLOT_RESERVE, and BACKFILL are defined for the same queue, jobs in the queue cannot backfill using slots reserved by other jobs in the same queue.
Not defined. No backfilling.
CHKPNT=chkpnt_dir [chkpnt_period]
Enables automatic checkpointing for the queue. All jobs submitted to the queue are checkpointable.
The checkpoint directory is the directory where the checkpoint files are created. Specify an absolute path or a path relative to CWD, do not use environment variables.
Specify the optional checkpoint period in minutes.
Only running members of a chunk job can be checkpointed.
If checkpoint-related configuration is specified in both the queue and an application profile, the application profile setting overrides queue level configuration.
To enable checkpointing of MultiCluster jobs, define a checkpoint directory in both the send-jobs and receive-jobs queues (CHKPNT in lsb.queues), or in an application profile (CHKPNT_DIR, CHKPNT_PERIOD, CHKPNT_INITPERIOD, CHKPNT_METHOD in lsb.applications) of both submission cluster and execution cluster. LSF uses the directory specified in the execution cluster.
To make a MultiCluster job checkpointable, both submission and execution queues must enable checkpointing, and the application profile or queue setting on the execution cluster determines the checkpoint directory. Checkpointing is not supported if a job runs on a leased host.
The file path of the checkpoint directory can contain up to 4000 characters for UNIX and Linux, or up to 255 characters for Windows, including the directory and file name.
Not defined
CHUNK_JOB_SIZE=integer
Chunk jobs only. Enables job chunking and specifies the maximum number of jobs allowed to be dispatched together in a chunk. Specify a positive integer greater than 1.
The ideal candidates for job chunking are jobs that have the same host and resource requirements and typically take 1 to 2 minutes to run.
However, throughput can deteriorate if the chunk job size is too big. Performance may decrease on queues with CHUNK_JOB_SIZE greater than 30. You should evaluate the chunk job size on your own systems for best performance.
With MultiCluster job forwarding model, this parameter does not affect MultiCluster jobs that are forwarded to a remote cluster.
If CHUNK_JOB_DURATION is set in lsb.params, chunk jobs are accepted regardless of the value of CPULIMIT, RUNLIMIT or RUNTIME.
Begin Queue
QUEUE_NAME = chunk
PRIORITY = 50
CHUNK_JOB_SIZE = 4
End Queue
Not defined
COMMITTED_RUN_TIME_FACTOR=number
Used only with fairshare scheduling. Committed run time weighting factor.
In the calculation of a user’s dynamic priority, this factor determines the relative importance of the committed run time in the calculation. If the -W option of bsub is not specified at job submission and a RUNLIMIT has not been set for the queue, the committed run time is not considered.
If undefined, the cluster-wide value from the lsb.params parameter of the same name is used.
Any positive number between 0.0 and 1.0
Not defined.
CORELIMIT=integer
The per-process (hard) core file size limit (in KB) for all of the processes belonging to a job from this queue (see getrlimit(2)).
Unlimited
Specifies the CPU frequency for a queue. All jobs submit to the queue require the specified CPU frequency. Value is a positive float number with units (GHz, MHz, or KHz). If no units are set, the default is GHz.
This value can also be set using the command bsub –freq.
The submission value will overwrite the application profile value, and the application profile value will overwrite the queue value.
Not defined (Nominal CPU frequency is used)
CPULIMIT=[default_limit] maximum_limit
where default_limit and maximum_limit are:
[hour:]minute[/host_name | /host_model]
Maximum normalized CPU time and optionally, the default normalized CPU time allowed for all processes of a job running in this queue. The name of a host or host model specifies the CPU time normalization host to use.
Limits the total CPU time the job can use. This parameter is useful for preventing runaway jobs or jobs that use up too many resources.
When the total CPU time for the whole job has reached the limit, a SIGXCPU signal is sent to all processes belonging to the job. If the job has no signal handler for SIGXCPU, the job is killed immediately. If the SIGXCPU signal is handled, blocked, or ignored by the application, then after the grace period expires, LSF sends SIGINT, SIGTERM, and SIGKILL to the job to kill it.
If a job dynamically spawns processes, the CPU time used by these processes is accumulated over the life of the job.
Processes that exist for fewer than 30 seconds may be ignored.
By default, if a default CPU limit is specified, jobs submitted to the queue without a job-level CPU limit are killed when the default CPU limit is reached.
If you specify only one limit, it is the maximum, or hard, CPU limit. If you specify two limits, the first one is the default, or soft, CPU limit, and the second one is the maximum CPU limit. The number of minutes may be greater than 59. Therefore, three and a half hours can be specified either as 3:30 or 210.
If no host or host model is given with the CPU time, LSF uses the default CPU time normalization host defined at the queue level (DEFAULT_HOST_SPEC in lsb.queues) if it has been configured, otherwise uses the default CPU time normalization host defined at the cluster level (DEFAULT_HOST_SPEC in lsb.params) if it has been configured, otherwise uses the host with the largest CPU factor (the fastest host in the cluster).
On Windows, a job that runs under a CPU time limit may exceed that limit by up to SBD_SLEEP_TIME. This is because sbatchd periodically checks if the limit has been exceeded.
On UNIX systems, the CPU limit can be enforced by the operating system at the process level.
You can define whether the CPU limit is a per-process limit enforced by the OS or a per-job limit enforced by LSF with LSB_JOB_CPULIMIT in lsf.conf.
Jobs submitted to a chunk job queue are not chunked if CPULIMIT is greater than 30 minutes.
Unlimited
CPU_TIME_FACTOR=number
Used only with fairshare scheduling. CPU time weighting factor.
In the calculation of a user’s dynamic share priority, this factor determines the relative importance of the cumulative CPU time used by a user’s jobs.
If undefined, the cluster-wide value from the lsb.params parameter of the same name is used.
0.7
DATALIMIT=[default_limit] maximum_limit
The per-process data segment size limit (in KB) for all of the processes belonging to a job from this queue (see getrlimit(2)).
By default, if a default data limit is specified, jobs submitted to the queue without a job-level data limit are killed when the default data limit is reached.
If you specify only one limit, it is the maximum, or hard, data limit. If you specify two limits, the first one is the default, or soft, data limit, and the second one is the maximum data limit.
Unlimited
DEFAULT_EXTSCHED=external_scheduler_options
Specifies default external scheduling options for the queue.
-extsched options on the bsub command are merged with DEFAULT_EXTSCHED options, and -extsched options override any conflicting queue-level options set by DEFAULT_EXTSCHED.
Not defined
DEFAULT_HOST_SPEC=host_name | host_model
The default CPU time normalization host for the queue.
The CPU factor of the specified host or host model is used to normalize the CPU time limit of all jobs in the queue, unless the CPU time normalization host is specified at the job level.
Not defined. The queue uses the DEFAULT_HOST_SPEC defined in lsb.params. If DEFAULT_HOST_SPEC is not defined in either file, LSF uses the fastest host in the cluster.
DESCRIPTION=text
Description of the job queue displayed by bqueues -l.
This description should clearly describe the service features of this queue, to help users select the proper queue for each job.
The text can include any characters, including white space. The text can be extended to multiple lines by ending the preceding line with a backslash (\). The maximum length for the text is 512 characters.
DISPATCH_BY_QUEUE=Y|y|N|n
Set this parameter to increase queue responsiveness. The scheduling decision for the specified queue will be published without waiting for the whole scheduling session to finish. The scheduling decision for the jobs in the specified queue is final and these jobs cannot be preempted within the same scheduling cycle.
Only set this parameter for your highest priority queue (such as for an interactive queue) to ensure that this queue has the highest responsiveness.
N
DISPATCH_ORDER=QUEUE
Defines an ordered cross-queue fairshare set. DISPATCH_ORDER indicates that jobs are dispatched according to the order of queue priorities first, then user fairshare priority.
By default, a user has the same priority across the master and slave queues. If the same user submits several jobs to these queues, user priority is calculated by taking into account all the jobs the user has submitted across the master-slave set.
If DISPATCH_ORDER=QUEUE is set in the master queue, jobs are dispatched according to queue priorities first, then user priority. Jobs from users with lower fairshare priorities who have pending jobs in higher priority queues are dispatched before jobs in lower priority queues. This avoids having users with higher fairshare priority getting jobs dispatched from low-priority queues.
Jobs in queues having the same priority are dispatched according to user priority.
Queues that are not part of the cross-queue fairshare can have any priority; they are not limited to fall outside of the priority range of cross-queue fairshare queues.
Not defined
DISPATCH_WINDOW=time_window ...
The time windows in which jobs from this queue are dispatched. Once dispatched, jobs are no longer affected by the dispatch window.
Not defined. Dispatch window is always open.
ENABLE_HIST_RUN_TIME=y | Y | n | N
Used only with fairshare scheduling. If set, enables the use of historical run time in the calculation of fairshare scheduling priority.
If undefined, the cluster-wide value from the lsb.params parameter of the same name is used.
Not defined.
EXCLUSIVE=Y | N | CU[cu_type]
If Y, specifies an exclusive queue.
If CU, CU[], or CU[cu_type], specifies an exclusive queue as well as a queue exclusive to compute units of type cu_type (as defined in lsb.params). If no type is specified, the default compute unit type is used.
Jobs submitted to an exclusive queue with bsub -x are only dispatched to a host that has no other LSF jobs running. Jobs submitted to a compute unit exclusive queue with bsub -R "cu[excl]" only run on a compute unit that has no other jobs running.
For hosts shared under the MultiCluster resource leasing model, jobs are not dispatched to a host that has LSF jobs running, even if the jobs are from another cluster.
N
Enables queue-level user-based fairshare and specifies share assignments. Only users with share assignments can submit jobs to the queue.
Do not configure hosts in a cluster to use fairshare at both queue and host levels. However, you can configure user-based fairshare and queue-based fairshare together.
Not defined. No fairshare.
FAIRSHARE_ADJUSTMENT_FACTOR=number
Used only with fairshare scheduling. Fairshare adjustment plugin weighting factor.
In the calculation of a user’s dynamic share priority, this factor determines the relative importance of the user-defined adjustment made in the fairshare plugin (libfairshareadjust.*).
A positive float number both enables the fairshare plugin and acts as a weighting factor.
If undefined, the cluster-wide value from the lsb.params parameter of the same name is used.
Not defined.
FAIRSHARE_QUEUES=queue_name[queue_name ...]
Not defined
FILELIMIT=integer
The per-process (hard) file size limit (in KB) for all of the processes belonging to a job from this queue (see getrlimit(2)).
Unlimited
HIST_HOURS=hours
Used only with fairshare scheduling. Determines a rate of decay for cumulative CPU time, run time, and historical run time.
To calculate dynamic user priority, LSF scales the actual CPU time and the run time using a decay factor, so that 1 hour of recently-used time is equivalent to 0.1 hours after the specified number of hours has elapsed.
To calculate dynamic user priority with decayed run time and historical run time, LSF scales the accumulated run time of finished jobs and run time of running jobs using the same decay factor, so that 1 hour of recently-used time is equivalent to 0.1 hours after the specified number of hours has elapsed.
When HIST_HOURS=0, CPU time and run time accumulated by running jobs is not decayed.
If undefined, the cluster-wide value from the lsb.params parameter of the same name is used.
Not defined.
HJOB_LIMIT=integer
Per-host job slot limit.
Maximum number of job slots that this queue can use on any host. This limit is configured per host, regardless of the number of processors it may have.
Begin Queue
...
HJOB_LIMIT = 1
HOSTS=hostA hostB hostC
...
End Queue
Unlimited
Enables host-based post-execution processing at the queue level. The HOST_POST_EXEC command runs on all execution hosts after the job finishes. If job based post-execution POST_EXEC was defined at the queue-level/application-level/job-level, the HOST_POST_EXEC command runs after POST_EXEC of any level.
The supported command rule is the same as the existing POST_EXEC for the queue section. See the POST_EXEC topic for details.
The host-based pre-execution command cannot be executed on Windows platforms. This parameter cannot be used to configure job-based post-execution processing.
Not defined.
Enables host-based pre-execution processing at the queue level. The HOST_PRE_EXEC command runs on all execution hosts before the job starts. If job based pre-execution PRE_EXEC was defined at the queue-level/application-level/job-level, the HOST_PRE_EXEC command runs before PRE_EXEC of any level.
The supported command rule is the same as the existing PRE_EXEC for the queue section. See the PRE_EXEC topic for details.
The host-based pre-execution command cannot be executed on Windows platforms. This parameter cannot be used to configure job-based pre-execution processing.
Not defined.
HOSTLIMIT_PER_JOB=integer
Per-job host limit.
The maximum number of hosts that a job in this queue can use. LSF verifies the host limit during the allocation phase of scheduling. If the number of hosts requested for a parallel job exceeds this limit and LSF cannot satisfy the minimum number of request slots, the parallel job will pend. However, for resumed parallel jobs, this parameter does not stop the job from resuming even if the job's host allocation exceeds the per-job host limit specified in this parameter.
Unlimited
A space-separated list of hosts on which jobs from this queue can be run.
If compute units, host groups, or host partitions are included in the list, the job can run on any host in the unit, group, or partition. All the members of the host list should either belong to a single host partition or not belong to any host partition. Otherwise, job scheduling may be affected.
Some items can be followed by a plus sign (+) and a positive number to indicate the preference for dispatching a job to that host. A higher number indicates a higher preference. If a host preference is not given, it is assumed to be 0. If there are multiple candidate hosts, LSF dispatches the job to the host with the highest preference; hosts at the same level of preference are ordered by load.
If compute units, host groups, or host partitions are assigned a preference, each host in the unit, group, or partition has the same preference.
Use the keyword others to include all hosts not explicitly listed.
Use the keyword all to include all hosts not explicitly excluded.
Use the keyword all@cluster_name hostgroup_name or allremote hostgroup_name to include lease in hosts.
Use the not operator (~) to exclude hosts from the all specification in the queue. This is useful if you have a large cluster but only want to exclude a few hosts from the queue definition.
The not operator can only be used with the all keyword. It is not valid with the keywords others and none.
The not operator (~) can be used to exclude host groups.
For parallel jobs, specify first execution host candidates when you want to ensure that a host has the required resources or runtime environment to handle processes that run on the first execution host.
To specify one or more hosts, host groups, or compute units as first execution host candidates, add the exclamation point (!) symbol after the name.
With MultiCluster resource leasing model, use the format host_name@cluster_name to specify a borrowed host. LSF does not validate the names of remote hosts. The keyword others indicates all local hosts not explicitly listed. The keyword all indicates all local hosts not explicitly excluded. Use the keyword allremote to specify all hosts borrowed from all remote clusters. Use all@cluster_name to specify the group of all hosts borrowed from one remote cluster. You cannot specify a host group or partition that includes remote resources, unless it uses the keyword allremote to include all remote hosts. You cannot specify a compute unit that includes remote resources.
With MultiCluster resource leasing model, the not operator (~) can be used to exclude local hosts or host groups. You cannot use the not operator (~) with remote hosts.
Host preferences specified by bsub -m combine intelligently with the queue specification and advance reservation hosts. The jobs run on the hosts that are both specified at job submission and belong to the queue or have advance reservation.
HOSTS=hostA+1 hostB hostC+1 hostD+3
This example defines three levels of preferences: run jobs on hostD as much as possible, otherwise run on either hostA or hostC if possible, otherwise run on hostB. Jobs should not run on hostB unless all other hosts are too busy to accept more jobs.
HOSTS=hostD+1 others
Run jobs on hostD as much as possible, otherwise run jobs on the least-loaded host available.
With MultiCluster resource leasing model, this queue does not use borrowed hosts.
HOSTS=all ~hostA
Run jobs on all hosts in the cluster, except for hostA.
With MultiCluster resource leasing model, this queue does not use borrowed hosts.
HOSTS=Group1 ~hostA hostB hostC
Run jobs on hostB, hostC, and all hosts in Group1 except for hostA.
With MultiCluster resource leasing model, this queue uses borrowed hosts if Group1 uses the keyword allremote.
HOSTS=hostA! hostB+ hostC hostgroup1!
Runs parallel jobs using either hostA or a host defined in hostgroup1 as the first execution host. If the first execution host cannot run the entire job due to resource requirements, runs the rest of the job on hostB. If hostB is too busy to accept the job, or if hostB does not have enough resources to run the entire job, runs the rest of the job on hostC.
HOSTS=computeunit1! hostB hostC
Runs parallel jobs using a host in computeunit1 as the first execution host. If the first execution host cannot run the entire job due to resource requirements, runs the rest of the job on other hosts in computeunit1 followed by hostB and finally hostC.
HOSTS=hostgroup1! computeunitA computeunitB computeunitC
Runs parallel jobs using a host in hostgroup1 as the first execution host. If additional hosts are required, runs the rest of the job on other hosts in the same compute unit as the first execution host, followed by hosts in the remaining compute units in the order they are defined in the lsb.hosts ComputeUnit section.
all (the queue can use all hosts in the cluster, and every host has equal preference)
With MultiCluster resource leasing model, this queue can use all local hosts, but no borrowed hosts.
IGNORE_DEADLINE=Y
If Y, disables deadline constraint scheduling (starts all jobs regardless of deadline constraints).
IMPT_JOBBKLG=integer |infinit
MultiCluster job forwarding model only.
Specifies the MultiCluster pending job limit for a receive-jobs queue. This represents the maximum number of MultiCluster jobs that can be pending in the queue; once the limit has been reached, the queue stops accepting jobs from remote clusters.
Use the keyword infinit to make the queue accept an unlimited number of pending MultiCluster jobs.
50
IMPT_SLOTBKLG=integer |infinit
MultiCluster job forwarding model only.
Specifies the MultiCluster pending job slot limit for a receive-jobs queue. In the submission cluster, if the total of requested job slots and the number of imported pending slots in the receiving queue is greater than IMPT_SLOTBKLG, the queue stops accepting jobs from remote clusters, and the job is not forwarded to the receiving queue.
Specify an integer between 0 and 2147483646 for the number of slots.
Use the keyword infinit to make the queue accept an unlimited number of pending MultiCluster job slots.
Set IMPT_SLOTBKLG to 0 to forbid any job being forwarded to the receiving queue.
infinit (the queue accepts an unlimited number of pending MultiCluster job slots)
INTERACTIVE=YES | NO | ONLY
YES causes the queue to accept both interactive and non-interactive batch jobs, NO causes the queue to reject interactive batch jobs, and ONLY causes the queue to accept interactive batch jobs and reject non-interactive batch jobs.
Interactive batch jobs are submitted via bsub -I.
YES. The queue accepts both interactive and non-interactive jobs.
INTERRUPTIBLE_BACKFILL=seconds
Configures interruptible backfill scheduling policy, which allows reserved job slots to be used by low priority small jobs that are terminated when the higher priority large jobs are about to start.
There can only be one interruptible backfill queue.It should be the lowest priority queue in the cluster.
Specify the minimum number of seconds for the job to be considered for backfilling.This minimal time slice depends on the specific job properties; it must be longer than at least one useful iteration of the job. Multiple queues may be created if a site has jobs of distinctively different classes.
The queue RUNLIMIT corresponds to a maximum time slice for backfill, and should be configured so that the wait period for the new jobs submitted to the queue is acceptable to users. 10 minutes of runtime is a common value.
You should configure REQUEUE_EXIT_VALUES for interruptible backfill queues.
BACKFILL and RUNLIMIT must be configured in the queue. The queue is disabled if BACKFILL and RUNLIMIT are not configured.
Not defined. No interruptible backfilling.
JOB_ACCEPT_INTERVAL=integer
The number you specify is multiplied by the value of lsb.params MBD_SLEEP_TIME (60 seconds by default). The result of the calculation is the number of seconds to wait after dispatching a job to a host, before dispatching a second job to the same host.
If 0 (zero), a host may accept more than one job in each dispatch turn. By default, there is no limit to the total number of jobs that can run on a host, so if this parameter is set to 0, a very large number of jobs might be dispatched to a host all at once. This can overload your system to the point that it is unable to create any more processes. It is not recommended to set this parameter to 0.
JOB_ACCEPT_INTERVAL set at the queue level (lsb.queues) overrides JOB_ACCEPT_INTERVAL set at the cluster level (lsb.params).
The parameter JOB_ACCEPT_INTERVAL only applies when there are running jobs on a host. A host running a short job which finishes before JOB_ACCEPT_INTERVAL has elapsed is free to accept a new job without waiting.
Not defined. The queue uses JOB_ACCEPT_INTERVAL defined in lsb.params, which has a default value of 1.
JOB_ACTION_WARNING_TIME=[hour:]minute
Specifies the amount of time before a job control action occurs that a job warning action is to be taken. For example, 2 minutes before the job reaches runtime limit or termination deadline, or the queue's run window is closed, an URG signal is sent to the job.
Job action warning time is not normalized.
A job action warning time must be specified with a job warning action in order for job warning to take effect.
The warning time specified by the bsub -wt option overrides JOB_ACTION_WARNING_TIME in the queue. JOB_ACTION_WARNING_TIME is used as the default when no command line option is specified.
JOB_ACTION_WARNING_TIME=2
Not defined
Do not quote the command line inside an action definition. Do not specify a signal followed by an action that triggers the same signal. For example, do not specify JOB_CONTROLS=TERMINATE[bkill] or JOB_CONTROLS=TERMINATE[brequeue]. This causes a deadlock between the signal and the action.
On UNIX, by default, SUSPEND sends SIGTSTP for parallel or interactive jobs and SIGSTOP for other jobs. RESUME sends SIGCONT. TERMINATE sends SIGINT, SIGTERM and SIGKILL in that order.
On Windows, actions equivalent to the UNIX signals have been implemented to do the default job control actions. Job control messages replace the SIGINT and SIGTERM signals, but only customized applications are able to process them. Termination is implemented by the TerminateProcess( ) system call.
JOB_IDLE=number
Specifies a threshold for idle job exception handling. The value should be a number between 0.0 and 1.0 representing CPU time/runtime. If the job idle factor is less than the specified threshold, LSF invokes LSF_SERVERDIR/eadmin to trigger the action for a job idle exception.
The minimum job run time before mbatchd reports that the job is idle is defined as DETECT_IDLE_JOB_AFTER in lsb.params.
Any positive number between 0.0 and 1.0
JOB_IDLE=0.10
A job idle exception is triggered for jobs with an idle value (CPU time/runtime) less than 0.10.
Not defined. No job idle exceptions are detected.
JOB_OVERRUN=run_time
Specifies a threshold for job overrun exception handling. If a job runs longer than the specified run time, LSF invokes LSF_SERVERDIR/eadmin to trigger the action for a job overrun exception.
JOB_OVERRUN=5
A job overrun exception is triggered for jobs running longer than 5 minutes.
Not defined. No job overrun exceptions are detected.
JOB_STARTER=starter [starter] ["%USRCMD"] [starter]
Creates a specific environment for submitted jobs prior to execution.
starter is any executable that can be used to start the job (i.e., can accept the job as an input argument). Optionally, additional strings can be specified.
By default, the user commands run after the job starter. A special string, %USRCMD, can be used to represent the position of the user’s job in the job starter command line. The %USRCMD string and any additional commands must be enclosed in quotation marks (" ").
If your job starter script runs on a Windows execution host and includes symbols (like & or |), you can use the JOB_STARTER_EXTEND=preservestarter parameter in lsf.conf and set JOB_STARTER=preservestarter in lsb.queues. A customized userstarter can also be used.
JOB_STARTER=csh -c "%USRCMD;sleep 10"
% bsub myjob arguments
% csh -c "myjob arguments;sleep 10"
Not defined. No job starter is used.
JOB_UNDERRUN=run_time
Specifies a threshold for job underrun exception handling. If a job exits before the specified number of minutes, LSF invokes LSF_SERVERDIR/eadmin to trigger the action for a job underrun exception.
JOB_UNDERRUN=2
A job underrun exception is triggered for jobs running less than 2 minutes.
Not defined. No job underrun exceptions are detected.
JOB_WARNING_ACTION=signal
Specifies the job action to be taken before a job control action occurs. For example, 2 minutes before the job reaches runtime limit or termination deadline, or the queue's run window is closed, an URG signal is sent to the job.
A job warning action must be specified with a job action warning time in order for job warning to take effect.
If JOB_WARNING_ACTION is specified, LSF sends the warning action to the job before the actual control action is taken. This allows the job time to save its result before being terminated by the job control action.
The warning action specified by the bsub -wa option overrides JOB_WARNING_ACTION in the queue. JOB_WARNING_ACTION is used as the default when no command line option is specified.
JOB_WARNING_ACTION=URG
Not defined
load_index=loadSched[/loadStop]
Specify io, it, ls, mem, pg, r15s, r1m, r15m, swp, tmp, ut, or a non-shared custom external load index. Specify multiple lines to configure thresholds for multiple load indices.
Specify io, it, ls, mem, pg, r15s, r1m, r15m, swp, tmp, ut, or a non-shared custom external load index as a column. Specify multiple columns to configure thresholds for multiple load indices.
Scheduling and suspending thresholds for the specified dynamic load index.
The loadSched condition must be satisfied before a job is dispatched to the host. If a RESUME_COND is not specified, the loadSched condition must also be satisfied before a suspended job can be resumed.
If the loadStop condition is satisfied, a job on the host is suspended.
The loadSched and loadStop thresholds permit the specification of conditions using simple AND/OR logic. Any load index that does not have a configured threshold has no effect on job scheduling.
LSF does not suspend a job if the job is the only batch job running on the host and the machine is interactively idle (it>0).
The r15s, r1m, and r15m CPU run queue length conditions are compared to the effective queue length as reported by lsload -E, which is normalized for multiprocessor hosts. Thresholds for these parameters should be set at appropriate levels for single processor hosts.
MEM=100/10
SWAP=200/30
mem>=100 && swap>=200
mem < 10 || swap < 30
Not defined
LOCAL_MAX_PREEXEC_RETRY=integer
The maximum number of times to attempt the pre-execution command of a job on the local cluster.
0 < MAX_PREEXEC_RETRY < INFINIT_INT
INFINIT_INT is defined in lsf.h.
Not defined. The number of preexec retry times is unlimited
MANDATORY_EXTSCHED=external_scheduler_options
Specifies mandatory external scheduling options for the queue.
-extsched options on the bsub command are merged with MANDATORY_EXTSCHED options, and MANDATORY_EXTSCHED options override any conflicting job-level options set by -extsched.
Not defined
MAX_JOB_PREEMPT=integer
The maximum number of times a job can be preempted. Applies to queue-based preemption only.
0 < MAX_JOB_PREEMPT < INFINIT_INT
INFINIT_INT is defined in lsf.h.
Not defined. The number of preemption times is unlimited.
MAX_JOB_REQUEUE=integer
The maximum number of times to requeue a job automatically.
0 < MAX_JOB_REQUEUE < INFINIT_INT
INFINIT_INT is defined in lsf.h.
Not defined. The number of requeue times is unlimited
MAX_PREEXEC_RETRY=integer
Use REMOTE_MAX_PREEXEC_RETRY instead. This parameter is maintained for backwards compatibility.
MultiCluster job forwarding model only. The maximum number of times to attempt the pre-execution command of a job from a remote cluster.
If the job's pre-execution command fails all attempts, the job is returned to the submission cluster.
0 < MAX_PREEXEC_RETRY < INFINIT_INT
INFINIT_INT is defined in lsf.h.
5
MAX_PROTOCOL_INSTANCES=integer
For LSF IBM Parallel Environment (PE) integration. Specify the number of parallel communication paths (windows) available to the protocol on each network. If number of windows specified for the job (with the instances option of bsub -network or the NETWORK_REQ parameter in lsb.queues or lsb.applications), or it is greater than the specified maximum value, LSF rejects the job.
Specify MAX_PROTOCOL_INSTANCES in a queue (lsb.queues) or cluster-wide in lsb.params. The value specified in a queue overrides the value specified in lsb.params.
LSF_PE_NETWORK_NUM must be defined to a non-zero value in lsf.conf for MAX_PROTOCOL_INSTANCES to take effect and for LSF to run PE jobs. If LSF_PE_NETWORK_NUM is not defined or is set to 0, the value of MAX_PROTOCOL_INSTANCES is ignored with a warning message.
For best performance, set MAX_PROTOCOL_INSTANCES so that the communication subsystem uses every available adapter before it reuses any of the adapters.
No default value
MAX_RSCHED_TIME=integer | infinit
MAX_RSCHED_TIME * MBD_SLEEP_TIME=timeout
Specify infinit to disable remote timeout (jobs always get dispatched in the correct FCFS order because MultiCluster jobs never get rescheduled, but MultiCluster jobs can be pending in the receive-jobs queue forever instead of being rescheduled to a better queue).
apply to the queue in the submission cluster (only). This parameter is ignored by the receiving queue.
Remote timeout limit never affects advance reservation jobs
Jobs that use an advance reservation always behave as if remote timeout is disabled.
20 (20 minutes by default)
MAX_SLOTS_IN_POOL=integer
Queue-based fairshare only. Maximum number of job slots available in the slot pool the queue belongs to for queue based fairshare.
Defined in the first queue of the slot pool. Definitions in subsequent queues have no effect.
When defined together with other slot limits (QJOB_LIMIT, HJOB_LIMIT or UJOB_LIMIT in lsb.queues or queue limits in lsb.resources) the lowest limit defined applies.
When MAX_SLOTS_IN_POOL, SLOT_RESERVE, and BACKFILL are defined for the same queue, jobs in the queue cannot backfill using slots reserved by other jobs in the same queue.
MAX_SLOTS_IN_POOL can be any number from 0 to INFINIT_INT, where INFINIT_INT is defined in lsf.h.
Not defined
MAX_TOTAL_TIME_PREEMPT=integer
The accumulated preemption time in minutes after which a job cannot be preempted again, where minutes is wall-clock time, not normalized time.
Setting the parameter of the same name in lsb.applications overrides this parameter; setting this parameter overrides the parameter of the same name in lsb.params.
Any positive integer greater than or equal to one (1)
Unlimited
MEMLIMIT=[default_limit] maximum_limit
The per-process (hard) process resident set size limit (in KB) for all of the processes belonging to a job from this queue (see getrlimit(2)).
Sets the maximum amount of physical memory (resident set size, RSS) that may be allocated to a process.
By default, if a default memory limit is specified, jobs submitted to the queue without a job-level memory limit are killed when the default memory limit is reached.
If you specify only one limit, it is the maximum, or hard, memory limit. If you specify two limits, the first one is the default, or soft, memory limit, and the second one is the maximum memory limit.
OS memory limit enforcement is the default MEMLIMIT behavior and does not require further configuration. OS enforcement usually allows the process to eventually run to completion. LSF passes MEMLIMIT to the OS that uses it as a guide for the system scheduler and memory allocator. The system may allocate more memory to a process if there is a surplus. When memory is low, the system takes memory from and lowers the scheduling priority (re-nice) of a process that has exceeded its declared MEMLIMIT. Only available on systems that support RLIMIT_RSS for setrlimit().
To enable LSF memory limit enforcement, set LSB_MEMLIMIT_ENFORCE in lsf.conf to y. LSF memory limit enforcement explicitly sends a signal to kill a running process once it has allocated memory past MEMLIMIT.
You can also enable LSF memory limit enforcement by setting LSB_JOB_MEMLIMIT in lsf.conf to y. The difference between LSB_JOB_MEMLIMIT set to y and LSB_MEMLIMIT_ENFORCE set to y is that with LSB_JOB_MEMLIMIT, only the per-job memory limit enforced by LSF is enabled. The per-process memory limit enforced by the OS is disabled. With LSB_MEMLIMIT_ENFORCE set to y, both the per-job memory limit enforced by LSF and the per-process memory limit enforced by the OS are enabled.
Available for all systems on which LSF collects total memory usage.
Begin Queue
QUEUE_NAME = default
DESCRIPTION = Queue with memory limit of 5000 kbytes
MEMLIMIT = 5000
End Queue
Unlimited
MIG=minutes
Enables automatic job migration and specifies the migration threshold for checkpointable or rerunnable jobs, in minutes.
LSF automatically migrates jobs that have been in the SSUSP state for more than the specified number of minutes. Specify a value of 0 to migrate jobs immediately upon suspension. The migration threshold applies to all jobs running on the host.
Job-level command line migration threshold overrides threshold configuration in application profile and queue. Application profile configuration overrides queue level configuration.
When a host migration threshold is specified, and is lower than the value for the job, the queue, or the application, the host value is used..
Members of a chunk job can be migrated. Chunk jobs in WAIT state are removed from the job chunk and put into PEND state.
Does not affect MultiCluster jobs that are forwarded to a remote cluster.
Not defined. LSF does not migrate checkpointable or rerunnable jobs automatically.
network_res_req has the following syntax:
For LSF IBM Parallel Environment (PE) integration. Specifies the network resource requirements for a PE job.
If any network resource requirement is specified in the job, queue, or application profile, the job is treated as a PE job. PE jobs can only run on hosts where IBM PE pnsd daemon is running.
The network resource requirement string network_res_req has the same syntax as the bsub -network option.
The -network bsub option overrides the value of NETWORK_REQ defined in lsb.queues or lsb.applications. The value of NETWORK_REQ defined in lsb.applications overrides queue-level NETWORK_REQ defined in lsb.queues.
When used for switch adapters, specifies that all windows are on a single network
Specifies that one or more windows are on each network, and that striped communication should be used over all available switch networks. The networks specified must be accessible by all hosts selected to run the PE job. See the Parallel Environment Runtime Edition for AIX: Operation and Use guide (SC23-6781-05) for more information about submitting jobs that use striping.
If mode is IP and type is specified as sn_all or sn_single, the job will only run on InfiniBand (IB) adapters (IPoIB). If mode is IP and type is not specified, the job will only run on Ethernet adapters (IPoEth). For IPoEth jobs, LSF ensures the job is running on hosts where pnsd is installed and running. For IPoIB jobs, LSF ensures the job the job is running on hosts where pnsd is installed and running, and that IB networks are up. Because IP jobs do not consume network windows, LSF does not check if all network windows are used up or the network is already occupied by a dedicated PE job.
Equivalent to the PE MP_EUIDEVICE environment variable and -euidevice PE flag See the Parallel Environment Runtime Edition for AIX: Operation and Use guide (SC23-6781-05) for more information. Only sn_all or sn_single are supported by LSF. The other types supported by PE are not supported for LSF jobs.
The application makes only MPI calls. This value applies to any MPI job regardless of the library that it was compiled with (PE MPI, MPICH2).
The application makes only PAMI calls.
The application makes only LAPI calls.
The application makes only OpenSHMEM calls.
The application makes only calls from a parallel API that you define. For example: protocol=myAPI or protocol=charm.
The default value is mpi.
LSF also supports an optional protocol_number (for example, mpi(2), which specifies the number of contexts (endpoints) per parallel API instance. The number must be a power of 2, but no greater than 128 (1, 2, 4, 8, 16, 32, 64, 128). LSF will pass the communication protocols to PE without any change. LSF will reserve network windows for each protocol.
When you specify multiple parallel API protocols, you cannot make calls to both LAPI and PAMI (lapi, pami) or LAPI and OpenSHMEM (lapi, shmem) in the same application. Protocols can be specified in any order.
See the MP_MSG_API and MP_ENDPOINTS environment variables and the -msg_api and -endpoints PE flags in the Parallel Environment Runtime Edition for AIX: Operation and Use guide (SC23-6781-05) for more information about the communication protocols that are supported by IBM Parallel Edition.
The network communication system mode used by the communication specified communication protocol: US (User Space) or IP (Internet Protocol). A US job can only run with adapters that support user space communications, such as the IB adapter. IP jobs can run with either Ethernet adapters or IB adapters. When IP mode is specified, the instance number cannot be specified, and network usage must be unspecified or shared.
Each instance on the US mode requested by a task running on switch adapters requires and adapter window. For example, if a task requests both the MPI and LAPI protocols such that both protocol instances require US mode, two adapter windows will be used.
The default value is US.
Specifies whether the adapter can be shared with tasks of other job steps: dedicated or shared. Multiple tasks of the same job can share one network even if usage is dedicated.
The default usage is shared.
The number of parallel communication paths (windows) per task made available to the protocol on each network. The number actually used depends on the implementation of the protocol subsystem.
The default value is 1.
If the specified value is greater than MAX_PROTOCOL_INSTANCES in lsb.params or lsb.queues, LSF rejects the job.
LSF_PE_NETWORK_NUM must be defined to a non-zero value in lsf.conf for NETWORK_REQ to take effect. If LSF_PE_NETWORK_NUM is not defined or is set to 0, NETWORK_REQ is ignored with a warning message.
The following network resource requirement string specifies that the requirements for an sn_all job (one or more windows are on each network, and striped communication should be used over all available switch networks). The PE job uses MPI API calls (protocol), runs in user-space network communication system mode, and requires 1 parallel communication path (window) per task.
NETWORK_REQ = "protocol=mpi:mode=us:instance=1:type=sn_all"
No default value, but if you specify no value (NETWORK_REQ=""), the job uses the following: protocol=mpi:mode=US:usage=shared:instance=1 in the queue.
NEW_JOB_SCHED_DELAY=seconds
The number of seconds that a new job waits, before being scheduled. A value of zero (0) means the job is scheduled without any delay. The scheduler still periodically fetches jobs from mbatchd. Once it gets jobs, scheduler schedules them without any delay. This may speed up job scheduling a bit, but it also generates some communication overhead. Therefore, you should only set it to 0 for high priority, urgent or interactive queues for a small workloads.
If NEW_JOB_SCHED_DELAY is set to a non-zero value, scheduler will periodically fetch new jobs from mbatchd, after which it sets job scheduling time to job submission time + NEW_JOB_SCHED_DELAY.
0 seconds
NICE=integer
Adjusts the UNIX scheduling priority at which jobs from this queue execute.
The default value of 0 (zero) maintains the default scheduling priority for UNIX interactive jobs. This value adjusts the run-time priorities for batch jobs on a queue-by-queue basis, to control their effect on other batch or interactive jobs. See the nice(1) manual page for more details.
LSF on Windows does not support HIGH or REAL-TIME priority classes.
This value is overwritten by the NICE setting in lsb.applications, if defined.
0 (zero)
Prevents preemption of jobs for the specified number of minutes of uninterrupted run time, where minutes is wall-clock time, not normalized time. NO_PREEMPT_INTERVAL=0 allows immediate preemption of jobs as soon as they start or resume running.
Setting the parameter of the same name in lsb.applications overrides this parameter; setting this parameter overrides the parameter of the same name in lsb.params.
0
PJOB_LIMIT=float
Per-processor job slot limit for the queue.
Maximum number of job slots that this queue can use on any processor. This limit is configured per processor, so that multiprocessor hosts automatically run more jobs.
Unlimited
Enables post-execution processing at the queue level. The POST_EXEC command runs on the execution host after the job finishes. Post-execution commands can be configured at the application and queue levels. Application-level post-execution commands run before queue-level post-execution commands.
The POST_EXEC command uses the same environment variable values as the job, and, by default, runs under the user account of the user who submits the job. To run post-execution commands under a different user account (such as root for privileged operations), configure the parameter LSB_PRE_POST_EXEC_USER in lsf.sudoers.
When a job exits with one of the queue’s REQUEUE_EXIT_VALUES, LSF requeues the job and sets the environment variable LSB_JOBPEND. The post-execution command runs after the requeued job finishes.
When the post-execution command is run, the environment variable LSB_JOBEXIT_STAT is set to the exit status of the job. If the execution environment for the job cannot be set up, LSB_JOBEXIT_STAT is set to 0 (zero).
The command path can contain up to 4094 characters for UNIX and Linux, or up to 255 characters for Windows, including the directory, file name, and expanded values for %J (job_ID) and %I (index_ID).
PRE_EXEC= /usr/share/lsf/misc/testq_pre >> /tmp/pre.out
POST_EXEC= /usr/share/lsf/misc/testq_post | grep -v "Hey!"
PATH='/bin /usr/bin /sbin /usr/sbin'
setenv USER_POSTEXEC /path_name
For post-execution commands that execute on a Windows Server 2003, x64 Edition platform, users must have read and execute privileges for cmd.exe.
Not defined. No post-execution commands are associated with the queue.
Enables pre-execution processing at the queue level. The PRE_EXEC command runs on the execution host before the job starts. If the PRE_EXEC command exits with a non-zero exit code, LSF requeues the job to the front of the queue.
The PRE_EXEC command uses the same environment variable values as the job, and runs under the user account of the user who submits the job. To run pre-execution commands under a different user account (such as root for privileged operations), configure the parameter LSB_PRE_POST_EXEC_USER in lsf.sudoers.
The command path can contain up to 4094 characters for UNIX and Linux, or up to 255 characters for Windows, including the directory, file name, and expanded values for %J (job_ID) and %I (index_ID).
PRE_EXEC= /usr/share/lsf/misc/testq_pre >> /tmp/pre.out
POST_EXEC= /usr/share/lsf/misc/testq_post | grep -v "Hey!"
PATH='/bin /usr/bin /sbin /usr/sbin'
For pre-execution commands that execute on a Windows Server 2003, x64 Edition platform, users must have read and execute privileges for cmd.exe. This parameter cannot be used to configure host-based pre-execution processing.
Not defined. No pre-execution commands are associated with the queue.
Enables preemptive scheduling and defines this queue as preemptive. Jobs in this queue preempt jobs from the specified lower-priority queues or from all lower-priority queues if the parameter is specified with no queue names. PREEMPTIVE can be combined with PREEMPTABLE to specify that jobs in this queue can preempt jobs in lower-priority queues, and can be preempted by jobs in higher-priority queues.
Enables preemptive scheduling and defines this queue as preemptable. Jobs in this queue can be preempted by jobs from specified higher-priority queues, or from all higher-priority queues, even if the higher-priority queues are not preemptive. PREEMPTIVE can be combined with PREEMPTIVE to specify that jobs in this queue can be preempted by jobs in higher-priority queues, and can preempt jobs in lower-priority queues.
Specifies the names of lower-priority queues that can be preempted.
To specify multiple queues, separate the queue names with a space, and enclose the list in a single set of square brackets.
Specifies to preempt this queue before preempting other queues. When multiple queues are indicated with a preference level, an order of preference is indicated: queues with higher relative preference levels are preempted before queues with lower relative preference levels set.
Specifies the names of higher-priority queues that can preempt jobs in this queue.
To specify multiple queues, separate the queue names with a space and enclose the list in a single set of square brackets.
Begin Queue
QUEUE_NAME=high
PREEMPTION=PREEMPTIVE
PRIORITY=99
End Queue
Begin Queue
QUEUE_NAME=medium
PREEMPTION=PREEMPTIVE[normal low+1]
PRIORITY=10
End Queue
Begin Queue
QUEUE_NAME=normal
PREEMPTION=PREEMPTIVE[low]
PREEMPTABLE[high medium]
PRIORITY=5
End Queue
Begin Queue
QUEUE_NAME=low
PRIORITY=1
End Queue
PREEMPT_DELAY=seconds
Preemptive jobs will wait the specified number of seconds from the submission time before preempting any low priority preemptable jobs. During the grace period, preemption will not be trigged, but the job can be scheduled and dispatched by other scheduling policies.
This feature can provide flexibility to tune the system to reduce the number of preemptions. It is useful to get better performance and job throughput. When the low priority jobs are short, if high priority jobs can wait a while for the low priority jobs to finish, preemption can be avoided and cluster performance is improved. If the job is still pending after the grace period has expired, the preemption will be triggered.
The waiting time is for preemptive jobs in the pending status only. It will not impact the preemptive jobs that are suspended.
The time is counted from the submission time of the jobs. The submission time means the time mbatchd accepts a job, which includes newly submitted jobs, restarted jobs (by brestart) or forwarded jobs from a remote cluster.
When the preemptive job is waiting, the pending reason is:
The preemptive job is allowing a grace period before preemption.
If you use an older version of bjobs, the pending reason is:
Unknown pending reason code <6701>;
The parameter is defined in lsb.params, lsb.queues (overrides lsb.params), and lsb.applications (overrides both lsb.params and lsb.queues).
Run badmin reconfig to make your changes take effect.
Not defined (if the parameter is not defined anywhere, preemption is immediate).
Specifies the relative queue priority for dispatching jobs. A higher value indicates a higher job-dispatching priority, relative to other queues.
LSF schedules jobs from one queue at a time, starting with the highest-priority queue. If multiple queues have the same priority, LSF schedules all the jobs from these queues in first-come, first-served order.
LSF queue priority is independent of the UNIX scheduler priority system for time-sharing processes. In LSF, the NICE parameter is used to set the UNIX time-sharing priority for batch jobs.
Specify a number greater than or equal to 1, where 1 is the lowest priority.
1
PROCESSLIMIT=[default_limit] maximum_limit
Limits the number of concurrent processes that can be part of a job.
By default, if a default process limit is specified, jobs submitted to the queue without a job-level process limit are killed when the default process limit is reached.
If you specify only one limit, it is the maximum, or hard, process limit. If you specify two limits, the first one is the default, or soft, process limit, and the second one is the maximum process limit.
Unlimited
PROCLIMIT=[minimum_limit [default_limit]] maximum_limit
Maximum number of slots that can be allocated to a job. For parallel jobs, the maximum number of processors that can be allocated to the job.
Queue
level PROCLIMIT has the highest priority over
application level PROCLIMIT and job level PROCLIMIT.
Application level PROCLIMIT has higher priority
than job level PROCLIMIT. Job-level limits must
fall within the maximum and minimum limits of the application profile
and the queue.
Optionally specifies the minimum and default number of job slots.
All limits must be positive numbers greater than or equal to 1 that satisfy the following relationship:
1 <= minimum <= default <= maximum
If RES_REQ in a queue was defined as a compound resource requirement with a block size (span[block=value]), the default value for PROCLIMIT should be a multiple of a block.
For example, this configuration would be accepted:
Queue-level RES_REQ="1*{type==any } + {type==local span[block=4]}"
PROCLIMIT = 5 9 13
This configuration, for example, would not be accepted. An error message will appear when doing badmin reconfig:
Queue-level RES_REQ="1*{type==any } + {type==local span[block=4]}"
PROCLIMIT = 4 10 12
Unlimited, the default number of slots is 1
QJOB_LIMIT=integer
Job slot limit for the queue. Total number of job slots that this queue can use.
Unlimited
QUEUE_GROUP=queue1, queue2 ...
Configures absolute priority scheduling (APS) across multiple queues.
When APS is enabled in the queue with APS_PRIORITY, the FAIRSHARE_QUEUES parameter is ignored. The QUEUE_GROUP parameter replaces FAIRSHARE_QUEUES, which is obsolete in LSF 7.0.
Not defined
QUEUE_NAME=string
Required. Name of the queue.
Specify any ASCII string up to 59 characters long. You can use letters, digits, underscores (_) or dashes (-). You cannot use blank spaces. You cannot specify the reserved name default.
You must specify this parameter to define a queue. The default queue automatically created by LSF is named default.
RCVJOBS_FROM=cluster_name ... | allclusters
MultiCluster only. Defines a MultiCluster receive-jobs queue.
Specify cluster names, separated by a space. The administrator of each remote cluster determines which queues in that cluster forward jobs to the local cluster.
Use the keyword allclusters to specify any remote cluster.
RCVJOBS_FROM=cluster2 cluster4 cluster6
This queue accepts remote jobs from clusters 2, 4, and 6.
REMOTE_MAX_PREEXEC_RETRY=integer
MultiCluster job forwarding model only. Applies to the execution cluster. Define the maximum number of times to attempt the pre-execution command of a job from the remote cluster.
0 - INFINIT_INT
INFINIT_INT is defined in lsf.h.
5
REQUEUE_EXIT_VALUES=[exit_code ...] [EXCLUDE(exit_code ...)]
Enables automatic job requeue and sets the LSB_EXIT_REQUEUE environment variable. Use spaces to separate multiple exit codes. Application-level exit values override queue-level values. Job-level exit values (bsub -Q) override application-level and queue-level values.
"[all] [~number ...] | [number ...]"
The reserved keyword all specifies all exit codes. Exit codes are typically between 0 and 255. Use a tilde (~) to exclude specified exit codes from the list.
Jobs are requeued to the head of the queue. The output from the failed run is not saved, and the user is not notified by LSF.
Define an exit code as EXCLUDE(exit_code) to enable exclusive job requeue, ensuring the job does not rerun on the samehost. Exclusive job requeue does not work for parallel jobs.
For MultiCluster jobs forwarded to a remote execution cluster, the exit values specified in the submission cluster with the EXCLUDE keyword are treated as if they were non-exclusive.
You can also requeue a job if the job is terminated by a signal.
If a job is killed by a signal, the exit value is 128+signal_value. The sum of 128 and the signal value can be used as the exit code in the parameter REQUEUE_EXIT_VALUES.
For example, if you want a job to rerun if it is killed with a signal 9 (SIGKILL), the exit value would be 128+9=137. You can configure the following requeue exit value to allow a job to be requeue if it was kill by signal 9:
REQUEUE_EXIT_VALUES=137
In Windows, if a job is killed by a signal, the exit value is signal_value. The signal value can be used as the exit code in the parameter REQUEUE_EXIT_VALUES.
For example, if you want to rerun a job after it was killed with a signal 7 (SIGKILL), the exit value would be 7. You can configure the following requeue exit value to allow a job to requeue after it was killed by signal 7:
REQUEUE_EXIT_VALUES=7
You can configure the following requeue exit value to allow a job to requeue for both Linux and Windows after it was killed:
REQUEUE_EXIT_VALUES=137 7
If mbatchd is restarted, it does not remember the previous hosts from which the job exited with an exclusive requeue exit code. In this situation, it is possible for a job to be dispatched to hosts on which the job has previously exited with an exclusive exit code.
You should configure REQUEUE_EXIT_VALUES for interruptible backfill queues (INTERRUPTIBLE_BACKFILL=seconds).
REQUEUE_EXIT_VALUES=30 EXCLUDE(20)
means that jobs with exit code 30 are requeued, jobs with exit code 20 are requeued exclusively, and jobs with any other exit code are not requeued.
Not defined. Jobs are not requeued.
RERUNNABLE=yes | no
If yes, enables automatic job rerun (restart).
Rerun is disabled when RERUNNABLE is set to no. The yes and no arguments are not case sensitive.
For MultiCluster jobs, the setting in the submission queue is used, and the setting in the execution queue is ignored.
Members of a chunk job can be rerunnable. If the execution host becomes unavailable, rerunnable chunk job members are removed from the job chunk and dispatched to a different execution host.
no
RESOURCE_RESERVE=MAX_RESERVE_TIME[integer]
Enables processor reservation and memory reservation for pending jobs for the queue. Specifies the number of dispatch turns (MAX_RESERVE_TIME) over which a job can reserve job slots and memory.
Overrides the SLOT_RESERVE parameter. If both RESOURCE_RESERVE and SLOT_RESERVE are defined in the same queue, an error is displayed when the cluster is reconfigured, and SLOT_RESERVE is ignored. Job slot reservation for parallel jobs is enabled by RESOURCE_RESERVE if the LSF scheduler plugin module names for both resource reservation and parallel batch jobs (schmod_parallel and schmod_reserve) are configured in the lsb.modules file: The schmod_parallel name must come before schmod_reserve in lsb.modules.
If a job has not accumulated enough memory or job slots to start by the time MAX_RESERVE_TIME expires, it releases all its reserved job slots or memory so that other pending jobs can run. After the reservation time expires, the job cannot reserve memory or slots for one scheduling session, so other jobs have a chance to be dispatched. After one scheduling session, the job can reserve available memory and job slots again for another period specified by MAX_RESERVE_TIME.
If BACKFILL is configured in a queue, and a run limit is specified with -W on bsub or with RUNLIMIT in the queue, backfill jobs can use the accumulated memory reserved by the other jobs in the queue, as long as the backfill job can finish before the predicted start time of the jobs with the reservation.
Unlike slot reservation, which only applies to parallel jobs, memory reservation and backfill on memory apply to sequential and parallel jobs.
RESOURCE_RESERVE=MAX_RESERVE_TIME[5]
This example specifies that jobs have up to 5 dispatch turns to reserve sufficient job slots or memory (equal to 5 minutes, by default).
Not defined. No job slots or memory is reserved.
RES_REQ=res_req
Resource requirements used to determine eligible hosts. Specify a resource requirement string as usual. The resource requirement string lets you specify conditions in a more flexible manner than using the load thresholds. Resource requirement strings can be simple (applying to the entire job), compound (applying to the specified number of slots) or can contain alternative resources (alternatives between 2 or more simple and/or compound). For alternative resources, if the first resource cannot be found that satisfies the first resource requirement, then the next resource requirement is tried, and so on until the requirement is satisfied.
Compound and alternative resource requirements follow the same set of rules for determining how resource requirements are going to be merged between job, application, and queue level. For more detail on merge rules, see the Administering IBM Platform LSF.
When a compound or alternative resource requirement is set for a queue, it will be ignored unless it is the only resource requirement specified (no resource requirements are set at the job-level or application-level).
When a simple resource requirement is set for a queue and a compound resource requirement is set at the job-level or application-level, the queue-level requirements merge as they do for simple resource requirements. However, any job-based resources defined in the queue only apply to the first term of the merged compound resource requirements.
When LSF_STRICT_RESREQ=Y is configured in lsf.conf, resource requirement strings in select sections must conform to a more strict syntax. The strict resource requirement syntax only applies to the select section. It does not apply to the other resource requirement sections (order, rusage, same, span, cu or affinity). When LSF_STRICT_RESREQ=Y in lsf.conf, LSF rejects resource requirement strings where an rusage section contains a non-consumable resource.
For simple resource requirements, the select sections from all levels must be satisfied and the same sections from all levels are combined. cu, order, and span sections at the job-level overwrite those at the application-level which overwrite those at the queue-level. Multiple rusage definitions are merged, with the job-level rusage taking precedence over the application-level, and application-level taking precedence over the queue-level.
The simple resource requirement rusage section can specify additional requests. To do this, use the OR (||) operator to separate additional rusage strings. Multiple -R options cannot be used with multi-phase rusage resource requirements.
For simple resource requirements the job-level affinity section overrides the application-level, and the application-level affinity section overrides the queue-level.
Compound and alternative resource requirements do not support use of the || operator within rusage sections or the cu section.
The RES_REQ consumable resource requirements must satisfy any limits set by the parameter RESRSV_LIMIT in lsb.queues, or the RES_REQ will be ignored.
When both the RES_REQ and RESRSV_LIMIT are set in lsb.queues for a consumable resource, the queue-level RES_REQ no longer acts as a hard limit for the merged RES_REQ rusage values from the job and application levels. In this case only the limits set by RESRSV_LIMIT must be satisfied, and the queue-level RES_REQ acts as a default value.
RES_REQ=rusage[mem=200:lic=1] ...
bsub -R'rusage[mem=100]' ...
rusage[mem=100:lic=1]
where mem=100 specified by the job overrides mem=200 specified by the queue. However, lic=1 from queue is kept, since job does not specify it.
RES_REQ = rusage[bwidth =2:threshold=5] ...
bsub -R "rusage[bwidth =1:threshold=6]" ...
the resulting requirement for the job is
rusage[bwidth =1:threshold=6]
RES_REQ=rusage[mem=200:duration=20:decay=1] ...
bsub -R'rusage[mem=100]' ...
rusage[mem=100:duration=20:decay=1]
Queue-level duration and decay are merged with the job-level specification, and mem=100 for the job overrides mem=200 specified by the queue. However, duration=20 and decay=1 from queue are kept, since job does not specify them.
RES_REQ=rusage[mem=200:duration=20:decay=1] ...
bsub -R'rusage[mem=(300 200 100):duration=(10 10 10)]' ...
rusage[mem=(300 200 100):duration=(10 10 10)]
RES_REQ=rusage[mem=(350 200):duration=(20):decay=(1)] ...
bsub -q q_name -R'rusage[mem=100:swap=150]' ...
rusage[mem=100:swap=150]
The job-level rusage string overrides the queue-level multi-phase rusage string.
The order section defined at the job level overwrites any resource requirements specified at the application level or queue level. The order section defined at the application level overwrites any resource requirements specified at the queue level. The default order string is r15s:pg.
If RES_REQ is defined at the queue level and there are no load thresholds defined, the pending reasons for each individual load index are not displayed by bjobs.
The span section defined at the queue level is ignored if the span section is also defined at the job level or in an application profile.
select[type==local] order[r15s:pg]. If this parameter is defined and a host model or Boolean resource is specified, the default type is any.
RESRSV_LIMIT=[res1={min1,} max1] [res2={min2,} max2]...
Where res is a consumable resource name, min is an optional minimum value and max is the maximum allowed value. Both max and min must be float numbers between 0 and 2147483647, and min cannot be greater than max.
Sets a range of allowed values for RES_REQ resources.
Queue-level RES_REQ rusage values (set in lsb.queues) must be in the range set by RESRSV_LIMIT, or the queue-level RES_REQ is ignored. Merged RES_REQ rusage values from the job and application levels must be in the range of RESRSV_LIMIT, or the job is rejected.
Changes made to the rusage values of running jobs using bmod -R cannot exceed the maximum values of RESRSV_LIMIT, but can be lower than the minimum values.
When both the RES_REQ and RESRSV_LIMIT are set in lsb.queues for a consumable resource, the queue-level RES_REQ no longer acts as a hard limit for the merged RES_REQ rusage values from the job and application levels. In this case only the limits set by RESRSV_LIMIT must be satisfied, and the queue-level RES_REQ acts as a default value.
For MultiCluster, jobs must satisfy the RESRSV_LIMIT range set for the send-jobs queue in the submission cluster. After the job is forwarded the resource requirements are also checked against the RESRSV_LIMIT range set for the receive-jobs queue in the execution cluster.
Only consumable resource limits can be set in RESRSV_LIMIT. Other resources will be ignored.
Not defined.
If max is defined and optional min is not, the default for min is 0.
RESUME_COND=res_req
Use the select section of the resource requirement string to specify load thresholds. All other sections are ignored.
LSF automatically resumes a suspended (SSUSP) job in this queue if the load on the host satisfies the specified conditions.
If RESUME_COND is not defined, then the loadSched thresholds are used to control resuming of jobs. The loadSched thresholds are ignored, when resuming jobs, if RESUME_COND is defined.
Not defined. The loadSched thresholds are used to control resuming of jobs.
RUN_JOB_FACTOR=number
Used only with fairshare scheduling. Job slots weighting factor.
In the calculation of a user’s dynamic share priority, this factor determines the relative importance of the number of job slots reserved and in use by a user.
If undefined, the cluster-wide value from the lsb.params parameter of the same name is used.
Not defined.
RUN_TIME_DECAY=Y | y | N | n
Used only with fairshare scheduling. Enables decay for run time at the same rate as the decay set by HIST_HOURS for cumulative CPU time and historical run time.
In the calculation of a user’s dynamic share priority, this factor determines whether run time is decayed.
If undefined, the cluster-wide value from the lsb.params parameter of the same name is used.
Running badmin reconfig or restarting mbatchd during a job's run time results in the decayed run time being recalculated.
When a suspended job using run time decay is resumed, the decay time is based on the elapsed time.
Not defined
RUN_TIME_FACTOR=number
Used only with fairshare scheduling. Run time weighting factor.
In the calculation of a user’s dynamic share priority, this factor determines the relative importance of the total run time of a user’s running jobs.
If undefined, the cluster-wide value from the lsb.params parameter of the same name is used.
Not defined.
RUN_WINDOW=time_window ...
Time periods during which jobs in the queue are allowed to run.
When the window closes, LSF suspends jobs running in the queue and stops dispatching jobs from the queue. When the window reopens, LSF resumes the suspended jobs and begins dispatching additional jobs.
Not defined. Queue is always active.
RUNLIMIT=[default_limit] maximum_limit
where default_limit and maximum_limit are:
[hour:]minute[/host_name | /host_model]
The maximum run limit and optionally the default run limit. The name of a host or host model specifies the runtime normalization host to use.
By default, jobs that are in the RUN state for longer than the specified maximum run limit are killed by LSF. You can optionally provide your own termination job action to override this default.
Jobs submitted with a job-level run limit (bsub -W) that is less than the maximum run limit are killed when their job-level run limit is reached. Jobs submitted with a run limit greater than the maximum run limit are rejected by the queue.
If you want to provide an estimated run time for scheduling purposes without killing jobs that exceed the estimate, define the RUNTIME parameter in an application profile instead of a run limit (see lsb.applications for details).
If you specify only one limit, it is the maximum, or hard, run limit. If you specify two limits, the first one is the default, or soft, run limit, and the second one is the maximum run limit. The number of minutes may be greater than 59. Therefore, three and a half hours can be specified either as 3:30, or 210.
The run limit is in the form of [hour:]minute. The minutes can be specified as a number greater than 59. For example, three and a half hours can either be specified as 3:30, or 210.
The run limit you specify is the normalized run time. This is done so that the job does approximately the same amount of processing, even if it is sent to host with a faster or slower CPU. Whenever a normalized run time is given, the actual time on the execution host is the specified time multiplied by the CPU factor of the normalization host then divided by the CPU factor of the execution host.
If ABS_RUNLIMIT=Y is defined in lsb.params, the runtime limit is not normalized by the host CPU factor. Absolute wall-clock run time is used for all jobs submitted to a queue with a run limit configured.
Optionally, you can supply a host name or a host model name defined in LSF. You must insert ‘/’ between the run limit and the host name or model name. (See lsinfo(1) to get host model information.)
If no host or host model is given, LSF uses the default runtime normalization host defined at the queue level (DEFAULT_HOST_SPEC in lsb.queues) if it has been configured; otherwise, LSF uses the default CPU time normalization host defined at the cluster level (DEFAULT_HOST_SPEC in lsb.params) if it has been configured; otherwise, the host with the largest CPU factor (the fastest host in the cluster).
For MultiCluster jobs, if no other CPU time normalization host is defined and information about the submission host is not available, LSF uses the host with the largest CPU factor (the fastest host in the cluster).
Jobs submitted to a chunk job queue are not chunked if RUNLIMIT is greater than 30 minutes.
RUNLIMIT is required for queues configured with INTERRUPTIBLE_BACKFILL.
Unlimited
SLA_GUARANTEES_IGNORE=Y| y | N | n
Applies to SLA guarantees only.
SLA_GUARANTEES_IGNORE=Y allows jobs in the queue access to all guaranteed resources. As a result, some guarantees might not be honored. If a queue does not have this parameter set, jobs in this queue cannot trigger preemption of an SLA job. If an SLA job is suspended (e.g. by a bstop), jobs in queues without the parameter set can still make use of the slots released by the suspended job.
Using SLA_GUARANTEES_IGNORE=Y defeats the purpose of guaranteeing resources. This should be used sparingly for low traffic queues only.
Not defined (N). The queue must honor resource guarantees when dispatching jobs.
SLOT_POOL=pool_name
Name of the pool of job slots the queue belongs to for queue-based fairshare. A queue can only belong to one pool. All queues in the pool must share the same set of hosts.
Specify any ASCII string up to 60 characters long. You can use letters, digits, underscores (_) or dashes (-). You cannot use blank spaces.
Not defined. No job slots are reserved.
SLOT_RESERVE=MAX_RESERVE_TIME[integer]
Enables processor reservation for the queue and specifies the reservation time. Specify the keyword MAX_RESERVE_TIME and, in square brackets, the number of MBD_SLEEP_TIME cycles over which a job can reserve job slots. MBD_SLEEP_TIME is defined in lsb.params; the default value is 60 seconds.
If a job has not accumulated enough job slots to start before the reservation expires, it releases all its reserved job slots so that other jobs can run. Then, the job cannot reserve slots for one scheduling session, so other jobs have a chance to be dispatched. After one scheduling session, the job can reserve job slots again for another period specified by SLOT_RESERVE.
SLOT_RESERVE is overridden by the RESOURCE_RESERVE parameter.
If both RESOURCE_RESERVE and SLOT_RESERVE are defined in the same queue, job slot reservation and memory reservation are enabled and an error is displayed when the cluster is reconfigured. SLOT_RESERVE is ignored.
Job slot reservation for parallel jobs is enabled by RESOURCE_RESERVE if the LSF scheduler plugin module names for both resource reservation and parallel batch jobs (schmod_parallel and schmod_reserve) are configured in the lsb.modules file: The schmod_parallel name must come before schmod_reserve in lsb.modules.
If BACKFILL is configured in a queue, and a run limit is specified at the job level (bsub -W), application level (RUNLIMIT in lsb.applications), or queue level (RUNLIMIT in lsb.queues), or if an estimated run time is specified at the application level (RUNTIME in lsb.applications), backfill parallel jobs can use job slots reserved by the other jobs, as long as the backfill job can finish before the predicted start time of the jobs with the reservation.
Unlike memory reservation, which applies both to sequential and parallel jobs, slot reservation applies only to parallel jobs.
SLOT_RESERVE=MAX_RESERVE_TIME[5]
This example specifies that parallel jobs have up to 5 cycles of MBD_SLEEP_TIME (5 minutes, by default) to reserve sufficient job slots to start.
Not defined. No job slots are reserved.
SLOT_SHARE=integer
Share of job slots for queue-based fairshare. Represents the percentage of running jobs (job slots) in use from the queue. SLOT_SHARE must be greater than zero (0) and less than or equal to 100.
The sum of SLOT_SHARE for all queues in the pool does not need to be 100%. It can be more or less, depending on your needs.
Not defined
SNDJOBS_TO=[queue@]cluster_name[+preference] ...
Defines a MultiCluster send-jobs queue.
Specify remote queue names, in the form queue_name@cluster_name[+preference], separated by a space.
This parameter is ignored if lsb.queues HOSTS specifies remote (borrowed) resources.
Queue preference is defined at the queue level in SNDJOBS_TO (lsb.queues) of the submission cluster for each corresponding execution cluster queue receiving forwarded jobs.
SNDJOBS_TO=queue2@cluster2+1 queue3@cluster2+2
STACKLIMIT=integer
The per-process (hard) stack segment size limit (in KB) for all of the processes belonging to a job from this queue (see getrlimit(2)).
Unlimited
STOP_COND=res_req
Use the select section of the resource requirement string to specify load thresholds. All other sections are ignored.
If STOP_COND is specified in the queue and there are no load thresholds, the suspending reasons for each individual load index is not displayed by bjobs.
STOP_COND= select[((!cs && it < 5) || (cs && mem < 15 && swp < 50))]
In this example, assume “cs” is a Boolean resource indicating that the host is a computer server. The stop condition for jobs running on computer servers is based on the availability of swap memory. The stop condition for jobs running on other kinds of hosts is based on the idle time.
SUCCESS_EXIT_VALUES=[exit_code ...]
Use this parameter to specify exit values used byLSF to determine if the job was done successfully. Application level success exit values defined with SUCCESS_EXIT_VALUES in lsb.applications override the configuration defined in lsb.queues. Job-level success exit values specified with the LSB_SUCCESS_EXIT_VALUES environment variable override the configration in lsb.queues and lsb.applications.
Use SUCCESS_EXIT_VALUES for submitting jobs to specific queues that successfully exit with non-zero values so that LSF does not interpret non-zero exit codes as job failure.
If the same exit code is defined in SUCCESS_EXIT_VALUES and REQUEUE_EXIT_VALUES, any job with this exit code is requeued instead of being marked as DONE because sbatchd processes requeue exit values before success exit values.
In MultiCluster job forwarding mode, LSF uses the SUCCESS_EXIT_VALUES from the remote cluster.
In a MultiCluster resource leasing environment, LSF uses the SUCCESS_EXIT_VALUES from the consumer cluster.
exit_code should be a value between 0 and 255. Use spaces to separate multiple exit codes.
Any changes you make to SUCCESS_EXIT_VALUES will not affect running jobs. Only pending jobs will use the new SUCCESS_EXIT_VALUES definitions, even if you run badmin reconfig and mbatchd restart to apply your changes.
Not defined.
SWAPLIMIT=integer
The amount of total virtual memory limit (in KB) for a job from this queue.
This limit applies to the whole job, no matter how many processes the job may contain.
The action taken when a job exceeds its SWAPLIMIT or PROCESSLIMIT is to send SIGQUIT, SIGINT, SIGTERM, and SIGKILL in sequence. For CPULIMIT, SIGXCPU is sent before SIGINT, SIGTERM, and SIGKILL.
Unlimited
TERMINATE_WHEN=[LOAD] [PREEMPT] [WINDOW]
If the TERMINATE_WHEN job control action is applied to a chunk job, sbatchd kills the chunk job element that is running and puts the rest of the waiting elements into pending state to be rescheduled later.
Begin Queue
NAME = night
RUN_WINDOW = 20:00-08:00
TERMINATE_WHEN = WINDOW
JOB_CONTROLS = TERMINATE[kill -KILL $LS_JOBPGIDS; mail - s "job $LSB_JOBID
killed by queue run window" $USER < /dev/null]
End Queue
THREADLIMIT=[default_limit] maximum_limit
Limits the number of concurrent threads that can be part of a job. Exceeding the limit causes the job to terminate. The system sends the following signals in sequence to all processes belongs to the job: SIGINT, SIGTERM, and SIGKILL.
By default, if a default thread limit is specified, jobs submitted to the queue without a job-level thread limit are killed when the default thread limit is reached.
If you specify only one limit, it is the maximum, or hard, thread limit. If you specify two limits, the first one is the default, or soft, thread limit, and the second one is the maximum thread limit.
Both the default and the maximum limits must be positive integers. The default limit must be less than the maximum limit. The default limit is ignored if it is greater than the maximum limit.
THREADLIMIT=6
No default thread limit is specified. The value 6 is the default and maximum thread limit.
THREADLIMIT=6 8
The first value (6) is the default thread limit. The second value (8) is the maximum thread limit.
Unlimited
UJOB_LIMIT=integer
Per-user job slot limit for the queue. Maximum number of job slots that each user can use in this queue.
UJOB_LIMIT must be within or greater than the range set by PROCLIMIT or bsub -n (if either is used), or jobs are rejected.
Unlimited
USE_PAM_CREDS=y | n
If USE_PAM_CREDS=y, applies PAM limits to a queue when its job is dispatched to a Linux host using PAM. PAM limits are system resource limits defined in limits.conf.
When USE_PAM_CREDS is enabled, PAM limits override others. For example, the PAM limit is used even if queue-level soft limit is less than PAM limit. However, it still cannot exceed queue's hard limit.
If the execution host does not have PAM configured and this parameter is enabled, the job fails.
For parallel jobs, only takes effect on the first execution host.
Overrides MEMLIMIT_TYPE=Process.
Overridden (for CPU limit only) by LSB_JOB_CPULIMIT=y.
Overridden (for memory limits only) by LSB_JOB_MEMLIMIT=y.
n
USE_PRIORITY_IN_POOL= y | Y | n | N
Queue-based fairshare only. After job scheduling occurs for each queue, this parameter enables LSF to dispatch jobs to any remaining slots in the pool in first-come first-served order across queues.
N
USERS=all [~user_name ...] [~user_group ...] | [user_name ...] [user_group [~user_group ...] ...]
A space-separated list of user names or user groups that can submit jobs to the queue. LSF cluster administrators are automatically included in the list of users. LSF cluster administrators can submit jobs to this queue, or switch (bswitch) any user’s jobs into this queue.
If user groups are specified, each user in the group can submit jobs to this queue. If FAIRSHARE is also defined in this queue, only users defined by both parameters can submit jobs, so LSF administrators cannot use the queue if they are not included in the share assignments.
User names must be valid login names. To specify a Windows user account, include the domain name in uppercase letters (DOMAIN_NAME\user_name).
User group names can be LSF user groups or UNIX and Windows user groups. To specify a Windows user group, include the domain name in uppercase letters (DOMAIN_NAME\user_group).
Use the keyword all to specify all users or user groups in a cluster.
Use the not operator (~) to exclude users from the all specification or from user groups. This is useful if you have a large number of users but only want to exclude a few users or groups from the queue definition.
The not operator (~) can only be used with the all keyword or to exclude users from user groups.
all (all users can submit jobs to the queue)
Variable configuration is used to automatically change LSF configuration based on time windows. You define automatic configuration changes in lsb.queues by using if-else constructs and time expressions. After you change the files, reconfigure the cluster with the badmin reconfig command.
The expressions are evaluated by LSF every 10 minutes based on mbatchd start time. When an expression evaluates true, LSF dynamically changes the configuration based on the associated configuration statements. Reconfiguration is done in real time without restarting mbatchd, providing continuous system availability.
Begin Queue
...
#if time(8:30-18:30)
INTERACTIVE = ONLY # interactive only during day shift #endif
...
End Queue