bkill

sends signals to kill, suspend, or resume unfinished jobs

Synopsis

bkill [-l] [-app application_profile_name] [-g job_group_name] [-sla service_class_name] [-J job_name] [-m host_name | -m host_group] [-q queue_name] [-r | -s signal_value | signal_name] [-u user_name | -u user_group | -u all] [job_ID ... | 0 | "job_ID[index]" ...]
bkill [-l] [-b] [-app application_profile_name] [-g job_group_name] [-sla service_class_name] [-J job_name] [-m host_name | -m host_group] [-q queue_name] [-u user_name | -u user_group | -u all] [job_ID ... | 0 | "job_ID[index]" ...]
bkill [-h | -V]

Description

By default, sends a set of signals to kill the specified jobs. On UNIX, SIGINT and SIGTERM are sent to give the job a chance to clean up before termination, then SIGKILL is sent to kill the job. The time interval between sending each signal is defined by the JOB_TERMINATE_INTERVAL parameter in lsb.params(5).

By default, kills the last job submitted by the user running the command. You must specify a job ID or -app, -g, -J, -m, -u, or -q. If you specify -app, -g, -J, -m, -u, or -q without a job ID, bkill kills the last job submitted by the user running the command. Specify job ID 0 (zero) to kill multiple jobs.

On Windows, job control messages replace the SIGINT and SIGTERM signals (but only customized applications can process them) and the TerminateProcess() system call is sent to kill the job.

bkill sends the signals INT, TERM and KILL in sequence. The exit code returned when a dispatched job is killed with bkill depends on which signal killed the job.

If PRIVILEGED_USER_FORCE_BKILL=y in lsb.params, only root and LSF administrators can run bkill -r. The -r option is ignored for other users.

Users can only operate on their own jobs. Only root and LSF administrators can operate on jobs submitted by other users.

If a signal request fails to reach the job execution host, LSF tries the operation later when the host becomes reachable. LSF retries the most recent signal request.

If a job is running in a queue with CHUNK_JOB_SIZE set, bkill has the following results depending on job state:

PEND

Job is removed from chunk (NJOBS -1, PEND -1)

RUN

All jobs in the chunk are suspended (NRUN -1, NSUSP +1)

USUSP

Job finishes, next job in the chunk starts if one exists (NJOBS -1, PEND -1, SUSP -1, RUN +1)

WAIT

Job finishes (NJOBS-1, PEND -1)

If the job cannot be killed, use bkill -r to remove the job from the LSF system without waiting for the job to terminate, and free the resources of the job.

Options

0

Kills all the jobs that satisfy other options (-app. -g, -m, -q, -u, and -J).

-b

Kills large numbers of jobs as soon as possible. Local pending jobs are killed immediately and cleaned up as soon as possible, ignoring the time interval specified by CLEAN_PERIOD in lsb.params. Jobs killed in this manner are not logged to lsb.acct.

Other jobs, such as running jobs, are killed as soon as possible and cleaned up normally.

If the -b option is used with the 0 subcommand, bkill kills all applicable jobs and silently skips the jobs that cannot be killed.

bkill -b 0
Operation is in progress

The -b option is ignored if used with the -r or -s options.

-l

Displays the signal names supported by bkill. This is a subset of signals supported by /bin/kill and is platform-dependent.

-r

Removes a job from the LSF system without waiting for the job to terminate in the operating system.

If PRIVILEGED_USER_FORCE_BKILL=y in lsb.params, only root and LSF administrators can run bkill -r. The -r option is ignored for other users.

Sends the same series of signals as bkill without -r, except that the job is removed from the system immediately, the job is marked as EXIT, and the job resources that LSF monitors are released as soon as LSF receives the first signal.

Use bkill -r only on jobs that cannot be killed in the operating system, or on jobs that cannot be otherwise removed using bkill.

The -r option cannot be used with the -s option.

-app application_profile_name

Operates only on jobs associated with the specified application profile. You must specify an existing application profile. If job_ID or 0 is not specified, only the most recently submitted qualifying job is operated on.

-g job_group_name

Operates only on jobs in the job group specified by job_group_name.

Use -g with -sla to kill jobs in job groups attached to a service class.

bkill does not kill jobs in lower level job groups in the path. For example, jobs are attached to job groups /risk_group and /risk_group/consolidate:

bsub -g /risk_group  myjob
Job <115> is submitted to default queue <normal>.
bsub -g /risk_group/consolidate myjob2
Job <116> is submitted to default queue <normal>.

The following bkill command only kills jobs in /risk_group, not the subgroup /risk_group/consolidate:

bkill -g /risk_group 0
Job <115> is being terminated
bkill -g /risk_group/consolidate 0
Job <116> is being terminated
-J job_name

Operates only on jobs with the specified job name. The -J option is ignored if a job ID other than 0 is specified in the job_ID option.

The job name can be up to 4094 characters long. Job names are not unique.

The wildcard character (*) can be used anywhere within a job name, but cannot appear within array indices. For example job* returns jobA and jobarray[1], *AAA*[1] returns the first element in all job arrays with names containing AAA, however job1[*] will not return anything since the wildcard is within the array index.

-m host_name | -m host_group

Operates only on jobs dispatched to the specified host or host group.

If job_ID is not specified, only the most recently submitted qualifying job is operated on. The -m option is ignored if a job ID other than 0 is specified in the job_ID option. See bhosts(1) and bmgroup(1) for more information about hosts and host groups.

-q queue_name

Operates only on jobs in the specified queue.

If job_ID is not specified, only the most recently submitted qualifying job is operated on.

The -q option is ignored if a job ID other than 0 is specified in the job_ID option.

See bqueues(1) for more information about queues.

-s signal_value | signal_name

Sends the specified signal to specified jobs. You can specify either a name, stripped of the SIG prefix (such as KILL), or a number (such as 9).

Eligible UNIX signal names are listed by bkill -l.

The -s option cannot be used with the -r option.

Use bkill -s to suspend and resume jobs by using the appropriate signal instead of using bstop or bresume. Sending the SIGCONT signal is the same as using bresume.

Sending the SIGSTOP signal to sequential jobs or the SIGTSTP to parallel jobs is the same as using bstop.

You cannot suspend a job that is already suspended, or resume a job that is not suspended. Using SIGSTOP or SIGTSTP on a job that is in the USUSP state has no effect and using SIGCONT on a job that is not in either the PSUSP or the USUSP state has no effect. See bjobs(1) for more information about job states.

Limited Windows signals are supported:

  • bkill -s 7 or bkill SIGKILL to terminate a job

  • bkill -s 16 or bkill SIGSTOP to suspend a job

  • bkill -s 15 to resume a job

-sla service_class_name

Operates on jobs belonging to the specified service class.

If job_ID is not specified, only the most recently submitted job is operated on.

Use -sla with -g to kill jobs in job groups attached to a service class.

The -sla option is ignored if a job ID other than 0 is specified in the job_ID option.

Use bsla to display the configuration properties of service classes configured in lsb.serviceclasses, the default SLA configured with ENABLE_DEFAULT_EGO_SLA in lsb.params, and dynamic information about the state of each service class.

-u user_name | -u user_group | -u all

Operates only on jobs submitted by the specified user or user group, or by all users if the reserved user name all is specified. To specify a Windows user account, include the domain name in uppercase letters and use a single backslash (DOMAIN_NAME\user_name) in a Windows command line or a double backslash (DOMAIN_NAME\\user_name) in a UNIX command line.

If job_ID is not specified, only the most recently submitted qualifying job is operated on. The -u option is ignored if a job ID other than 0 is specified in the job_ID option.

job_ID ... | 0 | "job_ID[index]" ...

Operates only on jobs that are specified by job_ID or "job_ID[index]", where "job_ID[index]" specifies selected job array elements (see bjobs(1)). For job arrays, quotation marks must enclose the job ID and index, and index must be enclosed in square brackets.

Kill an entire job array by specifying the job array ID instead of the job ID.

Jobs submitted by any user can be specified here without using the -u option. If you use the reserved job ID 0, all the jobs that satisfy other options (that is, -m, -q, -u and -J) are operated on; all other job IDs are ignored.

The options -u, -q, -m and -J have no effect if a job ID other than 0 is specified. Job IDs are returned at job submission time (see bsub(1)) and may be obtained with the bjobs command (see bjobs(1)).

Any jobs or job arrays that are killed are logged in lsb.acct.

-h

Prints command usage to stderr and exits.

-V

Prints LSF release version to stderr and exits.

Examples

bkill -s 17 -q night

Sends signal 17 to the last job that was submitted by the invoker to queue night.

bkill -q short -u all 0

Kills all the jobs that are in the queue short.

bkill -r 1045

Forces the removal of unkillable job 1045.

bkill -sla Tofino 0

Kill all jobs belonging to the service class named Tofino.

bkill -g /risk_group 0

Kills all jobs in the job group /risk_group.

bkill -app fluent

Kills the most recently submitted job associated with the application profile fluent for the current user.

bkill -app fluent 0

Kills all jobs associated with the application profile fluent for the current user.

See also

bsub(1), bjobs(1), bqueues(1), bhosts(1), bresume(1), bapp(1), bsla(1), bstop(1), bgadd(1), bgdel(1), bjgroup(1), bparams(5), lsb.serviceclasses(5), mbatchd(8), kill(1), signal(2)