of Generic RESource (GRES) on each compute node.
If the GRES information in the slurm.conf file does not fully describe those
resources, then a gres.conf file should be included on each compute node.
The file location can be modified at system build time using the
DEFAULT_SLURM_CONF parameter or at execution time by setting the SLURM_CONF
environment variable. The file will always be located in the
same directory as the \fBslurm.conf\fP file.

.LP
If the GRES information in the slurm.conf file fully describes those resources
(i.e. no "Cores", "File" or "Links" specification is required for that GRES
type or that information is automatically detected), that information may be
omitted from the gres.conf file and only the configuration information in the
slurm.conf file will be used.
The gres.conf file may be omitted completely if the configuration information
in the slurm.conf file fully describes all GRES.

.LP
Parameter names are case insensitive.
Any text following a "#" in the configuration file is treated
as a comment through the end of that line.
Changes to the configuration file take effect upon restart of
Slurm daemons, daemon receipt of the SIGHUP signal, or execution
of the command "scontrol reconfigure" unless otherwise noted.

.LP
\fBNOTE:\fP Slurm support for gres/mps requires the use of the select/cons_tres
plugin. For more information on how to configure MPS, see
\fIhttps://slurm.schedmd.com/gres.html#MPS_Management\fR.

.LP
For more information on GRES scheduling in general, see
\fIhttps://slurm.schedmd.com/gres.html\fR.

.LP
The overall configuration parameters available include:

.TP
\fBAutoDetect\fR
The hardware detection mechanisms to enable for automatic GRES configuration.
This should be on a line by itself. Current, options are:
.RS
.TP
\fBnvml\fR
Used to automatically detect NVIDIA GPUs
.TP
\fBrsmi\fR
Used to automatically detect AMD GPUs
.RE

.TP
\fBCount\fR
socket or NUMA level.
Therefore it is not possible to preferentially assign GRES with different
specific CPUs on the same NUMA or socket and this option should be used to
identify all cores on some socket.


Multiple cores may be specified using a comma delimited list or a range may be
specified using a "\-" separator (e.g. "0,1,2,3" or "0\-3").
If a job specifies \fB\-\-gres\-flags=enforce\-binding\fR, then only the
identified cores can be allocated with each generic resource. This will tend to
improve performance of jobs, but delay the allocation of resources to them.
If specified and a job is \fInot\fR submitted with the
\fB\-\-gres\-flags=enforce\-binding\fR option the identified cores will be
preferred for scheduled with each generic resource.

If \fB\-\-gres\-flags=disable\-binding\fR is specified, then any core can be
used with the resources, which also increases the speed of Slurm's
scheduling algorithm but can degrade the application performance.
The \fB\-\-gres\-flags=disable\-binding\fR option is currently required to use
more CPUs than are bound to a GRES (i.e. if a GPU is bound to the CPUs on one
socket, but resources on more than one socket are required to run the job).
If any core can be effectively used with the resources, then do not specify the
\fBcores\fR option for improved speed in the Slurm scheduling logic.
A restart of the slurmctld is needed for changes to the Cores option to take
effect.

\fBNOTE:\fR If your cores contain multiple threads only the first thread
(processing unit) of each core needs to be listed.
Also note that since Slurm must be able to perform resource management on
heterogeneous clusters having various processing unit numbering schemes,
a logical processing unit index must be specified instead of the physical
processing unit index.
That processing unit logical index might not correspond to your physical index
number.
Processing unit 0 will be the first socket, first core and (if configured) first
thread.
If hyperthreading is enabled, processing unit 1 will always be the first socket,
first core and second thread.
If hyperthreading is not enabled, processing unit 1 will always be the first
socket and second core.
This numbering coincides with the processing unit logical number (PU L#) seen
in "lstopo \-l" command output.

.TP
\fBFile\fR
Fully qualified pathname of the device files associated with a resource.
The name can include a numeric range suffix to be interpreted by Slurm
(e.g. \fIFile=/dev/nvidia[0\-3]\fR).


This field is generally required if enforcement of generic resource
allocations is to be supported (i.e. prevents users from making

NOTE: If you specify the \fBFile\fR parameter for a resource on some node,
the option must be specified on all nodes and Slurm will track the assignment
of each specific resource on each node. Otherwise Slurm will only track a
count of allocated resources rather than the state of each individual device
file.

NOTE: Drain a node before changing the count of records with \fBFile\fR
parameters (i.e. if you want to add or remove GPUs from a node's configuration).
Failure to do so will result in any job using those GRES being aborted.

.TP
\fBFlags\fR
Optional flags that can be specified to change configured behavior of the GRES.

Allowed values at present are:
.RS
.TP 20
\fBCountOnly\fR
Do not attempt to load plugin as this GRES will only be used to track counts of
GRES used. This avoids attempting to load non-existent plugin which can
affect filesystems with high latency metadata operations for non-existent files.
.RE

.TP
\fBLinks\fR
A comma\-delimited list of numbers identifying the number of connections
between this device and other devices to allow coscheduling of
better connected devices.
This is an ordered list in which the number of connections this specific
device has to device number 0 would be in the first position, the number of
connections it has to device number 1 in the second position, etc.
A \-1 indicates the device itself and a 0 indicates no connection.
If specified, then this line can only contain a single GRES device (i.e. can
only contain a single file via \fBFile\fR).


This is an optional value and is usually automatically determined if
\fBAutoDetect\fR is enabled.
A typical use case would be to identify GPUs having NVLink connectivity.
Note that for GPUs, the minor number assigned by the OS and used in the device
file (i.e. the X in \fI/dev/nvidiaX\fR) is not necessarily the same as the
device number/index. The device number is created by sorting the GPUs by PCI bus
ID and then numbering them starting from the smallest bus ID.
See \fIhttps://slurm.schedmd.com/gres.html#GPU_Management\fR

.TP
\fBName\fR
Name of the generic resource. Any desired name may be used.
The name must match a value in \fBGresTypes\fR in \fIslurm.conf\fR.
Each generic resource has an optional plugin which can provide
resource\-specific functionality.
.RE

.TP
\fBNodeName\fR
An optional NodeName specification can be used to permit one gres.conf file to
be used for all compute nodes in a cluster by specifying the node(s) that each
line should apply to.
The NodeName specification can use a Slurm hostlist specification as shown in
the example below.

.TP
\fBType\fR
An optional arbitrary string identifying the type of device.
For example, this might be used to identify a specific model of GPU, which users
can then specify in a job request.
If \fBType\fR is specified, then \fBCount\fR is limited in size (currently 1024).

.SH "EXAMPLES"
.LP
.br
##################################################################
.br
# Slurm's Generic Resource (GRES) configuration file
.br
# Define GPU devices with MPS support
.br
##################################################################
.br
AutoDetect=nvml
.br
Name=gpu Type=gtx560 File=/dev/nvidia0 COREs=0,1
.br
Name=gpu Type=tesla  File=/dev/nvidia1 COREs=2,3
.br
Name=mps Count=100 File=/dev/nvidia0 COREs=0,1
.br
Name=mps Count=100  File=/dev/nvidia1 COREs=2,3

.LP
.br
##################################################################
.br
# Slurm's Generic Resource (GRES) configuration file
.br
# Overwrite system defaults and explicitly configure three GPUs
.br
##################################################################
.br
Name=gpu Type=tesla File=/dev/nvidia[0\-1] COREs=0,1
.br
# Name=gpu Type=tesla  File=/dev/nvidia[2\-3] COREs=2,3
.br
.br
## Explicitly specify devices on nodes tux0\-tux15
.br
# NodeName=tux[0\-15]  Name=gpu File=/dev/nvidia[0\-3]
.br
# NOTE: tux3 nvidia1 device is out of service
.br
NodeName=tux[0\-2]  Name=gpu File=/dev/nvidia[0\-3]
.br
NodeName=tux3  Name=gpu File=/dev/nvidia[0,2\-3]
.br
NodeName=tux[4\-15]  Name=gpu File=/dev/nvidia[0\-3]
.br

.LP
.br
##################################################################
.br
# Slurm's Generic Resource (GRES) configuration file
.br
# Use NVML to gather GPU configuration information
.br
# Information about all other GRES gathered from slurm.conf
.br
##################################################################
.br
AutoDetect=nvml

.SH "COPYING"
Copyright (C) 2010 The Regents of the University of California.
Produced at Lawrence Livermore National Laboratory (cf, DISCLAIMER).
.br
Copyright (C) 2010\-2019 SchedMD LLC.
.LP
This file is part of Slurm, a resource management program.
For details, see <https://slurm.schedmd.com/>.
.LP
Slurm is free software; you can redistribute it and/or modify it under
the terms of the GNU General Public License as published by the Free
Software Foundation; either version 2 of the License, or (at your option)
any later version.
.LP
Slurm is distributed in the hope that it will be useful, but WITHOUT ANY
WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
FOR A PARTICULAR PURPOSE.  See the GNU General Public License for more
details.

.SH "SEE ALSO"
.LP
\fBslurm.conf\fR(5)

Man(1) output converted with man2html