.SH "DESCRIPTION" \fBsacctmgr\fR is used to view or modify Slurm account information. The account information is maintained within a database with the interface being provided by \fBslurmdbd\fR (Slurm Database daemon). This database can serve as a central storehouse of user and computer information for multiple computers at a single site. Slurm account information is recorded based upon four parameters that form what is referred to as an \fIassociation\fR. These parameters are \fIuser\fR, \fIcluster\fR, \fIpartition\fR, and \fIaccount\fR. \fIuser\fR is the login name. \fIcluster\fR is the name of a Slurm managed cluster as specified by the \fIClusterName\fR parameter in the \fIslurm.conf\fR configuration file. \fIpartition\fR is the name of a Slurm partition on that cluster. \fIaccount\fR is the bank account for a job. The intended mode of operation is to initiate the \fBsacctmgr\fR command, add, delete, modify, and/or list \fIassociation\fR records then commit the changes and exit. .TP "7" \f3Note: \fP\c The content's of Slurm's database are maintained in lower case. This may result in some \f3sacctmgr\fP output differing from that of other Slurm commands. .SH "OPTIONS" .TP \fB\-h\fR, \fB\-\-help\fR Print a help message describing the usage of \fBsacctmgr\fR. This is equivalent to the \fBhelp\fR command. .TP \fB\-i\fR, \fB\-\-immediate\fR commit changes immediately without asking for confirmation. .TP \fB\-n\fR, \fB\-\-noheader\fR No header will be added to the beginning of the output. .TP \fB\-p\fR, \fB\-\-parsable\fR Output will be '|' delimited with a '|' at the end. .TP \fB\-P\fR, \fB\-\-parsable2\fR Output will be '|' delimited without a '|' at the end. .TP \fB\-Q\fR, \fB\-\-quiet\fR Print no messages other than error messages. This is equivalent to the \fBquiet\fR command. This is equivalent to the \fBverbose\fR command. .TP \fB\-V\fR , \fB\-\-version\fR Display version number. This is equivalent to the \fBversion\fR command. .SH "COMMANDS" .TP \fBadd\fR <\fIENTITY\fR> <\fISPECS\fR> Add an entity. Identical to the \fBcreate\fR command. .TP \fBassociations\fR Use with show or list to display associations with the entity. .TP \fBclear\fR stats Clear the server statistics. .TP \fBcreate\fR <\fIENTITY\fR> <\fISPECS\fR> Add an entity. Identical to the \fBadd\fR command. .TP \fBdelete\fR <\fIENTITY\fR> where <\fISPECS\fR> Delete the specified entities. .TP \fBdump\fR <\fIENTITY\fR> <\fIFile=FILENAME\fR> Dump cluster data to the specified file. If the filename is not specified it uses clustername.cfg filename by default. .TP \fBhelp\fP Display a description of sacctmgr options and commands. .TP \fBlist\fR <\fIENTITY\fR> [<\fISPECS\fR>] Display information about the specified entity. By default, all entries are displayed, you can narrow results by specifying SPECS in your query. Identical to the \fBshow\fR command. .TP \fBload\fR <\fIFILENAME\fR> Load cluster data from the specified file. This is a configuration file generated by running the sacctmgr dump command. This command does not load archive data, see the sacctmgr archive load option instead. \fBshow\fR <\fIENTITY\fR> [<\fISPECS\fR>] Display information about the specified entity. By default, all entries are displayed, you can narrow results by specifying SPECS in your query. Identical to the \fBlist\fR command. .TP \fBshutdown\fR Shutdown the server. .TP \fBversion\fP Display the version number of sacctmgr. .SH "INTERACTIVE COMMANDS" \fBNOTE:\fP All commands listed below can be used in the interactive mode, but \fINOT\fP on the initial command line. .TP \fBexit\fP Terminate sacctmgr interactive mode. Identical to the \fBquit\fR command. .TP \fBquiet\fP Print no messages other than error messages. .TP \fBquit\fP Terminate the execution of sacctmgr interactive mode. Identical to the \fBexit\fR command. .TP \fBverbose\fP Enable detailed logging. This includes time\-stamps on data structures, record counts, etc. This is an independent command with no options meant for use in interactive mode. .TP \fB!!\fP Repeat the last command. .SH "ENTITIES" .TP \fIaccount\fP A bank account, typically specified at job submit time using the \fI\-\-account=\fR option. These may be arranged in a hierarchical fashion, for example The \fIClusterName\fR parameter in the \fIslurm.conf\fR configuration file, used to differentiate accounts on different machines. .TP \fIconfiguration\fP Used only with the \fIlist\fR or \fIshow\fR command to report current system configuration. .TP \fIcoordinator\fR A special privileged user usually an account manager or such that can add users or sub accounts to the account they are coordinator over. This should be a trusted person since they can change limits on account and user associations inside their realm. .TP \fIevent\fR Events like downed or draining nodes on clusters. .TP \fIfederation\fP A group of clusters that work together to schedule jobs. .TP \fIjob\fR Used to modify specific fields of a job: Derived Exit Code, the Comment String, or wckey. .TP \fIqos\fR Quality of Service. .TP \fIResource\fP Software resources for the system. Those are software licenses shared among clusters. .TP \fIRunawayJobs\fR Used only with the \fIlist\fR or \fIshow\fR command to report current jobs that have been orphanded on the local cluster and are now runaway. If there are jobs in this state it will also give you an option to "fix" them. NOTE: You must have an \fBAdminLevel\fR of at least \fBOperator\fR to preform this. .TP \fIstats\fR Used with \fBlist\fR or \fBshow\fR command to view server statistics. Accepts optional argument of \fBave_time\fR or \fBtotal_time\fR to sort on those fields. By default, sorts on increasing RPC count field. being considered for being allocated resources. If starting a job would cause any of its group limit to be exceeded, that job will not be considered for scheduling even if that job might preempt other jobs which would release sufficient group resources for the pending job to be initiated. .TP \fIDefaultQOS\fP=<default qos> The default QOS this association and its children should have. This is overridden if set directly on a user. To clear a previously set value use the modify command with a new value of \-1. .TP \fIFairshare\fP=<fairshare number | parent> Number used in conjunction with other accounts to determine job priority. Can also be the string \fIparent\fR, when used on a user this means that the parent association is used for fairshare. If Fairshare=parent is set on an account, that account's children will be effectively reparented for fairshare calculations to the first parent of their parent that is not Fairshare=parent. Limits remain the same, only its fairshare value is affected. To clear a previously set value use the modify command with a new value of \-1. .TP \fIGraceTime\fP=<preemption grace time in seconds> Specifies, in units of seconds, the preemption grace time to be extended to a job which has been selected for preemption. The default value is zero, no preemption grace time is allowed on this QOS. NOTE: This value is only meaningful for QOS PreemptMode=CANCEL. .TP \fIGrpTRESMins\fP=<TRES=max TRES minutes,...> The total number of TRES minutes that can possibly be used by past, present and future jobs running from this association and its children. To clear a previously set value use the modify command with a new value of \-1 for each TRES id. NOTE: This limit is not enforced if set on the root association of a cluster. So even though it may appear in sacctmgr output, it will not be enforced. ALSO NOTE: This limit only applies when using the Priority Multifactor plugin. The time is decayed using the value of PriorityDecayHalfLife or PriorityUsageResetPeriod as set in the slurm.conf. When this limit is reached all associated jobs running will be killed and all future jobs submitted with associations in the group will be delayed until they are able to run inside the limit. .TP \fIGrpTRESRunMins\fP=<TRES=max TRES run minutes,...> Resource plugin. .TP \fIGrpJobs\fP=<max jobs> Maximum number of running jobs in aggregate for this association and all associations which are children of this association. To clear a previously set value use the modify command with a new value of \-1. .TP \fIGrpJobsAccrue\fP=<max jobs> Maximum number of pending jobs in aggregate able to accrue age priority for this association and all associations which are children of this association. To clear a previously set value use the modify command with a new value of \-1. .TP \fIGrpSubmitJobs\fP=<max jobs> Maximum number of jobs which can be in a pending or running state at any time in aggregate for this association and all associations which are children of this association. To clear a previously set value use the modify command with a new value of \-1. .TP \fIGrpWall\fP=<max wall> Maximum wall clock time running jobs are able to be allocated in aggregate for this association and all associations which are children of this association. To clear a previously set value use the modify command with a new value of \-1. NOTE: This limit is not enforced if set on the root association of a cluster. So even though it may appear in sacctmgr output, it will not be enforced. ALSO NOTE: This limit only applies when using the Priority Multifactor plugin. The time is decayed using the value of PriorityDecayHalfLife or PriorityUsageResetPeriod as set in the slurm.conf. When this limit is reached all associated jobs running will be killed and all future jobs submitted with associations in the group will be delayed until they are able to run inside the limit. .TP \fIMaxTRESMins\fP=<max TRES minutes> Maximum number of TRES minutes each job is able to use in this association. This is overridden if set directly on a user. Default is the cluster's limit. To clear a previously set value use the modify command with a new value of \-1 for each TRES id. .TP \fIMaxTRES\fP=<max TRES> Maximum number of TRES each job is able to use in this association. This is overridden if set directly on a user. Default is the cluster's limit. To clear a previously set value use the modify command with a new \fIMaxJobsAccrue\fP=<max jobs> Maximum number of pending jobs able to accrue age priority at any given time for the given association. This is overridden if set directly on a user. Default is the cluster's limit. To clear a previously set value use the modify command with a new value of \-1. .TP \fIMaxSubmitJobs\fP=<max jobs> Maximum number of jobs which can this association can have in a pending or running state at any time. Default is the cluster's limit. To clear a previously set value use the modify command with a new value of \-1. .TP \fIMaxWall\fP=<max wall> Maximum wall clock time each job is able to use in this association. This is overridden if set directly on a user. Default is the cluster's limit.sacctmgr dump tux file=tux.cfg .br (file=tux.cfg is optional) To load a previously created file you can run > sacctmgr load file=tux.cfg Other options for load are \- clean \- delete what was already there and start from scratch with this information. .br Cluster= \- specify a different name for the cluster than that which is in the file. Quick explanation how the file works. Since the associations in the system follow a hierarchy, so does the file. Anything that is a parent needs to be defined before any children. The only exception is the understood 'root' account. This is always a default for any cluster and does not need to be defined. To edit/create a file start with a cluster line for the new cluster \fBCluster\ \-\ cluster_name:MaxNodesPerJob=15\fP Anything included on this line will be the defaults for all associations on this cluster. These options are as follows... allocated in aggregate for this association and all associations which are children of this association. .TP \fIGrpJobs=\fP Maximum number of running jobs in aggregate for this association and all associations which are children of this association. .TP \fIGrpJobsAccrue\fP Maximum number of pending jobs in aggregate able to accrue age priority for this association and all associations which are children of this association. .TP \fIGrpNodes=\fP Maximum number of nodes running jobs are able to be allocated in aggregate for this association and all associations which are children of this association. .TP \fIGrpSubmitJobs=\fP Maximum number of jobs which can be in a pending or running state at any time in aggregate for this association and all associations which are children of this association. .TP \fIGrpWall=\fP Maximum wall clock time running jobs are able to be allocated in aggregate for this association and all associations which are children of this association. .TP \fIFairShare=\fP Number used in conjunction with other associations to determine job priority. .TP \fIMaxJobs=\fP Maximum number of jobs the children of this association can run. .TP \fIMaxNodesPerJob=\fP Maximum number of nodes per job the children of this association can run. .TP \fIMaxWallDurationPerJob=\fP Maximum time (not related to job size) children of this accounts jobs can run. .TP \fIQOS=\fP Comma separated list of Quality of Service names (Defined in sacctmgr). .TP Followed by Accounts you want in this fashion... .na \fBParent\ \-\ root\fP (Defined by default) .br \fBAccount\ \-\ cs\fP:MaxNodesPerJob=5:MaxJobs=4:FairShare=399:MaxWallDurationPerJob=40:Description='Computer Science':Organization='LC' .br \fBParent\ \-\ cs\fP .br \fBAccount\ \-\ test\fP:MaxNodesPerJob=1:MaxJobs=1:FairShare=1:MaxWallDurationPerJob=1:Description='Test Account':Organization='Test' be allocated in aggregate for this association and all associations which are children of this association. \fIGrpTRESRunMins=\fP Used to limit the combined total number of TRES minutes used by all jobs running with this association and its children. This takes into consideration time limit of running jobs and consumes it, if the limit is reached no new jobs are started until other jobs finish to allow time to free up. .TP \fIGrpTRES=\fP Maximum number of TRES running jobs are able to be allocated in aggregate for this association and all associations which are children of this association. .TP \fIGrpJobs=\fP Maximum number of running jobs in aggregate for this association and all associations which are children of this association. .TP \fIGrpJobsAccrue\fP Maximum number of pending jobs in aggregate able to accrue age priority for this association and all associations which are children of this association. .TP \fIGrpNodes=\fP Maximum number of nodes running jobs are able to be allocated in aggregate for this association and all associations which are children of this association. .TP \fIGrpSubmitJobs=\fP Maximum number of jobs which can be in a pending or running state at any time in aggregate for this association and all associations which are children of this association. .TP \fIGrpWall=\fP Maximum wall clock time running jobs are able to be allocated in aggregate for this association and all associations which are children of this association. .TP \fIFairShare=\fP Number used in conjunction with other associations to determine job priority. .TP \fIMaxJobs=\fP Maximum number of jobs the children of this association can run. .TP \fIMaxNodesPerJob=\fP Maximum number of nodes per job the children of this association can run. .TP \fIMaxWallDurationPerJob=\fP Maximum time (not related to job size) children of this accounts jobs can run. .TP \fIOrganization= Name of organization that owns this account. .TP .TP \fIAdminLevel=\fP Type of admin this user is (Administrator, Operator) .br \fBMust be defined on the first occurrence of the user.\fP .TP \fICoordinator=\fP Comma separated list of accounts this user is coordinator over .br \fBMust be defined on the first occurrence of the user.\fP .TP \fIDefaultAccount=\fP system wide default account name .br \fBMust be defined on the first occurrence of the user.\fP .TP \fIFairShare=\fP Number used in conjunction with other associations to determine job priority. .TP \fIMaxJobs=\fP Maximum number of jobs this user can run. .TP \fIMaxNodesPerJob=\fP Maximum number of nodes per job this user can run. .TP \fIMaxWallDurationPerJob=\fP Maximum time (not related to job size) this user can run. .TP \fIQOS(=,+=,\-=)\fP Comma separated list of Quality of Service names (Defined in sacctmgr). .SH "ARCHIVE FUNCTIONALITY" Sacctmgr has the capability to archive to a flatfile and or load that data if needed later. The archiving is usually done by the slurmdbd and it is highly recommended you only do it through sacctmgr if you completely understand what you are doing. For slurmdbd options see "man slurmdbd" for more information. Loading data into the database can be done from these files to either view old data or regenerate rolled up data. .SS archive dump Dump accounting data to file. Depending on options and slurmdbd configuration data may remain in database or be purged. This operation cannot be rolled back once executed. If one of the following options is not specified when sacctmgr is called, the value configured in slurmdbd.comf is used. .TP \fIDirectory=\fP Directory to store the archive data. .TP \fIEvents\fP Purge job records older than time stated in months. If you want to purge on a shorter time period you can include hours, or days behind the numeric value to get those more frequent purges. (e.g. a value of '12hours' would purge everything older than 12 hours.) .TP \fIPurgeStepAfter=\fP Purge step records older than time stated in months. If you want to purge on a shorter time period you can include hours, or days behind the numeric value to get those more frequent purges. (e.g. a value of '12hours' would purge everything older than 12 hours.) .TP \fIPurgeSuspendAfter=\fP Purge job suspend records older than time stated in months. If you want to purge on a shorter time period you can include hours, or days behind the numeric value to get those more frequent purges. (e.g. a value of '12hours' would purge everything older than 12 hours.) .TP \fIScript=\fP Run this script instead of the generic form of archive to flat files. .TP \fISteps\fP Archive Steps. If not specified and PurgeStepAfter is set all step data removed will be lost permanently. .TP \fISuspend\fP Archive Suspend Data. If not specified and PurgeSuspendAfter is set all suspend data removed will be lost permanently. .SS archive load Load in to the database previously archived data. The archive file will not be loaded if the records already exist in the database - therefore, trying to load an archive file more than once will result in an error. When this data is again archived and purged from the database, if the old archive file is still in the directory ArchiveDir, a new archive file will be created (see ArchiveDir in the slurmdbd.conf man page), so the old file will not be overwritten and these files will have duplicate records. .TP \fIFile=\fP File to load into database. .TP \fIInsert=\fP SQL to insert directly into the database. This should be used very cautiously since this is writing your sql into the database. .SH "PERFORMANCE" .PP Executing \fBsacctmgr\fR sends a remote procedure call to \fBslurmdbd\fR. If enough calls from \fBsacctmgr\fR or other Slurm client commands that send remote procedure calls to the \fBslurmdbd\fR daemon come in at once, it can result in a degradation of performance of the \fBslurmdbd\fR daemon, possibly resulting in a denial of service. .SH "EXAMPLES" \fBNOTE:\fR There is an order to set up accounting associations. You must define clusters before you add accounts and you must add accounts before you can add users. .eo .br -> sacctmgr create cluster tux .br -> sacctmgr create account name=science fairshare=50 .br -> sacctmgr create account name=chemistry parent=science fairshare=30 .br -> sacctmgr create account name=physics parent=science fairshare=20 .br -> sacctmgr create user name=adam cluster=tux account=physics fairshare=10 .br -> sacctmgr delete user name=adam cluster=tux account=physics .br -> sacctmgr delete account name=physics cluster=tux .br -> sacctmgr modify user where name=adam cluster=tux account=physics set maxjobs=2 maxwall=30:00 .br -> sacctmgr add user brian account=chemistry .br -> sacctmgr list associations cluster=tux format=Account,Cluster,User,Fairshare tree withd .br -> sacctmgr list transactions StartTime=11/03\-10:30:00 format=Timestamp,Action,Actor .br -> sacctmgr dump cluster=tux file=tux_data_file .br -> sacctmgr load tux_data_file .br .br A user's account can not be changed directly. A new association needs to be created for the user with the new account. Then the association with the old account can be deleted. .br When modifying an object placing the key words 'set' and the optional 'where' is critical to perform correctly below are examples to produce correct results. As a rule of thumb anything you put in front of the set will be used as a quantifier. If you want to put a quantifier after the key word 'set' you should use the key word 'where'. .br .br wrong-> sacctmgr modify user name=adam set fairshare=10 cluster=tux When changing qos for something only use the '=' operator when wanting to explicitly set the qos to something. In most cases you will want to use the '+=' or '\-=' operator to either add to or remove from the existing qos already in place. .br .br If a user already has qos of normal,standby for a parent or it was explicitly set you should use qos+=expedite to add this to the list in this fashion. .br If you are looking to only add the qos expedite to only a certain account and or cluster you can do that by specifying them in the sacctmgr line. .br -> sacctmgr modify user name=adam set qos+=expedite .br .br > sacctmgr modify user name=adam acct=this cluster=tux set qos+=expedite .br .br Let's give an example how to add QOS to user accounts. List all available QOSs in the cluster. .br .br ->sacctmgr show qos format=name Name .br --------- .br normal .br expedite .br .br List all the associations in the cluster. .br ->sacctmgr show assoc format=cluster,account,qos Cluster Account QOS .br -------- ---------- ----- .br zebra root normal .br zebra root normal .br .br ->sacctmgr show assoc format=cluster,account,qos .br Cluster Account QOS .br -------- -------- ------- .br zebra root normal .br zebra root normal .br zebra g normal .br zebra g1 expedite,normal .br .br Now set the QOS expedite as the only QOS for the account G and display the result. Using the operator = that expedite is the only usable QOS by account G .br .br ->sacctmgr modify account name=G set qos=expedite .br .br >sacctmgr show assoc format=cluster,account,user,qos .br Cluster Account QOS .br --------- -------- ----- .br zebra root normal .br zebra root normal .br zebra g expedite .br zebra g1 expedite,normal .br .br If a new account is added under the account G it will inherit the QOS expedite and it will not have access to QOS normal. .br .br ->sacctmgr add account banana parent=G .br zebra banana expedite .br zebra g1 expedite,normal .br An example of listing trackable resources .br .br ->sacctmgr show tres .br Type Name ID .br ---------- ----------------- -------- .br cpu 1 .br mem 2 .br energy 3 .br node 4 .br billing 5 .br gres gpu:tesla 1001 .br license vcs 1002 .br bb cray 1003 .br .ec .SH "COPYING" Copyright (C) 2008\-2010 Lawrence Livermore National Security. Produced at Lawrence Livermore National Laboratory (cf, DISCLAIMER). .br Copyright (C) 2010\-2016 SchedMD LLC. .LP This file is part of Slurm, a resource management program. For details, see <https://slurm.schedmd.com/>. .LP Slurm is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. .LP Slurm is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
Man(1) output converted with man2html