sstat: Quick Insights into Slurm Jobs


sstat: Quick Insights into Slurm Jobs

A pragmatic guide to using sstat to view information about running Slurm jobs. Start with simple commands, then mix in parsable output and custom formats as you grow more confident.

1) Quick start: what sstat does

  • sstat displays status information for one or more Slurm jobs.
  • It can show basic fields like JobID and state, or more detailed metrics like CPU and memory usage.
  • Useful in scripts when you want to automate monitoring of a set of jobs.

Typical basic usage:

sstat 12345
sstat -j 12345,12346

2) Core flags (the essentials)

  • -j, —jobs: specify one or more job IDs (comma-separated).
  • -p, —parsable: produce output with a delimiter (pipes) between fields, which is convenient for parsing in scripts.
  • -o, —format: select a custom list of fields to display.
  • -e, —helpformat: show the list of available fields you can request.

Examples:

# Basic status for a single job
sstat -j 67890

# Parsable output for multiple jobs
sstat -p -j 11111,22222

# Custom fields: JobID, AveCPU, AveVMSize
sstat -p -j 33333 -o JobID,AveCPU,AveVMSize

3) Working with —format and —helpformat

  • The —format option lets you tailor the output to the metrics you care about.
  • Common fields include JobID, AveCPU, AveVMSize, NCPUS, Timelimit, and State.
  • If you’re unsure what to request, use —helpformat to list all available fields.
# See all available fields
sstat -e

# A typical advanced format
sstat -p -j 44444 -o JobID,AveCPU,AveVMSize,Elapsed

Note: Available fields depend on SLURM version and job state. If a field is not applicable to a job, it may appear blank or be omitted in parsable mode.

4) Common pitfalls and how to avoid them

  • Pitfall: Running sstat without a Slurm environment or Slurm controls will fail. Ensure you’re on a login node or a node with Slurm client tooling and proper credentials.
    • Fix: Source the correct environment or load the Slurm client package before using sstat.
  • Pitfall: Expecting human-friendly tables by default. sstat’s default output is not always tidy for scripting.
    • Fix: Use -p/—parsable or -o to constrain fields and make parsing predictable.
  • Pitfall: Mis-understanding time formats. Fields like Elapsed or Timelimit are sometimes in formats that require interpretation (e.g., D-HH:MM:SS).
    • Fix: Reference —helpformat to understand how each field is populated and displayed.
  • Pitfall: Working with a long list of jobs. Parsable mode helps, but extremely large outputs can still be unwieldy.
    • Fix: Filter with —jobs first, or script to process per-job data in chunks.

5) Common workflow snippets

Quick check: status for a few recent jobs

sstat -j 10001,10002,10003

Parsable with a focused set of fields

sstat -p -j 10001 -o JobID,AveCPU,AveVMSize

See what fields are available

sstat -e

Combine with other Slurm tools (example: match on a specific state)

This is a crude example; adapt to your environment

sstat -p -j 10001 | awk -F’|’ ‘$3 ~ /COMPLETED/ {print $1, $2, $3}‘

6) When to use sstat vs sstat-like tools

  • sstat is ideal when you want a quick human-friendly or script-friendly view of running jobs.
  • For historical data, Slurm accounting tools (sacct) may be more appropriate.
  • For live dashboards, combine sstat with a monitoring script and a lightweight terminal-based visualization.

7) Quick reference

  • Basic status: sstat -j <job_id>
  • Parsable multi-job: sstat -p -j <job_id1>,<job_id2>
  • Custom fields: sstat -p -j <job_id> -o JobID,AveCPU,AveVMSize
  • Help (fields): sstat -e

8) Further reading

Enjoy quicker insight into your jobs with focused, repeatable sstat commands.

See Also