slurm-tracker

Tracks Slurm jobs and uploads them as usage data to the Parallel Works ACTIVATE platform.

Overview

The ACTIVATE platform allows users to access cloud and on-prem compute resources. It includes a budget system for tracking arbitrary usage types such as CORE_HOUR, MEMORY, STORAGE, LICENSE_HOURS, etc.

This program is a custom integration that collects Slurm job statistics (specifically core hours, but expandable to other metrics) and posts usage events to the ACTIVATE platform's budget system.

How It Works

Query Slurm - The program runs sacct to fetch jobs from the past N minutes (configurable via --lookback)
Track Job State - Uses a SQLite database to track which jobs have been reported and how much time has already been reported for running jobs
Calculate Core Hours - For each job, calculates core hours based on allocated CPUs × elapsed time
Map to Allocations - Maps Slurm accounts to ACTIVATE allocations and partitions to SKU codes using a config file
Post Usage Events - Creates usage events in the ACTIVATE platform via API

Incremental Reporting

The program supports incremental reporting for long-running jobs:

Running jobs are reported in chunks (elapsed time since last report)
Completed jobs report any remaining unreported time
The SQLite state file tracks what has already been reported to avoid duplicates

Prerequisites

API Key: You must have a valid ACTIVATE API key to post usage events. See Getting an API Key for instructions.
Slurm: The sacct command must be available and accessible.
Go: Go 1.21+ for building from source.

Installation

Install directly using go install:

go install github.com/parallelworks/slurm-tracker/cmd/slurm-tracker@latest

This will download and install the slurm-tracker binary to your $GOPATH/bin directory.

Alternatively, clone the repository and build from source:

git clone https://github.com/parallelworks/slurm-tracker.git
cd slurm-tracker
go build -o slurm-tracker ./cmd/slurm-tracker

Getting an API Key

To use this program, you need an API key from the ACTIVATE platform.

For detailed instructions on how to generate an api key, see the ACTIVATE API Key Documentation.

Configuration

Config File (config.json)

Create a config.json file (see config.sample.json for reference):

{
  "defaultSku": "standard-compute",
  "defaultAllocation": "default-allocation-oid",
  "partition": [
    {
      "name": "compute",
      "sku": "standard-compute"
    },
    {
      "name": "gpu",
      "sku": "gpu-compute"
    }
  ],
  "account": [
    {
      "name": "research-team",
      "allocation": "allocation-oid-12345"
    },
    {
      "name": "engineering",
      "allocation": "allocation-oid-67890"
    }
  ]
}

Field	Description
`defaultSku`	Fallback SKU code when no partition mapping matches
`defaultAllocation`	Fallback allocation when no account mapping matches
`partition`	Maps Slurm partition names to SKU codes
`account`	Maps Slurm account names to ACTIVATE allocation OIDs

Environment Variables

Variable	Required	Description
`PW_API_KEY`	Yes	API key for authenticating with the ACTIVATE platform. See Getting an API Key.
`PW_PLATFORM_HOST`	No	Override the platform API endpoint

Usage

Note: Ensure you have set the PW_API_KEY environment variable before running. See Getting an API Key.

# Set your API key
export PW_API_KEY="your-api-key-here"

# Basic usage
./slurm-tracker --org my-organization

# With custom lookback window (30 minutes)
./slurm-tracker --org my-organization --lookback 30

# Dry run mode (calculates but doesn't post)
./slurm-tracker --org my-organization --dry-run

# Custom config file location
./slurm-tracker --org my-organization --config /path/to/config.json

# Custom state file location
./slurm-tracker --org my-organization --state-file /var/lib/slurm-tracker/states.db

Command Line Flags

Flag	Default	Description
`--org`	(required)	Organization name in ACTIVATE
`--lookback`	`5`	Minutes to look back for jobs
`--dry-run`	`false`	Calculate and log without posting to API
`--api-key`	`$PW_API_KEY`	API key for authentication
`--api-server`	`$PW_PLATFORM_HOST`	Platform API endpoint
`--state-file`	`./slurm_job_states.db`	SQLite database for tracking job states
`--config`	`config.json`	Path to the configuration file

Running as a Cron Job

To continuously track Slurm usage, run the program periodically via cron.

Important: The cron job needs access to the API key. You can either:

Set PW_API_KEY in the crontab environment

Pass it via the --api-key flag

Source it from a secure file

# Set the API key in crontab environment
PW_API_KEY=your-api-key-here

# Run every 5 minutes
*/5 * * * * /path/to/slurm-tracker --org my-organization --config /etc/slurm-tracker/config.json --state-file /var/lib/slurm-tracker/states.db

Alternatively, use a wrapper script that loads the API key from a secure location:

#!/bin/bash
export PW_API_KEY=$(cat /etc/slurm-tracker/.api-key)
/path/to/slurm-tracker --org my-organization --config /etc/slurm-tracker/config.json --state-file /var/lib/slurm-tracker/states.db

Usage Event Data

Each usage event posted to ACTIVATE includes:

Field	Description
`quantity`	Core hours for the reporting period
`startedAt`	Start time of the reporting period
`endedAt`	End time of the reporting period
`customSKUCode`	SKU code (mapped from Slurm partition)
`metadata.jobId`	Slurm job ID
`metadata.user`	User who submitted the job
`metadata.partition`	Slurm partition
`metadata.cluster`	Slurm cluster name
`metadata.account`	Slurm account
`metadata.qos`	Quality of Service

Extending to Other Metrics

While currently focused on CORE_HOUR, the program can be extended to track other metrics:

Memory Hours: Track memory × time usage
GPU Hours: Track GPU allocation for GPU partitions
Storage: Track job working directory storage usage
License Hours: Track software license usage based on job requirements

To extend, modify the calculateCoreHoursForElapsed function and add appropriate SKU mappings.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
.github/workflows		.github/workflows
cmd/slurm-tracker		cmd/slurm-tracker
internal		internal
.gitignore		.gitignore
.golangci.yml		.golangci.yml
README.md		README.md
config.sample.json		config.sample.json
go.mod		go.mod
go.sum		go.sum

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

slurm-tracker

Overview

How It Works

Incremental Reporting

Prerequisites

Installation

Getting an API Key

Configuration

Config File (config.json)

Environment Variables

Usage

Command Line Flags

Running as a Cron Job

Usage Event Data

Extending to Other Metrics

About

Uh oh!

Releases 2

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

parallelworks/slurm-tracker

Folders and files

Latest commit

History

Repository files navigation

slurm-tracker

Overview

How It Works

Incremental Reporting

Prerequisites

Installation

Getting an API Key

Configuration

Config File (config.json)

Environment Variables

Usage

Command Line Flags

Running as a Cron Job

Usage Event Data

Extending to Other Metrics

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages