cpu vs largemem

For jobs that required a lot of memory and a small number of cores, Vega has a largemem partition.

The partition is specified with the following data:

Physical Partition Slurm Partition Nodes Threads per node (Slurm CPUs**) Memory per node Threads/memory Slurm TRESBillingWeights parameter
CPU cpu, longcpu* 768 256 256 GB 1CPU/1GB CPU=1,Mem=1G
CPU LM largemem 192 256 1024 GB 1CPU/1GB CPU=1,Mem=0.25G

*cpu and longcpu Slurm partitions includes nodes from physical CPU LM partition.

** Hyperthreading is ON, so 1 CPU = 1 Thread in Slurm. For billing, 1 CPU core has 2 threads (Slurm CPU hours are devided by 2 to get core-hours).

For job with following requiraments:

Number of threads: 12
Amount of memory: 32GB
Time: 1h

SBATCH example for cpu partition is:

#!/bin/bash
#SBATCH --job-name=my_job
#SBATCH --partition=cpu
#SBATCH --cpus-per-task=12
#SBATCH --mem=32GB
#SBATCH --time=01:00:00

Billing for running this job on cpu partition using billing is 32 (memory) divided by 2 (threads-core ratio): 16 core-hours.

SBATCH example on largemem partition is the same except parameter: #SBATCH --partition=largemem

#!/bin/bash
#SBATCH --job-name=my_job
#SBATCH --partition=largemem
#SBATCH --cpus-per-task=12
#SBATCH --mem=32GB
#SBATCH --time=01:00:00

Billing for running this job on largemem partition using billing 12 (threads) divided by 2 (Threads-core ratio): 6 core-hours.

This means that sending such jobs gets the desired resources faster, and in terms of billing, such jobs cost less. It is recommended, that users use largemem partition for similar jobs with high ratio.

More information for billig are avaiable on Billing.