HECToR

VASP Benchmark Results

We have run a number of VASP use cases and investigated the parallel scaling and effect of VASP runtime parameters on performance on HECToR. This page summarises the results to allow users to try and select the optimal configuration for running their VASP calculations.

The full set of benchmarking results are also available:

Full set of VASP benchmark results

Runtime parameters

Our benchmarks look at the variation of VASP performance as a function of number of MPI tasks and also with the variation of the following VASP runtime parameters which can be set by the user in the INCAR file:

NPAR - Changes the balance between parallelisation over bands and over plane waves. For exact-exchange calculations this parameter is fixed at the number of MPI tasks.
NSIM - Changes the number of bands treated simultaneously (effectively uses either matrix-vector or matrix-matrix multiplications).
LPLANE - Changes the parallel decomposition of the 3D FFT. .TRUE. leads to a slab decomposition where two dimensions of the FFT are on the same MPI task (reduces collective communications but can introduce load imbalance or limit parallelism). .FALSE. leads to a pencil decomposition (more collective communications but more potential for parallelism and load-balancing).

We also investigated the effect up underpopulating HECToR compute nodes and using one core per Bulldozer module. This has the effect of increasing the memory and interconnect bandwidth available to a single MPI task and also giving each MPI task exclusive access to the double-width floating-point unit on the processor.

Benchmarks

We used the following benchmarks:

TiO₂ Supercell - 750 atoms, pure DFT, Γ-point, 6 electronic minimisation steps.
LiZnO - 64 atoms, pure DFT, Γ-point, single-point energy.
LiZnO - 64 atoms, exact-exchange, Γ-point, single-point energy.

Results

TiO₂ Supercell

Nodes	MPI Tasks per Node	MPI Tasks per Die	Stride	NPAR	LPLANE	NSIM	Time / s	Scaling
4	128	32	8	1	8	.TRUE.	8	1827.8	1.0
8	256	32	8	1	16	.TRUE.	1	1050.6
16	512	32	8	1	32	.TRUE.	8	662.4
32	512	16	4	2	32	.FALSE.	1	465.5

LiZnO Exact Exchange

Nodes	MPI Tasks per Node	MPI Tasks per Die	Stride	NPAR	LPLANE	NSIM	Time / s	Scaling
1	32	32	8	1	32	.TRUE.	8	1357.1	1.0
2	64	32	8	1	64	.TRUE.	1	1010.8
4	128	32	8	1	128	.TRUE.	1	967.8
8	128	16	4	2	256	.TRUE.	1	612.3

Fe FCC Supercell

Nodes	MPI Tasks per Node	MPI Tasks per Die	Stride	NPAR	LPLANE	NSIM	Time / s	Scaling
4	128	32	8	1	16	.TRUE.	16	1308.8	1.0
8	128	16	4	2	16	.TRUE.	16	857.1
16	256	16	4	2	16	.TRUE.	8	622.9

Main web site navigation

VASP Benchmark Results

Runtime parameters

Benchmarks

Results

TiO₂ Supercell

LiZnO Exact Exchange

Fe FCC Supercell

In this section

Apply to ARCHER

Current Service Status

HECToR

Main web site navigation

VASP Benchmark Results

Runtime parameters

Benchmarks

Results

TiO2 Supercell

LiZnO Exact Exchange

Fe FCC Supercell

In this section

Apply to ARCHER

Current Service Status

TiO₂ Supercell