Performance

We have tested PWmat on workstation with two GPU "GTX980" and as a comparision, we also tested Quantum Espresso self-consistent calculation module "pwscf" on the computing cluster "era" in the Supercomputing Center of Chinese Academy of Sciences. The "era" CPU is the fastest CPU core: Intel Xeon E5-2680 V2 (IvyBridge). The results are listed below:

Relaxation calculation of 148 ZnO slab with an attached dye molecule
pwscf PWmat
FFT grid (n1, n2, n3) (270,72,72) (243,64,60)
energy cutoff (in unit: Ry) 60 50
number of electrons 1040 1040
time spend (in unit: second) 17580 (SCF not converged after 100 scf steps) 20605(76 atom relax steps, 271 sec per step)
hardware configuration 80 cpu cores 2 cpu cores with 2 GPU cards
Self-consistent calculation of 256 GaAs
pwscf PWmat
FFT grid (n1, n2, n3) (100, 100. 50) (98, 98, 49)
energy cutoff (in unit: Ry) 25 25
number of electrons 1024 1024
time spend (in unit: second) 240 63
hardware configuration 60 cpu cores 2 cpu cores with 2 GPU cards
Relaxation of 256 GaAs with large initial distorsions
pwscf PWmat
FFT grid (n1, n2, n3) (100, 100. 50) (98, 98, 49)
energy cutoff (in unit: Ry) 25 25
number of electrons 1024 1024
time spend (in unit: second) SCF not converged after 100 scf iteration steps 882 (39 atom relax steps, 23 seconds per step)
hardware configuration 60 cpu cores 2 cpu cores with 2 GPU cards
Molecular dynamics simulations of 256 GaAs
pwscf PWmat
FFT grid (n1, n2, n3) (100, 100. 50) (98, 98, 49)
energy cutoff (in unit: Ry) 25 25
number of electrons 1024 1024
time spend (in unit: second) 5440 (100 MD steps, 54 seconds per MD step) 2000 (100 MD steps, 20 seconds per MD step)
hardware configuration 60 cpu cores 2 cpu cores with 2 GPU cards
Self-consistent calculation of 512 GaAs
pwscf PWmat
FFT grid (n1, n2, n3) (100, 100. 100) (98, 98, 98)
energy cutoff (in unit: Ry) 25 25
number of electrons 2048 2048
time spend (in unit: second) 1308 239
hardware configuration 100 cpu cores 2 cpu cores with 2 GPU cards
Relaxation of 64 ZnS with amorphous structure
pwscf PWmat
FFT grid (n1, n2, n3) (108, 108, 108) (108, 108, 108)
energy cutoff (in unit: Ry) 60 60
number of electrons 576 576
time spend (in unit: second) 21182(41relax steps, 515 seconds per step) 16575 (100 relax steps, 165 seconds per step)
hardware configuration 60 cpu cores 2 cpu cores with 2 GPU cards
Molecular dynamics calculation of 64 ZnS
pwscf PWmat
FFT grid (n1, n2, n3) (108, 108,108) (108, 108,108)
energy cutoff (in unit: Ry) 60 60
number of electrons 576 576
time spend (in unit: second) 3065 (100 MD steps, 31 s per step) 1757 (100 MD steps, 18 s per step)
hardware configuration 80 cpu cores 2 cpu cores with 2 GPU cards