Clearly it is desirable to implement an optimisation scheme which will allow the approximate bands to be optimised without the need for an explicit S-orthonormalisation.
In order to ensure only the direct changes to the optimiser were observed, we ran Castep for a fixed density. Typical convergence for a simple magnesium oxide test case using the usual Castep algorithm is:
------------------------------------------------------------------------ <-- SCF
SCF loop Energy Fermi Energy gain Timer <-- SCF
energy per atom (sec) <-- SCF
------------------------------------------------------------------------ <-- SCF
Initial -4.95078326E+003 5.20975146E+001 2.99 <-- SCF
1 -5.59753549E+003 7.89244217E+000 8.08440297E+001 3.90 <-- SCF
2 -5.66226988E+003 7.15740116E+000 8.09179761E+000 4.68 <-- SCF
3 -5.66301246E+003 7.16625993E+000 9.28225593E-002 5.76 <-- SCF
4 -5.66306881E+003 7.16423308E+000 7.04427727E-003 7.01 <-- SCF
5 -5.66306893E+003 7.16423173E+000 1.49140438E-005 8.41 <-- SCF
6 -5.66306893E+003 7.16423137E+000 4.02077714E-007 9.87 <-- SCF
7 -5.66306893E+003 7.16423137E+000 5.27220802E-008 11.01 <-- SCF
8 -5.66306893E+003 7.16423137E+000 1.76063159E-009 11.94 <-- SCF
9 -5.66306893E+003 7.16423137E+000 3.90757352E-010 12.61 <-- SCF
10 -5.66306893E+003 7.16423137E+000 1.33410476E-011 13.10 <-- SCF
11 -5.66306893E+003 7.16423137E+000 5.99380402E-012 13.53 <-- SCF
------------------------------------------------------------------------ <-- SCF
Switching to the RMM-DIIS gave
------------------------------------------------------------------------ <-- SCF
SCF loop Energy Fermi Energy gain Timer <-- SCF
energy per atom (sec) <-- SCF
------------------------------------------------------------------------ <-- SCF
Initial -4.95078326E+003 5.20975146E+001 2.85 <-- SCF
1 -5.59753549E+003 7.89244217E+000 8.08440297E+001 3.69 <-- SCF
2 -5.66226988E+003 7.15740116E+000 8.09179761E+000 4.41 <-- SCF
3 -5.66301246E+003 7.16625993E+000 9.28225593E-002 5.39 <-- SCF
4 -5.66306881E+003 7.16423308E+000 7.04427727E-003 6.55 <-- SCF
5 -5.66306892E+003 7.16423445E+000 1.34939453E-005 7.66 <-- SCF
6 -5.66306891E+003 7.16423645E+000 -1.18066010E-006 8.76 <-- SCF
7 -5.66306893E+003 7.16423715E+000 2.90858466E-006 10.06 <-- SCF
8 -5.66306893E+003 7.16424036E+000 5.70679748E-008 11.11 <-- SCF
9 -5.66306893E+003 7.16424784E+000 -9.96659399E-008 12.21 <-- SCF
10 -5.66306890E+003 7.16426887E+000 -3.48059116E-006 12.98 <-- SCF
11 -5.66306893E+003 7.16499317E+000 3.20356934E-006 13.78 <-- SCF
12 -5.66306893E+003 7.16435180E+000 -1.03932948E-007 14.46 <-- SCF
13 -5.66306893E+003 7.16439686E+000 -1.62527990E-007 15.22 <-- SCF
14 -5.66306892E+003 7.16467568E+000 -2.95401151E-007 15.88 <-- SCF
15 -5.66306891E+003 7.16448445E+000 -1.60189845E-006 16.58 <-- SCF
16 -5.66306891E+003 7.16566473E+000 5.03725090E-008 17.30 <-- SCF
17 -5.66305809E+003 7.16892722E+000 -1.35318489E-003 17.94 <-- SCF
18 -5.66289950E+003 7.17878051E+000 -1.98232364E-002 18.59 <-- SCF
19 -5.66295014E+003 7.20703280E+000 6.33023123E-003 19.23 <-- SCF
20 -5.65353849E+003 7.27226034E+000 -1.17645706E+000 19.82 <-- SCF
------------------------------------------------------------------------ <-- SCF
Even with this small test case there was a slight improvement in the SCF cycle time, but the numerical instabilities caused the solution to diverge eventually. Our modified algorithm proved slightly more stable for this test case, but slower and also showed signs of diverging:
------------------------------------------------------------------------ <-- SCF
SCF loop Energy Fermi Energy gain Timer <-- SCF
energy per atom (sec) <-- SCF
------------------------------------------------------------------------ <-- SCF
Initial -4.95078326E+003 5.20975146E+001 3.25 <-- SCF
1 -5.59753549E+003 7.89244217E+000 8.08440297E+001 5.65 <-- SCF
2 -5.66226988E+003 7.15740116E+000 8.09179761E+000 6.44 <-- SCF
3 -5.66301246E+003 7.16625993E+000 9.28225593E-002 7.51 <-- SCF
4 -5.66306881E+003 7.16423308E+000 7.04427727E-003 8.79 <-- SCF
5 -5.66306892E+003 7.16423445E+000 1.34668025E-005 10.08 <-- SCF
6 -5.66306891E+003 7.16423645E+000 -1.24884522E-006 11.33 <-- SCF
7 -5.66306825E+003 7.16423517E+000 -8.20412820E-005 12.79 <-- SCF
8 -5.66306852E+003 7.16424032E+000 3.31933376E-005 13.98 <-- SCF
9 -5.66306886E+003 7.16424765E+000 4.29159705E-005 15.26 <-- SCF
10 -5.66306888E+003 7.16426863E+000 2.20859478E-006 16.17 <-- SCF
11 -5.66306892E+003 7.16496512E+000 5.82997827E-006 17.23 <-- SCF
12 -5.66306892E+003 7.16434686E+000 -2.59482603E-007 17.99 <-- SCF
13 -5.66306892E+003 7.16439357E+000 -5.01087043E-007 18.80 <-- SCF
14 -5.66306891E+003 7.16466770E+000 -1.12797108E-006 19.56 <-- SCF
15 -5.66306890E+003 7.16447879E+000 -1.59632306E-006 20.42 <-- SCF
16 -5.66306886E+003 7.16561534E+000 -4.04882177E-006 21.16 <-- SCF
17 -5.66306881E+003 7.16881083E+000 -6.32000384E-006 21.89 <-- SCF
18 -5.66306867E+003 7.17899059E+000 -1.85164167E-005 22.64 <-- SCF
19 -5.66306845E+003 7.20738379E+000 -2.73264238E-005 23.33 <-- SCF
20 -5.66306769E+003 7.29973182E+000 -9.40002692E-005 24.05 <-- SCF
------------------------------------------------------------------------ <-- SCF
These results were fairly typical of the performance of these optimisers-it was relatively straightforward to get them close to the groundstate, but difficult to get the accuracy we require. Imposing orthonormality on the updates enabled both methods to converge quickly and robustly, indicating that this poor performance was not a bug, but inherent in the algorithms. We investigated restricted orthonormalisation, whereby only certain directions are projected out, but although this improved matters neither algorithm converged reliably.