In the following table [
], timings obtained from no frozen Opt constraints and generic Opt constraints are presented. The generic version is faster for small number of processes and then the trend changes for greater number of processes. However the differences are small, and may simply be due to noise in the timings.
Table:
Timing comparison of different runs with Bak on XT4, XE6 and optimised constraints_shake_vv on XE6
<#999#> |
<#1000#> |
<#1001#> |
Opt constraints |
|
|
16 |
199.154 |
218.722 |
213.314 |
210.336 |
32 |
106.790 |
98.113 |
94.112 |
92.688 |
64 |
63.129 |
57.494 |
51.421 |
53.652 |
128 |
42.036 |
39.360 |
35.527 |
34.627 |
256 |
27.471 |
29.492 |
22.997 |
24.787 |
512 |
22.137 |
25.951 |
20.190 |
21.397 |
|
Table:
Variation rate of different runs of no frozen atoms optimised constraints_shake_vv on XE6 with Bak on XT4 and XE6
<#1037#> |
No frozen Opt constraints comp. to |
|
|
Bak XT4 |
Bak XE6 |
16 |
7.11 |
-2.47 |
|
32 |
-11.87 |
-4.08 |
|
64 |
-18.55 |
-10.56 |
|
128 |
-15.48 |
-9.74 |
|
256 |
-16.29 |
-22.02 |
|
512 |
-8.8 |
-22.20 |
|
Average |
-10.65 |
-11.85 |
|
|
In tables [
] and [
] , the percentage improvements are again presented for the no frozen and generic cases respectively. In every case an improvement is observed. For the no frozen case the best improvement is 22, which is for the 256 and 512 cores runs. The results for the generic version are slightly slower. For this case on average an improvement of 10.26 si observed, with a peak at 17.55 when compared to the vanilla code on the XE6.
Table:
Variation rate of different runs of generic optimised constraints_shake_vv on XE6 with Bak on XT4 and XE6
<#1079#> |
Generic Opt constraints comp. to |
|
|
Bak XT4 |
Bak XE6 |
16 |
5.61 |
-3.83 |
|
32 |
-13.21 |
-5.53 |
|
64 |
-15.01 |
-6.68 |
|
128 |
-17.63 |
-12.02 |
|
256 |
-9.77 |
-15.95 |
|
512 |
-3.34 |
-17.55 |
|
Average |
-8.89 |
-10.26 |
|
|
Valène Pellissier 2011-08-24