Next: Time spent in file
Up: NEMO performance
Previous: Summary of benchmarking study
Contents
Optimal processor count
The results presented in Section 6 suggest that all
future work on NEMO should be carried out using code compiled with the PGI
compiler suite as it gives the lowest runtimes.
The NOCS researchers ideally want to be able to run an entire model year, i.e.
365 model days, in a 12 hour run on HECToR as this enables them to make
optimal use of the machine/queues and also allows them to keep up with the
post-processing and data transfer of the results as the run progresses. They
can currently achieve 300 model days in a 12 hour run
using 221 processors. In this section we investigate whether an optimal
processor count which satisfies the desire to complete a model year in a 12
hour time slot can be found. To do this NEMO is executed over a range of
processors and the number of model days which can be computed in 12 hours,
, is obtained from:-
 |
(1) |
where
is the number of seconds in 12 hours and
is
the time taken to complete a 60 step (i.e. 1 day) run of NEMO. This means
we ideally need
seconds.
The processor count investigated varies from 159 to 430. In all tests
runs have been performed with the land cells removed. The results of this
test are summarised in table 7. Figure
6 shows the results in graphical form with the
365 day threshold marked by the dashed line.
Table 7:
Runtime for 60 time steps for various processor configurations
ranging from 159 to 430.
jpni |
jpnj |
No. of procs |
Time for 60 steps (seconds) |
13 |
14 |
159 |
177.583 |
14 |
14 |
174 |
163.633 |
14 |
15 |
187 |
172.191 |
15 |
15 |
196 |
157.858 |
15 |
16 |
209 |
153.450 |
16 |
16 |
221 |
145.078 |
16 |
17 |
232 |
137.507 |
17 |
17 |
244 |
127.705 |
17 |
18 |
260 |
135.688 |
18 |
18 |
274 |
127.103 |
18 |
19 |
286 |
122.639 |
19 |
19 |
304 |
125.880 |
19 |
20 |
321 |
118.081 |
20 |
20 |
335 |
117.830 |
20 |
21 |
349 |
107.464 |
21 |
21 |
364 |
113.491 |
21 |
22 |
379(380) |
114.175 |
22 |
22 |
398(396) |
107.051 |
22 |
23 |
413 |
123.939 |
23 |
23 |
430(429) |
110.871 |
|
Figure 6:
Investigation of optimal processor count for NEMO subject to
completing a model year within a 12 hour compute run. The dashed line
shows the cut-off point.
|
In performing this investigation some problems were discovered relating
to the computation of land only cells performed by the
nocspmap_r25 code. It was found that several processor
configurations yielded incorrect numbers of land cells. These have been
highlighted in table 7 where the value which was
incorrectly computed is given in ``()'' after the correct number of land
cells. If the wrong number of land cells are specified the code fails with
an error of the form:-
===>>> : E R R O R
===========
Eliminate land processors algorithm
jpni = 21 jpnj = 22
jpnij = 380 < jpni x jpnj
***********, mpp_init2 finds jpnij= 379
Next: Time spent in file
Up: NEMO performance
Previous: Summary of benchmarking study
Contents