Use of the MPI proxy imposes a performance penalty on any MPI communication to and from the express process. Figure shows execution times for various tests using the proxy and the standard Cray MPI layer. A Cray MPI executable was used entirely on backend nodes. The same tests were compiled against the XPMT replacement library so that what was rank 0 now runs on the login node and communicates via the xpnode proxy processes (which becomes rank 0 in the MPI job). The remaining ranks are standard Cray MPI executables. This matches exactly the changes made to AVS/Express. The tests perform common communication calls (MPI_Bcast, MPI_Gather, MPI_Send, MPI_Recv, MPI_Isend, MPI_Irecv, MPI_Waitall) that are all found in AVS/Express. They each send or receive using an array of MPI_INTs where the size of the array increases as: 64, 128, 256, 512, 1024, 1024, 21024 and 81024 (the Gather test uses array sizes up to 512). The graphs show the total time for all array sizes used for each test. The use of the MPI proxy can slow communication by a factor of three, approximately, in some cases. Within AVS/Express most of the messages from the express process are less than 1K in size. The largest message sent back to the express process from the compute or render nodes is usually that containing the final composited rendered image, which for a 512512 window is 1Mb (assuming 4 bytes per pixel). Given that AVS/Express is an interactive application (rather than a numerical simulation) we find this performance penalty acceptable.
|