The HECToR Service is now closed and has been superceded by ARCHER.

SPRINTing with HECToR

This Distributed Computational Science and Engineering (dCSE) project was to develop SPRINT, which is an addon package for the R language and environment for statistical computing and graphics. The Simple Parallel R INTerface (SPRINT) offers both a parallel functions library and an interface for adding parallel functions to R.

The key aims of this project are:

  • Port SPRINT to HECToR and implement the parallel Pearson’s correlation function and the parallel permutation testing function.
  • Optimise the existing parallel implementation of Pearson’s correlation function (pcor) by updating the use of single processor I/O with an adaptation to use parallel I/O techniques.
  • Implement the permutation testing function (mt.maxT) in parallel (i.e. pmaxT).

The individual achievements of the project are summarised below:

  • An installation guide on how to compile SPRINT on HECToR was written and is available here.
  • The performance of the parallel correlation function (pcor) now scales for up to 512 processes. Originally, all results were gathered on and written by the master process. By using the underlying high performance Lustre filesystem the results are now distributed among all processes and written into the file with MPI-I/O.
  • The permutation testing function (mt.maxT) was parallelised to give pmaxT. The parallelism is introduced by dividing the permutation count equally to the available processes. Each process gathers a few of the observations and at the end all partial observations are reduced on the master process. Using this information the p-values are computed.
  • Based on the benchmarks performed on the HECToR XT4 system, both functions are now able to scale close to optimal for process counts up to 512. Statisticians can now use the parallel versions of these functions to process their large data sets and also get results within reasonable run times.
  • The work performed under this dCSE project was presented at HPDC 2010 Workshop, ACM International Symposium, useR! 2010.

Please see PDF or HTML for a report which summarises this project.