Grid computing using Scala

There have been some nice additions to the Scala collections with the parallel versions of standard data structures. These work fine for shared memory architectures, however, means for distributed memory parallelization seem to be very scarce and fragmented.
During a course at my university, I worked on the development of a general framework for distributed parallelization using Scala and MPJ-Express. The idea was to use functional concepts on sequences to mimic/generalize features already found in MPI. Since MPJ-Express is more or less a wrapper around MPI, there were plenty of interesting ideas to try out.

Here is an example of a parallel matrix-multiplication algorithm using the framework:

The algorithm simply creates some matrices, A and B, filling them with random blocks. No data is being initialized until the actual multiplication, due to the use of proxy-matrices. This ensures that only the needed data is expanded on the respective nodes. In the multiplication, jblas was used to ensure maximum peak-rate.

Clearly, the biggest achievements in the framework are the ability to send objects transparently through messages and the ability to use anonymous functions for network operations like reduceMPJ.

The parallelize call simply initializes MPI through MPJ, and allows for distributed operations like mapMPJ and reduceMPJ to be used. Behind the scenes, there is an implicit conversion from Array[T] to SeqMPJ[T], which is a wrapper class from this framework.

Another huge advantage from the framework is the fact that everything is type inferred. This means that, since the computation works as a kind of single program multiple data scheme, the receiving side of any send or collective communication operation knows the type to receive. This means that the programmer is free to think more about solving the problem than dealing with concurrency.

Another interesting example is finding the maximum number in in a segmented list:

Notice that the mapMPJ operation works in parallel for each element in the sequence. Notice that the max-function used in reduceMPJ is user defined, and might as well have been an anonymous function.

There are lots of interesting goals to look at in the future, especially working with non-uniform workloads.


Leave a Reply

Your email address will not be published. Required fields are marked *