Parallel Computing
Contents
Parallel Computing
Parallel computing is the use of multiple computing resources to solve a problem. These problems can be very wide ranging, from smaller examples to highly complex cosmological simulations or weather prediction codes. Utilising parallel computing adds additional complexities and challenges to programming. The programmer must consider a wide variety of new concepts and change their mindset from sequential to parallel. Having said that, the world we live in is predominantly parallel and as such it is natural to model problems in this way.
The Problem
Current parallel languages are either conceptually simple or efficient - but not both. These aims have, until this point, been contradictory. If parallel computing is to grow (as we predict with current advances in CPU and GPU technology) then this issue must be addressed. The problem is that we are using current, sequential, ways of thinking to try and solve this programmability problem... instead we need to think "out the box" and come up with a completely new solution.
Current Solutions
There are numerous parallel language solutions currently in existance, we will consider just a few:
Message Passing Interface
The MPI standard is extremly popular within this domain. Although bindings exist for many languages, most commonly it is used with C. The result is low level, highly complex, difficult to maintain BUT efficient code. As the programmer must control all aspects of parallelism they can often get caught up in the low level details which are uninteresting but important. Additionally the programmer is completely responsible for ensuring all communications will complete correctly, or else they run the risk of deadlock, livelock etc...
Bulk Synchronous Parallel
The BSP standard was once touted as the solution to parallel computing. Implementations of this standard are most commonly used in conjuction with C. The program is split into supersteps, each superstep is split into 3 stages - computation, communication and global synchronisation via barriers. However, this synchronisation is very expensive and as such performance of BSP is generally much poorer than that of MPI. In conjuctional, although the communication model adopted by BSP is simpler the programmer must still address low level issues (such as pointers) imposed by the underlying language used.
High Performance Fortran
In HPF the programmer just specifies the general distribution of data, with the compiler taking care of all other aspects of parallelism (such as computation distribution and communication.) Although a simple, abstract language, because much emphasis is placed upon the compiler to deduce parallelism efficiency suffers. The programmer, who is often in a far better position is indicate parallel aspects, lacks control and is limited. One useful feature of HPF is that all parallel aspects are expressed via comments, such that the HPF program is also acceptable to a normal Fortran Compiler
Co-Array Fortran
This language is more explicit than HPF. The programmer, via co-arrays will distribute computation and data but much rely on the compiler to determine communication (which is often one sided.) Because of this one sided communication, messages are often short which results in the overhead of sending many different messages. Having said this, things are improving with reference to CAF, the new upcomming Fortran standard is said to include co-arrays which will see the integration of the CAF concepts into the standard Fortran.