Difference between pages "Dartboard PI" and "Parallel Computing"

From Mesham
(Difference between pages)
Jump to navigationJump to search
(Download)
 
 
Line 1: Line 1:
== Overview ==
+
== Parallel Computing ==
  
[[Image:dartboard.jpg|thumb|260px|right|Dartboard method to find PI]]
+
Parallel computing is the use of multiple computing resources to solve a problem. These problems can be very wide ranging, from smaller examples to highly complex cosmological simulations or weather prediction codes. Utilising parallel computing adds additional complexities and challenges to programming. The programmer must consider a wide variety of new concepts and change their mindset from sequential to parallel. Having said that, the world we live in is predominantly parallel and as such it is natural to model problems in this way.
The dartboard method to find PI is a simple algorithm to find the value of PI. At this point it must be noted that there are much better methods out there to find PI, however the dartboard method is embarasingly parallel and as such quite simple to parallelise. The basic premise is that you can throw n darts randomly at a round dartboard on a square backing. As each dart is thrown randomly the ratio of darts hitting the board to those landing on the square is equal to the ratio between the two areas - which is PI / 4. Of course, the more darts you simulate throwing at the board, the better the approximation of PI - in our example each process will perform this simulated throwing a number of times, and then each process's approximation of PI is combined and averaged by one of the processes to obtain the result. Very ruffly, this means that with d darts, thrown over r rounds on n processes, the time taken parallely is the time it takes to simulate throwing d * r darts, yet a sequential algorithm would need to simulate throwing d * r * n darts. (We have excluded the consideration of communication costs from the parallel situation to simplify the concept.) Hopefully quite obviously, in the example by changing the number of processes, the number of rounds and the number of darts to throw in each round will directly change the accuracy of the result.
 
  
== Source Code ==
+
== The Problem ==
  
var m:=10; // number of processes
+
Current parallel languages are either conceptually simple or efficient - but not both. These aims have, until this point, been contradictory. If parallel computing is to grow (as we predict with current advances in CPU and GPU technology) then this issue must be addressed. The problem is that we are using current, sequential, ways of thinking to try and solve this programmability problem... instead we need to think "out the box" and come up with a completely new solution.
var pi:array[Double,m,1]:: allocated[row[] :: horizontal[m] :: single[evendist[]]];
 
var result:array[Double,m] :: allocated[single[on[0]]];
 
var mypi:Double;
 
mypi:=0;
 
var p;
 
par p from 0 to m - 1
 
{
 
  var darts:=1000; // number of darts to simulate throwing each round
 
  var rounds:=100; // number of rounds of darts to throw
 
  var i:=0;
 
  for i from 0 to rounds
 
  {
 
      mypi:= mypi + (4 * (throwdarts[darts] % darts));
 
  };
 
  ((pi#p)#0):=(mypi % rounds);
 
};
 
result:=pi;
 
proc 0
 
{
 
  var avepi:Double;
 
  avepi:=0;
 
  var j:=0;
 
  for j from 0 to m - 1
 
  {
 
      var y:=(result#j);
 
      avepi:=avepi + y;
 
  };
 
  avepi:=avepi % m;
 
  print["PI = ",avepi,"\n"];
 
};
 
 
function Int throwdarts[var darts]
 
{
 
  darts: Int :: allocated[multiple[]];
 
  var score:=0;
 
  var n:=0;
 
  for n from 0 to darts
 
  {
 
      var r:=randomnumber[0,1]; // random number between 0 and 1
 
      var xcoord:=(2 * r) - 1;
 
      r:=randomnumber[0,1]; // random number between 0 and 1
 
      var ycoord:=(2 * r) - 1;
 
      if ((sqr[xcoord] + sqr[ycoord]) < 1)
 
      {
 
        score:=score + 1; // hit the dartboard!
 
      };
 
  };
 
  return score;
 
};
 
  
== Notes ==
+
== Current Solutions ==
  
An interesting aside is that we have used a function in this example, yet there is no main function. The throwdarts function will simulate throwing the darts for each round. As already noted in the language documentation, the main function is optional and without it the compiler will set the program entry point to be the start of the source code.
+
There are numerous parallel language solutions currently in existance, we will consider just a few:
== Download ==
 
  
The dartboard method to compute PI source code is located [http://www.mesham.com/downloads/pi.mesh here]
+
=== Message Passing Interface ===
 +
 
 +
The MPI standard is extremly popular within this domain. Although bindings exist for many languages, most commonly it is used with C. The result is low level, highly complex, difficult to maintain BUT efficient code. As the programmer must control all aspects of parallelism they can often get caught up in the low level details which are uninteresting but important. Additionally the programmer is completely responsible for ensuring all communications will complete correctly, or else they run the risk of deadlock, livelock etc...
 +
 
 +
=== Bulk Synchronous Parallel ===
 +
 
 +
The BSP standard was once touted as the solution to parallel computing. Implementations of this standard are most commonly used in conjuction with C. The program is split into supersteps, each superstep is split into 3 stages - computation, communication and global synchronisation via barriers. However, this synchronisation is very expensive and as such performance of BSP is generally much poorer than that of MPI. In conjuctional, although the communication model adopted by BSP is simpler the programmer must still address low level issues (such as pointers) imposed by the underlying language used.
 +
 
 +
=== High Performance Fortran ===
 +
 
 +
In HPF the programmer just specifies the general distribution of data, with the compiler taking care of all other aspects of parallelism (such as computation distribution and communication.) Although a simple, abstract language, because much emphasis is placed upon the compiler to deduce parallelism efficiency suffers. The programmer, who is often in a far better position is indicate parallel aspects, lacks control and is limited. One useful feature of HPF is that all parallel aspects are expressed via comments, such that the HPF program is also acceptable to a normal Fortran Compiler
 +
 
 +
=== Co-Array Fortran ===
 +
 
 +
This language is more explicit than HPF. The programmer, via co-arrays will distribute computation and data but much rely on the compiler to determine communication (which is often one sided.) Because of this one sided communication, messages are often short which results in the overhead of sending many different messages. Having said this, things are improving with reference to CAF, the new upcomming Fortran standard is said to include co-arrays which will see the integration of the CAF concepts into the standard Fortran.

Revision as of 13:02, 11 January 2010

Parallel Computing

Parallel computing is the use of multiple computing resources to solve a problem. These problems can be very wide ranging, from smaller examples to highly complex cosmological simulations or weather prediction codes. Utilising parallel computing adds additional complexities and challenges to programming. The programmer must consider a wide variety of new concepts and change their mindset from sequential to parallel. Having said that, the world we live in is predominantly parallel and as such it is natural to model problems in this way.

The Problem

Current parallel languages are either conceptually simple or efficient - but not both. These aims have, until this point, been contradictory. If parallel computing is to grow (as we predict with current advances in CPU and GPU technology) then this issue must be addressed. The problem is that we are using current, sequential, ways of thinking to try and solve this programmability problem... instead we need to think "out the box" and come up with a completely new solution.

Current Solutions

There are numerous parallel language solutions currently in existance, we will consider just a few:

Message Passing Interface

The MPI standard is extremly popular within this domain. Although bindings exist for many languages, most commonly it is used with C. The result is low level, highly complex, difficult to maintain BUT efficient code. As the programmer must control all aspects of parallelism they can often get caught up in the low level details which are uninteresting but important. Additionally the programmer is completely responsible for ensuring all communications will complete correctly, or else they run the risk of deadlock, livelock etc...

Bulk Synchronous Parallel

The BSP standard was once touted as the solution to parallel computing. Implementations of this standard are most commonly used in conjuction with C. The program is split into supersteps, each superstep is split into 3 stages - computation, communication and global synchronisation via barriers. However, this synchronisation is very expensive and as such performance of BSP is generally much poorer than that of MPI. In conjuctional, although the communication model adopted by BSP is simpler the programmer must still address low level issues (such as pointers) imposed by the underlying language used.

High Performance Fortran

In HPF the programmer just specifies the general distribution of data, with the compiler taking care of all other aspects of parallelism (such as computation distribution and communication.) Although a simple, abstract language, because much emphasis is placed upon the compiler to deduce parallelism efficiency suffers. The programmer, who is often in a far better position is indicate parallel aspects, lacks control and is limited. One useful feature of HPF is that all parallel aspects are expressed via comments, such that the HPF program is also acceptable to a normal Fortran Compiler

Co-Array Fortran

This language is more explicit than HPF. The programmer, via co-arrays will distribute computation and data but much rely on the compiler to determine communication (which is often one sided.) Because of this one sided communication, messages are often short which results in the overhead of sending many different messages. Having said this, things are improving with reference to CAF, the new upcomming Fortran standard is said to include co-arrays which will see the integration of the CAF concepts into the standard Fortran.