Difference between revisions of "Tutorial - Dynamic Parallelism"
Line 1: | Line 1: | ||
<metadesc>Tutorial describing dynamic parallelism in Mesham</metadesc> | <metadesc>Tutorial describing dynamic parallelism in Mesham</metadesc> | ||
− | '''Tutorial number eight''' - [[Tutorial_-_Arrays|prev]] | + | '''Tutorial number eight''' - [[Tutorial_-_Arrays|prev]] :: [[Tutorial_-_Advanced Types|next]] |
== Introduction == | == Introduction == |
Revision as of 15:00, 1 February 2013
Tutorial number eight - prev :: next
Introduction
If you are following these tutorials in order then you could be forgiven for assuming that Mesham requires the programmer to explicitly set the number of processes in their code. This is entirely untrue and, whilst structuring your code around this assumption can lead to cleaner code, Mesham supports a dynamic number of processes which is decided upon at runtime. This tutorial will look at how you can use dynamic parallelism and write your code in this manner.
In its simplest form
#include <parallel> #include <io> #include <string> function void main() { print(itostring(pid())+"\n"); };
Compile the above code and run it with one process, now run it with ten, now with any number you want. See how, even though the code explicitly requires one process, by running with more will just execute that code on all the other processes? There are a number of rules associated with writing parallel codes in this fashion; firstly the number of processes can exceed the required number but it can not be smaller so if our code requires ten processes then we can run it with twenty, one hundred or even one thousand however we can not run it with nine. Secondly the code and data applicable to these extra processes is all variables allocated multiple and all code which is written SPMD style (i.e. outside of par, group, proc and parallel composition.)
A more complex example
So let's have a look at something a bit more complex that involves the default shared memory communication
#include <parallel> #include <io> #include <string> function void main() { var numberProc:=processes(); var s:array[Int, numberProc]::allocated[single[on[0]]]; s[pid()]:=pid(); sync; proc 0 { var i; for i from 0 to processes() - 1 { print(itostring(i)+" = "+itostring(s[i])+"\n"); }; }; };
Compile and run this example with any number of processes and look at how the code will handle us changing this number. There are a couple of general points to make about this code; firstly notice that we are still using the proc parallel construct of Mesham for process selection (which is absolutely fine to do.) We could have instead done something like if (pid()==0) which is entirely up to the programmer.
Next, modify variable s to be on process 2 (and change the proc statement to run on this process also. If you recompile and run this code then it will work fine as long as the number of process is greater than the required number (which is 3.) If you were to try and run the code with 2 processes for example then it will give you an error; the only exception to this is that the usual rule applies that if you run it with one process then Mesham will automatically spawn the required number and run over these. However, this illustration raises and important point - how can we (easily) tell how many processes to use? Happily there are two ways, either compile the code using the -summary flag or run the resulting Mesham executable with the -p flag, which will report how many processes that executable expects to be run over.
Dynamic type arguments
Often, when wanting to write parallel code in this manner, you also want to use flexible message passing constructs. Happily all of the message passing override types such as channel, reduce, broadcast support the provision of arguments which are only known at runtime. Let's have a look at an example to motivate this.
#include <parallel> #include <io> #include <string> function void main() { var a:=pid(); var b:=a+1; var c:=a-1; var c1:Int::allocated[multiple]::channel[a,b]; var c2:Int::allocated[multiple]::channel[c,a]; var t:=0; if (pid() > 0) t:=c2; if (pid() < processes() - 1) c1:=t+a; t:=t+a; if (pid() + 1 == processes()) print(itostring(t)+"\n"); };
The above code is a prefix sums type algorithm, where each process will send to the next one (whose id is one greater than it) its current id plus all of the ids of processes before it. The process with the largest id then displays the total number result which obviously depends on the number of processes used to run the code. One point to note about this is that we can (currently) only use variables and values as arguments to types, for example if you used the function call pid() directly in the channel type then it would give a syntax error. This is a limitation of the Mesham parser and will be addressed in a future release.