Tutorial - Arrays

From Mesham
Jump to navigationJump to search

Tutorial number seven - prev :: next

Introduction

An array is a collection of element data in one or more dimensions and is a key data structure used in numerous codes. In this tutorial we shall have a look at how to create, use and communicate arrays.

Simple arrays

function void main() {
   var a:array[Int,10];
};

The above code will declare variable a to be an array of ten Ints which are indexed 0 to 9 inclusively. In the absence of further information a set of default types will be applied which are; heap, onesided, row, allocated, multiple. Arrays, when allocated to the heap, are subject to garbage collection which will remove them when no longer used.

#include <io>
#include <string>

function void main() {
   var a:array[Int,10];
   var i;
   for i from 0 to 9 {
      a[i]:=i;
   };
   for i from 0 to 9 {
      print(itostring(a[i]));
   };
};

The code snippet demonstrates writing to and reading from elements of an array, if you compile and run this code then you will see it displays values 0 to 9 on standard output. We can access an element of an array (for reading or writing) via the [x] syntax, where x is either an Int constant or variable.

Arrays and functions

#include <io>
#include <string>

function void main() {
   var a:array[Int,10];
   fill(a);
   display(a);
};

function void fill(var a:array[Int,10]) {
   var i;
   for i from 0 to 9 {
      a[i]:=i;
   };
};

function void display(var a:array[Int]) {
   var i;
   for i from 0 to 9 {
      print(itostring(a[i]));
   };
};

This code demonstrates passing arrays into functions and there are a couple of noteworthy points to make here. First, because an array is, by default, allocated to the heap, as discussed in the functions tutorial, this is pass by reference. Hence modifications made in the fill function do affect the original data allocated in the main function, which is what we want here. Secondly, see that the type we provide to the display function does not have any explicit size associated with the array? It is not always possible to know the size of an array that is being passed into a function, so Mesham allows for the type of a function argument to be specified with a size but with two restrictions; first it must be a one dimensional array and secondly no compile time bounds checking can take place.

Multi dimensional arrays

Arrays can be any number of dimensions just by adding extra bounds into the type declaration:

function void main() {
   var a:array[Int,16,8];
   a[0][1]:=23;
};

This code illustrates declaring variable a to be an array of two dimensions; the first of size 16 and the second 8. By default all allocation of arrays is row major although this can be overridden. Line three illustrates writing into an element of a two dimensional array.

Communication of arrays

Arrays can be communicated entirely, per dimension or by individual elements.

#include <io>
#include <string>

function void main() {
   var a:array[Int,16,8]::allocated[single[on[1]]];
   proc 0 {
      a[0][1]:=28;
   };
   sync;
   proc 1 {
      print(itostring(a[0][1])+"\n");
   };
};

In this example process 0 writes to the (remote) memory of process 1 which contains the array, synchronisation occurs and then the value is displayed by process 1 to standard output.

Communicating multiple dimensions

#include <io>
#include <string>

function void main() {
   var a:array[Int,16,8]::allocated[single[on[1]]];
   proc 0 {
      var i;
      for i from 0 to 7 {			
         a[2][i]:=i;
      };
   };
   sync;
   proc 1 {
      var i;
      for i from 0 to 7 {
         print(itostring(a[2][i])+"\n");
      };
   };
};

Compile and run this code - look at the output, is it just a list of the value 8, not what you expected? Well in this example the values copied across may be any number between 0 and 8 because at each assignment a[2][i]:=i; we are setting the remote value of a at this specific index to be the value held in i. However, this communication does not guarantee to complete until the synchronisation and at that point the value of i is 8 (the loop iterates up to and including 7, after which i is incremented but found to be too large and the loop ceases.) It is something to be aware of - the value of a variable being remotely written matters until after the corresponding synchronisation.

There are a number of ways in which we could change this code to make it do what we want, the easiest is to use a temporary variable allocated on the heap (and will be garbage collected after the synchronisation.) To do this, replace the proc 0 block with:

proc 0 {
   var i;
   for i from 0 to 7 {
      var m:Int::heap;
      m:=i;
      a[2][i]:=m;
   };
};

This is an example of writing into remote memory of a process and modifying multiple indexes of an array (in any dimension.)

Communicating entire arrays

#include <io>
#include <string>

function void main() {
   var a:array[Int,20]::allocated[single[on[1]]];
   var b:array[Int,20]::allocated[single[on[2]]];
   proc 1 {
      var i;
      for i from 0 to 19 {			
         a[i]:=1;
      };
   };
   b:=a;
   sync;
   proc 2 {
      var i;
      for i from 0 to 19 {
         print(itostring(b[i])+"\n");
      };
   };
};

This code example demonstrates populating an array held on one process, assigning it in its entirety to an array on another process (line 13), synchronising and then the other process reading out all elements of that target array which has just been remotely written to.

Row and column major

By default arrays are row major allocated using the row type. This can be overridden to column major via the col type.

function void main() {
   var a:array[Int,16,8]::allocated[col::multiple];
};

will allocate array a to be an Int array of 16 by 8, allocated to all processes using column major memory allocation.

For something more interesting let's have a look at the following code:

#include <io>
#include <string>

function void main() {
   var a:array[Int,16,8];
   var i;
   var j;
   for i from 0 to 15 {
      for j from 0 to 7 {
         a[i][j]:=(i*10) + j;
      };
   };
   print(itostring(a::col[][14][7]));
};

By default variable a is row major allocated and we are filling up the array in this fashion. However, in the print statement we are accessing the indexes of this array in a column major fashion. Try changing col to row or remove it altogether to see the difference in value. Behind the scenes the types are doing to appropriate memory look up based upon their meaning and the indexes provided. Mixing memory allocation in this manner can be very useful for array transposition amongst other things. Exercise: Experiment with the col and row types and also see what effect it has placing them in the type chain of a like in the previous example.