Image processing

From Mesham
Revision as of 17:56, 11 January 2010 by Polas (talk | contribs) (Performance)
Jump to navigationJump to search

Overview

This example is one of the more complex examples we have written in the language. It allows the user to perform some parallel image processing on a black and white image. The image processing supported is applying a low or high pass filter on the image. However, to do this the image needs to be transformed into the frequency domain - and then requires transformation back into the time domain. At the core of the example is the FFT kernel, this is a basic Cooley Turkey FFT algorithm and there are more efficient ones out there. Having said that, the type information provided by the programmer allows the compiler to perform a large amount of optimisation during the translation process. By playing around you can change the filters and also invoke the high pass filter rather than the lowpass which the code does at the moment.

Image processing using filters in Mesham

Performance

Performance of the Fast Fourier Transform (FFT) has been evaluated on a super computer cluster. Two different experiments were performed, one with an image size of 128MB and the other with an image size of 2GB. Evaluations were performed against the Fastest Fourier Transformation in the West (FFTW) and a book example for 128MB. As can be seen with an uneven distribution of data (10 and 20 processors) FFTW experiences severe slowdowns whereas the Mesham version does not (the compiler will optimise the code in this case to avoid any slow down.)

Fast Fourier Transformation with 2GB of data
Fast Fourier Transformation with 128MB of data

Source Code

var complex : record["r",Float,"i",Float];
var n:=256; // image size
var m:=4; // number of processors
function void main[]
{
  var a:array[complex,n,n] :: allocated[row[] :: single[on[0]]];
  var s:array[complex,n,n] :: allocated[row[] :: horizontal[m] :: single[evendist[]]];
  var s2:array[complex,n,n] :: allocated[col[] :: horizontal[m] :: single[evendist[]]];
  var s3:array[complex,n,n] :: allocated[row[] :: horizontal[m] :: single[evendist[]] :: share[s2]];
  proc 0
  {
     var orig:="clown.ppm";
     loadfile[orig,a];
     moveorigin[a];
  };
  s:=a;
  var sin:array[complex,n % 2] :: allocated[row[]::multiple[]];
  computesin[sin];
  var p;
  par p from 0 to m - 1
  {
     var i;
     for i from (s#p).low to (s#p).high FFT[((s#p)#i),sin];
  };
  s2:=s;
  par p from 0 to m - 1
  {
     var i;
     for i from (s3#p).low to (s3#p).high FFT[((s3#p)#i),sin];
  };
  a:=s3;
  proc 0
  {
     filter[a];
     invert[a];
  };
  s:=a;
  par p from 0 to m - 1
  {
     var i;
     for i from (s#p).low to (s#p).high FFT[((s#p)#i),sin];
  };
  s2:=s;
  par p from 0 to m - 1
  {
     var i;
     for i from (s3#p).low to (s3#p).high FFT[((s3#p)#i),sin];
  };
  a:=s3;
  proc 0
  {
     moveorigin[a];
     descale[a];
     var res:="result.ppm";
     writefile[res,a];
  };
};

function void computesin[var sinusoid]
{
  sinusoid:array[complex, n % 2] :: allocated[row[] :: multiple[]];
  var j;
  for j from 0 to (n % 2) - 1
  {
     var topass:Float :: allocated[multiple[]];
     topass:=((2 * pi[] * j) % n);
     (sinusoid#j).i:=negsin[topass];
     (sinusoid#j).r:=cos[topass];
  };
};

function void FFT[var data, var sinusoid]
{
  data : array[complex,n] :: allocated[row[] :: multiple[]];
  sinusoid:array[complex, n % 2] :: allocated[row[] :: multiple[]];
  var i2:=log[n];

  bitreverse[data,n]; // data decomposition

  var increvec;
  for increvec from 2 to n // loops to log n stages
  {
     i2:=i2 - 1;
     var i0;
     for i0 from 0 to ((increvec % 2) - 1) // for each frequency spectra in stage
     {
        // below computes the sinusoid for this spectra
        var i1;
        for i1 from 0 to n - 1 // do butterfly for each point in the spectra
        (
           var f0:=((data#(i0 + i1 + (increvec % 2))).r * (sinusoid#(i0 << i2)).r)
                    - ((data#(i0 + i1 + (increvec % 2))).i * (sinusoid#(i0 << i2)).i);
           var f1:=((data#(i0 + i1 + (increvec % 2))).r * (sinusoid#(i0 << i2)).i)
                   + ((data#(i0 + i1 + (increvec % 2))).i * (sinusoid#(i0 << i2)).r);

           (data#(i0 + i1 + (increvec % 2))).r:=(data#(i0 + i1)).r - f0;
           (data#(i0 + i1 + (increvec % 2))).i:=(data#(i0 + i1)).i - f1;
           (data#(i0 + i1)).r := (data#(i0 + i1)).r + f0;
           (data#(i0 + i1)).i := (data#(i0 + i1)).i + f1;

           i1:=(i1 + increvec) - 1;
        };
     };
     increvec:=(increvec * 2) - 1;
  };
};

function void writefile[var thename:String, var data]
{
  data : array[complex,n,n] :: allocated[row[] :: multiple[]];
  var fil:=openfile[thename,"w"];
  writetofile[fil,"P6\\n# CREATOR: LOGS Program\\n"];
  writetofile[fil,n];
  writetofile[fil," "];
  writetofile[fil,n];
  writetofile[fil,"\\n255\\n"];
  var i;
  for i from 0 to n - 1
  {
     var j;
     for j from 0 to n - 1
     {
        var f:=((data#i)#j).r;
        writechartofile[fil,f];
        writechartofile[fil,f];
        writechartofile[fil,f];
     };
  };
  closefile[fil];
};

function void loadfile[var name,var data]
{
  name : String :: allocated[multiple[]];
  data : array[complex,n,n] :: allocated[row[] :: multiple[]];
  var fil:=openfile[name,"r"];
  readline[fil];
  readline[fil];
  readline[fil];
  readline[fil];
  var i;
  for i from 0 to n - 1
  {
     var j;
     for j from 0 to n - 1
     {
        var red:=readchar[fil];
        var green:=readchar[fil];
        var blue:=readchar[fil];
        ((data#i)#j).r:=toInt[red];
        ((data#i)#j).i:=toInt[red];
     };
  };
  closefile[fil];
};

function Int lowpass[var i, var j]
{
  i:Int :: allocated[multiple[]];
  j:Int :: allocated[multiple[]];
  var val:=sqr[i] + sqr[j];
  if (sqrt[val] < 225) return 1;
  return 0;
};

function Int highpass[var i, var j]
{
  i:Int :: allocated[multiple[]];
  j:Int :: allocated[multiple[]];
  var val:=sqr[i] + sqr[j];
  if (sqrt[val] > 190) return 1;
  return 0;
};

function void filter[var data]
{
  data : array[complex,n,n] :: allocated[row[] :: multiple[]];
  var i;
  for i from 0 to n - 1
  {
     var j;
     for j from 0 to n - 1
     {
        ((data#i)#j).r:=((data#i)#j).r * lowpass[i,j];
        ((data#i)#j).i:=((data#i)#j).i * lowpass[i,j];
     };
  };
};

function void moveorigin[var data]
{
  data : array[complex,n,n] :: allocated[row[] :: multiple[]];
  var i;
  for i from 0 to n - 1
  {
     var j;
     for j from 0 to n - 1
     {
        ((data#i)#j).r:=((data#i)#j).r * pow[-1,(i + j)];
        ((data#i)#j).i:=((data#i)#j).i * pow[-1,(i + j)];
     };
  };
};

function void descale[var data]
{
  data : array[complex,n,n] :: allocated[row[] :: multiple[]];
  var i;
  for i from 0 to n - 1
  {
     var j;
     for j from 0 to n - 1
     {
        var xnumy:=((data#i)#j).r;
        xnumy:=xnumy % (n * n);
        ((data#i)#j).r:=xnumy;

        xnumy:=((data#i)#j).i;
        xnumy:=neg[xnumy % (n * n)];
        ((data#i)#j).i:=xnumy;
     };
  };
};

function void invert[var data]
{
  data : array[complex,n,n] :: allocated[row[] :: multiple[]];
  var i;
  for i from 0 to n - 1
  {
     var j;
     for j from 0 to n - 1
     {
        var t:=((data#i)#j).i;
        ((data#i)#j).i:=neg[t];
     };
  };
};

Notes

The algorithm is relatively simple, one drawback is that, even though the transpositions are distributed, after the first 2d FFT all the data must be sent back to process 0 for filtering, and then the data is redistributed. It would improve runtime if we could filter the data without having to collect it all on a central one - this would be an interesting improvement to make to the algorithm.

Note: This example will produce an image in the Portable PixMap format (PPM), viewers of these on Unix based systems are easy to come by (i.e. eye of gnome) but on Windows are slightly more difficult. Windows users might want to rewrite some of the last bit on process 0 so that a BitMaP is created.

Download

You can download the entire Image processing package here