The Compiler

From Mesham
Revision as of 17:31, 19 January 2013 by Polas (talk | contribs) (Compilation in more detail)
Jump to navigationJump to search

Overview

The core translator produces ANSI standard C99 C code which uses the Message Passing Interface (version 2) for communication. Therefore, on the target machine, an implementation of MPI, such as OpenMPI, MPICH or a vendor specific MPI is required and will all work with the generated code. Additionally our runtime library (known as Idaho) needs to be also linked in. The runtime library performs three roles - firstly it is architecture specific (and versions exist for different flavours of Linux) as it contains any none portable code which is needed and is also optimised for specific platforms. Secondly the runtime library contains functions which are often called and would increase the size of generated C code. Lastly, by placing certain functionality in this library means that if one wishes to tune or modify behaviour for a specific platform then it can be done at the library level rather than having to recompile all existing Mesham codes. The standard runtime library requires the Boehm-Demers-Weiser conservative garbage collector (libgc).

Meshamworkflow.png

The resulting executable can be thought of as any normal executable, and can be run in a number of ways. In order to allow for simplicity the user can run their program just with one process, the program will automatically spawn the number of processors required. Secondly the executable can be run with the exact number of processes needed and this may be instigated via a process file or queue submission program. It should be noted that, as long as your MPI implementation supports multi-core (and the majority of them do) then the code can be executed properly on a multi core machine, often with the processes wrapping around the cores (for instance 2 processes on 2 cores is 1 process on each, 6 processes on 2 cores is 3 processes on each etc...)

Whilst earlier versions of the MPICH daemon allowed for the user to simply run their executable and the daemon would pick it up, Hydra which is the latest MPICH process manager, requires you to run it via the mpiexec command. We suggest mpiexec -np 1 ./name, where name is the name of your executable and the code will spawn the necessary number of processes.

Compilation in more detail

The compiler itself is contained within a number of different phases. Firstly, your Mesham code goes through a preprocessor which will expand the directives (such as include) into Mesham code. It is at the preprocessor stage that the standard function libraries are made available to the code if the programmer has included them. The code is then fed into the core compiler which contains the keywords and general rules of the language but does not contain any types. These types exist in a separate library and behaviour is called via an API, from the core compiler into the appropriate types.

Oubliettelandscape.png

The Oubliette core produces non human readable ANSI C99 code as an intermediate representation (IR), which is then fed into an applicable C compiler. This stage is also performed by the compiler - although it is possible to dump out this C code and manually compiler if desired.

Command line options

  • -o [name] Select output filename
  • -I [dir] Include the directory in the preprocessor path
  • -c Output C code only to a file
  • -cc Output C code only to stdout
  • -e Display C compiler errors and warnings also
  • -s Silent operation (no warnings)
  • -summary Produce a summary of compilation
  • -pp Output preprocessed result onto to file
  • -f [args] Forward arguments to the C compiler
  • -static Statically link against the runtime library
  • -shared Dynamically link against the runtime library (default)
  • -env Display compiler environment variable information
  • -h Display compiler help message
  • -v Display compiler version information
  • -vt Display compiler and type version information

Environment variables

The Mesham compiler requires certain environment variables to be set in order to select certain options such as the C compiler and location of dependencies. It is not necessarily required to set all of these - a subset will be fine if that is appropriate to your system.

  • MESHAM_SYS_INCLUDE The location of the mesham function include files, separated via ;
  • MESHAM_INCLUDE The optional location of any additional include files, separated via ;
  • MESHAM_C_COMPILER The C compiler to use, mpicc is a common choice
  • MESHAM_C_INCLUDE The location of header files for the C compiler to include, specifically mesham.h, separated via ;
  • MESHAM_C_LIBRARY The location of libraries for the C compiler to link against, specifically the runtime library, separated via ;

It is common to set these system variables in the bashrc script, which is commonly in your home directory. To do so then something like

export MESHAM_SYS_INCLUDE=/usr/include/mesham
export MESHAM_C_INCLUDE=$HOME/mesham/idaho
export MESHAM_C_LIBRARY=$HOME/mesham/idaho
export MESHAM_C_COMPILER=mpicc

Would set these four variables to those appropriate values, obviously change the values as required.