Nanos++ provides services to support task parallelism using synchronizations based on data-dependencies. Data parallelism is also supported by means of services mapped on top of its task support. Task are implemented as user-level threads when possible (currently x86,x86-64,ia64,ppc32 and ppc64 are supported).
Nanos++ also provides support for maintaining coherence across different address spaces (such as with GPUs or cluster nodes). It provides software directory and cache modules to this end.
The main purpose of Nanos++ is to be used in research of parallel programming environments. As such it is designed to be extensible by means of plugins. Currently, runtime plugins can be added (and selected for each execution) for:
- Task scheduling policy
- Thread barrier
- Device support
- Instrumentation formats
- Dependences approach
- Throttling policies
Several of such plugins are already available from the library distribution: different scheduling and throttling policies. In particular, device support for CUDA tasks and execution in a Cluster environment. The cluster support is still not public but if your interested you can contact us.
Nanos++ comes with support for instrumentation with the Extrae library that allows to obtain traces for performance analysis with Paraver visualization tool. Using the instrumentation plugin you can also create the dependence graph of your application. Such kind of tools provides you with valuable information to better understand your application characteristics.
It is not intended for programmers to write applications calling Nanos++ directly, the preferred way is to use it through one of the supported programming models. To do that you will probably also need to install our Mercurium compiler.