HPC frameworks provides the user with both the capability to properly exploit the computational resources of an ever-evolving HPC hardware ecosystem and the required robustness to harness the complexity of modern distributed systems.
Summary
Nowadays simulations for several scientific and engineering fields require both high computing performance and a huge quantity of memory. In order to take advantage of supercomputers' power, one possible key is the parallelisation of the computing by means of adapting the process itself to some of the features of the problems to be solved. For instance, one of the most commonly adopted strategies is the domain decomposition.
Domain decomposition, which is followed by the framework implementations mentioned below, fits in cases where the data is a description of quantities in an n-dimensional space, like in Finite Difference Methods. This allows us to split the problem into independent subproblems keeping, however, some coupling.
The main objective of HPC frameworks is to efficiently settle such strategies in current HPC architectures. This entails the combining of several approaches, and then the addressing of complexities arising in software design, such as:
- Use protocols such as MPI, building a communication system to deal with coupling needs in distributed memory systems
- Decompose processing over current computing nodes that host accelerators, like GPUs
- Make the most of shared memory systems by building a multithreading system where some threads are task-dedicated to, for example, asynchronous I/O or parallelizing computing blocks, with some interfaces such as OpenMP being aware of specific architecture characteristics (like NUMA) at the same time
We aim always to maintain a certain degree of discipline in order to ease the development and portability of applications.
Currently, two implementations of this kind of framework are in practice: in BSIT, a geophysical imaging system, and WARIS, focused in atmospheric transport modelling.
Objectives
- Develop HPC frameworks to supply structure and functionalities to HPC application sets
- Provide reliability in terms of parallel computing for the development of HPC applications
- Encourage and promote discipline within development, emphasising modularity and reusability, in order to guarantee quality throughout the software development life cycle.