BSC offers to the HPC community a set of tools that helps developers to program and execute their applications efficiently on distributed computational infrastructures
This release includes some extensions for IO intensive applications: a new type of I/O tasks that can be overlapped with computational tasks, and a storage bandwidth constraint that enable to restrict the number of tasks using intensive I/O at the same time.
This new distribution comes also with the PyCOMPSs-player, a tool to deploy and control a container-based PyCOMPSs/COMPSs environment ready to develop and test PyCOMPSs/COMPS applications.
The Workflows and Distributed Computing team at the Barcelona Supercomputing Center (BSC) is proud to announce a new release, version 2.7 (codename Hyacinth), of the programming environment COMPSs.
This version of COMPSs updates the result of the team’s work in the last years on the provision of a set of tools that helps developers to program and execute their applications efficiently on distributed computational infrastructures such as clusters, clouds and container managed platforms. COMPSs is a task-based programming model known for notably improving the performance of large-scale applications by automatically parallelizing their execution.
COMPSs has been available for the last years for the MareNostrum supercomputer and Spanish Supercomputing Network users, and it has been adopted in several research projects such as EUBra-BIGSEA, MUG, EGI, ASCETIC, TANGO, NEXTGenIO, and mF2C. In these projects, COMPSs has been applied to implement use cases provided by different communities across diverse disciplines as biomedicine, engineering, biodiversity, chemistry, astrophysics and earth sciences. Currently it is also under extension and adoption in applications in the projects CLASS, ELASTIC, ExaQUte, LANDSUPPORT, the BioExcel CoE, and the EXPERTISE ETN, as well as in a research contract with FUJITSU.
The new release comes with specific support for I/O intensive applications with features that enable IO awareness: the new @IO annotation identifies tasks that are basically doing an I/O operation. Those tasks use a very small amount of computation, and therefore can be overlapped with compute intensive tasks. I/O tasks can be of any type supported by COMPSs: regular tasks, tasks invoking external binaries and MPI tasks, which allow having parallel I/O tasks that uses MPI. Another mechanism that can be combined with the I/O task annotation is the storage bandwidth constraint, which gives an indication of the amount of bandwidth required by the task and it is used to limit the maximum number of I/O tasks that run in parallel trying not to exceed the maximum bandwidth of the storage infrastructure. Both mechanisms are used by the COMPSs runtime to better schedule and execute I/O intensive applications.
Another very appealing addition to this release is the PyCOMPSs-player. PyCOMPSs-player is a tool to use PyCOMPSs within local machines interactively through Docker images, enabling a quick installation for training, demos, application development and testing, or for systems where the installation can be cumbersome. Can be easily installed with pip. It comes with a small set of easy commands to initialize (which downloads the Docker image the first time), to start a Jupyter-notebooks server, to start the COMPSs monitor, etc.
COMPSs 2.7 also support a new type of data-dependency based on directories instead of the actual file name. This allows to define data dependencies in applications that use directories to organize multiple files used by a task.
Other extensions are: a new "weight" property for parameters that enables locality-policies to takes into account the size of the parameters; support for MPI+OpenMP hybrid tasks; support for Python type hinting; improvements and optimizations in DDS-2, which supports a Spark-like syntax.
COMPSs 2.7 comes with other minor new features, extensions and bug fixes.
COMPSs had around 1000 downloads last year and is used by around 20 groups in real applications. COMPSs has recently attracted interest from areas such as engineering, image recognition, genomics and seismology, where specific courses and dissemination actions have been performed.
The packages and the complete list of features are available in the Downloads page. A virtual appliance is also available to test the functionalities of COMPSs through a step-by-step tutorial that guides the user to develop and execute a set of example applications.
Additionally, a user guide and papers published in relevant conferences and journals are available.
For more information on COMPSs please visit our webpage: http://www.bsc.es/compss
The Workflow and Distributed Computing team at the Barcelona Supercomputing Center aims to offer tools and mechanisms that enable the sharing, selection, and aggregation of a wide variety of geographically distributed computational resources in a transparent way. The research done in this team is based in the former expertise of the group, and extending it towards the aspects of distributed computing that can benefit from this expertise. The team at BSC has a strong focus on programming models and resource management and scheduling in distributed computing environments.