BSC releases new version of PyCOMPSs/COMPSs with new features towards energy monitoring and better interactivity

14 November 2023

This COMPSs release comes with support for the Energy Aware Runtime (EAR) to obtain energy profiles in Python-based applications.

The interactive execution of applications has been enhanced with a new Jupyter kernel and JupyterLab extension to manage PyCOMPSs in the Jupyter ecosystem

Another addition is the support for dynamic user-defined constraints for the tasks.

The new dislib release includes new algorithms for Random Forest regressor and for the truncated SVD.

The Workflows and Distributed Computing team at the Barcelona Supercomputing Center (BSC-CNS) has launched a new release, version 3.3 (codename Orchid), of the programming environment COMPSs. This version of COMPSs updates the result of the team’s work in the last years on the provision of a set of tools that helps developers to program and execute their applications efficiently on distributed computational infrastructures such as clusters, clouds and container managed platforms. COMPSs is a task-based programming model known for notably improving the performance of large-scale applications by automatically parallelizing their execution.

COMPSs has been available for the last years for the MareNostrum supercomputer and Spanish Supercomputing Network users, and it has been adopted in several recent research projects such as mF2C, CLASS, ExaQUte, ELASTIC, the BioExcel CoE, LANDSUPPORT, the EXPERTISE ETN, in the Edge Twins HPC FET Innovation Launchpad project and in sample use cases of the ChEESE CoE. In these projects, COMPSs has been applied to implement use cases provided by different communities across diverse disciplines as biomedicine, engineering, biodiversity, chemistry, astrophysics, financial, telecommunications, manufacturing and earth sciences. Currently, it is also under extension and adoption in applications in the European projects AI-SPRINT, PerMedCoE, CAELESTIS, DT-GEO, ICOS, the CEEC CoE as well as in the Spanish funded project HP2C-DT. A special mention is the eFlows4HPC project coordinated by the group that aims to develop a workflow software stack where one of the main components is the PyCOMPSs/COMPSs environment.

While Jupyter notebooks have been supported in PyCOMPSs since version 2.4, the current version comes with extended support and a JupyterLab extension. The extension allows to switch on and off the COMPSs runtime and provides help to automatically generate the Python decorators, making the development of PyCOMPSs applications easier. The extension also provides a graphical extension that enables the interactive monitoring of the applications, showing the application task graph and application progress.

COMPSs 3.3 comes also with integration with the Energy Aware Runtime (EAR) to obtain energy profiles in python-based applications. EAR software is a management framework optimizing the energy and efficiency of a cluster of interconnected nodes. To improve the energy of the cluster, EAR provides energy control, accounting, monitoring and optimization of both the applications running on the cluster and of the overall global cluster. The integration with COMPSs supports now the energy management for COMPSs workflows which was not supported before.

COMPSs tasks can be annotated with a constraint that indicates a hardware or software requirement. For example, that the task needs to run with a given number of cores or with a given amount of memory. However, in previous versions, this constraint was fixed for all tasks of a type in an execution of an application. A new feature in version 3.3 supports the dynamic change of these constraints based on global variable values.

COMPSs 3.3 comes with other minor new features, extensions and bug fixes.

COMPSs had around 1000 downloads last year and is used by around 20 groups in real applications. COMPSs has recently attracted interest from areas such as engineering, image recognition, genomics and seismology, where specific courses and dissemination actions have been performed.

The packages and the complete list of features are available in the Downloads page. A Docker image is also available to test the functionalities of COMPSs through a step-by-step tutorial that guides the user to develop and execute a set of example applications.

Additionally, a user guide and papers published in relevant conferences and journals are available.

For more information on COMPSs please visit our webpage: http://www.bsc.es/compss

 

dislib new version

The group has also launched the new release of dislib 0.9.0. The Distributed Computing Library (dislib) provides distributed algorithms ready to use as a library. So far, dislib focuses on machine learning algorithms, and with an interface inspired by scikit-learn. The main objective of dislib is to facilitate the execution of big data analytics algorithms in distributed platforms, such as clusters, clouds, and supercomputers. Dislib has been implemented on top of PyCOMPSs programming model, Python binding of COMPSs.

Dislib is based on a distributed data structure, ds-array, that enables the parallel and distributed execution of the machine learning methods. The dislib library code is implemented as a PyCOMPSs application, where the different methods are annotated as PyCOMPSs tasks. At execution time, PyCOMPSs takes care of all the parallelization and data distribution aspects. However, the final dislib user code is unaware of the parallelization and distribution aspects, and is written as simple Python scripts, with an interface very similar to scikit-learn interface. Dislib includes methods for clustering, classification, regression, decomposition, model selection and data management. A research contract with FUJITSU had partially funded the dislib library and was used to evaluate the A64FX processor. Currently, the dislib developments are funded by the H2020 AI-Sprint project and by the EuroHPC eFlows4HPC project.

Since its recent creation, dislib has been applied in use cases of astrophysics (DBSCAN, with data of the GAIA mission), molecular dynamic workflows (Daura and PCA, BioExcel CoE). In the eFlows4HPC project, it is being applied in multiple use cases: in urgent computing for natural hazards (Random Forest regressors), in digital twins for manufacturing (SVD) and in distributed training of neural networks. In the AI-SPRINT project a personalized healthcare on atrial fibrillation detection is implemented using the Random Forest algorithm.

The release 0.9.0 includes two new versions of the Random Forest regressor following a data-parallel approach: one based on the use of the PyCOMPSs task failure management mechanism and a second one using the PyCOMPSs nesting paradigm where the parallel tasks can generate other tasks within them. It also includes two new SVD algorithms: the RandSVD and the LancSVD. Both implement the truncated SVD, however the RandSVD implements it by means of a randomised algorithm and LancSVD is based on the Lanczos algorithm. The release also includes an extended version of the TeraSort algorithm that enables to sort the algorithm by columns. In addition, other smaller operators and extensions to deal with the ds-array has been included.

Dislib 0.9.0 comes with other extensions and with a new user guide. The code is open source and available for download.