GekkoFS

Big Data BSC Group: Computer Sciences Software

GekkoFS is a file system capable of aggregating the local I/O capacity and performance of each compute node in a HPC cluster to produce a high-performance storage space that can be accessed in a distributed manner. 

 
Software Author: 

BSC and University of Mainz

License: 

MIT License 

Primary tabs

MIT License (Latest Version)

Midterm ADMIRE version of GekkoFS

Release Notes

Changelog

All notable changes to GekkoFS project will be documented in this file.

The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.

[Unreleased]

New

  • Additional tests to increase code coverage (!141).
  • GKFS_ENABLE_UNUSED_FUNCTIONS added to disable code to increase code coverage. (!141).
  • Updated Parallax version to new API (parallax option needs kv_format.parallax in the path, and the database in a device with O_DIRECT) (!158
  • Support for increasing file size via truncate() added (!159
  • Added PowerPC support (!151).
  • GKFS_RENAME_SUPPORT added to support renaming files. This specifically targets the use case for opened files using an existing file descriptor (!133).
  • Added FLOCK and fcntl functions for locks to interception albeit not supported by GekkoFS and returning the corresponding error code (!133).
  • Added ARM64 support (!160).
  • Added support for CMake presets to simplify build configurations (!163).
  • Several improvements to CMake scripts (!143)):
    • Dependency management is now handled more consistently: system dependencies are found using find_package(), whereas source-only dependencies are found using include_from_source(). This new function integrates a dependency provided its source code is available at GKFS_DEPENDENCIES_PATH. If it's not, it will try to automatically download it from its git repository using CMake's FetchContent().
    • More consistent use of targets (we are closer to 100% modern CMake).
    • Adds the gkfs_feature_summary() to allow printing a summary of all GekkoFS configuration options and their values. This should help users when building to precisely see how a GekkoFS instance has been configured.
  • Added (parallel) append support for consecutive writes with file descriptor opened with O_APPEND (!164).
  • Added support for Spack so that it can be used to install GekkoFS (!137).

Changed

  • Support parallelism for path resolution tests (!145).
  • Support parallelism for symlink tests (!147).
  • Update Parallax release (PARALLAX-exp) (!158
  • Improved and simplified coverage generation procedures for developers with specific CMake targets (!163).

Removed

Fixed

  • Updated daemon log level for tests (!138).
  • Using unlink now fails if it is a directory unless the AT_REMOVEDIR flag is used (POSIX compliance) (!139).
  • fchdir generate a SIGSEV in debug mode (due to log) (!141)
  • Support glibc-2.34 or newer with syscall_intercept ).
  • GekkoFS documentation is now automatically generated and published at here (!95, !109, !125).
  • Added a guided distributor mode which allows defining a specific distribution of data on a per directory or file basis (!39).
  • For developers:
    • A convenience library has been added for unit testing (!94).
    • Code format is now enforced with the clang-format tool (!66). A new script is available in scripts/check_format.sh for easy of use.
    • GKFS_METADATA_MOD macro has been added allowing the MetadataModule to be logged, among others (!98).
    • A convenience library has been added for path_util (!102).

Changed

  • GekkoFS license has been changed to GNU General Public License version 3 (!88)
  • Create, stat, and remove operation have been refactored and improved, reducing the number of required RPCs per operation (!60).
  • Syscall_intercept now supports glibc version 2.3 or newer (!72).
  • All arithmetic operations based on block sizes, and therefore chunk computations, are now constexpr (!75).
  • The CI pipeline has been significantly optimized (!103).
  • The GekkoFS dependency download and compile scripts have been severely refactored and improved (!111).
  • GekkoFS now supports the latest dependency versions (!112).

Removed

  • Boost is no longer used for the client and daemon (!90, !122, !123). Note that tests still require Boost_preprocessor.
  • Unneeded sources in CMake have been removed (!101).

Fixed

  • Building tests no longer proceeds if virtualenv creation fails (!68).
  • An error where unit tests could not be found has been fixed (!79).
  • The daemon can now be restarted without losing its namespace (!85).
  • An issue has been resolved that required AGIOS even if it wasn't been used (!104).
  • Several issues that caused docker images to fail has been resolved (!105, !106 , !107, !114).
  • An CMake issue in path_util that caused the compilation to fail was fixed (!115).
  • Fixed an issue where ls failed because newer kernels use fstatat() with EMPTY_PATH (!116).
  • Fixed an issue where LOG_OUTPUT_TRUNC did not work as expected (!118).

[0.8.0] - 2020-09-15

New

  • Both client library and daemon have been extended to support the ofi+verbs protocol.
  • A new Python testing harness has been implemented to support integration tests. The end goal is to increase the robustness of the code in the mid- to long-term.
  • The RPC protocol and the usage of shared memory for intra-node communication no longer need to be activated on compile time. New arguments -P|--rpc-protocol and --auto-sm have been added to the daemon to this effect. This configuration options are propagated to clients when they initialize and contact daemons.
  • Native support for the Omni-Path network protocol by choosing the ofi+psm2 RPC protocol. Note that this requires libfabric's version to be greater than 1.8 as well as psm2 to be installed in the system. Clients must set FI_PSM2_DISCONNECT=1 to be able to reconnect once the client is shut down once. Known limitations: Client reconnect doesn't always work. Apparently, if clients reconnect too fast the servers won't accept the connections. Also, currently more than 16 clients per node are not supported.
  • A new execution mode called GekkoFWD that allows GekkoFS to run as a user-level I/O forwarding infrastructure for applications. In this mode, I/O operations from an application are intercepted and forwarded to a single GekkoFS daemon that is chosen according to a pre-defined distribution. In the daemons, the requests are scheduled using the AGIOS scheduling library before they are dispatched to the shared backend parallel file system.
  • The fsync() system call is now fully supported.

Improved

  • Argobots tasks in the daemon are now wrapped in a dedicated class, effectively removing the dependency. This lays ground work for future non-Argobots I/O implementations.
  • The readdir() implementation has been refactored and improved.
  • Improvements on how to the installation scripts manage dependencies.

Fixed

  • The server sometimes crashed due to uncaught system errors in the storage backend. This has now been fixed.
  • Fixed a bug that broke ls on some architectures.
  • Fixed a bug that leaked internal errors from the interception library to client applications via errno propagation.

[0.7.0] - 2020-02-05

Added

  • Added support for eventfd()and eventfd2() system calls.

Changed

  • Replaced Margo with Mercury in the client library in order to increase application compatibility: the Argobots ULTs used by Margo to send and process RPCs clashed at times with applications using pthreads.
  • Renamed environment variables to better distinguish which variables affect the client library (LIBGKFS_*) and which affect the daemon (GKFS_DAEMON_*).
  • Replaced spdlog in the client with a bespoke logging infrastructure: spdlog's internal threads and exception management often had issues with the system call interception infrastructure. The current logging infrastructure is designed around the syscall interception mechanism, and is therefore more stable.
  • Due to the new logging infrastructure, there have been significant changes to the environment variables controlling logging output. The desired log module is now set with LIBGKFS_LOG, while the desired output channel is controlled with LIBGKFS_LOG_OUTPUT. Additional options such as LIBGKFS_LOG_OUTPUT_TRUNC, LOG_SYSCALL_FILTER and LOG_DEBUG_VERBOSITY can be used to further control messages. Run the client with LIBGKFS_LOG=help for more details.
  • Improved dependency management in CMake.

Fixed

  • Relocate internal file descriptors to a private range to avoid interfering with client application file descriptors.
  • Handle internal file descriptors created by fcntl().
  • Handle internal file descriptors passed to processes using CMSG_DATA in recvmsg().

[0.6.2] - 2019-10-07

Added

  • Paths inside kernel pseudo filesystems (/sys, /proc) are forwarded directly to the kernel and internal path resolution will be skipped. Be aware that also paths like /sys/../tmp/gkfs_mountpoint/asd will be forwarded to the kernel
  • Added new Cmake flag CREATE_CHECK_PARENTS to controls if the existance of the parent node needs to be checked during the creation of a child node.

Changed

  • Daemon logs for RPC handlers have been polished
  • Updated Margo, Mercury and Libfabric dependencies

Fixed

  • mk_node RPC wasn't propagating errors correctly from daemons
  • README has been improoved and got some minor fixes
  • fix wrong path in log call for mk_symlink function

[0.6.1] - 2019-09-17

Added

  • Added new Cmake flag LOG_SYSCALLS to enable/disable syscall logging.
  • Intercept the 64 bit version of getdents.
  • Added debian-based docker image.

Changed

  • Disable syscalls logging by default
  • Update Mercury, RocksDB and Libfabric dependencies

Fixed

  • Fix read at the end of file.
  • Don't create log file when using --version/--help cli flags.
  • On some systems LD_PRELOAD used on /bin/bash binary was not working.
  • Missing definition of loff_t on new version of GCC.

[0.6.0] - 2019-07-26

Added

  • Add compile time option to disable shared memory communication -DUSE_SHM:BOOL=OFF

Changed

  • Deamons does not store anymore information about the others deamons.
  • Improoved error handling on deamon initialization
  • Decreased RPC timeout 3min -> 3sec
  • Update 3rd party dependencies

Removed

  • PID file is not used anymore, we use only the new hosts file for out of bound communication
  • Dropped CCI plugin support
  • Dropped hostname-suffix cli option
  • Dropped port cli option (use --listen instead)
  • It is not needed anymore to pass hosts information to deamons, thus the --hosts cli have been removed

Fixed

  • Errors on get_dirents RPC are now reported back to clients
  • Write errors happenig on deamons are now reported back to clients
  • number overflow on lseek didn't allow to use seek on huge files

[0.5.0] - 2019-04-29

Changed

[0.4.0] - 2019-04-18

First GekkoFS public release

This version provides a client library that uses GLibC I/O function interception.

[0.3.1] - 2018-03-04

Changed

  • Read-write process improved. @Marc vef
  • Improved Filemap. @Marc Vef