Skip to main content

File Systems

caution

It is your responsibility as a user of our facilities to backup all your critical data. We only guarantee a daily backup of user data under /gpfs/home. Any other backup should only be done exceptionally under demand of the interested user.

Each user has several areas of disk space for storing files. These areas may have size or time limits, please read carefully all this section to know about the policy of usage of each of these filesystems. There are 3 different types of storage available in this cluster:

  • GPFS filesystems: GPFS is a distributed networked filesystem that is shared between most of BSC's clusters and should be mainly operated from the Data Transfer Machine. GPFS is only available from the login node and the general purpose compute nodes, it isn't reachable from nodes with FPGAs. This is for security reasons.
  • Node's NFS filesystem: This filesystem is shared between all the nodes of the cluster. Any files needed for operation in FPGA nodes should be put first here.
  • FPGA's NFS filesystem: This filesystem is shared between all FPGAs.

GPFS Filesystem

The IBM General Parallel File System (GPFS) is a high-performance shared-disk file system providing fast, reliable data access from all nodes of the cluster to a global filesystem. GPFS allows parallel applications simultaneous access to a set of files (even a single file) from any node that has the GPFS file system mounted while providing a high level of control over all file system operations. In addition, GPFS can read or write large blocks of data in a single I/O operation, thereby minimizing overhead.

An incremental backup will be performed daily only for /gpfs/home.

These are the GPFS filesystems available in the machine from all nodes:

  • /apps: Over this filesystem will reside the applications and libraries that have already been installed on the machine. Take a look at the directories to know the applications available for general use. For this cluster, take into account that the only applications that will be installed there are applications to be used in general purpose compute nodes (this excludes FPGA nodes).

  • /gpfs/home: This filesystem has the home directories of all the users, and when you log in you start in your home directory by default. Every user will have their own home directory to store own developed sources and their personal data. A default quota will be enforced on all users to limit the amount of data stored there.

  • /gpfs/projects: In addition to the home directory, there is a directory in /gpfs/projects for each group of users. For instance, the group bsc01 will have a /gpfs/projects/bsc01 directory ready to use. This space is intended to store data that needs to be shared between the users of the same group or project. A quota per group will be enforced depending on the space assigned by Access Committee. It is the project's manager responsibility to determine and coordinate the better use of this space, and how it is distributed or shared between their users.

  • /gpfs/scratch: Each user will have a directory over /gpfs/scratch. Its intended use is to store temporary files of your jobs during their execution, assuming that you are using general purpose compute nodes. A quota per group will be enforced depending on the space assigned.

Node's NFS filesystem

This NFS filesystem is intended to share files between all the nodes of the cluster without relying on GPFS. Note that it is the only way to make files available to the compute nodes that have FPGAs installed in them. It is divided into the following directories:

  • /nfs/apps: Over this filesystem will reside the applications and libraries to be used in FPGA nodes. Take a look at the directories to know the applications available for them. For this cluster, take into account that the only applications that will be installed there are applications to be used in FPGA nodes.

  • /nfs/home: This filesystem the personal space for each user in the NFS filesystem. Every user will have their own home directory to store own developed sources and their personal data. A default quota will be enforced on all users to limit the amount of data stored there.

  • /nfs/scratch: This is a filesystem for temporary data shared between nodes.

FPGA's NFS filesystem

This filesystem is what is "seen" by all the FPGAs of the cluster. Its contents are not managed by the operations team.

Local Hard Drive

Every node has a local solid state drive (SSD) that can be used as a local scratch space to store temporary files during executions of one of your jobs. This space is mounted over /scratch/tmp/$JOBID directory and pointed out by $TMPDIR environment variable. The amount of space within the /scratch filesystem can be checked in each machine System Overview. All data stored in these local drives at the compute nodes will not be available from the login nodes.

You should use the directory referred to by $TMPDIR to save your temporary files during job executions. This directory will automatically be cleaned after the job finishes.

Root Filesystem

The root file system, where the operating system is stored doesn't reside in the node, this is a NFS filesystem mounted from one of the servers.

As this is a remote filesystem only data from the operating system has to reside in this filesystem. It is NOT permitted the use of /tmp for temporary user data. The local hard drive can be used for this purpose as you could read in Local Hard Drive.

Quotas

The quotas are the amount of storage available for a user or a groups' users. You can picture it as a small disk readily available to you. A default value is applied to all users and groups and cannot be outgrown.

You can inspect your quota anytime you want using the following command from inside each filesystem:

% bsc_quota

The command provides a readable output for the quota. Check BSC Commands for more information.

If you need more disk space in this filesystem or in any other of the GPFS filesystems, the responsible for your project has to make a request for the extra space needed, specifying the requested space and the reasons why it is needed. For more information or requests you can Contact Us.

Please take into account that the quota system is implemented for GPFS only at the moment, in the future we might implement a quota system for the node's NFS filesystem too.