MareNostrum 5
File systems
Each user has access to multiple disk space areas for file storage, each with potential size or time limitations. It is essential to carefully review this section to understand the usage policies of each file system.
There are three distinct types of storage accessible within a node:
Filesystem | Description |
---|---|
GPFS | GPFS is a distributed networked filesystem accessible from all nodes |
Local Solid State Drive (SSD) | Each node has an internal "hard drive" |
root | This is the filesystem housing the operating system. |
GPFS filesystems
The IBM General Parallel File System (GPFS) is a high-performance shared-disk file system providing fast, reliable data access from all nodes of the cluster to a global filesystem. GPFS allows parallel applications simultaneous access to a set of files (even a single file) from any node that has the GPFS file system mounted while providing a high level of control over all file system operations. In addition, GPFS can read or write large blocks of data in a single I/O operation, thereby minimizing overhead.
These are the GPFS filesystems accessible on the machine from all nodes:
/apps
: This file system hosts applications and libraries that are pre-installed on the machine as well as new builds. Browse the directories to discover applications available for general use./gpfs/home
: This file system contains the home directories for all users, and upon login, you will automatically start in your home directory. Each user has their designated home directory to store personally developed sources and personal data. A default quota restricts the data stored in individual home directories.caution- Running jobs directly from this filesystem is strongly discouraged.
- Executing your jobs in your group's
/gpfs/projects
or/gpfs/scratch
directories is recommended.
/gpfs/projects
: In addition, each user group has a dedicated directory within/gpfs/projects/<GROUP>
. For example, the group "bsc01" will have a/gpfs/projects/bsc01
directory available for use. This space is designed for storing data that requires sharing among users within the same group or project. A group-based quota will be enforced, determined by the allocated space approved. The project manager is responsible for deciding how this space is utilized, distributed, and shared among the users within their project./gpfs/scratch
: Each user will also be allocated a directory per group within/gpfs/scratch/<GROUP>
. Its purpose is to store temporary files your jobs generate during their execution. A group-based quota will be enforced based on the allocated space.
We keep incremental backups of /gpfs/home, /gpfs/apps and /gpfs/projects, the frequency of which depends on the amount of data. That said, it is your responsibility as a user of our facilities to backup all your critical data.
Filesystem | Time to complete copy |
---|---|
/gpfs/home | ~1 day |
/gpfs/apps | ~1 day |
/gpfs/projects | 3~4 days |
/gpfs/tapes
: Tapes is a mid-long term storage filesystem that provides 400 PB of total space. You can access tapes from the Data Transfer Machine under /gpfs/tapes/hpc/your_group. More information can be found here.info- There is no backup of this filesystem. The user is responsible for adequately managing the data stored in it.
Local Solid State Drive (SSD)
Each node is equipped with a local solid-state drive (NVMe) designated as a temporary storage space to store files during the execution of your jobs.
This space is accessible through the /scratch/tmp/$JOBID
directory and is indicated by the
$TMPDIR
environment variable. The available space within the /scratch
filesystem on each
machine partition can be verified in the System Overview.
- It's important to note that data stored in these local drives on the compute nodes won't be accessible from the login nodes.
- This temporary directory will be automatically removed after the job finishes.
root filesystem
The root filesystem, housing the operating system, does not exist on the node; instead, it is an NFS filesystem mounted from one of the servers.
Given its nature as a remote filesystem, only the operating system data must reside in it.
It is expressly prohibited to utilize /tmp
for temporary user data. For this purpose,
the local SSD storage should be employed, as explained in the Local Solid State Drive (SSD) section.
Quotas
Quotas represent the allocated storage capacity for an individual user or a group of users. Think of it as a dedicated disk space assigned to you. A default value is set for all users and groups, and this limit cannot be exceeded.
You can check your quota at any time using the following command within each filesystem:
% bsc_quota
The command produces a readable output for the quota. Refer to BSC Commands for additional information.
If you require additional disk space in any filesystem, the person responsible for the project must submit a request for the extra space. The request should include the requested space amount and the reasons for its necessity. For more information or to submit requests, please contact us.