Skip to main content

5 posts tagged with "performance"

View All Tags

· 16 min read
Albert Azcárate Sánchez
Pablo Rodenas Barquero
David Vicente Dorca

One of the critical steps of any HPC program is reading data and storing temporal or final results to or from the file system. Knowing how and when to read or write into a file is essential because it is one of the most time-consuming operations. Not only is it vital to be efficient when doing IO operations (input-output), but it is also crucial to how the data is accessed and how to keep it. Another critical decision is how many processors will execute these IO operations and how they will communicate. In this case, these tests have been executed on gpfs/projects on Marenostrum 4. We will use a pure MPI communication scheme and HDF5 1.10.1.

Considering this, we have performed several tests using the h5bench benchmark to see the performance and impact of different memory access patterns and temporary or end-file storage patterns using a set of data structures.

· 71 min read
Cristian Morales Pérez

On MareNostrum 4 there are several combinations of Compilers and MPI implementations available for the users. With this report we want to analyse the performance of the different combinations using the OSU Benchmarks. OSU offers the possibility of measure the performance of different MPI functions, as peer-to-peer or collectives and with different message sizes.

Combination analysed

We named the combinations used with the following pattern:

  • [compiler-module]-[mpi-module]

Where the [compiler-module] is the compiler module loaded and the [mpi-module] is the MPI implementation module loaded.

The different combinations that we have used are the following:

  • intel2017.1-impi2017.1
  • intel2017.4-impi2017.4
  • intel2017.6-impi2017.6
  • intel2017.7-impi2017.7
  • intel2018.0-impi2018.0
  • intel2018.1-impi2018.1
  • intel2018.2-impi2018.2
  • intel2018.3-impi2018.3
  • intel2018.4-impi2018.4
  • gcc7.2.0-openmpi1.10.4-apps
  • gcc7.1.0-openmpi1.10.7
  • gcc7.2.0-openmpi2.0.2
  • gcc7.2.0-openmpi2.1.3
  • gcc7.2.0-openmpi3.0.0
  • gcc7.2.0-openmpi3.1.1
  • gcc8.1.0-openmpi4.0.0
  • gcc8.1.0-openmpi4.0.1
  • gcc8.1.0-openmpi4.0.2
  • gcc9.2.0-openmpi4.1.0
  • gcc11.2.0_binutils-openmpi4.1.2
  • intel2017.4_mvapich2.3b
  • gcc7.2.0_mvapich2.3b
  • intel2017.4_mvapich2.3rc1

Benchmarks

The OSU Micro Benchmarks are a common suite of benchmarks for measuring and evaluating the performance of MPI operations for comparing different MPI implementations and the underlying network interconnect. We have used this suite to measure different MPI operations as Point-to-Point, collectives and non-blocking collectives.

Point-to-Point Benchmarks

  • osu_bibw - Bidirectional Bandwidth Test
  • osu_bw - Bandwidth Test
  • osu_mbw_mr - Multiple Bandwidth / Message Rate Test
  • osu_latency - Latency Test
  • osu_latency_mp - Multi-process Latency Test
  • osu_latency_mt - Multi-threaded Latency Test
  • osu_multi_lat - Multi-pair Latency Test

Collective MPI Benchmarks

  • osu_allgather - MPI_Allgather Latency Test
  • osu_allreduce - MPI_Allreduce Latency Test
  • osu_alltoall - MPI_Alltoall Latency Test
  • osu_barrier - MPI_Barrier Latency Test
  • osu_bcast - MPI_Bcast Latency Test
  • osu_gather - MPI_Gather Latency Test
  • osu_reduce - MPI_Reduce Latency Test
  • osu_reduce_scatter - MPI_Reduce_scatter Latency Test
  • osu_scatter - MPI_Scatter Latency Test

Non-Blocking Collective MPI Benchmarks

  • osu_iallgather - MPI_Iallgather Latency Test
  • osu_iallreduce - MPI_Iallreduce Latency Test
  • osu_ialltoall - MPI_Ialltoall Latency Test
  • osu_ibarrier - MPI_Ibarrier Latency Test
  • osu_ibcast - MPI_Ibcast Latency Test
  • osu_igather - MPI_Igather Latency Test
  • osu_ireduce - MPI_Ireduce Latency Test
  • osu_iscatter - MPI_Iscatter Latency Test

Analysis

Point-to-Point

From the results below, we observe that some modules have bad performance on any P2P benchmark with any size. These modules are, intel2017.1-impi2017.1 and the the OpenMPI 4.X compiled with GCC.

For the Bandwidth benchmarks with small message, we observe better performance with the OpenMPI >= 3. In general, for the rest of the sizes, the IntelMPI, MVAPICH and OpenMPI <=3 have similar results in terms of bandwidth.

For the P2P Latency benchmarks, we observe similar performance using IntelMPI, OpenMPI and MVAPICH, except for the IntelMPI 2017.1 and the OpenMPI 4.0.X.

Collectives

From the results below, we observe a similar behaviour using the OpenMPI 4.0.X versions, where the performance is deficient compared with the other combinations. In the following paragraphs, when we talk about OpenMPI, we do not include these 4.0.X versions with low performance. Also, we do not include IntelMPI 2017.1 when we speak about IntelMPI in general.

Specifically for each benchmark, we observe that for the AllGather the best option is using MVAPICH, then the OpenMPI different from 4.0.X and the Intel MPI, in this order. We observe the same for the Reduce operation.

On the AllReduce tests, we observe that MVAPICH is still the best, but in this case, IntelMPI has lower latency than the other OpenMPI. We observe the same on the Scatter function.

On the AllToAll, we observe similar performance for MVAPICH, OpenMPI and IntelMPI, with some variability. In this case, the OpenMPI 4.0.X runs did not finish for the larger message sizes.

On the BroadCast function, the MVAPICH and the OpenMPI are the best options except for the largest message size, where the MVAPICH is better than OpenMPI. In his case, the Intel MPI performs similarly to the OpenMPI 4.0.X.

On the Gather benchmark, the MVAPICH with small messages works a bit worse than IntelMPI and OpenMPI, but with the medium and large sizes, the performance is better using MVAPICH.

The Reduce_Scatter has similar behaviour to Alltoall where the three combinations perform similarly.

Non-blocking collective

For the Iallreduce, we observe that the OpenMPI 1.X have more latency for larger sizes.

For the Ibroadcast, we observe similar performance with IntelMPI, OpenMPI and MVAPICH.

For the Igather and Iallreduce, we generally observe better performance with the IntelMPI and MVAPICH, and a bit worse with OpenMPI.

For the Ireduce, we observe better performance in general with the OpenMPI, and similar performance if we compare IntelMPI and MVAPICH.

On the Iscatter operation, we observe better performance for the small size of IntelMPI and MVAPICH compared with OpenMPI, but for the larger sizes, the OpenMPI combinations have lower latency.

Conclusions

To sum up, with the results observed, we do not recommend using on MareNostrum 4 the IntelMPI 2017.1 module and the OpenMPI 4.0.X modules, for any MPI application.

Then, in the case of an application with only P2P operations, we recommend using any MPI except the previously commented. If you use small message sizes, you can use OpenMPI 3 or below, but the improvement will be meagre compared with other MPI.

In the case of the Collective and Non-Blocking Collectives operations, we observe very different behaviours depending on the MPI function used, the message size and the number of nodes. We recommend checking the results from this report and the most consuming MPI functions of your application and choosing the best combination possible.

Results

Point-to-Point

osu_bibw

Figures

8 Bytes

osu_bibw Benchmark 8 Bytes

8 KBytes

osu_bibw Benchmark 8 KBytes

4 MBytes

osu_bibw Benchmark 4 MBytes

Table
modules8 Bytes8 KBytes4 Bytes
intel2017.1-impi2017.13.711909.893715.05
intel2017.4-impi2017.418.214942.421567.62
intel2017.6-impi2017.618.065084.5122986.97
intel2017.7-impi2017.719.167205.0723666.45
intel2018.0-impi2018.023.637781.423748.99
intel2018.1-impi2018.122.155397.9723385.88
intel2018.2-impi2018.223.647815.9723746.25
intel2018.3-impi2018.322.35445.7621549.18
intel2018.4-impi2018.422.325287.1920421.06
gcc7.2.0-openmpi1.10.4-apps29.455988.7822177.88
gcc7.1.0-openmpi1.10.729.495564.5820538.01
gcc7.2.0-openmpi2.0.232.088824.1623730.47
gcc7.2.0-openmpi2.1.333.358944.8523682.02
gcc7.2.0-openmpi3.0.033.258754.3423723.55
gcc7.2.0-openmpi3.1.129.25684.2220765.95
gcc8.1.0-openmpi4.0.013.61052.421240.98
gcc8.1.0-openmpi4.0.112.751024.121238.8
gcc8.1.0-openmpi4.0.210.91071.11293.9
gcc9.2.0-openmpi4.1.04.91543.617690.1
intel2019.5-openmpi4.1.132.558994.6823680.23
gcc11.2.0_binutils-openmpi4.1.24.131682.291246.34
intel2017.4-mvapich22.3b26.858150.8723760.53
gcc7.2.0-mvapich22.3b27.698325.2223746.37
intel2017.4-mvapich22.3rc127.688211.2923750.63

osu_bw

Figures

8 Bytes

osu_bw Benchmark 8 Bytes

8 KBytes

osu_bw Benchmark 8 KBytes

4 MBytes

osu_bw Benchmark 4 MBytes

Table
modules8 Bytes8 KBytes4 Bytes
intel2017.1-impi2017.12.51628.683569.38
intel2017.4-impi2017.417.954253.8411997.3
intel2017.6-impi2017.618.657020.712007.77
intel2017.7-impi2017.717.456084.411903.59
intel2018.0-impi2018.021.774572.5311999.58
intel2018.1-impi2018.120.626499.4511931.66
intel2018.2-impi2018.222.897618.8312020.94
intel2018.3-impi2018.320.584596.5211959.29
intel2018.4-impi2018.420.74512.6711838.44
gcc7.2.0-openmpi1.10.4-apps28.385332.4711996.9
gcc7.1.0-openmpi1.10.728.247642.512016.66
gcc7.2.0-openmpi2.0.226.937627.3812024.1
gcc7.2.0-openmpi2.1.328.227677.5812008.81
gcc7.2.0-openmpi3.0.027.324954.1211996.57
gcc7.2.0-openmpi3.1.127.794943.6111988.23
gcc8.1.0-openmpi4.0.04.03687.55858.92
gcc8.1.0-openmpi4.0.14.95696.84865.27
gcc8.1.0-openmpi4.0.25.1703.71878.55
gcc9.2.0-openmpi4.1.02.82913.413704.76
intel2019.5-openmpi4.1.123.116203.911916.12
gcc11.2.0_binutils-openmpi4.1.21.97907.821241.72
intel2017.4-mvapich22.3b24.87508.2712020.99
gcc7.2.0-mvapich22.3b24.797536.8412021.59
intel2017.4-mvapich22.3rc124.67537.3512003.92

osu_latency

Figures

8 Bytes

osu_latency Benchmark 8 Bytes

8 KBytes

osu_latency Benchmark 8 KBytes

4 MBytes

osu_latency Benchmark 4 MBytes

Table
modules8 Bytes8 KBytes4 Bytes
intel2017.1-impi2017.13.3315.141202.34
intel2017.4-impi2017.41.083.6360.02
intel2017.6-impi2017.61.164.02363.29
intel2017.7-impi2017.71.083.58360.48
intel2018.0-impi2018.01.154.05363.79
intel2018.1-impi2018.11.164.06364.11
intel2018.2-impi2018.21.174.06363.9
intel2018.3-impi2018.31.154.07364.04
intel2018.4-impi2018.41.224.5366.82
gcc7.2.0-openmpi1.10.4-apps1.194.13362.48
gcc7.1.0-openmpi1.10.71.13.58359.94
gcc7.2.0-openmpi2.0.21.534.38363.03
gcc7.2.0-openmpi2.1.31.184.09363.0
gcc7.2.0-openmpi3.0.01.174.1363.53
gcc7.2.0-openmpi3.1.11.184.11363.65
gcc8.1.0-openmpi4.0.048.0688.036602.06
gcc8.1.0-openmpi4.0.151.1685.866921.45
gcc8.1.0-openmpi4.0.257.3689.336684.69
gcc9.2.0-openmpi4.1.04.312.22630.09
intel2019.5-openmpi4.1.11.13.61364.0
gcc11.2.0_binutils-openmpi4.1.217.9639.593695.63
intel2017.4-mvapich22.3b1.113.65361.07
gcc7.2.0-mvapich22.3b1.123.64361.38
intel2017.4-mvapich22.3rc11.113.62360.97

osu_latency_mp

Figures

8 Bytes

osu_latency_mp Benchmark 8 Bytes

8 KBytes

osu_latency_mp Benchmark 8 KBytes

4 MBytes

osu_latency_mp Benchmark 4 MBytes

Table
modules8 Bytes8 KBytes4 Bytes
intel2017.1-impi2017.1
intel2017.4-impi2017.41.113.65359.32
intel2017.6-impi2017.61.244.07362.35
intel2017.7-impi2017.71.113.66359.34
intel2018.0-impi2018.01.264.57365.28
intel2018.1-impi2018.11.133.61360.35
intel2018.2-impi2018.21.214.1363.11
intel2018.3-impi2018.31.254.54364.49
intel2018.4-impi2018.41.23.99364.08
gcc7.2.0-openmpi1.10.4-apps1.314.13363.42
gcc7.1.0-openmpi1.10.71.354.58363.84
gcc7.2.0-openmpi2.0.20.04.44362.4
gcc7.2.0-openmpi2.1.31.414.5363.41
gcc7.2.0-openmpi3.0.01.414.46365.69
gcc7.2.0-openmpi3.1.11.394.55363.92
gcc8.1.0-openmpi4.0.065.0385.846652.59
gcc8.1.0-openmpi4.0.165.6886.066922.33
gcc8.1.0-openmpi4.0.256.6382.686898.9
gcc9.2.0-openmpi4.1.04.1719.49631.5
intel2019.5-openmpi4.1.11.213.59359.58
gcc11.2.0_binutils-openmpi4.1.240.5848.983708.16
intel2017.4-mvapich22.3b1.13.6361.16
gcc7.2.0-mvapich22.3b1.133.6361.11
intel2017.4-mvapich22.3rc11.13.63360.37

osu_latency_mt

Figures

8 Bytes

osu_latency_mt Benchmark 8 Bytes

8 KBytes

osu_latency_mt Benchmark 8 KBytes

4 MBytes

osu_latency_mt Benchmark 4 MBytes

Table
modules8 Bytes8 KBytes4 Bytes
intel2017.1-impi2017.114.5521.61251.13
intel2017.4-impi2017.41.764.32388.73
intel2017.6-impi2017.61.714.83383.89
intel2017.7-impi2017.71.666.55388.3
intel2018.0-impi2018.01.73.9391.74
intel2018.1-impi2018.11.824.32390.68
intel2018.2-impi2018.21.724.04385.84
intel2018.3-impi2018.31.744.91386.56
intel2018.4-impi2018.41.734.0386.29
gcc7.2.0-openmpi1.10.4-apps
gcc7.1.0-openmpi1.10.7
gcc7.2.0-openmpi2.0.2
gcc7.2.0-openmpi2.1.3
gcc7.2.0-openmpi3.0.02.495.27362.88
gcc7.2.0-openmpi3.1.12.76.14367.3
gcc8.1.0-openmpi4.0.02.65.62362.05
gcc8.1.0-openmpi4.0.12.55.15363.03
gcc8.1.0-openmpi4.0.23.256.4366.42
gcc9.2.0-openmpi4.1.07.6814.37591.15
intel2019.5-openmpi4.1.13.216.11364.64
gcc11.2.0_binutils-openmpi4.1.219.140.73649.79
intel2017.4-mvapich22.3b1.383.99361.44
gcc7.2.0-mvapich22.3b1.364.11361.18
intel2017.4-mvapich22.3rc11.353.94361.0

osu_mbw_mr

Figures

8 Bytes

osu_mbw_mr Benchmark 8 Bytes

8 KBytes

osu_mbw_mr Benchmark 8 KBytes

4 MBytes

osu_mbw_mr Benchmark 4 MBytes

Table
modules8 Bytes8 KBytes4 Bytes
intel2017.1-impi2017.12.021282.43710.77
intel2017.4-impi2017.417.014335.7811940.06
intel2017.6-impi2017.617.717025.6312014.46
intel2017.7-impi2017.717.46085.9111971.55
intel2018.0-impi2018.02645205.526443.8411938.42
intel2018.1-impi2018.120.714650.2811937.44
intel2018.2-impi2018.22595083.684316.3811936.2
intel2018.3-impi2018.321.587543.311999.62
intel2018.4-impi2018.421.254480.6712019.08
gcc7.2.0-openmpi1.10.4-apps26.615726.1712020.68
gcc7.1.0-openmpi1.10.728.437757.3812025.0
gcc7.2.0-openmpi2.0.226.015421.9312044.3
gcc7.2.0-openmpi2.1.33578117.817703.8312016.06
gcc7.2.0-openmpi3.0.023.636617.9311836.23
gcc7.2.0-openmpi3.1.127.527591.412016.73
gcc8.1.0-openmpi4.0.04.9711.16863.89
gcc8.1.0-openmpi4.0.13.09712.03859.01
gcc8.1.0-openmpi4.0.24.12705.69874.22
gcc9.2.0-openmpi4.1.02.78911.344104.18
intel2019.5-openmpi4.1.127.77707.7112001.9
gcc11.2.0_binutils-openmpi4.1.21.92889.751241.21
intel2017.4-mvapich22.3b24.477543.5612016.72
gcc7.2.0-mvapich22.3b25.057491.912015.19
intel2017.4-mvapich22.3rc13119166.357573.9612014.1

osu_multi_lat

Figures

8 Bytes

osu_multi_lat Benchmark 8 Bytes

8 KBytes

osu_multi_lat Benchmark 8 KBytes

4 MBytes

osu_multi_lat Benchmark 4 MBytes

Table
modules8 Bytes8 KBytes4 Bytes
intel2017.1-impi2017.13.3515.491189.15
intel2017.4-impi2017.41.243.93363.15
intel2017.6-impi2017.61.194.01361.64
intel2017.7-impi2017.71.243.99361.88
intel2018.0-impi2018.01.274.49365.83
intel2018.1-impi2018.11.253.97363.05
intel2018.2-impi2018.21.234.02363.6
intel2018.3-impi2018.31.243.98362.48
intel2018.4-impi2018.41.253.88363.5
gcc7.2.0-openmpi1.10.4-apps1.253.99361.04
gcc7.1.0-openmpi1.10.71.34.0362.69
gcc7.2.0-openmpi2.0.21.764.92360.78
gcc7.2.0-openmpi2.1.31.243.92362.32
gcc7.2.0-openmpi3.0.01.33.98362.32
gcc7.2.0-openmpi3.1.11.414.39364.13
gcc8.1.0-openmpi4.0.058.5581.836830.24
gcc8.1.0-openmpi4.0.165.2788.666870.71
gcc8.1.0-openmpi4.0.264.9188.976851.0
gcc9.2.0-openmpi4.1.03.7218.41597.44
intel2019.5-openmpi4.1.11.23.59359.25
gcc11.2.0_binutils-openmpi4.1.238.7647.033714.72
intel2017.4-mvapich22.3b1.113.58360.36
gcc7.2.0-mvapich22.3b1.123.54360.28
intel2017.4-mvapich22.3rc11.113.54360.16

Collective MPI Results

osu_allgather

Figures

4 Nodes

8 Bytes

osu_allgather Benchmark 8 Bytes

8 KBytes

osu_allgather Benchmark 8 KBytes

1 MBytes

osu_allgather Benchmark 1 MBytes

16 Nodes

8 Bytes

osu_allgather Benchmark 8 Bytes

8 KBytes

osu_allgather Benchmark 8 KBytes

512 KBytes

osu_allgather Benchmark 512 KBytes

32 Nodes

8 Bytes

osu_allgather Benchmark 8 Bytes

8 KBytes

osu_allgather Benchmark 8 KBytes

1 MBytes

osu_allgather Benchmark 256 KBytes

Table

4 nodes

modules8 bytes8 KBytes1 MByte
intel2017.1-impi2017.171.113099.81212652.13
intel2017.4-impi2017.419.992952.99186470.52
intel2017.6-impi2017.620.042954.34186623.27
intel2017.7-impi2017.719.772947.52186431.51
intel2018.0-impi2018.019.132943.8186786.91
intel2018.1-impi2018.119.572947.89187184.14
intel2018.2-impi2018.219.52966.16186776.81
intel2018.3-impi2018.319.662953.99186963.67
intel2018.4-impi2018.419.482936.32187460.2
gcc7.2.0-openmpi1.10.4-apps19.242018.88196463.29
gcc7.1.0-openmpi1.10.719.071992.05196211.55
gcc7.2.0-openmpi2.0.218.652045.77196048.42
gcc7.2.0-openmpi2.1.318.992008.4196243.24
gcc7.2.0-openmpi3.0.015.349083.77196480.48
gcc7.2.0-openmpi3.1.115.332020.51196581.33
gcc8.1.0-openmpi4.0.0674.515446.32392792.17
gcc8.1.0-openmpi4.0.1699.795417.35392926.27
gcc8.1.0-openmpi4.0.2677.525398.35394932.74
gcc9.2.0-openmpi4.1.0104.353212.67222661.27
intel2019.5-openmpi4.1.114.952019.14186856.47
gcc11.2.0_binutils-openmpi4.1.2440.184372.12222844.76
intel2017.4-mvapich22.3b19.072208.87121810.88
gcc7.2.0-mvapich22.3b19.172294.05122315.23
intel2017.4-mvapich22.3rc122.381281.45120674.5

16 nodes

modules8 bytes8 KBytes512 KByte
intel2017.1-impi2017.1386.8111975.89513439.85
intel2017.4-impi2017.466.8713554.02389517.29
intel2017.6-impi2017.663.9413541.36388646.47
intel2017.7-impi2017.770.8513714.44414375.33
intel2018.0-impi2018.064.8912904.08388209.35
intel2018.1-impi2018.166.6913114.59389246.31
intel2018.2-impi2018.262.8312361.41389991.21
intel2018.3-impi2018.367.1312870.5407106.97
intel2018.4-impi2018.466.3115085.3389473.33
gcc7.2.0-openmpi1.10.4-apps56.658831.55449012.6
gcc7.1.0-openmpi1.10.744.948173.38376948.84
gcc7.2.0-openmpi2.0.256.589078.97397288.31
gcc7.2.0-openmpi2.1.344.058222.5378769.94
gcc7.2.0-openmpi3.0.042.018413.18378332.55
gcc7.2.0-openmpi3.1.152.678203.6378832.23
gcc8.1.0-openmpi4.0.01004.9728718.73881674.52
gcc8.1.0-openmpi4.0.1975.3323398.16884407.76
gcc8.1.0-openmpi4.0.2964.223339.45874616.03
gcc9.2.0-openmpi4.1.0183.1613331.31510847.95
intel2019.5-openmpi4.1.199.078672.7376510.48
gcc11.2.0_binutils-openmpi4.1.2733.7820880.01620855.43
intel2017.4-mvapich22.3b45.666512.24236758.33
gcc7.2.0-mvapich22.3b35.236763.27237796.58
intel2017.4-mvapich22.3rc146.776931.28254291.58

32 nodes

modules8 bytes8 KBytes256 KByte
intel2017.1-impi2017.11471.4924094.62628267.41
intel2017.4-impi2017.4129.9427138.34434475.21
intel2017.6-impi2017.6127.2127020.78426855.58
intel2017.7-impi2017.7130.8127973.72453794.95
intel2018.0-impi2018.0127.1626789.84423492.48
intel2018.1-impi2018.1145.1527522.92473530.69
intel2018.2-impi2018.2132.027041.35428062.78
intel2018.3-impi2018.3131.0730295.6433598.01
intel2018.4-impi2018.4154.4828176.87428158.6
gcc7.2.0-openmpi1.10.4-apps107.3116808.41425384.02
gcc7.1.0-openmpi1.10.7109.4317058.51393795.66
gcc7.2.0-openmpi2.0.296.5916783.65390463.38
gcc7.2.0-openmpi2.1.389.7316877.16392138.99
gcc7.2.0-openmpi3.0.097.5417208.29393932.34
gcc7.2.0-openmpi3.1.192.2316884.55407622.02
gcc8.1.0-openmpi4.0.01286.6859088.36912173.39
gcc8.1.0-openmpi4.0.11230.0550146.51915301.86
gcc8.1.0-openmpi4.0.21288.0458673.48912157.07
gcc9.2.0-openmpi4.1.0225.9627182.59525007.74
intel2019.5-openmpi4.1.1100.4617421.36391150.23
gcc11.2.0_binutils-openmpi4.1.21081.1348758.43934094.02
intel2017.4-mvapich22.3b53.0713524.87282437.37
gcc7.2.0-mvapich22.3b139.4614220.57266299.7
intel2017.4-mvapich22.3rc162.7215913.49247577.04

osu_allreduce

Figures

4 Nodes

8 Bytes

osu_allreduce Benchmark 8 Bytes

8 KBytes

osu_allreduce Benchmark 8 KBytes

1 MBytes

osu_allreduce Benchmark 1 MBytes

16 Nodes

8 Bytes

osu_allreduce Benchmark 8 Bytes

8 KBytes

osu_allreduce Benchmark 8 KBytes

512 KBytes

osu_allreduce Benchmark 512 KBytes

32 Nodes

8 Bytes

osu_allreduce Benchmark 8 Bytes

8 KBytes

osu_allreduce Benchmark 8 KBytes

1 MBytes

osu_allreduce Benchmark 256 KBytes

Table

4 nodes

modules8 bytes8 KBytes1 MByte
intel2017.1-impi2017.132.69143.923855.68
intel2017.4-impi2017.410.2951.913452.59
intel2017.6-impi2017.611.8551.373427.35
intel2017.7-impi2017.710.5751.853456.45
intel2018.0-impi2018.011.150.263428.27
intel2018.1-impi2018.19.9650.313421.68
intel2018.2-impi2018.29.9150.383397.02
intel2018.3-impi2018.310.8952.853425.68
intel2018.4-impi2018.49.9151.943392.86
gcc7.2.0-openmpi1.10.4-apps10.15141.244478.94
gcc7.1.0-openmpi1.10.79.89141.314511.4
gcc7.2.0-openmpi2.0.213.99141.564845.05
gcc7.2.0-openmpi2.1.39.8141.264467.29
gcc7.2.0-openmpi3.0.011.17203.224579.01
gcc7.2.0-openmpi3.1.19.67123.664585.9
gcc8.1.0-openmpi4.0.0398.68824.364197.36
gcc8.1.0-openmpi4.0.1390.99794.574236.78
gcc8.1.0-openmpi4.0.2394.11798.394197.0
gcc9.2.0-openmpi4.1.068.38159.324296.48
intel2019.5-openmpi4.1.112.55103.873824.08
gcc11.2.0_binutils-openmpi4.1.2307.58557.835870.86
intel2017.4-mvapich22.3b7.842.622216.91
gcc7.2.0-mvapich22.3b8.6946.172337.37
intel2017.4-mvapich22.3rc18.0544.371960.87

16 nodes

modules8 bytes8 KBytes512 KByte
intel2017.1-impi2017.1131.281828.146722.73
intel2017.4-impi2017.416.7894.831679.88
intel2017.6-impi2017.618.8691.661649.05
intel2017.7-impi2017.719.5994.051621.07
intel2018.0-impi2018.014.9468.371679.41
intel2018.1-impi2018.117.42102.511692.72
intel2018.2-impi2018.216.4194.441669.67
intel2018.3-impi2018.316.8891.461756.55
intel2018.4-impi2018.414.9670.891628.87
gcc7.2.0-openmpi1.10.4-apps14.69264.764619.76
gcc7.1.0-openmpi1.10.719.52259.65758.6
gcc7.2.0-openmpi2.0.224.14251.74835.73
gcc7.2.0-openmpi2.1.317.49252.214908.47
gcc7.2.0-openmpi3.0.014.17257.724772.98
gcc7.2.0-openmpi3.1.113.89256.884865.46
gcc8.1.0-openmpi4.0.0854.241634.647712.08
gcc8.1.0-openmpi4.0.1865.441663.327819.65
gcc8.1.0-openmpi4.0.2862.781678.357559.49
gcc9.2.0-openmpi4.1.0126.24359.042147.2
intel2019.5-openmpi4.1.123.5220.371821.62
gcc11.2.0_binutils-openmpi4.1.2524.321685.093695.11
intel2017.4-mvapich22.3b13.2568.01946.85
gcc7.2.0-mvapich22.3b17.51117.171554.67
intel2017.4-mvapich22.3rc114.3172.82831.74

32 nodes

modules8 bytes8 KBytes256 KByte
intel2017.1-impi2017.1251.123454.989964.34
intel2017.4-impi2017.420.8280.63923.34
intel2017.6-impi2017.629.29132.34864.67
intel2017.7-impi2017.797.1492.01869.49
intel2018.0-impi2018.037.29141.35905.19
intel2018.1-impi2018.119.9787.59895.73
intel2018.2-impi2018.236.04146.41929.78
intel2018.3-impi2018.336.85147.89874.82
intel2018.4-impi2018.423.9693.16874.0
gcc7.2.0-openmpi1.10.4-apps45.48312.19486.34
gcc7.1.0-openmpi1.10.733.69313.0811121.37
gcc7.2.0-openmpi2.0.268.61327.937572.19
gcc7.2.0-openmpi2.1.359.43314.346627.88
gcc7.2.0-openmpi3.0.031.07342.996774.17
gcc7.2.0-openmpi3.1.137.66288.8323395.5
gcc8.1.0-openmpi4.0.01079.062107.6710658.97
gcc8.1.0-openmpi4.0.13282.494301.1210760.58
gcc8.1.0-openmpi4.0.21448.492125.1411034.7
gcc9.2.0-openmpi4.1.0130.57415.751533.71
intel2019.5-openmpi4.1.134.56260.63950.49
gcc11.2.0_binutils-openmpi4.1.2239.222343.585751.7
intel2017.4-mvapich22.3b18.36139.26466.49
gcc7.2.0-mvapich22.3b18.0993.92649.33
intel2017.4-mvapich22.3rc118.14102.7512.34

osu_alltoall

Figures

4 Nodes

8 Bytes

osu_alltoall Benchmark 8 Bytes

8 KBytes

osu_alltoall Benchmark 8 KBytes

1 MBytes

osu_alltoall Benchmark 1 MBytes

16 Nodes

8 Bytes

osu_alltoall Benchmark 8 Bytes

8 KBytes

osu_alltoall Benchmark 8 KBytes

512 KBytes

osu_alltoall Benchmark 512 KBytes

32 Nodes

8 Bytes

osu_alltoall Benchmark 8 Bytes

8 KBytes

osu_alltoall Benchmark 8 KBytes

1 MBytes

osu_alltoall Benchmark 256 KBytes

Table

4 nodes

modules8 bytes8 KBytes1 MByte
intel2017.1-impi2017.1130.4320447.991235683.97
intel2017.4-impi2017.450.666835.67651285.92
intel2017.6-impi2017.651.036817.3647828.65
intel2017.7-impi2017.751.016842.73646933.73
intel2018.0-impi2018.048.96836.4647094.03
intel2018.1-impi2018.149.236833.95648096.62
intel2018.2-impi2018.249.196819.15654708.26
intel2018.3-impi2018.349.66841.55645827.31
intel2018.4-impi2018.450.366820.86646924.78
gcc7.2.0-openmpi1.10.4-apps388.376786.56674439.6
gcc7.1.0-openmpi1.10.750.616780.71667712.38
gcc7.2.0-openmpi2.0.247.966517.13650394.99
gcc7.2.0-openmpi2.1.344.956569.78648686.7
gcc7.2.0-openmpi3.0.046.376558.45647492.11
gcc7.2.0-openmpi3.1.146.026530.3664655.05
gcc8.1.0-openmpi4.0.0630.9856830.636397851.19
gcc8.1.0-openmpi4.0.1633.7650562.456373096.14
gcc8.1.0-openmpi4.0.2641.4649959.886319668.72
gcc9.2.0-openmpi4.1.0143.8216486.27967188.91
intel2019.5-openmpi4.1.145.69437.92648573.74
gcc11.2.0_binutils-openmpi4.1.2382.72300519.858227377.51
intel2017.4-mvapich22.3b327.756763.08648045.49
gcc7.2.0-mvapich22.3b284.876747.18650128.77
intel2017.4-mvapich22.3rc149.846888.38656633.25

16 nodes

modules8 bytes8 KBytes512 KByte
intel2017.1-impi2017.1653.03374113.25
intel2017.4-impi2017.4232.5234738.481655233.25
intel2017.6-impi2017.6234.4142748.022362379.31
intel2017.7-impi2017.7229.645004.312697784.0
intel2018.0-impi2018.0224.4943428.842543344.08
intel2018.1-impi2018.1218.1634802.711655546.07
intel2018.2-impi2018.2223.6948220.182699089.57
intel2018.3-impi2018.3241.3753064.72684498.55
intel2018.4-impi2018.4221.3544552.022454875.47
gcc7.2.0-openmpi1.10.4-apps204.5148011.443034484.35
gcc7.1.0-openmpi1.10.7203.7148165.912799518.95
gcc7.2.0-openmpi2.0.2179.133829.781640613.34
gcc7.2.0-openmpi2.1.3178.7332731.351641025.53
gcc7.2.0-openmpi3.0.0176.8433152.011686647.93
gcc7.2.0-openmpi3.1.1186.1741292.122286877.93
gcc8.1.0-openmpi4.0.0925.96
gcc8.1.0-openmpi4.0.1977.33
gcc8.1.0-openmpi4.0.2936.31321165.21
gcc9.2.0-openmpi4.1.0338.6670373.084042077.27
intel2019.5-openmpi4.1.1182.5365400.492641005.49
gcc11.2.0_binutils-openmpi4.1.2868.16
intel2017.4-mvapich22.3b2519.9341909.752542762.6
gcc7.2.0-mvapich22.3b4806.9442713.342531095.62
intel2017.4-mvapich22.3rc1208.2341500.182348401.18

32 nodes

modules8 bytes8 KBytes256 KByte
intel2017.1-impi2017.11448.06
intel2017.4-impi2017.4478.72101034.312792691.69
intel2017.6-impi2017.6489.0985287.612560289.08
intel2017.7-impi2017.7492.65110925.443398058.77
intel2018.0-impi2018.0471.76117980.063462271.04
intel2018.1-impi2018.1491.06128337.883560891.64
intel2018.2-impi2018.2477.3898384.272628543.05
intel2018.3-impi2018.3489.24105457.643017694.17
intel2018.4-impi2018.4483.02131038.333192619.61
gcc7.2.0-openmpi1.10.4-apps503.76116329.673417918.21
gcc7.1.0-openmpi1.10.7478.65137153.363953680.44
gcc7.2.0-openmpi2.0.2406.64164335.083808265.28
gcc7.2.0-openmpi2.1.3442.13125595.662765473.19
gcc7.2.0-openmpi3.0.0407.45133660.863024198.43
gcc7.2.0-openmpi3.1.1411.01158393.933814015.39
gcc8.1.0-openmpi4.0.01695.45
gcc8.1.0-openmpi4.0.11686.8
gcc8.1.0-openmpi4.0.21706.16
gcc9.2.0-openmpi4.1.0619.64165742.274563587.43
intel2019.5-openmpi4.1.1397.26182719.162553085.93
gcc11.2.0_binutils-openmpi4.1.22093.76
intel2017.4-mvapich22.3b5927.59112538.563329286.36
gcc7.2.0-mvapich22.3b5282.1116717.393584657.02
intel2017.4-mvapich22.3rc1480.52103620.92869803.86

osu_bcast

Figures

4 Nodes

8 Bytes

osu_bcast Benchmark 8 Bytes

8 KBytes

osu_bcast Benchmark 8 KBytes

1 MBytes

osu_bcast Benchmark 1 MBytes

16 Nodes

8 Bytes

osu_bcast Benchmark 8 Bytes

8 KBytes

osu_bcast Benchmark 8 KBytes

512 KBytes

osu_bcast Benchmark 512 KBytes

32 Nodes

8 Bytes

osu_bcast Benchmark 8 Bytes

8 KBytes

osu_bcast Benchmark 8 KBytes

1 MBytes

osu_bcast Benchmark 256 KBytes

Table

4 nodes

modules8 bytes8 KBytes1 MByte
intel2017.1-impi2017.185.02362.183263.28
intel2017.4-impi2017.475.0345.822744.16
intel2017.6-impi2017.675.05345.132766.6
intel2017.7-impi2017.774.76344.722746.23
intel2018.0-impi2018.071.38348.772741.93
intel2018.1-impi2018.173.23353.412733.17
intel2018.2-impi2018.275.75345.642723.39
intel2018.3-impi2018.372.591388.552736.12
intel2018.4-impi2018.474.02349.642772.6
gcc7.2.0-openmpi1.10.4-apps4.2234.813235.93
gcc7.1.0-openmpi1.10.74.0234.743239.55
gcc7.2.0-openmpi2.0.26.0137.173137.34
gcc7.2.0-openmpi2.1.34.0334.823334.6
gcc7.2.0-openmpi3.0.04.0435.123344.63
gcc7.2.0-openmpi3.1.14.0134.843227.65
gcc8.1.0-openmpi4.0.095.88487.742901.08
gcc8.1.0-openmpi4.0.191.56505.733219.02
gcc8.1.0-openmpi4.0.298.73506.13248.25
gcc9.2.0-openmpi4.1.030.4482.951875.37
intel2019.5-openmpi4.1.15.8261.681786.86
gcc11.2.0_binutils-openmpi4.1.2158.97236.874501.83
intel2017.4-mvapich22.3b1.939.31572.84
gcc7.2.0-mvapich22.3b2.139.9589.48
intel2017.4-mvapich22.3rc15.612.41612.05

16 nodes

modules8 bytes8 KBytes512 KByte
intel2017.1-impi2017.1102.31629.424721.54
intel2017.4-impi2017.4297.071456.064091.39
intel2017.6-impi2017.6293.991408.584085.12
intel2017.7-impi2017.7296.451417.684107.99
intel2018.0-impi2018.0297.631413.884094.05
intel2018.1-impi2018.1299.291437.434089.46
intel2018.2-impi2018.2291.031396.964087.25
intel2018.3-impi2018.3326.031428.414318.94
intel2018.4-impi2018.4324.141424.014167.5
gcc7.2.0-openmpi1.10.4-apps4.8757.674366.37
gcc7.1.0-openmpi1.10.74.8358.214410.58
gcc7.2.0-openmpi2.0.29.1663.514497.44
gcc7.2.0-openmpi2.1.36.3459.974235.84
gcc7.2.0-openmpi3.0.04.6757.924371.94
gcc7.2.0-openmpi3.1.14.7757.374347.11
gcc8.1.0-openmpi4.0.0192.46620.615311.73
gcc8.1.0-openmpi4.0.1191.33621.415332.69
gcc8.1.0-openmpi4.0.2296.12668.975355.06
gcc9.2.0-openmpi4.1.032.3274.932394.83
intel2019.5-openmpi4.1.15.3626.541125.53
gcc11.2.0_binutils-openmpi4.1.2134.99185.945312.98
intel2017.4-mvapich22.3b4.5115.5452.29
gcc7.2.0-mvapich22.3b3.9615.59457.94
intel2017.4-mvapich22.3rc17.2617.61417.83

32 nodes

modules8 bytes8 KBytes256 KByte
intel2017.1-impi2017.1130.183173.976498.84
intel2017.4-impi2017.4608.932865.145670.36
intel2017.6-impi2017.6614.182855.785711.35
intel2017.7-impi2017.7665.982873.495691.03
intel2018.0-impi2018.0606.952965.035560.79
intel2018.1-impi2018.1634.522941.035657.03
intel2018.2-impi2018.2603.692961.545555.04
intel2018.3-impi2018.3592.882922.855563.54
intel2018.4-impi2018.4610.192915.075642.15
gcc7.2.0-openmpi1.10.4-apps12.3170.721503.89
gcc7.1.0-openmpi1.10.76.9268.011437.71
gcc7.2.0-openmpi2.0.210.0172.541463.05
gcc7.2.0-openmpi2.1.37.3366.161441.69
gcc7.2.0-openmpi3.0.08.2967.251493.29
gcc7.2.0-openmpi3.1.18.782.271482.98
gcc8.1.0-openmpi4.0.0227.72695.2910184.9
gcc8.1.0-openmpi4.0.1228.4696.3410221.62
gcc8.1.0-openmpi4.0.2226.55931.7810195.16
gcc9.2.0-openmpi4.1.012.1365.931206.07
intel2019.5-openmpi4.1.12.723.4611.29
gcc11.2.0_binutils-openmpi4.1.228.1999.023553.48
intel2017.4-mvapich22.3b5.3518.85318.2
gcc7.2.0-mvapich22.3b5.2618.98316.33
intel2017.4-mvapich22.3rc17.9120.6281.32

osu_gather

Figures

4 Nodes

8 Bytes

osu_gather Benchmark 8 Bytes

8 KBytes

osu_gather Benchmark 8 KBytes

1 MBytes

osu_gather Benchmark 1 MBytes

16 Nodes

8 Bytes

osu_gather Benchmark 8 Bytes

8 KBytes

osu_gather Benchmark 8 KBytes

512 KBytes

osu_gather Benchmark 512 KBytes

32 Nodes

8 Bytes

osu_gather Benchmark 8 Bytes

8 KBytes

osu_gather Benchmark 8 KBytes

1 MBytes

osu_gather Benchmark 256 KBytes

Table

4 nodes

modules8 bytes8 KBytes1 MByte
intel2017.1-impi2017.11.9616.6322142.87
intel2017.4-impi2017.41.85188.1615128.98
intel2017.6-impi2017.61.81188.3214860.27
intel2017.7-impi2017.71.8187.6814863.02
intel2018.0-impi2018.01.77182.1515192.86
intel2018.1-impi2018.11.84180.9315223.88
intel2018.2-impi2018.21.87179.7915012.18
intel2018.3-impi2018.31.81180.6215074.23
intel2018.4-impi2018.41.86180.1614898.94
gcc7.2.0-openmpi1.10.4-apps1.82324.5415764.48
gcc7.1.0-openmpi1.10.71.79329.1615650.71
gcc7.2.0-openmpi2.0.22.73359.415878.08
gcc7.2.0-openmpi2.1.31.79341.3515905.79
gcc7.2.0-openmpi3.0.053.42337.4615979.74
gcc7.2.0-openmpi3.1.11.94330.7415877.03
gcc8.1.0-openmpi4.0.026.462246.6159390.95
gcc8.1.0-openmpi4.0.127.212193.8859513.81
gcc8.1.0-openmpi4.0.227.892551.4959407.75
gcc9.2.0-openmpi4.1.09.1920.4615544.1
intel2019.5-openmpi4.1.11.7725.393836.7
gcc11.2.0_binutils-openmpi4.1.241.0672.715562.59
intel2017.4-mvapich22.3b10.9121.473700.55
gcc7.2.0-mvapich22.3b11.0321.633895.46
intel2017.4-mvapich22.3rc110.68109.7620070.28

16 nodes

modules8 bytes8 KBytes512 KByte
intel2017.1-impi2017.12.0887.75298819.43
intel2017.4-impi2017.41.92875.8820577.65
intel2017.6-impi2017.61.93872.6720795.55
intel2017.7-impi2017.71.93891.3822101.55
intel2018.0-impi2018.02.0871.0220702.63
intel2018.1-impi2018.11.94872.0820898.03
intel2018.2-impi2018.22.01882.9320691.47
intel2018.3-impi2018.32.02878.4120548.34
intel2018.4-impi2018.41.98883.422025.08
gcc7.2.0-openmpi1.10.4-apps2.181482.2727560.88
gcc7.1.0-openmpi1.10.72.061644.8527289.46
gcc7.2.0-openmpi2.0.22.81675.4925424.05
gcc7.2.0-openmpi2.1.32.041248.5225847.65
gcc7.2.0-openmpi3.0.02.161444.6627202.16
gcc7.2.0-openmpi3.1.12.071384.3225859.17
gcc8.1.0-openmpi4.0.046.325000.38151238.75
gcc8.1.0-openmpi4.0.140.331456.51151447.91
gcc8.1.0-openmpi4.0.237.6927505.5151335.24
gcc9.2.0-openmpi4.1.011.4135.743029.58
intel2019.5-openmpi4.1.12.427.722978.09
gcc11.2.0_binutils-openmpi4.1.271.09120.333689.95
intel2017.4-mvapich22.3b3.3941.952353.34
gcc7.2.0-mvapich22.3b3.8343.092482.83
intel2017.4-mvapich22.3rc13.34117.871749.3

32 nodes

modules8 bytes8 KBytes256 KByte
intel2017.1-impi2017.12.0180.83150867.56
intel2017.4-impi2017.42.181904.4619056.69
intel2017.6-impi2017.62.122008.6119266.41
intel2017.7-impi2017.72.091971.1519625.75
intel2018.0-impi2018.02.041927.3419498.88
intel2018.1-impi2018.12.21881.7819137.57
intel2018.2-impi2018.22.082125.0719339.7
intel2018.3-impi2018.32.091930.9519187.82
intel2018.4-impi2018.42.132080.8719898.92
gcc7.2.0-openmpi1.10.4-apps2.993251.3634065.98
gcc7.1.0-openmpi1.10.72.73594.7334956.37
gcc7.2.0-openmpi2.0.23.644148.6136309.81
gcc7.2.0-openmpi2.1.310.693662.0236651.8
gcc7.2.0-openmpi3.0.02.63496.6334586.05
gcc7.2.0-openmpi3.1.15.284253.8538148.41
gcc8.1.0-openmpi4.0.051.4993981.27161737.09
gcc8.1.0-openmpi4.0.138.2492800.89162169.02
gcc8.1.0-openmpi4.0.262.0394489.72164897.85
gcc9.2.0-openmpi4.1.02.1443.951822.9
intel2019.5-openmpi4.1.12.1426.831224.44
gcc11.2.0_binutils-openmpi4.1.22.9363.722110.73
intel2017.4-mvapich22.3b1.753.91104.0
gcc7.2.0-mvapich22.3b1.8857.461165.56
intel2017.4-mvapich22.3rc12.19122.971102.28

osu_reduce

Figures

4 Nodes

8 Bytes

osu_reduce Benchmark 8 Bytes

8 KBytes

osu_reduce Benchmark 8 KBytes

1 MBytes

osu_reduce Benchmark 1 MBytes

16 Nodes

8 Bytes

osu_reduce Benchmark 8 Bytes

8 KBytes

osu_reduce Benchmark 8 KBytes

512 KBytes

osu_reduce Benchmark 512 KBytes

32 Nodes

8 Bytes

osu_reduce Benchmark 8 Bytes

8 KBytes

osu_reduce Benchmark 8 KBytes

1 MBytes

osu_reduce Benchmark 256 KBytes

Table

4 nodes

modules8 bytes8 KBytes1 MByte
intel2017.1-impi2017.12.0958.632960.73
intel2017.4-impi2017.41.9224.392768.31
intel2017.6-impi2017.61.8523.942874.59
intel2017.7-impi2017.71.8824.342813.5
intel2018.0-impi2018.01.8523.442734.39
intel2018.1-impi2018.11.9123.352781.43
intel2018.2-impi2018.21.8623.362862.96
intel2018.3-impi2018.31.9123.652876.59
intel2018.4-impi2018.41.923.512847.51
gcc7.2.0-openmpi1.10.4-apps1.7919.235075.14
gcc7.1.0-openmpi1.10.71.719.055050.15
gcc7.2.0-openmpi2.0.22.2921.254542.04
gcc7.2.0-openmpi2.1.31.7519.114537.27
gcc7.2.0-openmpi3.0.01.819.054657.91
gcc7.2.0-openmpi3.1.12.1719.134444.12
gcc8.1.0-openmpi4.0.025.9538.315589.22
gcc8.1.0-openmpi4.0.129.539.165667.32
gcc8.1.0-openmpi4.0.226.1338.025600.55
gcc9.2.0-openmpi4.1.06.9213.0210815.21
intel2019.5-openmpi4.1.11.9116.11588.95
gcc11.2.0_binutils-openmpi4.1.239.264.311482.73
intel2017.4-mvapich22.3b1.0416.99744.46
gcc7.2.0-mvapich22.3b1.0710.95909.25
intel2017.4-mvapich22.3rc10.7617.44853.38

16 nodes

modules8 bytes8 KBytes512 KByte
intel2017.1-impi2017.12.9156.121642.25
intel2017.4-impi2017.41.9739.341337.19
intel2017.6-impi2017.61.935.881310.75
intel2017.7-impi2017.71.933.231309.72
intel2018.0-impi2018.01.8932.761285.05
intel2018.1-impi2018.11.9131.791366.02
intel2018.2-impi2018.22.0744.141302.94
intel2018.3-impi2018.31.9640.441310.94
intel2018.4-impi2018.41.9435.351311.76
gcc7.2.0-openmpi1.10.4-apps1.9520.33999.97
gcc7.1.0-openmpi1.10.72.5619.27990.41
gcc7.2.0-openmpi2.0.22.4122.011000.29
gcc7.2.0-openmpi2.1.31.9519.82873.29
gcc7.2.0-openmpi3.0.01.9420.01935.46
gcc7.2.0-openmpi3.1.12.2519.22855.86
gcc8.1.0-openmpi4.0.041.5147.421417.6
gcc8.1.0-openmpi4.0.174.1948.231468.79
gcc8.1.0-openmpi4.0.242.1551.451410.59
gcc9.2.0-openmpi4.1.010.0914.12469.72
intel2019.5-openmpi4.1.12.6415.1543.02
gcc11.2.0_binutils-openmpi4.1.278.8184.42619.67
intel2017.4-mvapich22.3b1.0816.75368.05
gcc7.2.0-mvapich22.3b1.110.98460.89
intel2017.4-mvapich22.3rc10.8317.491178.56

32 nodes

modules8 bytes8 KBytes256 KByte
intel2017.1-impi2017.12.82179.95775.99
intel2017.4-impi2017.42.0246.5657.76
intel2017.6-impi2017.61.9437.99757.34
intel2017.7-impi2017.71.9142.47642.5
intel2018.0-impi2018.01.933.45684.88
intel2018.1-impi2018.12.0846.28672.62
intel2018.2-impi2018.21.9236.62638.13
intel2018.3-impi2018.32.3182.34828.75
intel2018.4-impi2018.42.0670.25645.57
gcc7.2.0-openmpi1.10.4-apps2.5320.92448.16
gcc7.1.0-openmpi1.10.72.7522.2447.59
gcc7.2.0-openmpi2.0.23.7223.01542.11
gcc7.2.0-openmpi2.1.32.4219.34491.57
gcc7.2.0-openmpi3.0.03.0521.09540.15
gcc7.2.0-openmpi3.1.12.1920.01436.82
gcc8.1.0-openmpi4.0.097.96111.53644.95
gcc8.1.0-openmpi4.0.140.2854.69646.26
gcc8.1.0-openmpi4.0.241.9561.1645.71
gcc9.2.0-openmpi4.1.02.1513.08122.91
intel2019.5-openmpi4.1.11.8915.77213.55
gcc11.2.0_binutils-openmpi4.1.22.9610.96229.69
intel2017.4-mvapich22.3b1.0117.46191.84
gcc7.2.0-mvapich22.3b0.8510.81232.32
intel2017.4-mvapich22.3rc10.7717.39549.3

osu_reduce_scatter

Figures

4 Nodes

8 Bytes

osu_reduce_scatter Benchmark 8 Bytes

8 KBytes

osu_reduce_scatter Benchmark 8 KBytes

1 MBytes

osu_reduce_scatter Benchmark 1 MBytes

16 Nodes

8 Bytes

osu_reduce_scatter Benchmark 8 Bytes

8 KBytes

osu_reduce_scatter Benchmark 8 KBytes

512 KBytes

osu_reduce_scatter Benchmark 512 KBytes

32 Nodes

8 Bytes

osu_reduce_scatter Benchmark 8 Bytes

8 KBytes

osu_reduce_scatter Benchmark 8 KBytes

1 MBytes

osu_reduce_scatter Benchmark 256 KBytes

Table

4 nodes

modules8 bytes8 KBytes1 MByte
intel2017.1-impi2017.13.4298.03938.23
intel2017.4-impi2017.43.2763.353208.94
intel2017.6-impi2017.63.1263.973123.55
intel2017.7-impi2017.73.1364.013208.91
intel2018.0-impi2018.03.263.253236.52
intel2018.1-impi2018.13.1263.733226.35
intel2018.2-impi2018.23.1464.93146.54
intel2018.3-impi2018.33.3263.023215.32
intel2018.4-impi2018.43.1463.193229.49
gcc7.2.0-openmpi1.10.4-apps2.59110.84965.86
gcc7.1.0-openmpi1.10.72.1856.913351.59
gcc7.2.0-openmpi2.0.23.1459.622374.48
gcc7.2.0-openmpi2.1.32.3159.242357.09
gcc7.2.0-openmpi3.0.02.1357.442418.51
gcc7.2.0-openmpi3.1.12.2758.182432.97
gcc8.1.0-openmpi4.0.052.28379.892306.6
gcc8.1.0-openmpi4.0.155.0487.222357.21
gcc8.1.0-openmpi4.0.252.1372.322329.16
gcc9.2.0-openmpi4.1.06.4694.423358.69
intel2019.5-openmpi4.1.12.3236.653464.31
gcc11.2.0_binutils-openmpi4.1.227.39382.384080.42
intel2017.4-mvapich22.3b2.466.434614.09
gcc7.2.0-mvapich22.3b2.6369.074885.29
intel2017.4-mvapich22.3rc11.6567.467227.98

16 nodes

modules8 bytes8 KBytes512 KByte
intel2017.1-impi2017.128.59510.25090.49
intel2017.4-impi2017.48.4594.342152.55
intel2017.6-impi2017.66.2692.262042.47
intel2017.7-impi2017.76.0783.912010.81
intel2018.0-impi2018.06.7982.432014.9
intel2018.1-impi2018.16.23100.452111.91
intel2018.2-impi2018.26.1493.882038.59
intel2018.3-impi2018.37.28100.072437.82
intel2018.4-impi2018.46.1783.262107.5
gcc7.2.0-openmpi1.10.4-apps3.978.014813.55
gcc7.1.0-openmpi1.10.73.6178.464622.38
gcc7.2.0-openmpi2.0.24.8278.914397.43
gcc7.2.0-openmpi2.1.35.4487.374597.46
gcc7.2.0-openmpi3.0.03.5379.035392.78
gcc7.2.0-openmpi3.1.15.4781.574912.9
gcc8.1.0-openmpi4.0.053.98896.3682845.74
gcc8.1.0-openmpi4.0.152.45766.9582899.38
gcc8.1.0-openmpi4.0.252.27763.3762356.7
gcc9.2.0-openmpi4.1.010.68225.31433.47
intel2019.5-openmpi4.1.14.0275.261566.6
gcc11.2.0_binutils-openmpi4.1.268.03684.152202.93
intel2017.4-mvapich22.3b5.5796.7313753.15
gcc7.2.0-mvapich22.3b5.5289.075356.0
intel2017.4-mvapich22.3rc13.3790.534050.13

32 nodes

modules8 bytes8 KBytes256 KByte
intel2017.1-impi2017.133.96579.277454.0
intel2017.4-impi2017.413.08117.812929.85
intel2017.6-impi2017.69.99120.02298.84
intel2017.7-impi2017.79.66105.922202.36
intel2018.0-impi2018.010.56125.312339.26
intel2018.1-impi2018.110.18141.052367.05
intel2018.2-impi2018.212.68118.512374.71
intel2018.3-impi2018.311.22107.182820.12
intel2018.4-impi2018.410.5143.862441.74
gcc7.2.0-openmpi1.10.4-apps6.9998.872836.8
gcc7.1.0-openmpi1.10.76.22128.752657.16
gcc7.2.0-openmpi2.0.29.2697.932356.25
gcc7.2.0-openmpi2.1.37.16109.92368.55
gcc7.2.0-openmpi3.0.05.4895.62616.07
gcc7.2.0-openmpi3.1.17.34102.752655.87
gcc8.1.0-openmpi4.0.058.86980.4870369.09
gcc8.1.0-openmpi4.0.176.41015.95133414.5
gcc8.1.0-openmpi4.0.260.331031.22117931.2
gcc9.2.0-openmpi4.1.04.55190.2810.47
intel2019.5-openmpi4.1.14.3248.58761.64
gcc11.2.0_binutils-openmpi4.1.27.16828.051459.34
intel2017.4-mvapich22.3b6.84162.112592.25
gcc7.2.0-mvapich22.3b9.61120.012147.77
intel2017.4-mvapich22.3rc15.57126.012760.69

osu_scatter

Figures

4 Nodes

8 Bytes

osu_scatter Benchmark 8 Bytes

8 KBytes

osu_scatter Benchmark 8 KBytes

1 MBytes

osu_scatter Benchmark 1 MBytes

16 Nodes

8 Bytes

osu_scatter Benchmark 8 Bytes

8 KBytes

osu_scatter Benchmark 8 KBytes

512 KBytes

osu_scatter Benchmark 512 KBytes

32 Nodes

8 Bytes

osu_scatter Benchmark 8 Bytes

8 KBytes

osu_scatter Benchmark 8 KBytes

1 MBytes

osu_scatter Benchmark 256 KBytes

Table

4 nodes

modules8 bytes8 KBytes1 MByte
intel2017.1-impi2017.19.77494.2935063.05
intel2017.4-impi2017.46.39129.46148.01
intel2017.6-impi2017.66.46130.576241.49
intel2017.7-impi2017.76.41128.336170.24
intel2018.0-impi2018.06.04129.526144.6
intel2018.1-impi2018.16.1128.976197.45
intel2018.2-impi2018.26.14128.197867.75
intel2018.3-impi2018.36.17128.576244.69
intel2018.4-impi2018.46.13128.236149.57
gcc7.2.0-openmpi1.10.4-apps20.13314.2512200.19
gcc7.1.0-openmpi1.10.77.82201.3912426.76
gcc7.2.0-openmpi2.0.29.03206.2312304.63
gcc7.2.0-openmpi2.1.37.67203.3212292.07
gcc7.2.0-openmpi3.0.07.89203.2612142.17
gcc7.2.0-openmpi3.1.17.73202.4712254.71
gcc8.1.0-openmpi4.0.0147.45861.9578787.45
gcc8.1.0-openmpi4.0.1147.15868.3482048.01
gcc8.1.0-openmpi4.0.2148.5872.6878425.01
gcc9.2.0-openmpi4.1.021.97497.026600.7
intel2019.5-openmpi4.1.14.73163.079041.41
gcc11.2.0_binutils-openmpi4.1.2115.53716.3489586.97
intel2017.4-mvapich22.3b4.63168.096140.56
gcc7.2.0-mvapich22.3b4.85165.186172.63
intel2017.4-mvapich22.3rc18.08174.716250.64

16 nodes

modules8 bytes8 KBytes512 KByte
intel2017.1-impi2017.161.43602.13217110.9
intel2017.4-impi2017.414.9464.917444.28
intel2017.6-impi2017.615.75468.7117794.76
intel2017.7-impi2017.715.22469.8417955.2
intel2018.0-impi2018.015.01467.8417855.1
intel2018.1-impi2018.114.84466.4517613.55
intel2018.2-impi2018.212.01466.3517034.35
intel2018.3-impi2018.315.11467.1217438.21
intel2018.4-impi2018.415.78470.5717544.99
gcc7.2.0-openmpi1.10.4-apps11.57666.9923052.77
gcc7.1.0-openmpi1.10.710.87668.7325091.8
gcc7.2.0-openmpi2.0.214.87695.7624773.74
gcc7.2.0-openmpi2.1.313.44700.8426492.76
gcc7.2.0-openmpi3.0.012.2681.9526925.91
gcc7.2.0-openmpi3.1.111.39696.0423193.56
gcc8.1.0-openmpi4.0.0194.343265.55212722.6
gcc8.1.0-openmpi4.0.1210.593288.22216925.57
gcc8.1.0-openmpi4.0.2209.583276.12214352.7
gcc9.2.0-openmpi4.1.036.924282.72271491.07
intel2019.5-openmpi4.1.15.99484.1617420.46
gcc11.2.0_binutils-openmpi4.1.2131.893917.87277926.46
intel2017.4-mvapich22.3b6.72436.3317819.36
gcc7.2.0-mvapich22.3b7.3421.4416865.35
intel2017.4-mvapich22.3rc19.81518.917480.0

32 nodes

modules8 bytes8 KBytes256 KByte
intel2017.1-impi2017.182.317186.8203329.6
intel2017.4-impi2017.420.84919.7717851.44
intel2017.6-impi2017.617.45919.4317813.98
intel2017.7-impi2017.725.14917.1717723.02
intel2018.0-impi2018.015.05917.322784.47
intel2018.1-impi2018.122.07917.4118099.03
intel2018.2-impi2018.215.03915.6616990.55
intel2018.3-impi2018.318.24915.9417327.08
intel2018.4-impi2018.425.56925.2718121.08
gcc7.2.0-openmpi1.10.4-apps15.191289.1135215.13
gcc7.1.0-openmpi1.10.718.651321.8233373.68
gcc7.2.0-openmpi2.0.219.231340.3134844.84
gcc7.2.0-openmpi2.1.318.721345.6366562.13
gcc7.2.0-openmpi3.0.016.231316.134347.06
gcc7.2.0-openmpi3.1.118.041349.9934270.5
gcc8.1.0-openmpi4.0.0571.386178.96247739.17
gcc8.1.0-openmpi4.0.1200.815864.14242965.59
gcc8.1.0-openmpi4.0.2207.846433.16246344.02
gcc9.2.0-openmpi4.1.021.6910413.28300342.75
intel2019.5-openmpi4.1.15.68943.6918429.23
gcc11.2.0_binutils-openmpi4.1.245.056825.97311272.33
intel2017.4-mvapich22.3b11.44729.7717580.5
gcc7.2.0-mvapich22.3b10.2720.2516973.76
intel2017.4-mvapich22.3rc1

Non-Blocking Collectives

osu_iallgather

Figures

4 Nodes

8 Bytes

osu_iallgather Benchmark 8 Bytes

8 KBytes

osu_iallgather Benchmark 8 KBytes

1 MBytes

osu_iallgather Benchmark 1 MBytes

16 Nodes

8 Bytes

osu_iallgather Benchmark 8 Bytes

8 KBytes

osu_iallgather Benchmark 8 KBytes

512 KBytes

osu_iallgather Benchmark 512 KBytes

32 Nodes

8 Bytes

osu_iallgather Benchmark 8 Bytes

8 KBytes

osu_iallgather Benchmark 8 KBytes

1 MBytes

osu_iallgather Benchmark 256 KBytes

Table

4 nodes

modules8 bytes8 KBytes1 MByte
intel2017.1-impi2017.172.56616.99447406.27
intel2017.4-impi2017.40.06089.63401841.1
intel2017.6-impi2017.645.236201.27390845.88
intel2017.7-impi2017.724.166099.09404967.51
intel2018.0-impi2018.045.836110.89380587.04
intel2018.1-impi2018.146.65975.01381222.3
intel2018.2-impi2018.246.295979.25381211.88
intel2018.3-impi2018.30.05973.22386581.6
intel2018.4-impi2018.418.415954.35388681.01
gcc7.2.0-openmpi1.10.4-apps601.3918268.441046859.86
gcc7.1.0-openmpi1.10.7663.117297.541048838.64
gcc7.2.0-openmpi2.0.20.521342.151152702.66
gcc7.2.0-openmpi2.1.3376.6317506.561087871.02
gcc7.2.0-openmpi3.0.0392.7416818.711092399.07
gcc7.2.0-openmpi3.1.1378.8420074.291111082.84
gcc8.1.0-openmpi4.0.012976.02
gcc8.1.0-openmpi4.0.113177.18
gcc8.1.0-openmpi4.0.221045.73
gcc9.2.0-openmpi4.1.03983.7427459.392773977.54
intel2019.5-openmpi4.1.1203.9816019.61330715.71
gcc11.2.0_binutils-openmpi4.1.238280.65
intel2017.4-mvapich22.3b60.83009.19215209.11
gcc7.2.0-mvapich22.3b56.562976.39207167.93
intel2017.4-mvapich22.3rc1202.572981.98233832.89

16 nodes

modules8 bytes8 KBytes512 KByte
intel2017.1-impi2017.1727.025520.961062173.23
intel2017.4-impi2017.426.825164.95805028.62
intel2017.6-impi2017.626.0425256.02820006.33
intel2017.7-impi2017.7139.0925280.59811391.94
intel2018.0-impi2018.025.1824658.6786703.02
intel2018.1-impi2018.1145.8724742.65799437.06
intel2018.2-impi2018.245.7624831.47787022.09
intel2018.3-impi2018.3164.5327461.83800691.68
intel2018.4-impi2018.4157.7724824.27811404.85
gcc7.2.0-openmpi1.10.4-apps5608.86
gcc7.1.0-openmpi1.10.76961.05
gcc7.2.0-openmpi2.0.22582.51
gcc7.2.0-openmpi2.1.32381.17
gcc7.2.0-openmpi3.0.01366.88
gcc7.2.0-openmpi3.1.12301.08
gcc8.1.0-openmpi4.0.0
gcc8.1.0-openmpi4.0.1
gcc8.1.0-openmpi4.0.2
gcc9.2.0-openmpi4.1.012402.01172143.67
intel2019.5-openmpi4.1.12411.29
gcc11.2.0_binutils-openmpi4.1.2
intel2017.4-mvapich22.3b46.7313894.37400247.47
gcc7.2.0-mvapich22.3b40.5113436.82375556.33
intel2017.4-mvapich22.3rc11109.7512166.15385108.62

32 nodes

modules8 bytes8 KBytes256 KByte
intel2017.1-impi2017.11160.7650764.211335155.61
intel2017.4-impi2017.454.6851561.54837929.32
intel2017.6-impi2017.60.054003.72818599.51
intel2017.7-impi2017.750.9454264.92820229.02
intel2018.0-impi2018.00.050115.44812661.31
intel2018.1-impi2018.167.8950252.17799661.26
intel2018.2-impi2018.2304.0452913.01819011.17
intel2018.3-impi2018.358.5253377.99952117.09
intel2018.4-impi2018.4318.2852535.66799577.67
gcc7.2.0-openmpi1.10.4-apps16520.07
gcc7.1.0-openmpi1.10.723624.33
gcc7.2.0-openmpi2.0.29104.55
gcc7.2.0-openmpi2.1.37399.91
gcc7.2.0-openmpi3.0.07458.8
gcc7.2.0-openmpi3.1.17064.13
gcc8.1.0-openmpi4.0.0
gcc8.1.0-openmpi4.0.1
gcc8.1.0-openmpi4.0.2
gcc9.2.0-openmpi4.1.06.0
intel2019.5-openmpi4.1.14538.82
gcc11.2.0_binutils-openmpi4.1.2
intel2017.4-mvapich22.3b250.0537229.09492424.92
gcc7.2.0-mvapich22.3b43.5127412.93403120.27
intel2017.4-mvapich22.3rc12122.2430942.21369348.05

osu_iallreduce

Figures

4 Nodes

8 Bytes

osu_iallreduce Benchmark 8 Bytes

8 KBytes

osu_iallreduce Benchmark 8 KBytes

1 MBytes

osu_iallreduce Benchmark 1 MBytes

16 Nodes

8 Bytes

osu_iallreduce Benchmark 8 Bytes

8 KBytes

osu_iallreduce Benchmark 8 KBytes

512 KBytes

osu_iallreduce Benchmark 512 KBytes

32 Nodes

8 Bytes

osu_iallreduce Benchmark 8 Bytes

8 KBytes

osu_iallreduce Benchmark 8 KBytes

1 MBytes

osu_iallreduce Benchmark 256 KBytes

Table

4 nodes

modules8 bytes8 KBytes1 MByte
intel2017.1-impi2017.196.93306.118389.14
intel2017.4-impi2017.40.0105.567283.56
intel2017.6-impi2017.625.81107.287234.98
intel2017.7-impi2017.712.52108.017186.1
intel2018.0-impi2018.00.0108.587206.79
intel2018.1-impi2018.111.22141.67089.89
intel2018.2-impi2018.225.94107.357083.97
intel2018.3-impi2018.312.72106.465247.81
intel2018.4-impi2018.412.8105.255211.65
gcc7.2.0-openmpi1.10.4-apps15.72299.932387.21
gcc7.1.0-openmpi1.10.729.81299.3411317.3
gcc7.2.0-openmpi2.0.263.77347.119714.75
gcc7.2.0-openmpi2.1.330.84296.288737.68
gcc7.2.0-openmpi3.0.031.16296.159358.36
gcc7.2.0-openmpi3.1.10.0297.259183.55
gcc8.1.0-openmpi4.0.0805.842467.188607.9
gcc8.1.0-openmpi4.0.1350.742527.588637.66
gcc8.1.0-openmpi4.0.2813.282531.678655.14
gcc9.2.0-openmpi4.1.0143.77384.798365.76
intel2019.5-openmpi4.1.133.17274.097322.22
gcc11.2.0_binutils-openmpi4.1.24799.821655.8111109.35
intel2017.4-mvapich22.3b30.86124.454321.7
gcc7.2.0-mvapich22.3b27.9122.94405.62
intel2017.4-mvapich22.3rc129.83270.6524665.53

16 nodes

modules8 bytes8 KBytes512 KByte
intel2017.1-impi2017.1444.86697.156834.27
intel2017.4-impi2017.438.97155.463562.82
intel2017.6-impi2017.622.4147.273462.03
intel2017.7-impi2017.742.95147.893442.54
intel2018.0-impi2018.037.72157.093453.39
intel2018.1-impi2018.136.24152.363398.73
intel2018.2-impi2018.220.25151.223378.06
intel2018.3-impi2018.30.0175.832488.34
intel2018.4-impi2018.441.53192.942568.39
gcc7.2.0-openmpi1.10.4-apps58.95476.7959456.24
gcc7.1.0-openmpi1.10.752.57472.0859162.61
gcc7.2.0-openmpi2.0.254.11573.318158.55
gcc7.2.0-openmpi2.1.355.09454.0115005.35
gcc7.2.0-openmpi3.0.049.81461.2511255.52
gcc7.2.0-openmpi3.1.166.06467.4314496.31
gcc8.1.0-openmpi4.0.0515.245730.8216351.66
gcc8.1.0-openmpi4.0.1501.785674.4818344.28
gcc8.1.0-openmpi4.0.21118.735693.916631.55
gcc9.2.0-openmpi4.1.0287.11710.154477.58
intel2019.5-openmpi4.1.178.34418.613533.6
gcc11.2.0_binutils-openmpi4.1.21230.683162.146905.31
intel2017.4-mvapich22.3b35.74172.561768.59
gcc7.2.0-mvapich22.3b71.55272.272514.5
intel2017.4-mvapich22.3rc1122.8436.8411231.19

32 nodes

modules8 bytes8 KBytes256 KByte
intel2017.1-impi2017.1264.021789.7916963.81
intel2017.4-impi2017.4120.27642.282174.55
intel2017.6-impi2017.6120.38261.312059.15
intel2017.7-impi2017.748.52270.491827.7
intel2018.0-impi2018.078.7218.481910.62
intel2018.1-impi2018.138.56256.571980.76
intel2018.2-impi2018.247.94184.272026.49
intel2018.3-impi2018.355.18239.011405.31
intel2018.4-impi2018.472.2263.41456.36
gcc7.2.0-openmpi1.10.4-apps93.956486.56342003.98
gcc7.1.0-openmpi1.10.785.7572.46197435.36
gcc7.2.0-openmpi2.0.2126.46652.0434436.2
gcc7.2.0-openmpi2.1.3100.7570.527261.33
gcc7.2.0-openmpi3.0.00.0623.6426463.09
gcc7.2.0-openmpi3.1.1115.76587.133557.17
gcc8.1.0-openmpi4.0.01547.6811782.6328657.7
gcc8.1.0-openmpi4.0.11442.87187.0123920.47
gcc8.1.0-openmpi4.0.21629.9912759.6926158.15
gcc9.2.0-openmpi4.1.0345.06788.833376.68
intel2019.5-openmpi4.1.1143.59580.642316.46
gcc11.2.0_binutils-openmpi4.1.2898.53711.85189.48
intel2017.4-mvapich22.3b61.82254.541309.38
gcc7.2.0-mvapich22.3b56.18240.671307.15
intel2017.4-mvapich22.3rc185.21307.6116020.7

osu_ialltoall

Figures

4 Nodes

8 Bytes

osu_ialltoall Benchmark 8 Bytes

8 KBytes

osu_ialltoall Benchmark 8 KBytes

1 MBytes

osu_ialltoall Benchmark 1 MBytes

16 Nodes

8 Bytes

osu_ialltoall Benchmark 8 Bytes

8 KBytes

osu_ialltoall Benchmark 8 KBytes

512 KBytes

osu_ialltoall Benchmark 512 KBytes

32 Nodes

8 Bytes

osu_ialltoall Benchmark 8 Bytes

8 KBytes

osu_ialltoall Benchmark 8 KBytes

1 MBytes

osu_ialltoall Benchmark 256 KBytes

Table

4 nodes

modules8 bytes8 KBytes1 MByte
intel2017.1-impi2017.1287.2145631.822660989.62
intel2017.4-impi2017.4129.8814577.831353090.75
intel2017.6-impi2017.6116.0814493.761344869.13
intel2017.7-impi2017.70.014506.951350741.77
intel2018.0-impi2018.0114.7414169.811310500.99
intel2018.1-impi2018.1116.8514180.521309638.54
intel2018.2-impi2018.253.3214187.541315370.55
intel2018.3-impi2018.3130.514279.41312135.88
intel2018.4-impi2018.450.1114201.61311033.57
gcc7.2.0-openmpi1.10.4-apps601.9118398.09966028.48
gcc7.1.0-openmpi1.10.7642.418218.141049184.97
gcc7.2.0-openmpi2.0.2248.3718488.791114060.93
gcc7.2.0-openmpi2.1.31.3917290.291082812.16
gcc7.2.0-openmpi3.0.0396.9718215.731081852.59
gcc7.2.0-openmpi3.1.1192.7318106.261075919.71
gcc8.1.0-openmpi4.0.020702.23
gcc8.1.0-openmpi4.0.113657.04
gcc8.1.0-openmpi4.0.257.94
gcc9.2.0-openmpi4.1.02685.0529496.262849601.19
intel2019.5-openmpi4.1.1
gcc11.2.0_binutils-openmpi4.1.224674.31
intel2017.4-mvapich22.3b138.0516652.281488879.88
gcc7.2.0-mvapich22.3b113.8216052.671417142.78
intel2017.4-mvapich22.3rc1395.2416233.841427236.73

16 nodes

modules8 bytes8 KBytes512 KByte
intel2017.1-impi2017.11459.18
intel2017.4-impi2017.4181.5574230.983424146.14
intel2017.6-impi2017.6405.9677098.973445716.23
intel2017.7-impi2017.70.076856.823442044.39
intel2018.0-impi2018.0167.4472013.753331513.84
intel2018.1-impi2018.1174.2269779.153331679.49
intel2018.2-impi2018.2174.7371693.473324345.54
intel2018.3-impi2018.3180.9688315.785078969.08
intel2018.4-impi2018.4180.3692558.424807165.45
gcc7.2.0-openmpi1.10.4-apps4.12
gcc7.1.0-openmpi1.10.73525.05
gcc7.2.0-openmpi2.0.21119.0
gcc7.2.0-openmpi2.1.32259.81
gcc7.2.0-openmpi3.0.00.0
gcc7.2.0-openmpi3.1.17871.36
gcc8.1.0-openmpi4.0.0
gcc8.1.0-openmpi4.0.1
gcc8.1.0-openmpi4.0.2
gcc9.2.0-openmpi4.1.025015.39203666.35
intel2019.5-openmpi4.1.11209.47
gcc11.2.0_binutils-openmpi4.1.2
intel2017.4-mvapich22.3b434.2796621.913518552.12
gcc7.2.0-mvapich22.3b189.8108821.675347956.39
intel2017.4-mvapich22.3rc1266.83202724.52

32 nodes

modules8 bytes8 KBytes256 KByte
intel2017.1-impi2017.13689.24
intel2017.4-impi2017.41535.78274672.35
intel2017.6-impi2017.6449.5251371.96
intel2017.7-impi2017.7482.5205635.72
intel2018.0-impi2018.00.0251310.89
intel2018.1-impi2018.1428.18213421.645952284.75
intel2018.2-impi2018.21092.69282470.93
intel2018.3-impi2018.30.0231851.54
intel2018.4-impi2018.41132.66211315.865763847.83
gcc7.2.0-openmpi1.10.4-apps34256.69
gcc7.1.0-openmpi1.10.777243.4
gcc7.2.0-openmpi2.0.27892.11
gcc7.2.0-openmpi2.1.37695.45
gcc7.2.0-openmpi3.0.020117.81
gcc7.2.0-openmpi3.1.138090.42
gcc8.1.0-openmpi4.0.0
gcc8.1.0-openmpi4.0.1
gcc8.1.0-openmpi4.0.2
gcc9.2.0-openmpi4.1.059279.86
intel2019.5-openmpi4.1.18218.06
gcc11.2.0_binutils-openmpi4.1.2
intel2017.4-mvapich22.3b837.3257885.45
gcc7.2.0-mvapich22.3b1289.2252011.03
intel2017.4-mvapich22.3rc11929.94484457.97

osu_ibcast

Figures

4 Nodes

8 Bytes

osu_ibcast Benchmark 8 Bytes

8 KBytes

osu_ibcast Benchmark 8 KBytes

1 MBytes

osu_ibcast Benchmark 1 MBytes

16 Nodes

8 Bytes

osu_ibcast Benchmark 8 Bytes

8 KBytes

osu_ibcast Benchmark 8 KBytes

512 KBytes

osu_ibcast Benchmark 512 KBytes

32 Nodes

8 Bytes

osu_ibcast Benchmark 8 Bytes

8 KBytes

osu_ibcast Benchmark 8 KBytes

1 MBytes

osu_ibcast Benchmark 256 KBytes

Table

4 nodes

modules8 bytes8 KBytes1 MByte
intel2017.1-impi2017.116.257.566416.9
intel2017.4-impi2017.411.0827.45759.36
intel2017.6-impi2017.68.0528.245792.46
intel2017.7-impi2017.711.727.875845.31
intel2018.0-impi2018.07.9728.225564.0
intel2018.1-impi2018.115.7828.145675.05
intel2018.2-impi2018.23.5428.165670.04
intel2018.3-impi2018.34.5827.695617.22
intel2018.4-impi2018.414.9828.035670.52
gcc7.2.0-openmpi1.10.4-apps5.82142.653349.17
gcc7.1.0-openmpi1.10.710.87141.353321.18
gcc7.2.0-openmpi2.0.221.25169.4910200.93
gcc7.2.0-openmpi2.1.327.71143.79343.43
gcc7.2.0-openmpi3.0.010.76144.699558.41
gcc7.2.0-openmpi3.1.15.74142.369615.29
gcc8.1.0-openmpi4.0.0156.39803.53148922.38
gcc8.1.0-openmpi4.0.1147.67768.27147363.4
gcc8.1.0-openmpi4.0.2149.51760.06147631.8
gcc9.2.0-openmpi4.1.030.21113.349150.05
intel2019.5-openmpi4.1.16.42138.0212870.35
gcc11.2.0_binutils-openmpi4.1.2
intel2017.4-mvapich22.3b
gcc7.2.0-mvapich22.3b6.6235.053480.97
intel2017.4-mvapich22.3rc18.1938.521976.7

16 nodes

modules8 bytes8 KBytes512 KByte
intel2017.1-impi2017.166.91123.479699.15
intel2017.4-impi2017.410.534.376535.14
intel2017.6-impi2017.65.0533.985937.77
intel2017.7-impi2017.78.2434.595724.92
intel2018.0-impi2018.010.1834.435943.77
intel2018.1-impi2018.19.8833.995622.13
intel2018.2-impi2018.210.335.276465.05
intel2018.3-impi2018.318.3733.676210.65
intel2018.4-impi2018.46.5234.516670.25
gcc7.2.0-openmpi1.10.4-apps16.11267.786565.51
gcc7.1.0-openmpi1.10.711.32266.186498.54
gcc7.2.0-openmpi2.0.227.92307.279315.69
gcc7.2.0-openmpi2.1.39.05269.237755.78
gcc7.2.0-openmpi3.0.08.24263.478260.9
gcc7.2.0-openmpi3.1.113.02262.598536.95
gcc8.1.0-openmpi4.0.0206.351497.08160375.24
gcc8.1.0-openmpi4.0.1280.551484.91158292.61
gcc8.1.0-openmpi4.0.2209.131488.7158184.63
gcc9.2.0-openmpi4.1.050.18188.8810677.1
intel2019.5-openmpi4.1.115.81267.712154.87
gcc11.2.0_binutils-openmpi4.1.2276.861225.55166316.87
intel2017.4-mvapich22.3b9.4345.117129.13
gcc7.2.0-mvapich22.3b19.9847.488442.55
intel2017.4-mvapich22.3rc116.0848.161257.72

32 nodes

modules8 bytes8 KBytes256 KByte
intel2017.1-impi2017.1109.34154.511360.17
intel2017.4-impi2017.416.2642.3513799.99
intel2017.6-impi2017.68.3743.4712679.03
intel2017.7-impi2017.79.3457.3512609.04
intel2018.0-impi2018.014.8541.5713298.46
intel2018.1-impi2018.119.1842.6612825.56
intel2018.2-impi2018.27.0539.5711233.9
intel2018.3-impi2018.344.3142.5511860.41
intel2018.4-impi2018.47.5839.7411475.38
gcc7.2.0-openmpi1.10.4-apps16.03322.276867.65
gcc7.1.0-openmpi1.10.717.76328.296827.35
gcc7.2.0-openmpi2.0.234.29375.665634.6
gcc7.2.0-openmpi2.1.311.96324.884848.14
gcc7.2.0-openmpi3.0.023.73321.35038.65
gcc7.2.0-openmpi3.1.116.84322.435317.09
gcc8.1.0-openmpi4.0.0322.682040.1192811.45
gcc8.1.0-openmpi4.0.1241.952001.7595063.99
gcc8.1.0-openmpi4.0.264.992029.7791896.77
gcc9.2.0-openmpi4.1.030.85213.355485.23
intel2019.5-openmpi4.1.112.78313.378001.63
gcc11.2.0_binutils-openmpi4.1.232.671626.52109727.83
intel2017.4-mvapich22.3b10.0850.8115438.61
gcc7.2.0-mvapich22.3b75.42116.0712842.07
intel2017.4-mvapich22.3rc112.2555.31733.37

osu_igather

Figures

4 Nodes

8 Bytes

osu_igather Benchmark 8 Bytes

8 KBytes

osu_igather Benchmark 8 KBytes

1 MBytes

osu_igather Benchmark 1 MBytes

16 Nodes

8 Bytes

osu_igather Benchmark 8 Bytes

8 KBytes

osu_igather Benchmark 8 KBytes

512 KBytes

osu_igather Benchmark 512 KBytes

32 Nodes

8 Bytes

osu_igather Benchmark 8 Bytes

8 KBytes

osu_igather Benchmark 8 KBytes

1 MBytes

osu_igather Benchmark 256 KBytes

Table

4 nodes

modules8 bytes8 KBytes1 MByte
intel2017.1-impi2017.15.465.8839722.78
intel2017.4-impi2017.43.2787.3214649.0
intel2017.6-impi2017.621.4686.2414593.67
intel2017.7-impi2017.72.5187.8415225.72
intel2018.0-impi2018.03.4784.613974.7
intel2018.1-impi2018.15.4685.2814035.63
intel2018.2-impi2018.23.5584.1714119.21
intel2018.3-impi2018.325.0885.5214106.59
intel2018.4-impi2018.45.4484.0814090.19
gcc7.2.0-openmpi1.10.4-apps23.29526.6135139.19
gcc7.1.0-openmpi1.10.723.09532.0535140.35
gcc7.2.0-openmpi2.0.231.17537.4938894.43
gcc7.2.0-openmpi2.1.30.75500.3234327.27
gcc7.2.0-openmpi3.0.012.53501.1534370.88
gcc7.2.0-openmpi3.1.11.54501.3634575.6
gcc8.1.0-openmpi4.0.026.732.44246902.99
gcc8.1.0-openmpi4.0.115.1131.84253126.79
gcc8.1.0-openmpi4.0.210.7627.21242312.76
gcc9.2.0-openmpi4.1.010.57124.8628300.18
intel2019.5-openmpi4.1.111.9491.3434991.64
gcc11.2.0_binutils-openmpi4.1.26.7463.65146522.96
intel2017.4-mvapich22.3b5.08139.718631.3
gcc7.2.0-mvapich22.3b5.06137.0218354.15
intel2017.4-mvapich22.3rc15.0698.9917547.99

16 nodes

modules8 bytes8 KBytes512 KByte
intel2017.1-impi2017.145.86221.6522648.53
intel2017.4-impi2017.42.52179.8911811.87
intel2017.6-impi2017.65.42167.3111735.06
intel2017.7-impi2017.72.53165.4811781.74
intel2018.0-impi2018.05.63161.1911244.44
intel2018.1-impi2018.15.51163.0911294.15
intel2018.2-impi2018.22.52166.2311527.83
intel2018.3-impi2018.35.58200.7411418.28
intel2018.4-impi2018.45.69165.0511523.86
gcc7.2.0-openmpi1.10.4-apps9.522789.156563.13
gcc7.1.0-openmpi1.10.70.02943.1663305.6
gcc7.2.0-openmpi2.0.214.542208.1265597.17
gcc7.2.0-openmpi2.1.30.01973.060637.13
gcc7.2.0-openmpi3.0.09.851918.755540.22
gcc7.2.0-openmpi3.1.18.61941.956647.76
gcc8.1.0-openmpi4.0.096.04
gcc8.1.0-openmpi4.0.1642.89
gcc8.1.0-openmpi4.0.2502.85
gcc9.2.0-openmpi4.1.05.685.68189745.95
intel2019.5-openmpi4.1.14.672034.3762388.2
gcc11.2.0_binutils-openmpi4.1.275.36
intel2017.4-mvapich22.3b5.77263.6214191.07
gcc7.2.0-mvapich22.3b5.68262.2313995.87
intel2017.4-mvapich22.3rc13.24185.5513494.57

32 nodes

modules8 bytes8 KBytes256 KByte
intel2017.1-impi2017.143.41278.3513562.6
intel2017.4-impi2017.45.49209.196924.56
intel2017.6-impi2017.65.52202.837477.93
intel2017.7-impi2017.73.38214.876864.86
intel2018.0-impi2018.03.57199.046563.45
intel2018.1-impi2018.12.42229.546666.08
intel2018.2-impi2018.23.63205.177194.32
intel2018.3-impi2018.319.38197.86541.75
intel2018.4-impi2018.42.69198.326543.81
gcc7.2.0-openmpi1.10.4-apps8.576955.2266057.53
gcc7.1.0-openmpi1.10.75.178562.1476562.52
gcc7.2.0-openmpi2.0.25.315217.0864545.2
gcc7.2.0-openmpi2.1.36.224380.3159474.81
gcc7.2.0-openmpi3.0.02.494427.6958582.61
gcc7.2.0-openmpi3.1.12.594291.3558723.14
gcc8.1.0-openmpi4.0.0
gcc8.1.0-openmpi4.0.10.0
gcc8.1.0-openmpi4.0.2
gcc9.2.0-openmpi4.1.04.722360.76634542.7
intel2019.5-openmpi4.1.13.354060.3456194.1
gcc11.2.0_binutils-openmpi4.1.252.85
intel2017.4-mvapich22.3b5.41310.847762.29
gcc7.2.0-mvapich22.3b5.53318.087692.41
intel2017.4-mvapich22.3rc12.28216.377497.85

osu_ireduce

Figures

4 Nodes

8 Bytes

osu_ireduce Benchmark 8 Bytes

8 KBytes

osu_ireduce Benchmark 8 KBytes

1 MBytes

osu_ireduce Benchmark 1 MBytes

16 Nodes

8 Bytes

osu_ireduce Benchmark 8 Bytes

8 KBytes

osu_ireduce Benchmark 8 KBytes

512 KBytes

osu_ireduce Benchmark 512 KBytes

32 Nodes

8 Bytes

osu_ireduce Benchmark 8 Bytes

8 KBytes

osu_ireduce Benchmark 8 KBytes

1 MBytes

osu_ireduce Benchmark 256 KBytes

Table

4 nodes

modules8 bytes8 KBytes1 MByte
intel2017.1-impi2017.12.81162.896035.27
intel2017.4-impi2017.45.2847.886406.36
intel2017.6-impi2017.65.2847.946417.37
intel2017.7-impi2017.72.5648.16398.35
intel2018.0-impi2018.05.5149.86260.03
intel2018.1-impi2018.15.4748.516147.05
intel2018.2-impi2018.25.4847.916237.71
intel2018.3-impi2018.35.4749.346117.89
intel2018.4-impi2018.45.5347.616045.87
gcc7.2.0-openmpi1.10.4-apps6.2426.13307.15
gcc7.1.0-openmpi1.10.741.6626.543496.08
gcc7.2.0-openmpi2.0.214.836.612328.51
gcc7.2.0-openmpi2.1.35.9126.572399.99
gcc7.2.0-openmpi3.0.06.7727.12070.54
gcc7.2.0-openmpi3.1.15.9227.112423.19
gcc8.1.0-openmpi4.0.041.7855.842774.36
gcc8.1.0-openmpi4.0.133.3461.063172.69
gcc8.1.0-openmpi4.0.242.4957.342730.47
gcc9.2.0-openmpi4.1.011.27223.814853.36
intel2019.5-openmpi4.1.16.3187.784655.72
gcc11.2.0_binutils-openmpi4.1.260.35691.116998.02
intel2017.4-mvapich22.3b4.8348.754607.12
gcc7.2.0-mvapich22.3b2.955.485137.23
intel2017.4-mvapich22.3rc12.7213.33795.24

16 nodes

modules8 bytes8 KBytes512 KByte
intel2017.1-impi2017.133.2448.983917.77
intel2017.4-impi2017.45.3260.552905.14
intel2017.6-impi2017.65.2760.32949.18
intel2017.7-impi2017.72.5860.472904.38
intel2018.0-impi2018.05.5660.32827.55
intel2018.1-impi2018.15.5161.622803.8
intel2018.2-impi2018.23.6183.632902.03
intel2018.3-impi2018.35.571.572962.27
intel2018.4-impi2018.45.4662.262760.66
gcc7.2.0-openmpi1.10.4-apps6.4127.031766.97
gcc7.1.0-openmpi1.10.76.2330.161768.89
gcc7.2.0-openmpi2.0.217.1639.351127.75
gcc7.2.0-openmpi2.1.32.9428.11928.01
gcc7.2.0-openmpi3.0.06.928.24852.36
gcc7.2.0-openmpi3.1.125.0228.0964.15
gcc8.1.0-openmpi4.0.062.5569.251231.05
gcc8.1.0-openmpi4.0.163.9969.11250.49
gcc8.1.0-openmpi4.0.269.8378.941250.1
gcc9.2.0-openmpi4.1.049.09342.922441.0
intel2019.5-openmpi4.1.16.77113.132051.29
gcc11.2.0_binutils-openmpi4.1.2117.911136.123917.33
intel2017.4-mvapich22.3b5.37271.882378.01
gcc7.2.0-mvapich22.3b5.5480.032668.67
intel2017.4-mvapich22.3rc14.9614.031684.12

32 nodes

modules8 bytes8 KBytes256 KByte
intel2017.1-impi2017.16.62532.341943.89
intel2017.4-impi2017.42.5793.891619.27
intel2017.6-impi2017.65.2996.211457.77
intel2017.7-impi2017.75.34118.111465.28
intel2018.0-impi2018.03.6162.711528.08
intel2018.1-impi2018.127.69103.841479.67
intel2018.2-impi2018.22.5682.141351.85
intel2018.3-impi2018.35.5189.01498.26
intel2018.4-impi2018.45.4481.111343.92
gcc7.2.0-openmpi1.10.4-apps8.0729.11153.98
gcc7.1.0-openmpi1.10.78.6527.85917.95
gcc7.2.0-openmpi2.0.212.3638.99580.82
gcc7.2.0-openmpi2.1.36.330.91490.23
gcc7.2.0-openmpi3.0.07.2828.4511.36
gcc7.2.0-openmpi3.1.14.0829.52502.41
gcc8.1.0-openmpi4.0.063.3577.46640.62
gcc8.1.0-openmpi4.0.164.6279.2734.16
gcc8.1.0-openmpi4.0.252.9599.15643.57
gcc9.2.0-openmpi4.1.06.13382.771424.09
intel2019.5-openmpi4.1.15.44203.971058.04
gcc11.2.0_binutils-openmpi4.1.231.311052.632325.41
intel2017.4-mvapich22.3b5.17193.381009.14
gcc7.2.0-mvapich22.3b3.19124.841384.02
intel2017.4-mvapich22.3rc15.0214.34765.94

osu_iscatter

Figures

4 Nodes

8 Bytes

osu_iscatter Benchmark 8 Bytes

8 KBytes

osu_iscatter Benchmark 8 KBytes

1 MBytes

osu_iscatter Benchmark 1 MBytes

16 Nodes

8 Bytes

osu_iscatter Benchmark 8 Bytes

8 KBytes

osu_iscatter Benchmark 8 KBytes

512 KBytes

osu_iscatter Benchmark 512 KBytes

32 Nodes

8 Bytes

osu_iscatter Benchmark 8 Bytes

8 KBytes

osu_iscatter Benchmark 8 KBytes

1 MBytes

osu_iscatter Benchmark 256 KBytes

Table

4 nodes

modules8 bytes8 KBytes1 MByte
intel2017.1-impi2017.112.0114321.94370804.31
intel2017.4-impi2017.47.64343.26117977.01
intel2017.6-impi2017.60.0341.94117661.23
intel2017.7-impi2017.78.9352.81117459.6
intel2018.0-impi2018.016.92334.95112578.88
intel2018.1-impi2018.17.22350.48112944.16
intel2018.2-impi2018.28.36345.88114111.6
intel2018.3-impi2018.37.53330.57112456.4
intel2018.4-impi2018.416.87332.49113223.17
gcc7.2.0-openmpi1.10.4-apps76.91237.2317292.41
gcc7.1.0-openmpi1.10.788.14244.3217760.95
gcc7.2.0-openmpi2.0.271.45259.0418875.41
gcc7.2.0-openmpi2.1.354.92214.3716780.59
gcc7.2.0-openmpi3.0.052.37215.5516770.2
gcc7.2.0-openmpi3.1.156.56210.8616959.4
gcc8.1.0-openmpi4.0.0329.71929.43171177.47
gcc8.1.0-openmpi4.0.1348.76942.15171231.89
gcc8.1.0-openmpi4.0.2343.36945.69171594.91
gcc9.2.0-openmpi4.1.0141.88985.188377.55
intel2019.5-openmpi4.1.157.02521.5619868.9
gcc11.2.0_binutils-openmpi4.1.2491.42857.79164943.82
intel2017.4-mvapich22.3b16.58406.89117227.14
gcc7.2.0-mvapich22.3b23.82414.21114304.3
intel2017.4-mvapich22.3rc111.41368.18116866.65

16 nodes

modules8 bytes8 KBytes512 KByte
intel2017.1-impi2017.157.9710565.83643866.77
intel2017.4-impi2017.413.931168.75208580.87
intel2017.6-impi2017.60.01165.79208617.1
intel2017.7-impi2017.728.861166.52209292.82
intel2018.0-impi2018.028.151142.04200456.82
intel2018.1-impi2018.112.261139.52209179.7
intel2018.2-impi2018.20.01682.25201796.01
intel2018.3-impi2018.313.711147.93202783.99
intel2018.4-impi2018.430.191195.94203639.67
gcc7.2.0-openmpi1.10.4-apps630.541053.9100367.23
gcc7.1.0-openmpi1.10.799.541219.73101679.69
gcc7.2.0-openmpi2.0.2277.03771.22106709.27
gcc7.2.0-openmpi2.1.3222.99646.47101414.43
gcc7.2.0-openmpi3.0.0217.71662.92101270.0
gcc7.2.0-openmpi3.1.1226.27645.62101372.87
gcc8.1.0-openmpi4.0.01194.743404.47590262.13
gcc8.1.0-openmpi4.0.11189.353452.11590923.96
gcc8.1.0-openmpi4.0.298.363388.88591494.86
gcc9.2.0-openmpi4.1.01925.779561.76525744.05
intel2019.5-openmpi4.1.1219.96622.76105633.68
gcc11.2.0_binutils-openmpi4.1.22403.214553.51493939.06
intel2017.4-mvapich22.3b13.781237.66208838.54
gcc7.2.0-mvapich22.3b46.191387.97202833.9
intel2017.4-mvapich22.3rc142.421316.79206054.64

32 nodes

modules8 bytes8 KBytes256 KByte
intel2017.1-impi2017.148.4725121.49667274.22
intel2017.4-impi2017.418.452243.93198820.35
intel2017.6-impi2017.638.752490.94203927.15
intel2017.7-impi2017.746.82485.05205257.45
intel2018.0-impi2018.051.762394.6198347.83
intel2018.1-impi2018.149.272500.48196242.88
intel2018.2-impi2018.218.652351.39195060.95
intel2018.3-impi2018.317.82316.17194695.39
intel2018.4-impi2018.419.542393.69196492.15
gcc7.2.0-openmpi1.10.4-apps2150.432906.993471.66
gcc7.1.0-openmpi1.10.799.763481.9120458.88
gcc7.2.0-openmpi2.0.298.551471.13121308.38
gcc7.2.0-openmpi2.1.3471.361280.43111604.41
gcc7.2.0-openmpi3.0.0462.61235.54101935.01
gcc7.2.0-openmpi3.1.1456.911227.3596386.19
gcc8.1.0-openmpi4.0.099.026185.59614499.69
gcc8.1.0-openmpi4.0.12301.16161.03637260.08
gcc8.1.0-openmpi4.0.22292.596025.56614908.86
gcc9.2.0-openmpi4.1.00.022726.7578179.69
intel2019.5-openmpi4.1.199.531135.31109426.27
gcc11.2.0_binutils-openmpi4.1.23004.148567.34499934.07
intel2017.4-mvapich22.3b24.072590.07201900.34
gcc7.2.0-mvapich22.3b21.032499.64193679.02
intel2017.4-mvapich22.3rc126.92674.28202127.7

· 15 min read
Cristian Morales Pérez
David Vicente Dorca
Joan Vinyals Ylla-Català

During the execution of synthetic and real app benchmarks, we detected that the performance of the benchmarks have a high variability on MareNostrum. Also, we were notified by other colleagues that they achieved more stable times with other apps, as FALL3D, on clusters like JUWELS. Therefore, we decided to investigate the origin of this “system noise” and the different ways to limit it, as its affects the stability of the application performance and affects negatively in the scalability, mainly in applications with synchronization MPI calls in their kernel.

Using a synthetic benchmark developed by us, we observed that there are periodic events during the simulation that affects the time of a small number of iterations. We concluded that these events come from the system and they produce preemptions on the simulation. To avoid these preemptions, we tried to run the same benchmarks leaving 1 or 2 cores per node empty. Thus, these 1 or 2 empty cores are available to run these periodic events therefore the other cores avoid the possible preemption and the applications running on them could have a more stable time per iteration.

In addition, during the PRACE UEABS activity, we noticed that some applications perform better on SkyLake systems with HyperThreading enabled, as JUWELS cluster (beyond the performance improvement acquired by the higher frequencies). This is because the systems with HyperThreading handle better the preemption and the context switching are swifter.

Taking these points into account, we decided to test multiple applications with multiple configurations, with HyperThreading and without limiting the frequency to 2,1 GHz, as others SkyLake clusters do.

· 11 min read
Cristian Morales Pérez
David Vicente Dorca

On MareNostrum 4, the frequency is limited to 2,1 GHz by a recommendation of the vendor. During some tests removing this limit, we detected that depending on the application the performance increases or decreases. Also, we observed on some cases a higher power consumption.

Therefore, we run several tests measuring the performance and power consumption on nodes with and without the frequency limit.