Mess Benchmark 2.0

A multiplatform benchmark designed to provide a holistic, detailed, and close-to-hardware view of memory system performance through bandwidth-latency curves.

This is an update to the original Mess Benchmark (now deprecated), focused on improved usability and portability.

Website | GitHub | Paper

Documentation

Project documentation is available in the GitHub wiki:

Motivation

Traditional memory benchmarks report isolated metrics such as peak bandwidth or idle latency, which often fail to capture how memory systems behave under realistic workloads. Mess (Memory Stress) addresses this limitation by characterizing memory performance through bandwidth-latency curves that cover the full range of memory traffic intensity, from unloaded to fully saturated.

This approach reveals critical insights:

Memory writes degrade performance significantly compared to reads
Systems typically saturate at 70-90% of theoretical maximum bandwidth
Latency ranges from 85-130ns when idle to 200-600ns+ under saturation

Mess provides a holistic, close-to-hardware view of memory system behavior, enabling researchers and engineers to understand real-world performance characteristics that standard benchmarks miss.

MICRO 2024 Best Paper Runner-Up: The Mess methodology was published at the 57th IEEE/ACM International Symposium on Microarchitecture.

For a detailed explanation of the benchmark methodology, see the Memory BSC Tools page.

Tools Included

Mess 2.0 provides an integrated workflow for memory system characterization, from benchmarking to application profiling:

Mess Benchmark: Characterizes your memory system by generating bandwidth-latency curves that reveal how it behaves under varying load.
Mess Profiler: Automates counter discovery and runs profiling tools (perf, likwid, etc.) with the correct configuration, ensuring application measurements align with benchmark data.
Plotter-Parser: Generates publication-quality plots as well as CSV and JSON files containing the parsed bandwidth-latency curves.
Traffic Generator: The low-level engine that generates precise memory traffic patterns at the assembly level. Can also be used independently for custom microbenchmarks.

Architecture Support

Support status follows the wiki (Architecture-Support):

Architecture	Status	SIMD	Notes
x86-64 CPUs	Supported	AVX2, AVX-512	Intel and AMD processors
ARM CPUs	Supported	NEON, SVE	Includes Neoverse, Graviton, Apple Silicon
Power CPUs	Supported	VSX	Power8 and newer
RISC-V CPUs	WIP	RVV 1.0	Assembly + latency + counter detection available; bandwidth measurement pending
GPUs	Pending	—	Under active development

Installation

See full instructions in the Installation wiki page.

Clone

git clone --recursive https://github.com/bsc-mem/Mess-2.0.git
cd Mess-2.0

Important: The --recursive flag is required to download submodules that Mess depends on.

If you forget --recursive, initialize submodules later:

git submodule update --init --recursive

Dependencies

Core requirements:

C++17 compiler: GCC 9+ (recommended), Clang 10+, Intel OneAPI (ICX), AOCC
numactl: NUMA memory binding (required)
taskset: Core pinning (preferred, part of util-linux)
perf: Recommended counter backend (linux-tools-common)
Python 3: Plotting utilities

On Linux, ensure perf access:

cat /proc/sys/kernel/perf_event_paranoid
echo 0 | sudo tee /proc/sys/kernel/perf_event_paranoid

Huge pages are recommended for more accurate measurements:

echo 1024 | sudo tee /proc/sys/vm/nr_hugepages

Even without huge pages, Mess automatically compensates for page walk latency. See Huge-Memory-Pages for details.

Build

make
make install

Binaries are generated in build/bin/:

mess — Core benchmark
mess-profiler — Memory bandwidth profiler
traffic_generator — Standalone traffic generation tool

Optional PATH setup:

export PATH=$PATH:$(pwd)/build/bin

Verification

./build/bin/mess --version
./build/bin/mess --dry-run --verbose=2

Quick Start

./build/bin/mess --dry-run --verbose=2

./build/bin/mess

./build/bin/mess --profile

Common options:

Option	Description	Example
`--ratio=N[,N...]`	Issued load ratio(s) in %	`--ratio=100,75,50`
`--pause=N[,N...]`	Pause bubble values	`--pause=0,10,100,1000`
`--profile`	Save measurement files	`--profile`
`--verbose=N`	Verbosity level `0-4`	`--verbose=3`
`--measurer=TYPE`	Counter backend (`auto/perf/likwid/pcm`)	`--measurer=perf`
`--bind=LIST`	NUMA memory-node binding	`--bind=0`
`--cores=LIST`	Explicit traffic-generator cores	`--cores=0-15`
`--total-cores=N`	Number of traffic-generator cores	`--total-cores=16`

For the complete option set, use ./build/bin/mess --help or see Understanding-CLI-Arguments.

Common Workflows

Single-Point Sanity Check

./build/bin/mess --profile --ratio=100 --pause=0 --verbose=3 --repetitions=1
./build/bin/mess --profile --ratio=0 --pause=0 --verbose=3 --repetitions=1

NUMA Comparison

./build/bin/mess --profile --bind=0 --folder=numa0
./build/bin/mess --profile --bind=1 --folder=numa1

Core-Scaling Sweep

for c in 2 4 8 16; do
  ./build/bin/mess --profile --total-cores=$c --folder=cores_$c
done

Mess Profiler

mess-profiler reuses Mess counter discovery to profile applications with consistent output.

./build/bin/mess-profiler --dry-run

./build/bin/mess-profiler -s 100ms -o app_profile.csv ./my_app

Profiler docs: Mess-Profiler

Plotter-Parser

Visualization utilities in utils/:

plotter.py: generates memory-curve plots and processed CSV/JSON
app_plotter.py: overlays application profile points on curves
parse_runtimes.py: summarizes run times

cd utils
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt

Plotter docs: Plotter-Parser

Learning Resources

Tutorials and Slides: mess.bsc.es/tutorials
Detailed Methodology: memory.bsc.es/tools/mess-benchmark

Troubleshooting

Permission errors on counters: set perf_event_paranoid to 0
Missing counters/backend mismatch: check ./build/bin/mess-profiler --dry-run
Unstable measurements: increase --repetitions and use --verbose=3

More: FAQ and Iterative-Debugging

Found a bug? Open an issue on GitHub or email mess@bsc.es.

Contributors

Mess is developed by the Memory Systems Team at the Barcelona Supercomputing Center (BSC).

_{Victor Xirau Guardans
Main Mess 2.0 developer
victor.xirau@bsc.es}

_{Mariana Carmin
Mess 2.0 developer
mcarmin@bsc.es}

_{Pau Diaz
Mess 2.0 developer
pau.diazcuesta@bsc.es}

_{Pouya Esmaili Dokht
Mess Paper author
pouya.esmaili@bsc.es}

Or email: mess@bsc.es

Citation

If you use Mess in research, please cite:

@inproceedings{esmaili2024mess,
  title     = {A Mess of Memory System Benchmarking, Simulation and Application Profiling},
  author    = {Esmaili-Dokht, Pouya and Sgherzi, Francesco and Girelli, Valeria Soldera
               and Boixaderas, Isaac and Carmin, Mariana and Monemi, Alireza
               and Armejach, Adria and Mercadal, Estanislao and Llort, German
               and Radojkovi{\'c}, Petar and Moreto, Miquel and Gim{\'e}nez, Judit
               and Martorell, Xavier and Ayguad{\'e}, Eduard and Labarta, Jesus
               and Confalonieri, Emanuele and Dubey, Rishabh and Adlard, Joshua},
  booktitle = {Proceedings of the 57th IEEE/ACM International Symposium on Microarchitecture (MICRO)},
  pages     = {136--152},
  year      = {2024},
  publisher = {IEEE}
}

References

Mess Benchmark — The original implementation of the Mess benchmark.
Mess Simulator — Analytical memory model using bandwidth-latency curves.
Mess-Paraver — Integration with Paraver for visualization.
Mess Paper — Esmaili-Dokht, P., Sgherzi, F., Girelli, V. S., Boixaderas, I., Carmin, M., Monemi, A., Armejach, A., Mercadal, E., Llort, G., Radojković, P., Moreto, M., Giménez, J., Martorell, X., Ayguadé, E., Labarta, J., Confalonieri, E., Dubey, R., & Adlard, J. (2024). A mess of memory system benchmarking, simulation and application profiling. In Proceedings of the 57th IEEE/ACM International Symposium on Microarchitecture (MICRO) (pp. 136-152). IEEE.

_{Mess Benchmark is released under the BSD 3-Clause License}

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
doc/logos		doc/logos
include		include
src		src
templates		templates
tools		tools
utils		utils
.gitmodules		.gitmodules
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
Makefile		Makefile
Mess.cpp		Mess.cpp
README.md		README.md
generate_code.cpp		generate_code.cpp
mess_profiler.cpp		mess_profiler.cpp

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Mess Benchmark 2.0

Table of Contents

Documentation

Motivation

Tools Included

Architecture Support

Installation

Clone

Dependencies

Build

Verification

Quick Start

Common Workflows

Single-Point Sanity Check

NUMA Comparison

Core-Scaling Sweep

Mess Profiler

Plotter-Parser

Learning Resources

Troubleshooting

Contributors

Citation

References

About

Uh oh!

Releases 2

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Mess Benchmark 2.0

Table of Contents

Documentation

Motivation

Tools Included

Architecture Support

Installation

Clone

Dependencies

Build

Verification

Quick Start

Common Workflows

Single-Point Sanity Check

NUMA Comparison

Core-Scaling Sweep

Mess Profiler

Plotter-Parser

Learning Resources

Troubleshooting

Contributors

Citation

References

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 2

Contributors

Uh oh!

Languages