A comprehensive C++17 library featuring five different implementations of thread-safe stacks. This project is designed to demonstrate various synchronization strategies, from simple mutex guarding to advanced fine-grained locking and memory-efficient node management.
- Diverse Implementations: 5 distinct stack architectures tailored for different contention levels.
- Zero-Guesswork Testing: 100% coverage with Google Test (GTest) suite.
- Pro Benchmarking: Built-in performance analyzer measuring throughput in MOps/s.
- Exception Safety: Advanced use of
std::shared_ptrto ensure data integrity during concurrent access.
| File | Description |
|---|---|
ModernThreadSafeStack.h |
Basic wrapper around std::stack with std::mutex. |
WaitingThreadSafeStack.h |
Adds condition_variable to eliminate busy-waiting. |
SharedPtrWaitingStack.h |
Optimized for heavy objects; reduces copying via shared_ptr. |
ModernLinkedStack.h |
Custom linked-list; performs allocations outside the lock. |
WaitLinkedStack.h |
The most advanced version; linked-list with blocking support. |
tests/ |
Google Test suite for functional verification. |
main.cpp |
Professional benchmark harness. |
Configuration: 8 threads (4 Producers / 4 Consumers) | CPU: 8C / 16T @ 1.9 GHz
Focus: synchronization overhead and contention.
| Implementation | Threads | Time (ns/op) | Throughput (MOps/s) |
|---|---|---|---|
| MutexStack (try_pop) | 8 | 327 | 24.46 |
| LinkedStack (try_pop) | 8 | 1500 | 5.33 |
| BlockingStack (wait) | 8 | 321 | 24.93 |
| SharedStack (wait) | 8 | 1674 | 4.77 |
| BlockingLinkedStack (wait) | 8 | 2232 | 3.58 |
Focus: memory pressure and lock duration.
| Implementation | Threads | Time (ns/op) | Throughput (MOps/s) |
|---|---|---|---|
| MutexStack (try_pop) | 8 | 1030 | 7.76 |
| LinkedStack (try_pop) | 8 | 1645 | 4.86 |
| BlockingStack (wait) | 8 | 938 | 8.53 |
| SharedStack (wait) | 8 | 2550 | 3.13 |
| BlockingLinkedStack (wait) | 8 | 3312 | 2.41 |
- The
std::dequeAdvantage: The Basic and Waiting Mutex stacks (built onstd::stack) significantly outperform the custom Linked-List versions. This is primarily due tostd::deque's block-allocation strategy. By allocating memory in chunks rather than per-node, it minimizes calls to the global allocator and maximizes CPU cache locality. - The Hidden Cost of Nodes: Contrary to theoretical expectations, the "optimization" of allocating nodes outside the critical section in the
Linked ListandSharedPtrversions actually hindered performance. The overhead of frequentnew/deleteoperations and the creation ofstd::shared_ptrcontrol blocks (especially visible in the 285ms result) outweighed the benefits of shorter lock-holding times. - Efficiency of Condition Variables: In the Heavy Payload category, the Waiting Mutex Stack [WAIT] (using
waitAndPop) achieved the best result (123.54ms). This demonstrates that for larger data, using astd::condition_variableis more efficient than aggressivetryPoppolling, as it reduces useless lock contention when the stack is empty. - Scaling Observations: While Linked-List structures are often praised for concurrency, your results show that for a 4P/4C load on an 8-core machine, the contention management of a standard mutex wrapper around a contiguous-ish container (
deque) is far superior to the overhead of a node-based architecture.
Recommendation: For most general-purpose applications, a Waiting Mutex Stack based on std::deque is the optimal choice. Only move to Linked-List or Lock-Free structures if you are dealing with extremely high contention or real-time requirements where individual latency spikes from std::deque reallocations are unacceptable.
- C++17 Compiler (GCC 9+, Clang 10+, MSVC 2019+)
- CMake 3.10+
- Google Test (will be fetched automatically if using CMake)
# Clone the repository
git clone https://github.com/NotACat1/ThreadSafeStack.git
cd ThreadSafeStack
# Build Benchmark in Release mode (Crucial for performance!)
g++ -O3 -std=c++17 -pthread main.cpp -o benchmark
./ThreadSafeStack- Reduce Critical Sections: We aim to keep the code inside
std::lock_guardas minimal as possible. - Smart Memory Management: Using
std::make_sharedbefore locking allows multiple threads to prepare data in parallel. - Modern C++: Utilizing RAII,
std::optional-like interfaces (viatryPop), and move semantics for maximum efficiency.