Skip to content

Symex resource monitoring: per-function step counts, memory, callgrind output, and live progress display#8903

Open
tautschnig wants to merge 4 commits intodiffblue:developfrom
tautschnig:feature/resource-monitoring
Open

Symex resource monitoring: per-function step counts, memory, callgrind output, and live progress display#8903
tautschnig wants to merge 4 commits intodiffblue:developfrom
tautschnig:feature/resource-monitoring

Conversation

@tautschnig
Copy link
Copy Markdown
Collaborator

Add resource monitoring capabilities that help users identify which parts of their program contribute most to verification cost, without resorting to delta debugging.

New features

Per-function symex step counts (at --verbosity 9):

Runtime Symex: 2.5s
Symex steps: 50000 total
  process_buffer: 35000 steps (70%)
  verify_hash: 8000 steps (16%)
  main: 4000 steps (8%)
Peak memory: 512 MB

--symex-callgrind <file> — writes per-source-location step counts in callgrind format, viewable with KCachegrind/QCachegrind to visually identify expensive code paths.

--show-symex-progress — live terminal display showing the full call stack and loop nesting, redrawn every 2 seconds:

Symex: 50000 steps, SSA 42000, 512 MB
__CPROVER__start
  main
    process_buffer [process_buffer.0 iter 15]
      verify_hash [verify_hash.0 iter 3]

Falls back to one-line periodic progress when output is not a terminal (CI, pipes). Always reports max call depth and loop nesting at the end.

Formula growth warnings — alerts when a single instruction adds >100 SSA steps, identifying the function and source location responsible.

Commits

  1. Report per-function symex step counts and peak memory usage — end-of-stage summary with top-10 functions by step count and peak memory after symex and solver stages.

  2. Add continuous progress reporting and callgrind output — periodic progress during symex and SSA conversion (every 2s), plus --symex-callgrind option for KCachegrind visualization.

  3. Warn when a single symex step causes large formula growth — per-instruction SSA equation size tracking with threshold-based warnings.

  4. Add --show-symex-progress for live call stack and loop display — interactive terminal display with call stack, loop iteration counts, and max-depth summary. Includes man page entries for all new options.

  • Each commit message has a non-empty body, explaining why the change was made.
  • Methods or procedures I have added are documented, following the guidelines provided in CODING_STANDARD.md.
  • The feature or user visible behaviour I have added or modified has been documented in the User Guide in doc/cprover-manual/
  • Regression or unit tests are included, or existing tests cover the modified code (in this case I have detailed which ones those are in the commit message).
  • n/a My commit message includes data points confirming performance improvements (if claimed).
  • My PR is restricted to a single feature or bugfix.
  • n/a White-space or formatting changes outside the feature-related changed lines are in commits of their own.

tautschnig and others added 4 commits March 27, 2026 22:16
Add resource monitoring that helps users identify which functions
contribute most to verification cost:

- Count symex steps per function and report the top contributors
  with percentages after symbolic execution completes
- Report peak memory usage after symex and after solving

The output appears at the statistics verbosity level (--verbosity 9)
alongside the existing Runtime Symex/Solver timing. Example output:

  Runtime Symex: 0.003s
  Symex steps: 102 total
    factorial: 80 steps (78%)
    main: 12 steps (12%)
    __CPROVER_initialize: 6 steps (6%)
    __CPROVER__start: 4 steps (4%)
  Peak memory: 4 MB

This directly addresses the user need to map excessive CPU or memory
usage to particular program fragments without delta debugging.

Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
During long-running verifications, users need to see what CBMC is
doing without waiting for a stage to complete. Add periodic progress
messages (every 2 seconds) that report:

- During symex: step count, current function, SSA equation size,
  and peak memory
- During SSA conversion: progress through the equation steps

Example output during a long symex:
  Symex: 50000 steps, in process_buffer, SSA size 42000, peak memory 512 MB
  Symex: 100000 steps, in verify_hash, SSA size 89000, peak memory 1024 MB

This addresses Milestone 1 of the resource monitoring feature: the
user can continuously observe resource consumption and map it to the
verification stage and program fragment CBMC is currently processing.

Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
Track the SSA equation size before and after each symex instruction.
When a single instruction adds more than 100 SSA steps, emit a
statistics-level warning identifying the function and source location.

This helps users identify specific instructions that cause formula
blowup, such as large array initializations, complex function calls,
or deeply nested pointer dereferences. Example output:

  Large SSA growth: +500 steps at process_buffer (buffer.c:42), total 12500

This addresses Task 4 of the resource monitoring feature: monitoring
formula size increases and reporting when a particular step results
in an increase above a threshold.

Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
Add an interactive terminal display that shows the current call stack
and loop nesting during symbolic execution. Enabled with the
--show-symex-progress flag, it redraws the terminal every 2 seconds
to show:

- A header with total steps, SSA equation size, and memory usage
- The full call stack with indentation showing nesting depth
- The current function highlighted in orange
- Active loop unwindings with iteration counts shown in cyan

Example display (redrawn in place on a terminal):

  Symex: 50000 steps, SSA 42000, 512 MB
  __CPROVER__start
    main
      process_buffer [process_buffer.0 iter 15]
        verify_hash [verify_hash.0 iter 3]

When output is not a terminal (CI, pipes), the interactive display
is disabled and the existing one-line periodic progress is used.
At the end of symex, max call depth and loop nesting are reported
(testable in CI via --verbosity 9).

Also clears the interactive display before printing final statistics,
and adds man page entries for --show-symex-progress and
--symex-callgrind.

Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
@tautschnig tautschnig self-assigned this Mar 27, 2026
Copilot AI review requested due to automatic review settings March 27, 2026 23:31
@tautschnig tautschnig changed the title Feature/resource monitoringSymex resource monitoring: per-function step counts, memory, callgrind output, and live progress display Symex resource monitoring: per-function step counts, memory, callgrind output, and live progress display Mar 27, 2026
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds resource-monitoring and progress-reporting capabilities around symbolic execution (symex) to help users identify cost hotspots (per-function step counts, peak memory, optional callgrind export, and live progress display).

Changes:

  • Add peak_memory_bytes() utility and report peak memory after symex and solver stages.
  • Collect/report per-function symex step counts; optionally export per-source-location counts in callgrind format (--symex-callgrind).
  • Add --show-symex-progress and periodic progress reporting during SSA conversion.

Reviewed changes

Copilot reviewed 15 out of 15 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
src/util/memory_info.h Declares peak_memory_bytes() for peak-memory reporting.
src/util/memory_info.cpp Implements peak_memory_bytes() with platform-specific branches.
src/goto-symex/symex_target_equation.cpp Adds periodic SSA-conversion progress logging and formatting cleanups.
src/goto-symex/symex_main.cpp Collects step counts / source-location counts, implements live symex progress display and formula-growth warnings.
src/goto-symex/goto_symex.h Adds storage and API for step counts, callgrind export, and interactive-display state.
src/goto-checker/multi_path_symex_only_checker.cpp Prints end-of-symex stats (steps/top functions/max depth) and writes callgrind output if requested.
src/goto-checker/multi_path_symex_checker.cpp Writes callgrind output if requested in the main checking flow.
src/goto-checker/bmc_util.cpp Reports peak memory after the decision-procedure stage.
src/cbmc/cbmc_parse_options.h Registers --symex-callgrind and --show-symex-progress CLI options.
src/cbmc/cbmc_parse_options.cpp Parses new CLI options and extends --help text.
regression/cbmc/resource_monitoring/test.desc Adds regression coverage for step-count + peak-memory statistics output.
regression/cbmc/resource_monitoring/progress.desc Adds regression coverage for --show-symex-progress summary stats.
regression/cbmc/resource_monitoring/callgrind.desc Adds regression coverage for --symex-callgrind status output.
regression/cbmc/resource_monitoring/main.c Test program used by the new resource-monitoring regressions.
doc/man/cbmc.1 Documents new command-line options.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +85 to +110
std::size_t peak_memory_bytes()
{
#ifdef __GLIBC__
struct mallinfo m = mallinfo();
return m.uordblks + m.hblkhd;
#elif defined(_WIN32)
PROCESS_MEMORY_COUNTERS pmc;
if(GetProcessMemoryInfo(GetCurrentProcess(), &pmc, sizeof(pmc)))
return pmc.PeakWorkingSetSize;
return 0;
#elif defined(__APPLE__)
struct task_basic_info t_info;
mach_msg_type_number_t t_info_count = TASK_BASIC_INFO_COUNT;
if(
task_info(
current_task(), TASK_BASIC_INFO, (task_info_t)&t_info, &t_info_count) ==
KERN_SUCCESS)
{
return t_info.virtual_size;
}
return 0;
#else
struct rusage r_usage;
if(getrusage(RUSAGE_SELF, &r_usage) == 0)
return static_cast<std::size_t>(r_usage.ru_maxrss) * 1024;
return 0;
Copy link

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

peak_memory_bytes() is documented as returning peak resident set size (RSS) in bytes, but the platform implementations don’t match that: on __GLIBC__ it uses mallinfo() (heap allocator stats, not RSS/peak, and may overflow as mallinfo fields are int), and on macOS it returns task_basic_info::virtual_size (virtual size, not RSS/peak). Consider either (1) implementing true peak RSS on each platform (e.g. getrusage(RUSAGE_SELF).ru_maxrss with correct unit handling, or /proc/self/status VmHWM on Linux), or (2) renaming/re-documenting the function to reflect what it actually measures and using mallinfo2() where applicable.

Copilot uses AI. Check for mistakes.
Comment on lines +600 to +611
// Track per-function step counts
if(!state.source.function_id.empty())
{
++function_step_counts[state.source.function_id];

const auto &loc = state.source.pc->source_location();
if(!loc.get_file().empty())
{
source_keyt key{loc.get_file(), state.source.function_id, loc.get_line()};
++source_location_step_counts[key];
}
}
Copy link

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per-step accounting updates (function_step_counts and source_location_step_counts) run unconditionally on every symex step. For large runs this adds significant CPU and memory overhead (std::map insert/find per step and potentially per source location), even when the user didn’t request verbosity-9 stats, callgrind output, or progress display. Consider gating collection behind an explicit flag set from options (e.g. enabled when --symex-callgrind, --show-symex-progress, or statistics verbosity is active), and avoid populating source_location_step_counts unless callgrind output is requested.

Copilot uses AI. Check for mistakes.
#include "goto_symex.h"
#include "path_storage.h"

#include <fstream>
Copy link

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#include <fstream> appears unused in this file (no std::ifstream/ofstream/fstream usage). Please drop it to avoid unnecessary dependencies / IWYU noise.

Suggested change
#include <fstream>

Copilot uses AI. Check for mistakes.
symex.write_callgrind(out);
log.status() << "Symex callgrind data written to " << callgrind_file
<< messaget::eom;
}
Copy link

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the --symex-callgrind file can’t be opened (e.g. bad path/permissions), the code silently does nothing and provides no feedback. Please emit a warning/error when std::ofstream fails so users know the profile wasn’t written.

Suggested change
}
}
else
{
log.warning() << "Failed to open symex callgrind output file '"
<< callgrind_file << "'" << messaget::eom;
}

Copilot uses AI. Check for mistakes.
symex.write_callgrind(out);
log.status() << "Symex callgrind data written to " << callgrind_file
<< messaget::eom;
}
Copy link

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the --symex-callgrind file can’t be opened (e.g. bad path/permissions), the code silently does nothing and provides no feedback. Please emit a warning/error when std::ofstream fails so users know the profile wasn’t written.

Suggested change
}
}
else
{
log.error() << "Failed to open symex callgrind output file '"
<< callgrind_file << "'" << messaget::eom;
}

Copilot uses AI. Check for mistakes.
main.c
--unwind 7 --verbosity 9
^Symex steps: \d+ total$
^ factorial: \d+ steps \(\d+%\)$
Copy link

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The expected per-function line assumes the function identifier prints as factorial, but CBMC function identifiers are typically qualified (e.g. c::factorial). This regex is likely too strict and may make the test fail; consider loosening it to accept the qualified identifier.

Suggested change
^ factorial: \d+ steps \(\d+%\)$
^ (?:\w+::)?factorial: \d+ steps \(\d+%\)$

Copilot uses AI. Check for mistakes.
Comment on lines +862 to +872
public:
/// When true, show live call stack and loop nesting on the terminal.
/// Enabled by --show-symex-progress.
bool interactive_display_enabled = false;

/// Maximum call stack depth and loop nesting observed during symex.
std::size_t max_call_depth_seen = 0;
std::size_t max_active_loops_seen = 0;

/// Number of lines drawn by the last interactive display, for redrawing.
std::size_t interactive_display_lines = 0;
Copy link

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change introduces new mutable public fields on goto_symext (e.g. interactive_display_enabled, max_call_depth_seen, etc.). Exposing these as public state makes it harder to maintain invariants and evolves the API surface. Consider keeping these members private and providing narrow setter/getter methods (or an explicit resource_monitoring_config struct) so callers can enable progress display without directly mutating internal counters.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants