This document describes the performance profiling instrumentation added to FloatWM. The profiling system provides comprehensive monitoring of window manager performance including frame timing, event processing latency, memory usage, and CPU utilization.
- Tracks each event loop iteration as a "frame"
- Calculates FPS (frames per second) in real-time
- Records min/max/average frame times
- Provides p50, p95, p99 percentiles for frame time distribution
- Measures time from event receipt to dispatch completion
- Tracks latency per event type (KeyPress, MotionNotify, etc.)
- Provides statistics on total events processed
- Identifies slow event handlers
- Monitors RSS (Resident Set Size) memory usage
- Tracks VMS (Virtual Memory Size)
- Records peak memory usage
- Calculates memory usage trends
- Measures current CPU percentage
- Tracks average and peak CPU usage
- Monitors context switches (voluntary/involuntary)
- Tracks total CPU time consumed
- Generates hierarchical call traces
- Creates JSON output for visualization
- Tracks scope entry/exit times
- Shows call stack depth
The profiling system consists of:
-
profiling.rs- Core profiling modulePerformanceProfiler: Main profiler structProfilingConfig: Configuration optionsPerformanceMetrics: Collected metrics data structuresFlameGraphNode: Flame graph data structure
-
event.rsIntegration - Event loop integration- Frame timing measurement in event loop
- Event latency tracking in dispatch
- Periodic memory/CPU sampling
- Automatic export on shutdown
pub struct ProfilingConfig {
/// Enable profiling
pub enabled: bool,
/// Sample interval for metrics collection (in milliseconds)
pub sample_interval_ms: u64,
/// Maximum number of samples to keep in memory
pub max_samples: usize,
/// Enable flamegraph generation
pub enable_flamegraph: bool,
/// Output directory for profiling data
pub output_dir: String,
/// Enable frame timing tracking
pub track_frame_timing: bool,
/// Enable event latency tracking
pub track_event_latency: bool,
/// Enable memory usage tracking
pub track_memory_usage: bool,
/// Enable CPU utilization tracking
pub track_cpu_usage: bool,
}use floatwm::profiling::{PerformanceProfiler, ProfilingConfig};
// Create profiler with custom configuration
let config = ProfilingConfig {
enabled: true,
enable_flamegraph: true,
output_dir: "/tmp/floatwm_profiling".to_string(),
..Default::default()
};
let mut profiler = PerformanceProfiler::new(config);
// Set profiler on event loop
event_loop.set_profiler(profiler);
// The profiler automatically collects metrics during event loop execution
// On shutdown, metrics are exported to JSON filesFor custom profiling scopes:
// Enter a profiling scope
profiler.enter_scope("my_function");
// ... do work ...
// Exit the scope
profiler.exit_scope();Record custom metrics:
// Record frame time
profiler.record_frame_time(Duration::from_millis(16));
// Record event latency
profiler.record_event_latency("CustomEvent", Duration::from_micros(100));
// Sample memory usage
profiler.sample_memory_usage()?;
// Sample CPU usage
profiler.sample_cpu_usage()?;Export data:
// Export metrics to JSON
profiler.export_metrics("/tmp/metrics.json")?;
// Generate flamegraph
profiler.generate_flamegraph("/tmp/flamegraph.json")?;{
"frame_timing": {
"frame_count": 1000,
"avg_frame_time_us": 16666.67,
"min_frame_time_us": 15000.0,
"max_frame_time_us": 25000.0,
"current_fps": 60.0,
"avg_fps": 59.8,
"percentiles": {
"p50_us": 16500.0,
"p95_us": 18000.0,
"p99_us": 22000.0
}
},
"event_latency": {
"total_events": 5000,
"avg_latency_us": 150.5,
"min_latency_us": 50.0,
"max_latency_us": 5000.0,
"by_event_type": {
"KeyPress": {
"count": 100,
"avg_us": 120.0,
"min_us": 80.0,
"max_us": 200.0
}
}
},
"memory_usage": {
"rss_bytes": 52428800,
"peak_rss_bytes": 67108864,
"vms_bytes": 104857600,
"page_faults": 0,
"memory_trend": 1024.0
},
"cpu_usage": {
"current_percent": 2.5,
"avg_percent": 2.1,
"peak_percent": 8.3,
"total_cpu_time": 12.5,
"voluntary_context_switches": 1000,
"involuntary_context_switches": 10
}
}[
{
"name": "event_loop_iteration",
"start_us": 0,
"duration_us": 16667,
"depth": 0,
"children": [
{
"name": "event_KeyPress",
"start_us": 1000,
"duration_us": 150,
"depth": 1,
"children": []
}
]
}
]The profiling system is designed to have minimal overhead:
- When disabled: No performance impact (configurable)
- When enabled:
- Frame timing: ~1-2 microseconds per frame
- Event latency: ~0.5 microseconds per event
- Memory sampling: ~1 millisecond per sample (via /proc filesystem)
- CPU sampling: ~1 millisecond per sample (via /proc filesystem)
- Frame timing: Every frame
- Event latency: Every event
- Memory/CPU: Every 100 frames (configurable)
- Summary printing: Every 1000 frames (configurable)
-
src/profiling.rs (NEW)
- Core profiling implementation
- ~1000 lines of code
-
src/event.rs (MODIFIED)
- Integrated profiler into event loop
- Added frame timing and event latency tracking
- Added profiler accessor methods
-
src/lib.rs (MODIFIED)
- Added profiling module exports
The profiling system includes unit tests for:
- Profiler creation and configuration
- Frame timing recording
- Event latency tracking
- Flamegraph scope management
- Metrics reset functionality
Possible future improvements:
-
Real-time Visualization
- Built-in web server for live metrics
- WebSocket streaming of metrics
- Interactive flamegraph viewer
-
Advanced Analysis
- Bottleneck detection
- Anomaly detection
- Performance regression alerts
-
Integration
- Integration with external profilers (perf, flamegraph.rs)
- Support for Chrome tracing format
- Export to Prometheus metrics
-
Configurable Triggers
- Automatic profiling on high load
- Trigger profiling from IPC commands
- Time-based profiling windows
Future support for environment variable configuration:
FLOATWM_PROFILING_ENABLED: Enable/disable profilingFLOATWM_PROFILING_OUTPUT_DIR: Output directory for profiling dataFLOATWM_PROFILING_FLAMEGRAPH: Enable flamegraph generationFLOATWM_PROFILING_SAMPLE_INTERVAL: Sampling interval in milliseconds
This profiling system is part of FloatWM and follows the same license (MIT OR Apache-2.0).