Skip to content

Implement Rust Symbol Demangling Support #19

@unclesp1d3r

Description

@unclesp1d3r

Summary

Implement Rust symbol demangling functionality to decode mangled Rust symbols into human-readable format. This enhancement will improve the analyzer's ability to process and display Rust binary symbols.

Background

Rust uses name mangling to encode additional information (such as type parameters, namespaces, and trait implementations) into symbol names. Mangled Rust symbols are difficult to read and understand without proper demangling. For example:

  • Mangled: _ZN3std2io5stdio6_print17h0123456789abcdefE
  • Demangled: std::io::stdio::_print

Supporting Rust symbol demangling will enable StringyMcStringFace to:

  • Display readable function and method names from Rust binaries
  • Improve string classification accuracy for Rust-compiled artifacts
  • Provide better analysis output for users working with Rust code

Proposed Solution

Implementation Steps

  1. Add Dependencies

    • Add rustc-demangle crate to Cargo.toml
    • Version: Use latest stable (^0.1)
  2. Create Symbol Processing Module

    • Create new file: src/classification/symbols.rs
    • Implement demangle_rust_symbol() function
    • Handle both legacy and v0 Rust mangling schemes
    • Return original string if demangling fails (graceful fallback)
  3. Integration Points

    • Integrate symbol demangling into the string classification pipeline
    • Apply demangling during binary analysis phase
    • Ensure demangling is optional/configurable
  4. Error Handling

    • Handle invalid symbol formats gracefully
    • Log demangling failures for debugging
    • Maintain performance with large symbol tables

Code Structure

// src/classification/symbols.rs
use rustc_demangle::demangle;

pub fn demangle_rust_symbol(mangled: &str) -> String {
    // Attempt to demangle, fallback to original on failure
    demangle(mangled).to_string()
}

pub fn is_rust_symbol(symbol: &str) -> bool {
    // Check if symbol matches Rust mangling pattern
    symbol.starts_with("_ZN") || symbol.starts_with("_R")
}

Testing Strategy

Implement comprehensive unit tests covering:

  1. Valid Rust Symbols

    • Standard function names
    • Methods with type parameters
    • Trait implementations
    • Nested modules
  2. Edge Cases

    • Empty strings
    • Non-Rust mangled symbols (C++, D, etc.)
    • Partially mangled strings
    • Very long symbol names
  3. Performance Tests

    • Benchmark demangling with large symbol tables
    • Ensure acceptable performance overhead

Example Test Cases

#[test]
fn test_demangle_rust_symbol() {
    assert_eq\!(
        demangle_rust_symbol("_ZN3std2io5stdio6_print17h0123456789abcdefE"),
        "std::io::stdio::_print"
    );
}

#[test]
fn test_invalid_symbol_returns_original() {
    let invalid = "not_a_mangled_symbol";
    assert_eq\!(demangle_rust_symbol(invalid), invalid);
}

Acceptance Criteria

  • rustc-demangle dependency added to Cargo.toml
  • src/classification/symbols.rs module created with demangling functionality
  • Unit tests achieve >90% code coverage
  • All tests pass successfully
  • Documentation added for public APIs
  • Performance impact measured and acceptable (<5% overhead)

Requirements

Requirement ID: 4.1

Task ID

stringy-analyzer/symbol-processing

Related Work

  • Consider future support for other languages (C++, D, Swift)
  • Potential integration with existing classification rules
  • May need configuration option to enable/disable demangling

Sub-issues

Metadata

Metadata

Assignees

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions