Rust: More test cases for add, sub etc#21191
Conversation
There was a problem hiding this comment.
Pull request overview
This PR adds additional test cases for Rust taint dataflow analysis to cover more arithmetic and bitwise operators. The tests reveal issues with call resolution when explicit type annotations are missing. The PR also consolidates operator models to reduce duplication.
Changes:
- Added
more_ops()test function covering negation, not, subtraction, multiplication, shift, and XOR operators - Added
std_ops()test function for explicit method calls to operator traits with MISSING annotations documenting known resolution issues - Added
wrappingmodule to test operations onWrapping<i64>types - Consolidated operator models in core.model.yml to use
Argument[self,0]instead of separate entries
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| rust/ql/test/library-tests/dataflow/taint/main.rs | Added test functions for various operators and their method forms, including cases that reveal call resolution issues |
| rust/ql/test/library-tests/dataflow/taint/inline-taint-flow.expected | Auto-generated test expectations updated to reflect new test cases |
| rust/ql/test/library-tests/dataflow/taint/TaintFlowStep.expected | Auto-generated test expectations updated to reflect new test cases |
| rust/ql/lib/codeql/rust/frameworks/stdlib/core.model.yml | Consolidated operator models to reduce duplication by using combined argument specifications |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
|
||
| fn std_ops() { | ||
| sink(source(1).add(2i64)); // $ hasTaintFlow=1 | ||
| sink(source(1).add(2)); // $ MISSING: hasTaintFlow=1 --- the missing results here are due to failing to resolve targets for `add` etc where there's no explicit type |
There was a problem hiding this comment.
This is a known issue; this is because 2 is currently assigned the type i32 instead of i64.
| a.add_assign(Wrapping(crate::source(3))); | ||
| a += source(4); | ||
| a += std::num::Wrapping(crate::source(5)); | ||
| sink(a); // $ hasTaintFlow=2 hasTaintFlow=4 MISSING: hasTaintFlow=3 hasTaintFlow=5 --- we don't currently find any `Call`s for `Wrapping` above |
There was a problem hiding this comment.
I believe this is because we need special models like core::num::wrapping::Wrapping as core::ops::arith::AddAssign for Wrapping. Something like
- ["<_ as core::ops::arith::AddAssign>::add_assign", "Argument[self].Reference.Field[core::num::wrapping::Wrapping(0)]", "Argument[self].Reference.Field[core::num::wrapping::Wrapping(0)]", "taint", "manual"]
- ["<_ as core::ops::arith::AddAssign>::add_assign", "Argument[0].Field[core::num::wrapping::Wrapping(0)]", "Argument[self].Reference.Field[core::num::wrapping::Wrapping(0)]", "taint", "manual"]There was a problem hiding this comment.
I'll try that, thanks...
There was a problem hiding this comment.
I've expanded the test case and added models (and fixed some existing ones). Though I suspect for types like Wrapping, if we could identify them and model the value itself as tainted (rather than the content) it might be easier to model and result in more complete flow.
|
I just pushed some model changes. I will re-run DCA (after checking CI) to confirm nothing has regressed. |
| - ["<_ as core::ops::arith::Rem>::rem", "Argument[self]", "ReturnValue", "taint", "manual"] | ||
| - ["<_ as core::ops::arith::Rem>::rem", "Argument[0]", "ReturnValue", "taint", "manual"] | ||
| - ["<_ as core::ops::arith::Rem>::rem", "Argument[0].Reference", "ReturnValue", "taint", "manual"] | ||
| - ["<_ as core::ops::arith::Add>::add", "Argument[self,0]", "ReturnValue", "taint", "manual"] |
There was a problem hiding this comment.
I believe the purpose of models like - ["<_ as core::ops::arith::Add>::add", "Argument[0].Reference", "ReturnValue", "taint", "manual"] were to account for implementations like https://doc.rust-lang.org/std/ops/trait.Add.html#impl-Add%3C%26f16%3E-for-f16.
There was a problem hiding this comment.
Actually, since these are taint steps, they should not be needed given our implementation of defaultImplicitTaintRead, which means that if data is stored in a Reference, then it will be auto-read.
There was a problem hiding this comment.
Yep, that explanation tracks, these extra model cases mattered once but are no longer needed. I want to avoid copying unnecessary bloat into new models and I figured I might as well clean up the old ones as well.
I'm doing a DCA run to check for unexpected regressions.
There was a problem hiding this comment.
DCA LGTM (no result changes).
Left over work from last year. I thought we were missing some simple models I could easily add, but it turns out the test cases reveal deeper issues perhaps in call resolution.
@hvitved FYI