Suppresses RLC non-final field overwrite warning for safe constructor field initialization#7050
Suppresses RLC non-final field overwrite warning for safe constructor field initialization#7050iamsanjaymalakar wants to merge 119 commits intotypetools:masterfrom
Conversation
… field initialization
kelloggm
left a comment
There was a problem hiding this comment.
Overall looks good, just a few minor comments. I'd like to take one more look before we merge, though.
|
@iamsanjaymalakar The typecheck job is not passing. |
kelloggm
left a comment
There was a problem hiding this comment.
LGTM, I don't need to review again. Please address these comments and then request a review from Mike.
|
@iamsanjaymalakar Martin asked you to assign me when you had addressed his comments. Have you addressed them? |
Yes, I have. You can take a look. |
mernst
left a comment
There was a problem hiding this comment.
Please clarify some documentation, and update the code if necessary. (I didn't read the code because there is no point in doing so until the documentation is clear.)
|
|
||
| public FirstAssignmentInConditional(boolean b) { | ||
| try { | ||
| if (b) { |
There was a problem hiding this comment.
I don't see how it would turn into a CFG dependency.
| try { | ||
| if (b) { | ||
| // ::error: (required.method.not.called) | ||
| s = new FileInputStream("test1.txt"); // false positive: first write in this branch |
There was a problem hiding this comment.
I don't see that handling branches requires control-flow reasoning or a control-flow graph. It can be done purely (and simply) in the AST. If you don't want to do it, that is one thing, but please provide the correct rationale.
I think the wording you are looking for is "lexically first"; "AST-only" does not convey anything to me.
|
@iamsanjaymalakar When you address a code review comment, please mark it as "resolved" so that we know its status and it does not clutter this code review webpage. |
…ork into 7049-dev
Code review changes
| } | ||
| } else if (stmt instanceof TryTree tryTree) { | ||
|
|
||
| // finally introduces ordering across try/catch that requires CFG reasoning |
There was a problem hiding this comment.
Why does it require CFG reasoning? Here is a simple suggestion:
If the finally block cannot assign to the field, then it need not cause the whole scan to return REASSIGNMENT.
Another way to say this is that you might as well scan the finally block, as it is run at the end.
In a regular statement block, whether the second statement is executed depends on whether the first throws an exception, but that does not prevent your algorithm from scanning it.
If no previous code (in the try body or in any catch) can assign the field, then an assignment in the finally block can be the first assignment. So move handling of the finally block to the end of the handling of TryTree, and scan it.
| } | ||
|
|
||
| // try-with-resources evaluates resource initializers before the try body. Modeling those | ||
| // effects would require extra handling, so reject. |
There was a problem hiding this comment.
Why does it require extra handling? I think it is enough to scan the variable declarations, one after another, just as the code scans variable declarations in any other block.
| } | ||
|
|
||
| // If any catch assigns the field, then initialization is path-dependent (try vs catch). | ||
| // Without control-flow reasoning, conservatively reject. |
There was a problem hiding this comment.
Why is control-flow reasoning (as opposed to tree path reasoning) required? If the body of the try definitely does not assign the variable, then each catch can be processed independently, and the assignment may be the first. If the body of the try may assign the variable, then there is already an answer before getting to the catch blocks. More generally, process the try block in execution order (which is what comments claim about the algorithm anyway).
Regarding the note about "initialization if path-dependent": that is also true for if statements and many other constructs, but that fact does not prevent analysis of those constructs. This file is only trying to ask: if statement S is executed, then is it the first assignment to the field?
|
|
||
| boolean targetInThen = containsTargetAssignment(thenStmt, targetAssignment); | ||
| boolean targetInElse = containsTargetAssignment(elseStmt, targetAssignment); | ||
|
|
||
| // If the target assignment is in one branch, only that branch is on the path to the | ||
| // target assignment. Return the result after scanning that branch. | ||
| if (targetInThen) { | ||
| return scanForFirstWrite(List.of(thenStmt), targetAssignment, targetField, cmAtf); | ||
| } | ||
| if (targetInElse) { | ||
| return scanForFirstWrite(List.of(elseStmt), targetAssignment, targetField, cmAtf); | ||
| } | ||
|
|
There was a problem hiding this comment.
I don't see the need for any of this code.
The below code processes the then and the else.
It can return the following result:
- If either the
thenor theelseis FIRST_ASSIGNMENT, that is the outcome of theifstatement. - If either the
thenor theelseis REASSIGNMENT, that is the outcome of theifstatement. - Otherwise, both are UNASSIGNED, so that is the outcome of the
ifstatement.
| } | ||
| return FirstWriteScanResult.REASSIGNMENT; | ||
| } else { | ||
| // Any other statement kind requires control-flow-aware reasoning that this helper does not |
There was a problem hiding this comment.
I don't agree with this claim about control-flow-aware reasoning. For example, an AssertTree consists of two expressions that can be processed one after the other.
EmptyStatementTree, LabeledStatementTree, and ReturnTree, are also easy to handle.
If you want to say that you are all other statement kinds because you don't want to put in the work, that is OK (and I agree that some of them are more trouble than they are worth or require control-flow reasoning), but please don't make blanket statements that are not true.
| * not handle field initializers, static or instance initializer blocks, or statement-level | ||
| * control-flow constructs. | ||
| * | ||
| * <p>If it is used on an unsupported fragment, the result is conservative only with respect to |
There was a problem hiding this comment.
Does "unsupported fragment" mean an unsupported Expression type or an unsupported expression context?
| // - current assignment → first write → FIRST_ASSIGNMENT | ||
| // - earlier assignment → not first → REASSIGNMENT | ||
| // TODO: Must scan the RHS, to catch a nested assignment to the same field within the RHS, | ||
| // e.g., this.f = (this.f = new Foo()). |
| @Override | ||
| public FirstWriteScanResult visitMethodInvocation(MethodInvocationTree node, Void p) { | ||
| // Treat any side-effecting call before the target assignment as possibly assigning the field. | ||
| // An explicit super(...) call is also allowed, |
There was a problem hiding this comment.
Explicitly discuss whether this is an unsoundness. (I think it is.)
| */ | ||
| private abstract static class BooleanShortCircuitScanner extends TreeScanner<Boolean, Void> { | ||
|
|
||
| /** Do not instantiate. */ |
There was a problem hiding this comment.
This is instantiated, so the comment is wrong. (Or if I am wrong about that, add a throw statement to the body.)
| @FindDistinct Tree targetAssignment, | ||
| @FindDistinct VariableElement targetField, | ||
| RLCCalledMethodsAnnotatedTypeFactory cmAtf) { | ||
| return new ExpressionFirstWriteScanner(targetAssignment, targetField, cmAtf).scan(root, null); |
There was a problem hiding this comment.
Every call to scanForFirstWrite(ExpressionTree, ...) creates a new ExpressionFirstWriteScanner. That seems wasteful. Furthermore, it highlights a difference between the way that statements are handled and how expressions are handled. Unify them. Here are two possibilities:
- eliminate
ExpressionFirstWriteScannerand make top-level private methodscanForFirstWritehandle allTrees (including both statement trees and expression trees) by calling itself recursively, or - eliminate top-level private method
scanForFirstWriteand expandExpressionFirstWriteScannerintoFirstWriteScannerthat handles allTreess (including both statement trees and expression trees).
Either of those approaches can work. Given your current code structure, you may find the first one simpler to implement.
Summary by CodeRabbit
New Features
Tests