-
-
Notifications
You must be signed in to change notification settings - Fork 14.8k
Multiple nested loops taking very long to compile with CPU extensions #115465
Copy link
Copy link
Closed
Labels
A-LLVMArea: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.C-bugCategory: This is a bug.Category: This is a bug.I-compiletimeIssue: Problems and improvements with respect to compile times.Issue: Problems and improvements with respect to compile times.P-mediumMedium priorityMedium priorityS-waiting-on-LLVMStatus: the compiler-dragon is eepy, can someone get it some tea?Status: the compiler-dragon is eepy, can someone get it some tea?T-compilerRelevant to the compiler team, which will review and decide on the PR/issue.Relevant to the compiler team, which will review and decide on the PR/issue.WG-llvmWorking group: LLVM backend code generationWorking group: LLVM backend code generationregression-from-stable-to-stablePerformance or correctness regression from one stable version to another.Performance or correctness regression from one stable version to another.
Metadata
Metadata
Assignees
Labels
A-LLVMArea: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.C-bugCategory: This is a bug.Category: This is a bug.I-compiletimeIssue: Problems and improvements with respect to compile times.Issue: Problems and improvements with respect to compile times.P-mediumMedium priorityMedium priorityS-waiting-on-LLVMStatus: the compiler-dragon is eepy, can someone get it some tea?Status: the compiler-dragon is eepy, can someone get it some tea?T-compilerRelevant to the compiler team, which will review and decide on the PR/issue.Relevant to the compiler team, which will review and decide on the PR/issue.WG-llvmWorking group: LLVM backend code generationWorking group: LLVM backend code generationregression-from-stable-to-stablePerformance or correctness regression from one stable version to another.Performance or correctness regression from one stable version to another.
Type
Fields
Give feedbackNo fields configured for issues without a type.
Code
I tried this code:
This slowdown is only visible on some target-cpus (not the default x86 target), for me this was
-Ctarget-cpu=znver3but it also happens ontarget-cpu=nativeon godbolthttps://godbolt.org/z/xqnbfdxKb
I expected to see this happen: It compiles in some reasonable amount of time
Instead, this happened: It takes over 7 minutes to compile on my ryzen 5900X
I'm thinking that the compiler is aggressively trying to unroll the loops and then inline the formatting code (the compile speeds up quite a bit when I remove the prints), but that is just speculation.
Version it worked on
It's hard to track down exactly where this regression happened, but it seems to be at least working on 1.64 (takes ~20s to compile on godbolt), at 1.65 it starts timing on on cpu=znver3
Version with regression
It seems that the regression happened somewhere between 1.65 and 1.72, however I am using nightly
rustc --version --verbose:Timings
Timings