add llvm.assume bounds hints, full fast-math flags on all fp ops#492
Merged
add llvm.assume bounds hints, full fast-math flags on all fp ops#492
Conversation
Contributor
Benchmark Results (Linux x86-64)
CLI Tool Benchmarks
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
@llvm.assumeafter bounds checks — after a successful bounds check, emitcall void @llvm.assume(i1 %in_bounds)to explicitly tell LLVM's optimizer that the index is valid. Zero runtime cost (assume generates no machine code). Enables SCEV to potentially eliminate redundant bounds checks in loops with monotonic access patterns.nnan ninfto the existingnsz arcp contract reassoc afnflags onfadd,fsub,fmul,fdiv,frem,fneg. This tells LLVM it can assume no NaN/Infinity values, enabling more aggressive loop transforms and FP optimizations.emitFAdd/emitFSub/emitFMul/emitFDiv/emitFRem/emitFNegbuilder functions now emit the same fast-math flags as the binary operator codegen path.These are foundational optimizations that give LLVM's optimizer more information to work with. They don't produce measurable speedups on current benchmarks alone, but they're prerequisites for future optimizations (loop bounds check elimination, auto-vectorization) and ensure consistent optimization across all FP code paths.
Test plan
npm run verify:quickpasses (tests + self-hosting Stage 1)