-
Notifications
You must be signed in to change notification settings - Fork 205
Open
Labels
apiPublic API surface (np.*, NDArray methods, operators)Public API surface (np.*, NDArray methods, operators)performancePerformance improvements or optimizationsPerformance improvements or optimizations
Description
Problem
np.maximum/minimum/clip with @out parameter does two passes:
np.copyto(@out, lhs)- copy input to outputClipArrayMin(@out, min)- clip in-place
Root Cause
ClipArrayMin<T>(T* output, T* minArr, int size) operates in-place. No separate src parameter.
Fix
Add 3-operand kernel variants:
ClipArrayMinFromSource<T>(T* dest, T* src, T* minArr, int size)
// dest[i] = max(src[i], minArr[i]) - single passSame for ClipArrayMax and ClipArrayBounds.
Impact
Affects np.maximum, np.minimum, np.fmax, np.clip when @out is provided.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
apiPublic API surface (np.*, NDArray methods, operators)Public API surface (np.*, NDArray methods, operators)performancePerformance improvements or optimizationsPerformance improvements or optimizations