Optimize splitPart scalar function to reduce allocation by justahuman1 · Pull Request #17660 · apache/pinot

justahuman1 · 2026-02-07T20:52:19Z

Optimized the splitPart function to use direct string traversal instead of allocating a full String[] array, significantly reducing memory pressure in high-throughput query scenarios.

Performance improvements:

Space complexity: O(n) → O(1)
Time complexity: O(n) for both implementations

Added JMH benchmarks to demonstrate improvements and regressions. The primary regression is in the backward index case so I added splitPartNegativeIdxSingleCharDelim (b4272e2) which increases complexity but makes the kernel much faster

Addresses #17362

JMH Results

Yeah, the common cases are more than 10x faster 🚀 . The original results showed a regression with negative_index, hence b4272e2. However, this code may be harder to manage so that is the trade-off. The below results are from all commits in the PR

Operation Type	New (ns/op)	Old (ns/op)	Speedup (Old/New)
small_index	777.8	38,722.2	≈ 49.8× faster
large_index	295,474.5	3,867,775.5	≈ 13.1× faster
negative_index	13.2	38,635.4	≈ 2,929× faster
large_negative_index	47,797.4	3,824,985.5	≈ 80× faster
adjacent_delimiters	779.9	3,864.4	≈ 5× faster

Benchmark                                       (_scenario)  Mode  Cnt           Score          Error  Units
BenchmarkSplitPart.testSplitPartNew             small_index  avgt   10         777.797 ±       22.025  ns/op
BenchmarkSplitPart.testSplitPartNew             large_index  avgt   10      295474.514 ±     2876.945  ns/op
BenchmarkSplitPart.testSplitPartNew          negative_index  avgt   10          13.198 ±        2.553  ns/op
BenchmarkSplitPart.testSplitPartNew    large_negative_index  avgt   10       47797.384 ±      465.119  ns/op
BenchmarkSplitPart.testSplitPartNew     adjacent_delimiters  avgt   10         779.944 ±       14.194  ns/op
BenchmarkSplitPart.testSplitPartNew             many_fields  avgt   10        1935.942 ±       59.342  ns/op
BenchmarkSplitPart.testSplitPartNew        multi_char_delim  avgt   10        4281.167 ±      102.780  ns/op
BenchmarkSplitPart.testSplitPartNew  large_multi_char_delim  avgt   10     1160607.923 ±    20648.116  ns/op
BenchmarkSplitPart.testSplitPartOld             small_index  avgt   10       38722.193 ±     1463.000  ns/op
BenchmarkSplitPart.testSplitPartOld             large_index  avgt   10     3867775.470 ±   107083.008  ns/op
BenchmarkSplitPart.testSplitPartOld          negative_index  avgt   10       38635.403 ±     1325.020  ns/op
BenchmarkSplitPart.testSplitPartOld    large_negative_index  avgt   10     3824985.465 ±    89641.937  ns/op
BenchmarkSplitPart.testSplitPartOld     adjacent_delimiters  avgt   10        3864.391 ±      101.146  ns/op
BenchmarkSplitPart.testSplitPartOld             many_fields  avgt   10     3814644.277 ±   132924.021  ns/op
BenchmarkSplitPart.testSplitPartOld        multi_char_delim  avgt   10      182955.787 ±     3254.768  ns/op
BenchmarkSplitPart.testSplitPartOld  large_multi_char_delim  avgt   10  1043425954.300 ± 29614093.332  ns/op

Allocation via `-prof gc`

     "org.apache.pinot.perf.BenchmarkSplitPart.testSplitPartNew:gc.alloc.rate.norm":  48.000 B/op
     "org.apache.pinot.perf.BenchmarkSplitPart.testSplitPartNew:gc.alloc.rate.norm":  48.001 B/op
     "org.apache.pinot.perf.BenchmarkSplitPart.testSplitPartNew:gc.alloc.rate.norm":  48.002 B/op
     "org.apache.pinot.perf.BenchmarkSplitPart.testSplitPartNew:gc.alloc.rate.norm":  48.007 B/op
     "org.apache.pinot.perf.BenchmarkSplitPart.testSplitPartNew:gc.alloc.rate.norm":  56.036 B/op
     "org.apache.pinot.perf.BenchmarkSplitPart.testSplitPartNew:gc.alloc.rate.norm":  56.043 B/op
     "org.apache.pinot.perf.BenchmarkSplitPart.testSplitPartNew:gc.alloc.rate.norm":  6616.001 B/op

     "org.apache.pinot.perf.BenchmarkSplitPart.testSplitPartOld:gc.alloc.rate.norm":  304.000 B/op
     "org.apache.pinot.perf.BenchmarkSplitPart.testSplitPartOld:gc.alloc.rate.norm":  3560.001 B/op
     "org.apache.pinot.perf.BenchmarkSplitPart.testSplitPartOld:gc.alloc.rate.norm":  6616.001 B/op
     "org.apache.pinot.perf.BenchmarkSplitPart.testSplitPartOld:gc.alloc.rate.norm":  760984.043 B/op
     "org.apache.pinot.perf.BenchmarkSplitPart.testSplitPartOld:gc.alloc.rate.norm":  760984.044 B/op
     "org.apache.pinot.perf.BenchmarkSplitPart.testSplitPartOld:gc.alloc.rate.norm":  760984.045 B/op
     "org.apache.pinot.perf.BenchmarkSplitPart.testSplitPartOld:gc.alloc.rate.norm":  760984.046 B/op

justahuman1 · 2026-02-07T20:52:46Z

   * TODO: Revisit if index should be one-based (both Presto and Postgres use one-based index, which starts with 1)
-   * @param input
-   * @param delimiter
+   * @param input the input String to be split into parts.


copied verbatim from the the overloaded splitPart function below to keep in sync.

justahuman1 · 2026-02-07T20:54:47Z

@@ -0,0 +1,160 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one


I can remove this if required. Added it for my own validation of the perf

codecov-commenter · 2026-02-13T01:54:41Z

Codecov Report

❌ Patch coverage is 93.84615% with 4 lines in your changes missing coverage. Please review.
✅ Project coverage is 63.22%. Comparing base (7eec9a7) to head (84c507d).
⚠️ Report is 100 commits behind head on master.

Files with missing lines	Patch %	Lines
.../pinot/common/function/scalar/StringFunctions.java	93.84%	1 Missing and 3 partials ⚠️

Additional details and impacted files

@@             Coverage Diff              @@
##             master   #17660      +/-   ##
============================================
- Coverage     63.29%   63.22%   -0.08%     
- Complexity     1466     1525      +59     
============================================
  Files          3190     3194       +4     
  Lines        191904   193705    +1801     
  Branches      29390    29809     +419     
============================================
+ Hits         121474   122469     +995     
- Misses        60940    61622     +682     
- Partials       9490     9614     +124

Flag	Coverage Δ
custom-integration1	`100.00% <ø> (ø)`
integration	`100.00% <ø> (ø)`
integration1	`100.00% <ø> (ø)`
integration2	`0.00% <ø> (ø)`
java-11	`63.20% <93.84%> (-0.06%)`	⬇️
java-21	`63.19% <93.84%> (-0.07%)`	⬇️
temurin	`63.22% <93.84%> (-0.08%)`	⬇️
unittests	`63.22% <93.84%> (-0.08%)`	⬇️
unittests1	`55.52% <93.84%> (-0.08%)`	⬇️
unittests2	`34.17% <0.00%> (-0.10%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

justahuman1 · 2026-02-13T22:05:03Z

I will debug these tests and update the PR. I will also try to get the existing coverage to 100% so we can be confident before migrating

Jackie-Jiang · 2026-02-18T01:23:30Z

Try rebasing. The test failures might not be related

Updated the splitPart function to use direct string traversal instead of allocating a full String[] array, significantly reducing memory pressure in high-throughput query scenarios. Performance improvements: - Space complexity: O(n) → O(1) - Time complexity: O(n) for both implementations Added JMH benchmarks to demonstrate improvements and regressions. The primary regression is in the backward index case with a large index value. This is the uncommon case and maybe worth the tradeoff since the memory allocation in the common case is now significantly reduced.

justahuman1 · 2026-03-07T04:21:58Z

Done. I also added full test coverage which uncovered a behavior change with this new implementation, let me know your thoughts @Jackie-Jiang

Behavioral difference from old implementation

The new implementation differs from StringUtils.splitByWholeSeparator for empty input strings only. The old approach returns an empty array for "", treating it as having 0 fields.
The new implementation treats "" as having 1 field (the empty string itself), which is consistent with how non-empty strings without a matching delimiter are handled:

input	delimiter	index	old impl	new impl
`""`	`"."`	0	`"null"`	`""`
`""`	`"."`	-1	`"null"`	`""`
`""`	`"::"`	-1	`"null"`	`""`
`"a"`	`"."`	0	`"a"`	`"a"` (same)

I feel that the new implementation is more correct, rather than returning null. wdyt?

Copilot

Pull request overview

This PR optimizes the splitPart scalar function in pinot-common to avoid allocating a full String[] split result, reducing memory pressure for high-throughput query workloads. It also adds a JMH benchmark and expands unit test coverage for additional edge cases.

Changes:

Reimplemented StringFunctions.splitPart(input, delimiter, index) using direct string traversal (including a specialized negative-index fast path for single-char delimiters).
Added JMH benchmarks to compare the new vs old splitPart implementations.
Added many new splitPart unit test cases covering edge scenarios (empty delimiter, consecutive delimiters, leading/trailing delimiters, multi-char delimiters, empty input).

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File	Description
`pinot-common/src/main/java/org/apache/pinot/common/function/scalar/StringFunctions.java`	Replaces `splitPart` implementation with allocation-reduced traversal logic and adds a negative-index optimized helper.
`pinot-common/src/test/java/org/apache/pinot/common/function/scalar/StringFunctionsTest.java`	Expands `splitPart` test matrix for more delimiter/index boundary cases.
`pinot-perf/src/main/java/org/apache/pinot/perf/BenchmarkSplitPart.java`	Introduces a JMH benchmark to measure old vs new `splitPart` performance across scenarios.

Copilot · 2026-03-08T01:43:43Z

+        _delimiter = ":";
+        _input = buildString(50, repeatStr(_delimiter, 10));


In the multi_char_delim scenario, _delimiter is set to ":" but _input is built using repeatStr(_delimiter, 10) as the delimiter between fields. This means the benchmark isn't actually exercising a multi-character delimiter for splitPart (it passes a single-char delimiter that happens to appear consecutively), so the results can be misleading. Consider setting _delimiter to the repeated string (or use something like "::") and building _input with that same _delimiter.

Suggested change

_delimiter = ":";

_input = buildString(50, repeatStr(_delimiter, 10));

_delimiter = repeatStr(":", 10);

_input = buildString(50, _delimiter);

this is semantically the same. its more readable this way so I will address it but the logic is the same

Copilot · 2026-03-08T01:43:44Z

+      case "large_index":
+        // Extract 80th element from 100-field string
+        _delimiter = ",";
+        _input = buildString(10000, _delimiter);
+        _index = 790;
+        break;
+      case "negative_index":
+        // Extract last element from 100-field string
+        _delimiter = ",";
+        _input = buildString(100, _delimiter);
+        _index = -1;
+        break;
+      case "large_negative_index":
+        // Extract last element from 100-field string
+        _delimiter = ",";
+        _input = buildString(10000, _delimiter);
+        _index = -7242;
+        break;
+      case "adjacent_delimiters":
+        // String with many consecutive delimiters
+        _input = "field0+++field1+++field2+++field3";
+        _delimiter = "+";
+        _index = 1;
+        break;
+      case "many_fields":
+        // Extract 5th element from 1000-field string
+        _delimiter = ",";
+        _input = buildString(10000, _delimiter);
+        _index = 4;


Several scenario comments don't match the actual parameters (e.g., large_index/many_fields mention 100/1000 fields but build 10,000; large_index says 80th but uses index 790). Align the comments (or the parameters) so the benchmark scenarios remain self-explanatory.

thanks for this. updated the comments.

Copilot · 2026-03-08T01:43:44Z

+      start += delimLen;
+    }
+    // optimization for negative index with single-char delimiter since common case
+    // multi-char delimiter with negative index can be handled in future since less common and more complex


-index will overflow when index == Integer.MIN_VALUE, causing incorrect behavior for that edge case (and potentially passing a negative value into the helper). Add an explicit guard for index == Integer.MIN_VALUE (e.g., return "null" early) or handle the absolute-value conversion without overflow.

Suggested change

// multi-char delimiter with negative index can be handled in future since less common and more complex

// multi-char delimiter with negative index can be handled in future since less common and more complex

if (index == Integer.MIN_VALUE) {

return "null";

}

huh TIL. Done in 710c916

…d multi_char_delim scenario

justahuman1 · 2026-03-09T22:48:43Z

Addressed all comment from copilot cc: @xiangfu0 @deemoliu

Jackie-Jiang · 2026-03-19T17:05:50Z

-    } else {
+    // compare with {@link BenchmarkSplitPart} for future changes
+    int delimLen = delimiter.length();
+    if (delimLen == 0) {


This seems like a behavior change. When delimiter is empty, the current behavior splits on whitespace.

Can you also check the behavior on PostgreSQL?

Thanks for the review! I checked the behavior for postgres;

When the delimiter is an empty string:

PostgreSQL: returns the entire input for field 1/-1, empty string for other fields

Pinot (old): splits on whitespace — this was a side effect of StringUtils.splitByWholeSeparator treating empty ("") the same as null

I've updated the implementation to preserve the existing whitespace-splitting behavior for the empty delimiter case by falling back to StringUtils.splitByWholeSeparator. The changes in this PR now only applies to non-empty delimiters, so there's no behavioral change.

If we want to align with PostgreSQL's empty-delimiter semantics, that can be a separate PR since it's a behavioral change independent of this performance optimization.

Sounds good. Could you please also add a test using random string generator and some simple delimiters to ensure the correctness of the new impl?

Done: 84c507d

Copilot

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

Copilot · 2026-03-24T19:20:56Z

+
+    // skip leading delimiters
+    int start = 0;
+    int len = input.length();


For empty input (""), this implementation returns an empty string for index 0 and -1 (because start == len and substring(len) is ""). That differs from the existing splitPart(input, delimiter, limit, index) behavior (which returns "null" for the same cases with a large limit) and is likely an unintended semantic change for splitPart without limit. Consider adding an explicit empty-input guard (return "null") to keep the 3-arg and 4-arg splitPart variants consistent.

Suggested change

int len = input.length();

int len = input.length();

if (len == 0) {

// Align behavior with splitPart(input, delimiter, limit, index) for empty input

return "null";

}

Also addressed as part of Jackie's comment regarding behavior change. Thanks 👍

Copilot · 2026-03-24T19:20:56Z

+        {"", ".", 0, 100, "", "null"},
+        {"", ".", -1, 100, "", "null"},
+        {"", ".", -2, 100, "null", "null"},
+
+        // Empty input with multi-char delimiter and negative index
+        {"", "::", -1, 100, "", "null"},


These new test cases assert that StringFunctions.splitPart("", ".", 0/-1) returns "" (empty string). The 4-arg splitPart(input, delimiter, limit, index) currently returns "null" for the same inputs with limit=100, so this bakes in a behavioral divergence between the overloads. If the intent is to preserve existing semantics, update these expectations to match the overload (and the prior implementation) rather than introducing a silent behavior change for empty input.

Suggested change

{"", ".", 0, 100, "", "null"},

{"", ".", -1, 100, "", "null"},

{"", ".", -2, 100, "null", "null"},

// Empty input with multi-char delimiter and negative index

{"", "::", -1, 100, "", "null"},

{"", ".", 0, 100, "null", "null"},

{"", ".", -1, 100, "null", "null"},

{"", ".", -2, 100, "null", "null"},

// Empty input with multi-char delimiter and negative index

{"", "::", -1, 100, "null", "null"},

Good call! I was also doing this per #17660 (comment)

Copilot · 2026-03-24T19:20:56Z

-    } else if (index < 0 && index >= -splitString.length) {
-      return splitString[splitString.length + index];
-    } else {
+    // compare with {@link BenchmarkSplitPart} for future changes


The // compare with {@link BenchmarkSplitPart} note uses Javadoc {@link ...} syntax in a non-Javadoc comment and references a class in the perf module. Consider switching to plain text (or a URL) to avoid implying a resolvable Javadoc symbol and reduce cross-module coupling in production code comments.

Suggested change

// compare with {@link BenchmarkSplitPart} for future changes

// compare with BenchmarkSplitPart (perf module) for future changes

Jackie-Jiang · 2026-03-25T00:09:44Z

+      String input, char delimiter, int index, int len, int start) {
+    // input is empty or contains only delimiters
+    if (start == len) {
+      return index == 1 ? "" : "null";


Is this correct? Why is it returning empty string?

Good question—

Even in the old implementation, splitByWholeSeparator("+++++", "+") returns [""] — a single-element array with one empty string (i.e. the trailing content after all delimiters are consumed, which is an empty char). So index=-1 maps to parts[0] = "". This is also why I found the StringUtils behavior confusing..

Reproducible in jshell:

❯ jshell --class-path pinot-server/target/pinot-server-pkg/lib/commons-lang3-3.20.0.jar | Welcome to JShell -- Version 17.0.5 | For an introduction type: /help intro jshell> import org.apache.commons.lang3.StringUtils; jshell> StringUtils.splitByWholeSeparator("+++++", "+") $2 ==> String[1] { "" }

Also see intelliJ test coverage where it shows the randomized test (which uses the older StringUtils version for the expected value) hitting this branch as well:

We also do have test cases that hit both these branches:

{"+++++", "+", -1, 100, "", ""}, {"+++++", "+", -2, 100, "null", "null"},

Jackie-Jiang

Great job!

justahuman1 commented Feb 7, 2026

View reviewed changes

justahuman1 force-pushed the split-part-malloc-fix branch from 33336f1 to 5a2b4f1 Compare February 7, 2026 20:53

justahuman1 commented Feb 7, 2026

View reviewed changes

Comment thread pinot-common/src/test/java/org/apache/pinot/common/function/scalar/StringFunctionsTest.java

justahuman1 commented Feb 7, 2026

View reviewed changes

justahuman1 mentioned this pull request Feb 7, 2026

Improve splitPart function #17362

Closed

justahuman1 force-pushed the split-part-malloc-fix branch from 5a2b4f1 to 24644cf Compare February 7, 2026 21:38

justahuman1 changed the title ~~Update splitPart scalar function to reduce allocation~~ Optimize splitPart scalar function to reduce allocation Feb 10, 2026

Jackie-Jiang added enhancement Improvement to existing functionality performance Related to performance optimization labels Feb 13, 2026

justahuman1 added 3 commits March 6, 2026 20:15

add negative pass

66dddf9

Add comprehensive splitPart test coverage for edge cases

cb95a56

justahuman1 force-pushed the split-part-malloc-fix branch from b4272e2 to cb95a56 Compare March 7, 2026 04:16

xiangfu0 requested review from Copilot, deemoliu and xiangfu0 March 8, 2026 01:38

Copilot started reviewing on behalf of xiangfu0 March 8, 2026 01:39 View session

Copilot AI reviewed Mar 8, 2026

View reviewed changes

Address PR review: Integer.MIN_VALUE guard, fix benchmark comments an…

710c916

…d multi_char_delim scenario

Jackie-Jiang reviewed Mar 19, 2026

View reviewed changes

xiangfu0 added the index Related to indexing (general) label Mar 20, 2026

Jackie-Jiang removed the index Related to indexing (general) label Mar 24, 2026

xiangfu0 requested a review from Copilot March 24, 2026 19:15

Copilot started reviewing on behalf of xiangfu0 March 24, 2026 19:16 View session

Copilot AI reviewed Mar 24, 2026

View reviewed changes

address behavior change to prevent regression + copilot cmts

84c507d

Jackie-Jiang reviewed Mar 25, 2026

View reviewed changes

Jackie-Jiang approved these changes Mar 26, 2026

View reviewed changes

Jackie-Jiang merged commit 61009f6 into apache:master Mar 26, 2026
30 of 32 checks passed

Jackie-Jiang mentioned this pull request Mar 26, 2026

Optimize splitPart To Reduce Allocations #17428

Closed

		@@ -0,0 +1,160 @@
		/**
		* Licensed to the Apache Software Foundation (ASF) under one

		_delimiter = ":";
		_input = buildString(50, repeatStr(_delimiter, 10));

-    int len = input.length();
+    int len = input.length();
+    if (len == 0) {
+      // Align behavior with splitPart(input, delimiter, limit, index) for empty input
+      return "null";
+    }

	// compare with {@link BenchmarkSplitPart} for future changes
	// compare with BenchmarkSplitPart (perf module) for future changes

Conversation

justahuman1 commented Feb 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

JMH Results

Allocation via -prof gc

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

codecov-commenter commented Feb 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

justahuman1 commented Feb 13, 2026

Uh oh!

Jackie-Jiang commented Feb 18, 2026

Uh oh!

justahuman1 commented Mar 7, 2026

Behavioral difference from old implementation

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 8, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 8, 2026

Choose a reason for hiding this comment

Uh oh!

justahuman1 Mar 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 8, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

justahuman1 commented Mar 9, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

justahuman1 commented Feb 7, 2026 •

edited

Loading

Allocation via `-prof gc`

codecov-commenter commented Feb 13, 2026 •

edited

Loading

justahuman1 Mar 9, 2026 •

edited

Loading