Skip to content

[opt](build) optimize column code and move function into catalog module#61574

Open
morrySnow wants to merge 1 commit intoapache:masterfrom
morrySnow:expr-refactor
Open

[opt](build) optimize column code and move function into catalog module#61574
morrySnow wants to merge 1 commit intoapache:masterfrom
morrySnow:expr-refactor

Conversation

@morrySnow
Copy link
Contributor

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@hello-stephen
Copy link
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@morrySnow
Copy link
Contributor Author

run buildall

@morrySnow
Copy link
Contributor Author

/review

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review Summary

This PR refactors FE code to move Function, ScalarFunction, AggregateFunction, FunctionName from fe-core to fe-catalog, extracts GsonUtilsBase from GsonUtils, moves utility classes (Id, IdGenerator, TreeNode, URI) to fe-common, introduces Function.BinaryType enum to decouple from Thrift, and removes dead code from Expr.java and TreeNode.java.

Critical Checkpoints

1. Goal and correctness: The stated goal is refactoring with no logic changes. However, a correctness bug was introduced in FoldConstantRuleOnBE.java (see inline comment). The AggregateFunction copy constructor improvement is a positive change.

2. Modification minimality: The PR is large (67 files, ~2200 lines changed) but mostly mechanical moves. The dead code removal and method relocation are reasonable.

3. Concurrency: No new concurrency concerns introduced.

4. Lifecycle / static initialization: GsonUtilsBase.GSON is a static final field in fe-common. GsonUtils in fe-core extends the builder and creates its own GSON. No static initialization order fiasco risk since GsonUtils.GSON_BUILDER references GsonUtilsBase.GSON_BUILDER which is in its own module dependency.

5. Configuration items: None added.

6. Incompatible changes: Function.BinaryType enum names match TFunctionBinaryType exactly, so valueOf(fn.getBinaryType().name()) conversion is safe. Serialization format uses @SerializedName on the binaryType field (bt), so GSON will serialize the new enum by name — this is compatible with old data as long as the enum names match, which they do.

7. Parallel code paths: The method relocations (e.g., extractSlots, getInputSlotRef, unwrapCast, getConjuncts) are correctly moved to their only call sites. The ScanNode.unwrapCast() is functionally equivalent to Expr.unwrapExpr(false) when implicitOnly=false.

8. Test coverage: The PR claims no tests needed due to refactoring. However, the FoldConstantRuleOnBE change is a logic bug, not just refactoring. No regression test covers the specific scenario of sibling constant sub-expressions.

9. Observability: No changes needed.

10. Transaction/persistence: Function.write() now uses GsonUtilsBase.GSON instead of GsonUtils.GSON. Since GsonUtilsBase.GSON lacks the runtime type adapters registered in GsonUtils.GSON, this could be an issue for Function subclasses that need polymorphic deserialization. However, Function serialization uses @SerializedName fields directly and doesn't appear to rely on RuntimeTypeAdapterFactory for its own hierarchy, so this is likely safe.

11. Performance: No performance issues introduced.

Issues Found

  • [CRITICAL] FoldConstantRuleOnBE.collectConst(): Replacing IdGenerator<ExprId> with Integer breaks constant folding for expressions with multiple constant siblings. Java Integer is immutable and passed by value — the id++ inside collectConst only modifies the local copy. Sibling constants get the same ID, causing map key collisions and silent data loss. This can produce wrong query results.

  • [MEDIUM] ExprToSqlVisitor.visitSlotRef(): expr.getSlotId() != null and expr.getSlotId() == null will throw NullPointerException when expr.desc is null, because SlotRef.getSlotId() calls Preconditions.checkNotNull(desc) internally. The old code used expr.desc != null which was a safe null check. Should use expr.getDesc() != null instead.

public static Expression foldByBE(ExpressionMatchingContext<Expression> context) {
IdGenerator<ExprId> idGenerator = ExprId.createGenerator();
Integer idGenerator = 0;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[CRITICAL BUG] Integer is immutable in Java and passed by value. Replacing IdGenerator<ExprId> (a mutable object with shared state) with Integer breaks the ID generation logic.

When collectConst is called recursively for sibling children in the else branch (line 223), each sibling receives the same id value because the id++ on line 205 only modifies the local copy — the increment is invisible to the caller and subsequent siblings.

Example: For f(g(1), h(2)) where both g(1) and h(2) are constant:

  1. collectConst(f(...), ..., id=0) → not fully constant, enters else
  2. collectConst(g(1), ..., id=0) → constant, id++ → local id=1, puts "1" → g(1)
  3. collectConst(h(2), ..., id=0) → same id=0 from parent! → id++ → local id=1, puts "1" → h(2) overwriting g(1)

Result: g(1) is silently lost from the map. The replace() phase may substitute g(1)'s position with h(2)'s folded result, producing wrong query results.

Fix: Use AtomicInteger (already imported in this file!) instead of Integer:

AtomicInteger idGenerator = new AtomicInteger(0);
// ...
String id = String.valueOf(idGenerator.incrementAndGet());

&& expr.desc != null) {
return expr.getLabel() + "[#" + expr.desc.getId().asInt() + "]";
&& expr.getSlotId() != null) {
return expr.getLabel() + "[#" + expr.getSlotId().asInt() + "]";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[BUG] expr.getSlotId() calls Preconditions.checkNotNull(desc) internally (see SlotRef.java:130). If desc is null, this will throw NullPointerException instead of evaluating to null. The old code expr.desc != null was a safe null check.

This line and the one at line 182 (expr.getSlotId() == null) should use expr.getDesc() != null / expr.getDesc() == null instead, to preserve the original null-safety semantics.

}
} else if (expr.desc == null) {
} else if (expr.getSlotId() == null) {
return "`" + expr.getCol() + "`";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[BUG] Same issue as above: expr.getSlotId() == null will throw NullPointerException when desc is null, because SlotRef.getSlotId() internally calls Preconditions.checkNotNull(desc). The old code used expr.desc == null which was safe.

Should be: expr.getDesc() == null

@doris-robot
Copy link

TPC-H: Total hot run time: 26886 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 2c890b8be6ac6b3ebec1ffc882122351f2e449e9, data reload: false

------ Round 1 ----------------------------------
orders	Doris	NULL	NULL	0	0	0	NULL	0	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	17650	4450	4291	4291
q2	q3	10652	811	514	514
q4	4671	364	250	250
q5	7576	1189	1033	1033
q6	172	175	145	145
q7	775	832	665	665
q8	9525	1451	1313	1313
q9	5226	4798	4679	4679
q10	6308	1956	1650	1650
q11	443	242	230	230
q12	750	593	462	462
q13	18047	2927	2163	2163
q14	231	239	213	213
q15	q16	749	736	668	668
q17	744	808	491	491
q18	5879	5374	5324	5324
q19	1197	974	607	607
q20	547	489	370	370
q21	4584	1850	1523	1523
q22	500	393	295	295
Total cold run time: 96226 ms
Total hot run time: 26886 ms

----- Round 2, with runtime_filter_mode=off -----
orders	Doris	NULL	NULL	150000000	42	6422171781	NULL	22778155	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	4734	4650	4566	4566
q2	q3	3986	4449	3824	3824
q4	865	1214	785	785
q5	4044	4364	4450	4364
q6	207	192	149	149
q7	1761	1601	1559	1559
q8	2470	2705	2642	2642
q9	7555	7313	7341	7313
q10	3779	4061	3788	3788
q11	525	440	417	417
q12	493	600	437	437
q13	2845	3212	2285	2285
q14	280	309	293	293
q15	q16	731	808	749	749
q17	1229	1380	1394	1380
q18	7315	7034	6774	6774
q19	893	930	910	910
q20	2084	2174	2020	2020
q21	3958	3469	3358	3358
q22	464	430	376	376
Total cold run time: 50218 ms
Total hot run time: 47989 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 168077 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 2c890b8be6ac6b3ebec1ffc882122351f2e449e9, data reload: false

query5	4322	629	506	506
query6	320	237	216	216
query7	4215	476	274	274
query8	357	263	228	228
query9	8718	2672	2716	2672
query10	478	441	333	333
query11	6961	5099	4918	4918
query12	178	128	128	128
query13	1272	470	357	357
query14	5744	3711	3455	3455
query14_1	2831	2801	2851	2801
query15	211	203	179	179
query16	987	473	472	472
query17	897	772	646	646
query18	2446	456	364	364
query19	218	217	197	197
query20	132	131	128	128
query21	220	140	111	111
query22	13322	13864	14413	13864
query23	16316	15738	15674	15674
query23_1	15871	15639	15828	15639
query24	7212	1586	1221	1221
query24_1	1218	1228	1235	1228
query25	538	458	404	404
query26	1241	254	143	143
query27	2776	478	291	291
query28	4481	1842	1824	1824
query29	864	569	473	473
query30	296	229	191	191
query31	1000	949	882	882
query32	86	68	71	68
query33	510	334	274	274
query34	882	870	519	519
query35	649	681	589	589
query36	1107	1111	1034	1034
query37	138	95	81	81
query38	2956	2971	2867	2867
query39	849	828	809	809
query39_1	794	836	795	795
query40	229	170	137	137
query41	73	69	57	57
query42	299	251	257	251
query43	258	259	230	230
query44	
query45	200	196	187	187
query46	908	1007	624	624
query47	2114	2143	2087	2087
query48	313	324	246	246
query49	629	480	384	384
query50	708	289	231	231
query51	4119	4045	4000	4000
query52	263	267	257	257
query53	286	336	283	283
query54	312	275	261	261
query55	90	86	84	84
query56	319	349	310	310
query57	1951	1722	1699	1699
query58	282	266	267	266
query59	2793	2962	2725	2725
query60	350	333	325	325
query61	147	151	157	151
query62	617	593	548	548
query63	315	286	278	278
query64	5162	1277	976	976
query65	
query66	1489	457	382	382
query67	24353	24282	24103	24103
query68	
query69	403	300	287	287
query70	977	989	853	853
query71	331	300	294	294
query72	2973	2833	2339	2339
query73	545	560	320	320
query74	9590	9562	9420	9420
query75	2882	2738	2458	2458
query76	2291	1035	699	699
query77	357	397	312	312
query78	10968	11144	10469	10469
query79	1101	771	572	572
query80	1368	617	530	530
query81	544	261	230	230
query82	1006	148	132	132
query83	348	259	241	241
query84	294	117	106	106
query85	927	482	442	442
query86	421	290	292	290
query87	3148	3105	3017	3017
query88	3537	2645	2641	2641
query89	416	359	342	342
query90	2018	175	167	167
query91	172	160	132	132
query92	78	74	74	74
query93	915	846	501	501
query94	629	323	312	312
query95	599	408	313	313
query96	656	523	223	223
query97	2500	2485	2444	2444
query98	243	221	231	221
query99	947	996	913	913
Total cold run time: 248854 ms
Total hot run time: 168077 ms

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants