Skip to content

feat: add PythonPlugin with codeflash_core/codeflash_python packages#1887

Open
KRRT7 wants to merge 10 commits intomainfrom
codeflash_python
Open

feat: add PythonPlugin with codeflash_core/codeflash_python packages#1887
KRRT7 wants to merge 10 commits intomainfrom
codeflash_python

Conversation

@KRRT7
Copy link
Collaborator

@KRRT7 KRRT7 commented Mar 24, 2026

Summary

  • Add codeflash_core (language-agnostic protocols, config, models) and codeflash_python (~115 files, Python language plugin) under src/
  • Unify FunctionToOptimize, FunctionParent, HelperFunction, and TestConfig as single sources of truth from codeflash_core
  • Wire PythonPlugin into the orchestrator for build_index, rank_functions, cleanup_run, and get_call_graph_index
  • Replace current_language_support().function_optimizer_class with direct PythonFunctionOptimizer import
  • Remove superseded ranking methods (rank_all_functions_globally, rank_by_dependency_count, display_global_ranking)
  • Remove FunctionInfo alias — all code now imports FunctionToOptimize directly

Commits

  1. feat: add codeflash_core and codeflash_python packages with unified types — bulk migration of core types and Python plugin
  2. refactor: wire PythonPlugin into orchestrator and remove dead ranking code — route orchestrator through plugin, delete dead methods
  3. refactor: remove FunctionInfo alias, use FunctionToOptimize directly — cleanup across 16 files

Test plan

  • uv run pytest tests/ passes
  • uv run prek clean on all changed files
  • Verify optimization pipeline still works end-to-end

@claude
Copy link
Contributor

claude bot commented Mar 24, 2026

Claude finished @KRRT7's task in 24m 53s —— View job


PR Review Summary

  • Triage PR size — LARGE (258 files, +32971/-1174)
  • Run lint/typecheck — prek passed, no issues
  • Resolve stale threads — none found
  • Code review
  • Duplicate detection
  • Test coverage
  • Merge optimization PRs — none open

Prek Checks

uv run prek run --from-ref origin/main passed cleanly. No auto-fixes needed.


Code Review

🔴 Critical Bugs (runtime crashes)

1. Missing methods on PythonPlugin — will crash every optimization run

optimizer.py:291 calls self.plugin.setup_test_config(self.test_cfg) and optimizer.py:402 calls self.plugin.get_call_graph_index(), but neither method exists on PythonPlugin. Every call to Optimizer.run() that finds functions to optimize will immediately raise AttributeError.

Fix this →

2. runtime field is None — regression in 7+ tests

codeflash/optimization/optimizer.py:205 now imports FunctionOptimizer from codeflash_python.function_optimizer instead of codeflash.languages.python.function_optimizer. The codeflash_python version of run_and_parse_tests uses codeflash_python.verification internally, which does not populate TestInvocation.runtime in the same way. The following tests all fail with TypeError: '>' not supported between instances of 'NoneType' and 'int':

  • test_bubble_sort_behavior_results
  • test_method_full_instrumentation
  • test_classmethod_full_instrumentation
  • test_staticmethod_full_instrumentation
  • test_class_method_test_instrumentation_only
  • test_class_method_full_instrumentation
  • test_sync_function_behavior_in_async_test_environment

3. FunctionToOptimize used without import in tests

tests/test_languages/test_python_support.py:265 and tests/test_languages/test_language_parity.py use FunctionToOptimize at module/class scope, but the top-level import was replaced with ParentInfo only (FunctionInfo alias was removed, FunctionToOptimize was not added). This causes NameError: name 'FunctionToOptimize' is not defined in TestReplaceFunction and TestExtractCodeContextParity.

Fix this →

4. Behavioral regression in test_pickle_patcher.py::test_run_and_parse_picklepatch

compare_test_results now returns True where it previously returned False for unpickleable objects. The test's assert not match fails. This may be related to the codeflash_python comparator logic differing from the one in codeflash/verification/comparator.py.

🟡 Design Issues

5. codeflash_core.models imports from codeflash — violates layering

src/codeflash_core/models.py:65: qualified_name_with_modules_from_root does from codeflash.code_utils.code_utils import module_name_from_file_path. This makes the "language-agnostic core" depend on the codeflash package, creating a circular layering dependency. Either move module_name_from_file_path to codeflash_core, or move this method out of the core model.

Fix this →

6. call_graph_live_display runs but build_index is a no-op

optimizer.py:312 shows a progress bar for building a call graph index, then calls self.plugin.build_index(...) which immediately returns pass. Users will see a call graph progress indicator that does nothing. Consider either wiring the real index or removing the progress bar.

7. call_graph_summary still exported from console.py but never called

After removing the call_graph_summary(resolver, ...) call in the optimizer, the function in codeflash/cli_cmds/console.py:386 is now dead code.


Duplicate Detection

HIGH confidence duplicates — code exists in both codeflash/ and src/codeflash_python/:

File in codeflash/ Duplicate in src/codeflash_python/ Notes
codeflash/benchmarking/instrument_codeflash_trace.py (123 lines) src/codeflash_python/benchmarking/instrument_codeflash_trace.py (130 lines) Near-identical; differs only in import path (codeflash.code_utils vs codeflash_python.code_utils)
codeflash/benchmarking/function_ranker.py (330 lines) src/codeflash_python/benchmarking/function_ranker.py (239 lines) Both define FunctionRanker with rank_functions and get_function_addressable_time

The optimizer.py now imports from codeflash_python.benchmarking (the codeflash/benchmarking/ versions are kept alongside). These will diverge over time; one should be canonical.


Test Coverage

5 test files affected by this PR have test failures (7 Python, several Java). Java failures appear pre-existing (Java tracer infra issues unrelated to this PR). The Python failures listed above are regressions introduced by this PR.

New files codeflash/plugin.py, codeflash/plugin_ai_ops.py, codeflash/plugin_helpers.py, codeflash/plugin_results.py, codeflash/plugin_test_lifecycle.py, and codeflash/verification/test_runner.py (1,688 lines total) have no direct unit tests.


Base automatically changed from codeflash-core to main March 24, 2026 09:43
@KRRT7 KRRT7 force-pushed the codeflash_python branch from 163de13 to 686c3ae Compare March 24, 2026 09:44
@KRRT7 KRRT7 changed the title Add PythonPlugin adapter wiring codeflash to codeflash_core PythonPlugin adapter + unified core types Mar 24, 2026
@codeflash-ai
Copy link
Contributor

codeflash-ai bot commented Mar 24, 2026

⚡️ Codeflash found optimizations for this PR

📄 34% (0.34x) speedup for __getattr__ in codeflash/languages/__init__.py

⏱️ Runtime : 532 microseconds 396 microseconds (best of 250 runs)

A dependent PR with the suggested changes has been created. Please review:

If you approve, it will be merged into this PR (branch codeflash_python).

Static Badge

@codeflash-ai
Copy link
Contributor

codeflash-ai bot commented Mar 24, 2026

⚡️ Codeflash found optimizations for this PR

📄 17% (0.17x) speedup for __getattr__ in codeflash/languages/base.py

⏱️ Runtime : 599 microseconds 513 microseconds (best of 250 runs)

A new Optimization Review has been created.

🔗 Review here

Static Badge

@KRRT7 KRRT7 force-pushed the codeflash_python branch from 6a547d2 to 535adfe Compare March 24, 2026 12:20
@KRRT7 KRRT7 changed the title PythonPlugin adapter + unified core types feat: add PythonPlugin with codeflash_core/codeflash_python packages Mar 24, 2026
@KRRT7 KRRT7 force-pushed the codeflash_python branch from 535adfe to 55bc577 Compare March 24, 2026 12:43
KRRT7 added 6 commits March 24, 2026 07:47
The codeflash_python verifier's generate_tests expected a Path but
callers pass TestConfig. Match the old verifier's signature.
… mismatch

The old AiServiceClient creates OptimizedCandidate from codeflash.models
but the new OptimizationSet expects them from codeflash_python.models.
Delete src/codeflash_python/models/models.py and update all 53 files
to import from codeflash.models.models — single source of truth.
Add ruff per-file-ignores for pre-existing PTH110, PTH123, PD011, E721
in src/codeflash_python/. Fix TC003 in addopts.py.
@codeflash-ai
Copy link
Contributor

codeflash-ai bot commented Mar 24, 2026

⚡️ Codeflash found optimizations for this PR

📄 3,389% (33.89x) speedup for PythonPlugin.normalize_code in codeflash/plugin.py

⏱️ Runtime : 58.3 milliseconds 1.67 milliseconds (best of 17 runs)

A dependent PR with the suggested changes has been created. Please review:

If you approve, it will be merged into this PR (branch codeflash_python).

Static Badge

@codeflash-ai
Copy link
Contributor

codeflash-ai bot commented Mar 24, 2026

⚡️ Codeflash found optimizations for this PR

📄 2,465% (24.65x) speedup for PluginAiOpsMixin.repair_candidate in codeflash/plugin_ai_ops.py

⏱️ Runtime : 244 milliseconds 9.52 milliseconds (best of 139 runs)

A new Optimization Review has been created.

🔗 Review here

Static Badge

@codeflash-ai
Copy link
Contributor

codeflash-ai bot commented Mar 24, 2026

⚡️ Codeflash found optimizations for this PR

📄 251% (2.51x) speedup for replace_function_simple in codeflash/plugin_helpers.py

⏱️ Runtime : 80.6 milliseconds 23.0 milliseconds (best of 109 runs)

A dependent PR with the suggested changes has been created. Please review:

If you approve, it will be merged into this PR (branch codeflash_python).

Static Badge

The hot loop that processes invocation IDs now hoists three expensive operations outside the loop: `current_language_support()` (which imports and instantiates a registry lookup costing ~29 ms), `tests_root.resolve()` (filesystem stat calls adding ~1 ms), and constructing the Jest extensions tuple (repeated allocation overhead). Profiler data confirms `current_language_support()` consumed 99.8% of its 28.8 ms call time in a registry import, and moving it before the loop eliminates 17 redundant calls. Additionally, the optimized version skips `tabulate()` calls when row lists are empty, saving ~6-13 ms per empty table (three tables checked per invocation). These changes reduce the function's total time from 54.9 ms to 48.7 ms with no regressions.
@codeflash-ai
Copy link
Contributor

codeflash-ai bot commented Mar 24, 2026

⚡️ Codeflash found optimizations for this PR

📄 21% (0.21x) speedup for existing_tests_source_for in codeflash/result/create_pr.py

⏱️ Runtime : 8.13 milliseconds 6.73 milliseconds (best of 250 runs)

A dependent PR with the suggested changes has been created. Please review:

If you approve, it will be merged into this PR (branch codeflash_python).

Static Badge

…2026-03-24T18.24.01

⚡️ Speed up function `existing_tests_source_for` by 21% in PR #1887 (`codeflash_python`)
@codeflash-ai
Copy link
Contributor

codeflash-ai bot commented Mar 24, 2026

This PR is now faster! 🚀 @claude[bot] accepted my optimizations from:

@codeflash-ai
Copy link
Contributor

codeflash-ai bot commented Mar 24, 2026

⚡️ Codeflash found optimizations for this PR

📄 125% (1.25x) speedup for pytest_cmd_tokens in codeflash/verification/test_runner.py

⏱️ Runtime : 4.67 milliseconds 2.08 milliseconds (best of 126 runs)

A dependent PR with the suggested changes has been created. Please review:

If you approve, it will be merged into this PR (branch codeflash_python).

Static Badge

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant