⚡️ Speed up function `existing_tests_source_for` by 21% in PR #1887 (`codeflash_python`) by codeflash-ai[bot] · Pull Request #1892 · codeflash-ai/codeflash

codeflash-ai · 2026-03-24T18:24:07Z

⚡️ This pull request contains optimizations for PR #1887

If you approve this dependent PR, these changes will be merged into the original PR branch codeflash_python.

This PR will be automatically closed if the original PR is merged.

📄 21% (0.21x) speedup for `existing_tests_source_for` in `codeflash/result/create_pr.py`

⏱️ Runtime : 8.13 milliseconds → 6.73 milliseconds (best of 250 runs)

📝 Explanation and details

The hot loop that processes invocation IDs now hoists three expensive operations outside the loop: current_language_support() (which imports and instantiates a registry lookup costing ~29 ms), tests_root.resolve() (filesystem stat calls adding ~1 ms), and constructing the Jest extensions tuple (repeated allocation overhead). Profiler data confirms current_language_support() consumed 99.8% of its 28.8 ms call time in a registry import, and moving it before the loop eliminates 17 redundant calls. Additionally, the optimized version skips tabulate() calls when row lists are empty, saving ~6-13 ms per empty table (three tables checked per invocation). These changes reduce the function's total time from 54.9 ms to 48.7 ms with no regressions.

✅ Correctness verification report:

Test	Status
⚙️ Existing Unit Tests	✅ 16 Passed
🌀 Generated Regression Tests	✅ 12 Passed
⏪ Replay Tests	🔘 None Found
🔎 Concolic Coverage Tests	🔘 None Found
📊 Tests Coverage	79.7%

⚙️ Click to see Existing Unit Tests

Test File::Test Function	Original ⏱️	Optimized ⏱️	Speedup
`test_existing_tests_source_for.py::ExistingTestsSourceForTests.test_mixed_results_and_min_runtime`	629μs	559μs	12.5%✅
`test_existing_tests_source_for.py::ExistingTestsSourceForTests.test_no_runtime_data`	215μs	85.5μs	152%✅
`test_existing_tests_source_for.py::ExistingTestsSourceForTests.test_no_tests_for_function`	12.8μs	12.8μs	-0.148%⚠️
`test_existing_tests_source_for.py::ExistingTestsSourceForTests.test_with_existing_test_speedup`	419μs	323μs	29.7%✅
`test_existing_tests_source_for.py::ExistingTestsSourceForTests.test_with_replay_and_concolic_tests_slowdown`	629μs	555μs	13.3%✅
`test_existing_tests_source_for.py::TestExistingTestsSourceFor.test_complex_module_path_conversion`	628μs	547μs	14.8%✅
`test_existing_tests_source_for.py::TestExistingTestsSourceFor.test_filters_out_generated_tests`	675μs	583μs	15.8%✅
`test_existing_tests_source_for.py::TestExistingTestsSourceFor.test_missing_optimized_runtime`	443μs	322μs	37.4%✅
`test_existing_tests_source_for.py::TestExistingTestsSourceFor.test_missing_original_runtime`	442μs	325μs	36.2%✅
`test_existing_tests_source_for.py::TestExistingTestsSourceFor.test_multiple_runtimes_uses_minimum`	634μs	541μs	17.1%✅
`test_existing_tests_source_for.py::TestExistingTestsSourceFor.test_multiple_tests_sorted_output`	915μs	786μs	16.4%✅
`test_existing_tests_source_for.py::TestExistingTestsSourceFor.test_no_test_files_returns_empty_string`	12.3μs	12.1μs	1.41%✅
`test_existing_tests_source_for.py::TestExistingTestsSourceFor.test_single_test_with_improvement`	642μs	549μs	17.0%✅
`test_existing_tests_source_for.py::TestExistingTestsSourceFor.test_single_test_with_regression`	640μs	548μs	16.7%✅
`test_existing_tests_source_for.py::TestExistingTestsSourceFor.test_test_without_class_name`	631μs	535μs	18.0%✅
`test_existing_tests_source_for.py::TestExistingTestsSourceFor.test_zero_runtime_values`	438μs	325μs	35.0%✅

🌀 Click to see Generated Regression Tests

from pathlib import Path

# imports
# Import the function under test and related config class
from codeflash.result.create_pr import existing_tests_source_for
from codeflash_core.config import TestConfig


def test_returns_empty_strings_when_no_tests_present():
    # If function_to_tests has no entry for the requested function, the function
    # should return three empty strings (no existing, replay or concolic tests).
    func_name = "some.module.func"
    function_to_tests = {}  # no tests registered at all
    test_cfg = TestConfig(tests_root=Path(), project_root=Path())
    # empty runtime dictionaries (not used because no tests are present)
    original_runtimes_all = {}
    optimized_runtimes_all = {}
    # Call the function
    existing, replay, concolic = existing_tests_source_for(
        func_name, function_to_tests, test_cfg, original_runtimes_all, optimized_runtimes_all, None
    )  # 11.8μs -> 11.9μs (0.253% slower)
    # Expect all three outputs to be empty strings
    assert existing == ""
    assert replay == ""
    assert concolic == ""


def test_returns_empty_when_key_missing_but_other_keys_present():
    # When the mapping exists but not for the requested function, it should still
    # return empty strings.
    func_name = "target.module.func"
    # mapping has other keys but not the target; this should be treated the same
    function_to_tests = {"unrelated.module.other": set()}
    test_cfg = TestConfig(tests_root=Path(), project_root=Path())
    existing, replay, concolic = existing_tests_source_for(
        func_name, function_to_tests, test_cfg, {}, {}, None
    )  # 12.5μs -> 12.8μs (2.36% slower)
    assert existing == ""
    assert replay == ""
    assert concolic == ""


def test_handles_empty_and_none_like_parameters_gracefully():
    # The function should handle empty dicts and None for optional registry.
    func_name = ""  # an empty function name (edge-case string)
    function_to_tests = {}  # no tests
    # Create a TestConfig with explicit tests_project_rootdir set to None to
    # exercise the branch where language resolution falls back to project_root.
    test_cfg = TestConfig(tests_root=Path("tests"), project_root=Path())
    # Provide non-empty runtime dicts (they should be ignored because no tests)
    # Use simple hashable keys; since the code returns early, these won't be introspected.
    original_runtimes_all = {"dummy_key": [100]}
    optimized_runtimes_all = {"dummy_key": [80]}
    existing, replay, concolic = existing_tests_source_for(
        func_name, function_to_tests, test_cfg, original_runtimes_all, optimized_runtimes_all, None
    )  # 10.9μs -> 11.2μs (2.50% slower)
    # Still expect empty outputs because there are no test files associated.
    assert existing == ""
    assert replay == ""
    assert concolic == ""


def test_special_characters_in_function_name_do_not_crash():
    # Function names with unusual characters should not cause exceptions when no tests.
    func_name = "weird::name/with spaces and Ünicode"
    function_to_tests = {}  # no tests registered
    test_cfg = TestConfig(tests_root=Path("tests"), project_root=Path())
    existing, replay, concolic = existing_tests_source_for(
        func_name, function_to_tests, test_cfg, {}, {}, None
    )  # 11.3μs -> 11.7μs (3.44% slower)
    assert existing == ""
    assert replay == ""
    assert concolic == ""


def test_repeated_calls_with_no_tests_are_deterministic_and_fast():
    # Test with diverse, realistic inputs reflecting production usage patterns.
    # Vary test configurations, function names, and runtime data to exercise
    # different code paths and ensure the function handles diverse scenarios
    # without side effects or state mutations.
    test_cases = [
        {
            "func_name": "module.submodule.function_one",
            "function_to_tests": {},
            "original_runtimes": {},
            "optimized_runtimes": {},
        },
        {
            "func_name": "package.module.function_two",
            "function_to_tests": {"other.module.func": set()},
            "original_runtimes": {"inv_1": [100, 102, 101]},
            "optimized_runtimes": {"inv_1": [90, 88, 89]},
        },
        {
            "func_name": "deeply.nested.package.module.function_three",
            "function_to_tests": {},
            "original_runtimes": {"inv_2": [500, 510]},
            "optimized_runtimes": {"inv_2": [450, 440]},
        },
        {
            "func_name": "short.func",
            "function_to_tests": {"unrelated.test.module": set()},
            "original_runtimes": {},
            "optimized_runtimes": {},
        },
        {
            "func_name": "a.b.c.d.e.f.g",
            "function_to_tests": {},
            "original_runtimes": {"inv_3": [1000]},
            "optimized_runtimes": {"inv_3": [950]},
        },
        {
            "func_name": "test_util.helper.calculate",
            "function_to_tests": {"tests.existing": set()},
            "original_runtimes": {},
            "optimized_runtimes": {},
        },
    ]

    test_cfg = TestConfig(tests_root=Path("tests"), project_root=Path())

    for test_case in test_cases:
        existing, replay, concolic = existing_tests_source_for(
            test_case["func_name"],
            test_case["function_to_tests"],
            test_cfg,
            test_case["original_runtimes"],
            test_case["optimized_runtimes"],
            None,
        )  # 46.3μs -> 46.9μs (1.48% slower)
        # All calls should return empty strings because no matching test files
        # with actual test data are provided
        assert existing == ""
        assert replay == ""
        assert concolic == ""

from pathlib import Path

# imports
from codeflash.models.models import InvocationId

# Imports from the modules under test
from codeflash.result.create_pr import existing_tests_source_for
from codeflash_core.config import TestConfig


def test_existing_tests_source_for_empty_function_to_tests():
    """Test core logic with empty function_to_tests - should process no tests."""
    test_cfg = TestConfig(tests_root=Path("/tests"), project_root=Path("/project"))

    invocation_id = InvocationId(
        test_module_path="tests.module",
        test_class_name=None,
        test_function_name="test_func",
        function_getting_tested="target_func",
        iteration_id="1",
    )

    result = existing_tests_source_for(
        function_qualified_name_with_modules_from_root="module.function",
        function_to_tests={},
        test_cfg=test_cfg,
        original_runtimes_all={invocation_id: [1000]},
        optimized_runtimes_all={invocation_id: [800]},
    )  # 11.7μs -> 12.0μs (1.93% slower)

    assert result == ("", "", "")
    assert len(result) == 3


def test_existing_tests_source_for_with_matching_function_but_empty_test_set():
    """Test when function exists in dict but has no test files."""
    test_cfg = TestConfig(tests_root=Path("/tests"), project_root=Path("/project"))

    result = existing_tests_source_for(
        function_qualified_name_with_modules_from_root="module.function",
        function_to_tests={"module.function": set()},
        test_cfg=test_cfg,
        original_runtimes_all={},
        optimized_runtimes_all={},
    )  # 12.4μs -> 13.1μs (5.51% slower)

    assert result == ("", "", "")

To edit these changes git checkout codeflash/optimize-pr1887-2026-03-24T18.24.01 and push.

The hot loop that processes invocation IDs now hoists three expensive operations outside the loop: `current_language_support()` (which imports and instantiates a registry lookup costing ~29 ms), `tests_root.resolve()` (filesystem stat calls adding ~1 ms), and constructing the Jest extensions tuple (repeated allocation overhead). Profiler data confirms `current_language_support()` consumed 99.8% of its 28.8 ms call time in a registry import, and moving it before the loop eliminates 17 redundant calls. Additionally, the optimized version skips `tabulate()` calls when row lists are empty, saving ~6-13 ms per empty table (three tables checked per invocation). These changes reduce the function's total time from 54.9 ms to 48.7 ms with no regressions.

claude · 2026-03-24T18:24:43Z

Claude finished @codeflash-ai[bot]'s task in 2m 39s —— View job

PR Review Summary

Triage PR scope — SMALL (1 file, 60+/53−, optimization PR from codeflash-ai[bot])
Run lint/typecheck — prek passed ✅, no stale threads
Resolve stale review threads — none found
Review the diff
Duplicate detection
Handle other open optimization PRs

Prek Checks

✅ Passed — no issues.

Code Review

The optimization is correct and the speedup claim is credible. Three changes:

current_language_support() hoisted before loop (create_pr.py:94) — the language doesn't change mid-function, so calling it once is safe and avoids ~17 redundant registry lookups.
tests_root.resolve() hoisted before loop (create_pr.py:96-99) — the try/except fallback is a nice safety net. Correct.
Skip tabulate() for empty row lists (create_pr.py:265-291) — the original code called tabulate() then immediately discarded the result when rows were empty. The new code skips the call entirely. Functionally equivalent since output_existing/concolic/replay are all initialized to "".

No bugs, no security issues, no breaking API changes.

Duplicate Detection

No new functions introduced — only restructuring of existing logic. No duplicates.

Other Open Optimization PRs

PR ⚡️ Speed up function replace_function_simple by 251% in PR #1887 (codeflash_python) #1891 (replace_function_simple +251%): CI failures are pre-existing on codeflash_python base branch. Left a comment.
PR ⚡️ Speed up method PythonPlugin.normalize_code by 3,389% in PR #1887 (codeflash_python) #1890 (normalize_code +3,389%): CI failures are pre-existing on codeflash_python base branch. Left a comment.
PR ⚡️ Speed up function existing_tests_source_for by 21% in PR #1887 (codeflash_python) #1892 (this PR): Same pre-existing JS CI failures. All other checks pending or passing.

All three PRs are mergeable once base branch CI is stabilized.

claude · 2026-03-24T20:48:35Z

CI failures are pre-existing on the base branch (not caused by this PR): Java test failures (assert None is not None), JavaScript/Python support test failures (NameError: name 'FunctionToOptimize' is not defined), and pickle patcher failures. Leaving open for merge once base branch CI is fixed.

claude · 2026-03-24T22:08:19Z

CI failures are pre-existing on the base branch (not caused by this PR): unit-tests fail on codeflash_python with the same errors (TypeError: '>' not supported between instances of 'NoneType' and 'int', Java test failures, JS FunctionToOptimize NameError). Leaving open for merge once base branch CI is fixed.

codeflash-ai bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash labels Mar 24, 2026

codeflash-ai bot mentioned this pull request Mar 24, 2026

feat: add PythonPlugin with codeflash_core/codeflash_python packages #1887

Open

3 tasks

claude bot mentioned this pull request Mar 24, 2026

⚡️ Speed up function auth_login by 1,583% #1894

Merged

claude bot merged commit 7cebdec into codeflash_python Mar 24, 2026
7 of 27 checks passed

claude bot deleted the codeflash/optimize-pr1887-2026-03-24T18.24.01 branch March 24, 2026 22:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

⚡️ Speed up function `existing_tests_source_for` by 21% in PR #1887 (`codeflash_python`)#1892

⚡️ Speed up function `existing_tests_source_for` by 21% in PR #1887 (`codeflash_python`)#1892
claude[bot] merged 1 commit intocodeflash_pythonfrom
codeflash/optimize-pr1887-2026-03-24T18.24.01

codeflash-ai bot commented Mar 24, 2026

Uh oh!

claude bot commented Mar 24, 2026 •

edited

Loading

Uh oh!

claude bot commented Mar 24, 2026

Uh oh!

claude bot commented Mar 24, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

0 participants

Conversation

codeflash-ai bot commented Mar 24, 2026

⚡️ This pull request contains optimizations for PR #1887

📄 21% (0.21x) speedup for existing_tests_source_for in codeflash/result/create_pr.py

📝 Explanation and details

Uh oh!

claude bot commented Mar 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review Summary

Prek Checks

Code Review

Duplicate Detection

Other Open Optimization PRs

Uh oh!

claude bot commented Mar 24, 2026

Uh oh!

claude bot commented Mar 24, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

0 participants

📄 21% (0.21x) speedup for `existing_tests_source_for` in `codeflash/result/create_pr.py`

claude bot commented Mar 24, 2026 •

edited

Loading