Skip to content

Rectify strategy skelton PR (#212)#214

Open
Sai-Krishna99 wants to merge 5 commits intoNixtla:mainfrom
Sai-Krishna99:feat/rectify-strategy
Open

Rectify strategy skelton PR (#212)#214
Sai-Krishna99 wants to merge 5 commits intoNixtla:mainfrom
Sai-Krishna99:feat/rectify-strategy

Conversation

@Sai-Krishna99
Copy link
Copy Markdown

Description

Skeleton PR for rectify strategy utilities as discussed in #212.

Code changes

  • Added utilsforecast/rectify.py with three functions:
    • compute_rectify_residuals- computes per-horizon residuals (actual - forecast) - implemented
    • align_rectify_features - aligns feature matrices with horizon-indexed residuals - stubbed
    • rectify - applies fitted correction models to base forecasts - stubbed
  • Added tests/test_rectify.py - 5 passing tests (pandas + polars), 8 skipped for stubbed functions
  • No changes to existing files, no new dependencies

Design decisions

  • Dedicated rectify.py module rather than include within postprocessing.py
  • Uses narwhals for pandas/polars compatibility like other scripts
  • Model-agnostic: any object with .predict(X) works as a correction model
  • Supports two modes: per_horizon (separate model per h) and horizon_aware (single model with horizon as feature)
  • Flexible feature interface - accepts whatever feature matrix is passed, documenting that same-feature-set = classic Rectify

Review needed on:

Feedback on the API shape before filling in align_rectify_features and rectify. Specifically:

  • Do the function signatures and param conventions look right?
  • Does a dedicated rectify.py module make sense vs another location?
  • Any thoughts on the horizon representation (1-indexed integer column)?

References

Copilot AI review requested due to automatic review settings March 5, 2026 04:15
@CLAassistant
Copy link
Copy Markdown

CLAassistant commented Mar 5, 2026

CLA assistant check
All committers have signed the CLA.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds initial scaffolding for a “rectify strategy” utility module in utilsforecast, including a working residual-computation helper and a test suite covering pandas/polars compatibility.

Changes:

  • Added utilsforecast/rectify.py with compute_rectify_residuals implemented and align_rectify_features/rectify stubbed.
  • Added tests/test_rectify.py with passing coverage for residual computation and skipped tests for the stubbed APIs.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 5 comments.

File Description
utilsforecast/rectify.py Introduces the rectify utilities module and implements residual computation; other APIs are currently placeholders.
tests/test_rectify.py Adds tests validating residual computation for pandas/polars and outlines (skipped) expectations for upcoming functionality.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread utilsforecast/rectify.py Outdated
Comment on lines +52 to +57
horizon_expr = (
nw.col(time_col).rank().over(id_col).cast(nw.Int32).alias("horizon")
)
result = merged.select(
[id_col, time_col, horizon_expr, *residual_exprs]
).sort(id_col, time_col)
Copy link

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

horizon is computed via rank() without specifying a deterministic tie-breaking method. In pandas/polars the default rank method can yield non-integer values when there are duplicate time_col values within an id_col, and the subsequent cast(Int32) would silently truncate/duplicate horizons. Consider generating horizons using a deterministic per-group row index after sorting by time_col (or explicitly set an ordinal/dense rank method) so horizon is always 1..n with no ambiguity.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implemented a deterministic way using cum_count instead of rank method

Comment thread utilsforecast/rectify.py Outdated
Comment on lines +42 to +46
actuals = nw.from_native(df)
forecasts = nw.from_native(forecasts_df)
merged = actuals.select([id_col, time_col, target_col]).join(
forecasts.select([id_col, time_col, *models]),
on=[id_col, time_col],
Copy link

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

compute_rectify_residuals doesn’t validate input schemas/dtypes (required columns, time dtype) the way other public helpers do (e.g., validate_format usage in utilsforecast/processing.py:673). Adding validate_format checks for df and forecasts_df (and verifying models columns exist) would give clearer error messages and keep behavior consistent across the library.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Input validation to be part of logic implementation

Comment thread utilsforecast/rectify.py Outdated
def rectify(
df: IntoDataFrameT,
models: List[str],
correction_models: Dict[int, Dict[str, Any]],
Copy link

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The correction_models type annotation (Dict[int, Dict[str, Any]]) conflicts with the docstring, which says that in horizon_aware mode it should be {model_name: fitted_model}. Consider updating the annotation to a union that matches both modes (or splitting the API) so static type checkers and users aren’t misled.

Suggested change
correction_models: Dict[int, Dict[str, Any]],
correction_models: Union[Dict[int, Dict[str, Any]], Dict[str, Any]],

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

type annotation fix will be implemented with actual logic

Comment thread utilsforecast/rectify.py Outdated
Returns:
dict or tuple: Training data for rectification models.
"""
raise NotImplementedError
Copy link

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

align_rectify_features is exported in __all__ but currently raises NotImplementedError unconditionally. If this is meant to be a skeleton only, consider not exporting it (or making it private) until implemented, or raise a more informative error indicating it’s experimental/unavailable in this release.

Suggested change
raise NotImplementedError
raise NotImplementedError(
"align_rectify_features is not implemented in this version and is "
"currently experimental/unavailable."
)

Copilot uses AI. Check for mistakes.
Comment thread utilsforecast/rectify.py Outdated
Comment on lines +115 to +119
Returns:
pandas or polars DataFrame: Corrected forecasts with same schema
as input df.
"""
raise NotImplementedError
Copy link

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rectify is exported in __all__ but currently raises NotImplementedError unconditionally. Shipping a public API entry point in this state can break user code at runtime; consider implementing the function before release, or avoid exporting it / clearly mark it as experimental to prevent accidental use.

Copilot uses AI. Check for mistakes.
@Sai-Krishna99
Copy link
Copy Markdown
Author

@janrth and @nasaul Please check this skeleton and let me know if I'm okay to go with the logic implementation on this

Copy link
Copy Markdown
Contributor

@nasaul nasaul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review: feat/rectify-strategy — Rectify Skeleton

Overall: Good foundation. The module structure, docstrings, and test contracts are well-designed and worth building on. The one implemented function (compute_rectify_residuals) has a correctness risk that should be fixed before the stubs are filled in, as everything else will build on top of it.


Issues to Fix

1. Horizon computation has a window-function ordering risk (rectify.py:50-57)

The horizon_expr using cum_count().over(id_col) is applied in the same select as the outer sort. In Polars, over does not guarantee it sees rows in the sorted order of the enclosing frame — the sort and the window function are evaluated independently. This means horizons could be assigned non-deterministically per series.

Fix: compute the horizon in a separate step after the sort, or use order_by inside the window expression (Polars ≥ 0.20 supports over(..., order_by=time_col)):

sorted_merged = merged.sort(id_col, time_col)
horizon_expr = nw.col(time_col).cum_count().over(id_col, order_by=time_col).cast(nw.Int32).alias("horizon")

2. No handling for multi-fold CV inputs (rectify.py:43-48)

The join on [id_col, time_col] produces duplicate or mismatched rows when passed cross-validation output from mlforecast/statsforecast/neuralforecast, since those stack all folds and a (unique_id, ds) pair appears once per fold.

Suggest adding an optional cutoff_col parameter. When provided, include it in the join key and compute horizon within each (unique_id, cutoff) group. Without this, the primary real-world use case (training correction models from CV output) is broken.

3. Row-order bug in test_residual_values (test_rectify.py:57-62)

expected = actuals_nw["y"].to_numpy() - forecasts_nw[model].to_numpy()
actual = result_nw.sort("unique_id", "ds")[model].to_numpy()

result_nw is sorted but actuals_nw and forecasts_nw are not, so expected can be in a different order than actual. The test passes today only because generate_series happens to return sorted data. Fix: sort all three frames before comparison.


Design Suggestions

4. align_rectify_features return type is hard to consume

Union[Dict[int, Tuple[np.ndarray, Dict[str, np.ndarray]]], Tuple[np.ndarray, Dict[str, np.ndarray]]] means callers must introspect the return type based on the mode argument. Consider two separate functions (align_per_horizon / align_horizon_aware) or a typed result dataclass. This also makes the mode parameter in rectify cleaner.

5. correction_models type hint doesn't reflect horizon_aware mode

The docstring says {horizon: {model_name: fitted_model}} for per_horizon mode, but horizon_aware mode would need {model_name: fitted_model}. The current type hint Dict[int, Dict[str, Any]] is wrong for one of the two modes. This should be clarified before implementing.


Out of Scope

6. User-facing documentation

The "bring your own model" design (user trains correction models externally, passes fitted objects with a .predict(X) interface) is a meaningful API contract that warrants a dedicated documentation page covering the full workflow — from CV output to correction model training to inference. This is best tracked as a follow-up issue rather than blocking this PR.


Verdict: Fix items 1–3 before working on it, address 4–5 during stub implementation. The skeleton is otherwise ready to build on.

@Sai-Krishna99 Sai-Krishna99 force-pushed the feat/rectify-strategy branch from 3e49ca3 to 96dfed9 Compare March 18, 2026 00:51
@Sai-Krishna99
Copy link
Copy Markdown
Author

Thanks for the detailed review @nasaul. I addressed items 1-3:

  • Horizon ordering: switched to over(id_col, order_by=time_col) so
    the window function handles ordering explicitly, no reliance on frame sort
  • CV fold support: added cutoff_col param - when provided, it's
    included in the join key and horizon partitioning
  • Test row-order: all frames sorted before comparison now

Items 4-5 (split align function, fix correction_models type) - I will
handle them during stub implementation as suggested.

@Sai-Krishna99
Copy link
Copy Markdown
Author

Implementation update

Stubs are now fully implemented — align_rectify_features and rectify are complete.

What changed

  • align_rectify_features: groups residuals by horizon, slices feature matrix accordingly. Supports per_horizon (dict of per-h arrays) and horizon_aware (single matrix with horizon column appended)
  • rectify: applies fitted correction models to base forecasts, adding predicted residuals back per horizon
  • Input validation via validate_format on all public functions, consistent with the rest of the codebase
  • correction_models type annotation updated to reflect both per_horizon and horizon_aware signatures
  • All 16 tests pass (pandas + polars), including 3 validation error tests
  • Validated end-to-end with LinearRegression, Ridge, and DecisionTree on synthetic CV data

Validation results

MAE comparison across 3 correction models on 10 series with 4 CV folds and horizon h=5:
image

Per-horizon residual and feature alignment output showing correct shapes (40 samples per horizon):
image

@janrth and @nasaul - Please review when you get a chance

@Sai-Krishna99 Sai-Krishna99 requested a review from nasaul March 19, 2026 02:56
@Sai-Krishna99
Copy link
Copy Markdown
Author

Hi @nasaul and @janrth. Hope you had a chance to review the implementation.

@nasaul
Copy link
Copy Markdown
Contributor

nasaul commented Apr 5, 2026

Review

The implementation is clean, narwhals-idiomatic, and the CV cutoff_col support was a thoughtful addition. The core logic is correct. A few things to address before merging:


Design Issues

1. mode parameter causes type ambiguity (biggest concern)

Both align_rectify_features and rectify have a mode parameter that changes the return type and the expected shape of correction_models:

  • align_rectify_features returns a dict in per_horizon mode and a tuple in horizon_aware mode
  • rectify expects {h: {model: fitted_model}} in one mode and {model: fitted_model} in the other

The user can pass correction_models of the wrong shape and get a confusing KeyError at runtime. Consider either:

  • Two separate functions (rectify_per_horizon / rectify_horizon_aware)
  • Or @overload with Literal["per_horizon", "horizon_aware"] to get proper static type checking

2. Row-aligned features is fragile

Passing features: np.ndarray that must be "row-aligned" with the dataframe has no way to validate alignment — a common source of silent bugs. At minimum, add a features.shape[0] == len(df) guard with a clear error message.

3. Dead parameters in align_rectify_features

id_col and time_col are accepted but never used in the function body. Either use them for validation or remove them.

4. rectify doesn't support cutoff_col

compute_rectify_residuals has cutoff_col for CV, but rectify doesn't. Fine conceptually (no cutoffs at inference), but the asymmetric horizon numbering logic could cause subtle mismatches if series aren't consistently ordered. Worth a comment at minimum.


Test Gaps (confirmed by coverage — rectify.py at 85%)

  • cutoff_col path in compute_rectify_residuals — no test exercises this despite being a non-trivial feature
  • mode="horizon_aware" in rectify — covered in align_rectify_features tests but not in rectify
  • Invalid mode error path in align_rectify_features

Minor Issues

  • Mixed typing style: List[str] from typing (old style) mixed with str | None (Python 3.10+). The rest of the codebase uses typing, so use Optional[str] to be consistent.
  • Module not exported: rectify.py isn't referenced in __init__.py. Intentional for now?
  • CI coverage failure: pytest tests/test_rectify.py fails because the global coverage threshold is 80% and the rest of the package isn't exercised. The module itself hits 85%. Scope coverage with --cov=utilsforecast/rectify or this will block CI.

What's Working Well

  • cum_count().over(..., order_by=time_col) for deterministic horizon computation is the right fix
  • validate_format integration is correct and consistent with the rest of the codebase
  • Tests cover pandas + polars, multiple models, zero/constant corrections, and error paths
  • Error messages are clear

@Sai-Krishna99 Sai-Krishna99 force-pushed the feat/rectify-strategy branch from f94f263 to 0c054a6 Compare April 11, 2026 03:55
@Sai-Krishna99
Copy link
Copy Markdown
Author

Thanks for the detailed review @nasaul. I addressed the latest points:

  • Mode typing / wrong shape handling: added Literal / @overload typing for per_horizon and horizon_aware, plus clearer runtime validations for incorrect correction_models shapes
  • Feature row-alignment: added features.shape[0] checks in both align_rectify_features and rectify
  • Dead params: id_col and time_col are now used in align_rectify_features validation
  • cutoff_col / inference: added a short note in rectify explaining that cutoff_col isn’t supported there since horizons are recomputed at inference time
  • Test gaps: added coverage for the cutoff_col path, rectify(..., mode="horizon_aware"), invalid mode, feature row mismatch, and invalid correction_models shapes
  • Typing style: switched str | None to Optional[str>

Also rebased the branch on latest upstream/main.

Validation:

  • uv run pre-commit run --files utilsforecast/rectify.py tests/test_rectify.py passes
  • uv run pytest tests/test_rectify.py --cov-reset --cov=utilsforecast.rectify passes with 25 tests and ~92% rectify.py coverage
  • Full suite passes locally except test_distributed_evaluate, which fails because JAVA_HOME isn’t set on my machine. Excluding that local Spark env test, 1679 tests pass.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants