Rectify strategy skelton PR (#212) by Sai-Krishna99 · Pull Request #214 · Nixtla/utilsforecast

Sai-Krishna99 · 2026-03-05T04:15:33Z

Description

Skeleton PR for rectify strategy utilities as discussed in #212.

Code changes

Added utilsforecast/rectify.py with three functions:
- compute_rectify_residuals- computes per-horizon residuals (actual - forecast) - implemented
- align_rectify_features - aligns feature matrices with horizon-indexed residuals - stubbed
- rectify - applies fitted correction models to base forecasts - stubbed
Added tests/test_rectify.py - 5 passing tests (pandas + polars), 8 skipped for stubbed functions
No changes to existing files, no new dependencies

Design decisions

Dedicated rectify.py module rather than include within postprocessing.py
Uses narwhals for pandas/polars compatibility like other scripts
Model-agnostic: any object with .predict(X) works as a correction model
Supports two modes: per_horizon (separate model per h) and horizon_aware (single model with horizon as feature)
Flexible feature interface - accepts whatever feature matrix is passed, documenting that same-feature-set = classic Rectify

Review needed on:

Feedback on the API shape before filling in align_rectify_features and rectify. Specifically:

Do the function signatures and param conventions look right?
Does a dedicated rectify.py module make sense vs another location?
Any thoughts on the horizon representation (1-indexed integer column)?

References

Ben Taieb & Hyndman (2012), "Recursive and direct multi-step forecasting: the best of both worlds"
Original request: rectify forecast strategy mlforecast#163
Tracking issue for utilsforecast: Rectify strategy #212

CLAassistant · 2026-03-05T04:15:39Z

All committers have signed the CLA.

Copilot

Pull request overview

Adds initial scaffolding for a “rectify strategy” utility module in utilsforecast, including a working residual-computation helper and a test suite covering pandas/polars compatibility.

Changes:

Added utilsforecast/rectify.py with compute_rectify_residuals implemented and align_rectify_features/rectify stubbed.
Added tests/test_rectify.py with passing coverage for residual computation and skipped tests for the stubbed APIs.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 5 comments.

File	Description
`utilsforecast/rectify.py`	Introduces the rectify utilities module and implements residual computation; other APIs are currently placeholders.
`tests/test_rectify.py`	Adds tests validating residual computation for pandas/polars and outlines (skipped) expectations for upcoming functionality.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-05T04:20:13Z

+    horizon_expr = (
+        nw.col(time_col).rank().over(id_col).cast(nw.Int32).alias("horizon")
+    )
+    result = merged.select(
+        [id_col, time_col, horizon_expr, *residual_exprs]
+    ).sort(id_col, time_col)


horizon is computed via rank() without specifying a deterministic tie-breaking method. In pandas/polars the default rank method can yield non-integer values when there are duplicate time_col values within an id_col, and the subsequent cast(Int32) would silently truncate/duplicate horizons. Consider generating horizons using a deterministic per-group row index after sorting by time_col (or explicitly set an ordinal/dense rank method) so horizon is always 1..n with no ambiguity.

Implemented a deterministic way using cum_count instead of rank method

Copilot · 2026-03-05T04:20:13Z

+    actuals = nw.from_native(df)
+    forecasts = nw.from_native(forecasts_df)
+    merged = actuals.select([id_col, time_col, target_col]).join(
+        forecasts.select([id_col, time_col, *models]),
+        on=[id_col, time_col],


compute_rectify_residuals doesn’t validate input schemas/dtypes (required columns, time dtype) the way other public helpers do (e.g., validate_format usage in utilsforecast/processing.py:673). Adding validate_format checks for df and forecasts_df (and verifying models columns exist) would give clearer error messages and keep behavior consistent across the library.

Input validation to be part of logic implementation

Copilot · 2026-03-05T04:20:14Z

+def rectify(
+    df: IntoDataFrameT,
+    models: List[str],
+    correction_models: Dict[int, Dict[str, Any]],


The correction_models type annotation (Dict[int, Dict[str, Any]]) conflicts with the docstring, which says that in horizon_aware mode it should be {model_name: fitted_model}. Consider updating the annotation to a union that matches both modes (or splitting the API) so static type checkers and users aren’t misled.

Suggested change

correction_models: Dict[int, Dict[str, Any]],

correction_models: Union[Dict[int, Dict[str, Any]], Dict[str, Any]],

type annotation fix will be implemented with actual logic

Copilot · 2026-03-05T04:20:14Z

+    Returns:
+        dict or tuple: Training data for rectification models.
+    """
+    raise NotImplementedError


align_rectify_features is exported in __all__ but currently raises NotImplementedError unconditionally. If this is meant to be a skeleton only, consider not exporting it (or making it private) until implemented, or raise a more informative error indicating it’s experimental/unavailable in this release.

Suggested change

raise NotImplementedError

raise NotImplementedError(

"align_rectify_features is not implemented in this version and is "

"currently experimental/unavailable."

)

Copilot · 2026-03-05T04:20:14Z

+    Returns:
+        pandas or polars DataFrame: Corrected forecasts with same schema
+            as input df.
+    """
+    raise NotImplementedError


rectify is exported in __all__ but currently raises NotImplementedError unconditionally. Shipping a public API entry point in this state can break user code at runtime; consider implementing the function before release, or avoid exporting it / clearly mark it as experimental to prevent accidental use.

Sai-Krishna99 · 2026-03-08T02:20:42Z

@janrth and @nasaul Please check this skeleton and let me know if I'm okay to go with the logic implementation on this

nasaul

PR Review: `feat/rectify-strategy` — Rectify Skeleton

Overall: Good foundation. The module structure, docstrings, and test contracts are well-designed and worth building on. The one implemented function (compute_rectify_residuals) has a correctness risk that should be fixed before the stubs are filled in, as everything else will build on top of it.

Issues to Fix

1. Horizon computation has a window-function ordering risk (rectify.py:50-57)

The horizon_expr using cum_count().over(id_col) is applied in the same select as the outer sort. In Polars, over does not guarantee it sees rows in the sorted order of the enclosing frame — the sort and the window function are evaluated independently. This means horizons could be assigned non-deterministically per series.

Fix: compute the horizon in a separate step after the sort, or use order_by inside the window expression (Polars ≥ 0.20 supports over(..., order_by=time_col)):

sorted_merged = merged.sort(id_col, time_col)
horizon_expr = nw.col(time_col).cum_count().over(id_col, order_by=time_col).cast(nw.Int32).alias("horizon")

2. No handling for multi-fold CV inputs (rectify.py:43-48)

The join on [id_col, time_col] produces duplicate or mismatched rows when passed cross-validation output from mlforecast/statsforecast/neuralforecast, since those stack all folds and a (unique_id, ds) pair appears once per fold.

Suggest adding an optional cutoff_col parameter. When provided, include it in the join key and compute horizon within each (unique_id, cutoff) group. Without this, the primary real-world use case (training correction models from CV output) is broken.

3. Row-order bug in test_residual_values (test_rectify.py:57-62)

expected = actuals_nw["y"].to_numpy() - forecasts_nw[model].to_numpy()
actual = result_nw.sort("unique_id", "ds")[model].to_numpy()

result_nw is sorted but actuals_nw and forecasts_nw are not, so expected can be in a different order than actual. The test passes today only because generate_series happens to return sorted data. Fix: sort all three frames before comparison.

Design Suggestions

4. align_rectify_features return type is hard to consume

Union[Dict[int, Tuple[np.ndarray, Dict[str, np.ndarray]]], Tuple[np.ndarray, Dict[str, np.ndarray]]] means callers must introspect the return type based on the mode argument. Consider two separate functions (align_per_horizon / align_horizon_aware) or a typed result dataclass. This also makes the mode parameter in rectify cleaner.

5. correction_models type hint doesn't reflect horizon_aware mode

The docstring says {horizon: {model_name: fitted_model}} for per_horizon mode, but horizon_aware mode would need {model_name: fitted_model}. The current type hint Dict[int, Dict[str, Any]] is wrong for one of the two modes. This should be clarified before implementing.

Out of Scope

6. User-facing documentation

The "bring your own model" design (user trains correction models externally, passes fitted objects with a .predict(X) interface) is a meaningful API contract that warrants a dedicated documentation page covering the full workflow — from CV output to correction model training to inference. This is best tracked as a follow-up issue rather than blocking this PR.

Verdict: Fix items 1–3 before working on it, address 4–5 during stub implementation. The skeleton is otherwise ready to build on.

Sai-Krishna99 · 2026-03-18T00:54:34Z

Thanks for the detailed review @nasaul. I addressed items 1-3:

Horizon ordering: switched to over(id_col, order_by=time_col) so
the window function handles ordering explicitly, no reliance on frame sort
CV fold support: added cutoff_col param - when provided, it's
included in the join key and horizon partitioning
Test row-order: all frames sorted before comparison now

Items 4-5 (split align function, fix correction_models type) - I will
handle them during stub implementation as suggested.

Sai-Krishna99 · 2026-03-19T02:55:38Z

Implementation update

Stubs are now fully implemented — align_rectify_features and rectify are complete.

What changed

align_rectify_features: groups residuals by horizon, slices feature matrix accordingly. Supports per_horizon (dict of per-h arrays) and horizon_aware (single matrix with horizon column appended)
rectify: applies fitted correction models to base forecasts, adding predicted residuals back per horizon
Input validation via validate_format on all public functions, consistent with the rest of the codebase
correction_models type annotation updated to reflect both per_horizon and horizon_aware signatures
All 16 tests pass (pandas + polars), including 3 validation error tests
Validated end-to-end with LinearRegression, Ridge, and DecisionTree on synthetic CV data

Validation results

MAE comparison across 3 correction models on 10 series with 4 CV folds and horizon h=5:

Per-horizon residual and feature alignment output showing correct shapes (40 samples per horizon):

@janrth and @nasaul - Please review when you get a chance

Sai-Krishna99 · 2026-04-03T22:24:38Z

Hi @nasaul and @janrth. Hope you had a chance to review the implementation.

nasaul · 2026-04-05T22:00:25Z

Review

The implementation is clean, narwhals-idiomatic, and the CV cutoff_col support was a thoughtful addition. The core logic is correct. A few things to address before merging:

Design Issues

1. mode parameter causes type ambiguity (biggest concern)

Both align_rectify_features and rectify have a mode parameter that changes the return type and the expected shape of correction_models:

align_rectify_features returns a dict in per_horizon mode and a tuple in horizon_aware mode
rectify expects {h: {model: fitted_model}} in one mode and {model: fitted_model} in the other

The user can pass correction_models of the wrong shape and get a confusing KeyError at runtime. Consider either:

Two separate functions (rectify_per_horizon / rectify_horizon_aware)
Or @overload with Literal["per_horizon", "horizon_aware"] to get proper static type checking

2. Row-aligned features is fragile

Passing features: np.ndarray that must be "row-aligned" with the dataframe has no way to validate alignment — a common source of silent bugs. At minimum, add a features.shape[0] == len(df) guard with a clear error message.

3. Dead parameters in align_rectify_features

id_col and time_col are accepted but never used in the function body. Either use them for validation or remove them.

4. rectify doesn't support cutoff_col

compute_rectify_residuals has cutoff_col for CV, but rectify doesn't. Fine conceptually (no cutoffs at inference), but the asymmetric horizon numbering logic could cause subtle mismatches if series aren't consistently ordered. Worth a comment at minimum.

Test Gaps (confirmed by coverage — `rectify.py` at 85%)

cutoff_col path in compute_rectify_residuals — no test exercises this despite being a non-trivial feature
mode="horizon_aware" in rectify — covered in align_rectify_features tests but not in rectify
Invalid mode error path in align_rectify_features

Minor Issues

Mixed typing style: List[str] from typing (old style) mixed with str | None (Python 3.10+). The rest of the codebase uses typing, so use Optional[str] to be consistent.
Module not exported: rectify.py isn't referenced in __init__.py. Intentional for now?
CI coverage failure: pytest tests/test_rectify.py fails because the global coverage threshold is 80% and the rest of the package isn't exercised. The module itself hits 85%. Scope coverage with --cov=utilsforecast/rectify or this will block CI.

What's Working Well

cum_count().over(..., order_by=time_col) for deterministic horizon computation is the right fix
validate_format integration is correct and consistent with the rest of the codebase
Tests cover pandas + polars, multiple models, zero/constant corrections, and error paths
Error messages are clear

… added, methods stubbed for review. Tests included

…tions

Sai-Krishna99 · 2026-04-11T04:18:41Z

Thanks for the detailed review @nasaul. I addressed the latest points:

Mode typing / wrong shape handling: added Literal / @overload typing for per_horizon and horizon_aware, plus clearer runtime validations for incorrect correction_models shapes
Feature row-alignment: added features.shape[0] checks in both align_rectify_features and rectify
Dead params: id_col and time_col are now used in align_rectify_features validation
cutoff_col / inference: added a short note in rectify explaining that cutoff_col isn’t supported there since horizons are recomputed at inference time
Test gaps: added coverage for the cutoff_col path, rectify(..., mode="horizon_aware"), invalid mode, feature row mismatch, and invalid correction_models shapes
Typing style: switched str | None to Optional[str>

Also rebased the branch on latest upstream/main.

Validation:

uv run pre-commit run --files utilsforecast/rectify.py tests/test_rectify.py passes
uv run pytest tests/test_rectify.py --cov-reset --cov=utilsforecast.rectify passes with 25 tests and ~92% rectify.py coverage
Full suite passes locally except test_distributed_evaluate, which fails because JAVA_HOME isn’t set on my machine. Excluding that local Spark env test, 1679 tests pass.

Copilot AI review requested due to automatic review settings March 5, 2026 04:15

Copilot started reviewing on behalf of Sai-Krishna99 March 5, 2026 04:16 View session

Copilot AI reviewed Mar 5, 2026

View reviewed changes

nasaul reviewed Mar 9, 2026

View reviewed changes

Sai-Krishna99 force-pushed the feat/rectify-strategy branch from 3e49ca3 to 96dfed9 Compare March 18, 2026 00:51

Sai-Krishna99 requested a review from nasaul March 19, 2026 02:56

Sai-Krishna99 added 5 commits April 10, 2026 21:38

feat: add rectify strategy skelton (Issue Nixtla#212). Core utilities…

cbdc555

… added, methods stubbed for review. Tests included

fix: use cum_count instead of rank for deterministic horizon indexing

84042b1

fix:address review feedback on rectifiy skeleton

f6621e8

feat: implemeted align_rectify_features and rectify with input valida…

9660393

…tions

fix: address rectify review feedback

0c054a6

Sai-Krishna99 force-pushed the feat/rectify-strategy branch from f94f263 to 0c054a6 Compare April 11, 2026 03:55

	correction_models: Dict[int, Dict[str, Any]],
	correction_models: Union[Dict[int, Dict[str, Any]], Dict[str, Any]],

-    raise NotImplementedError
+    raise NotImplementedError(
+        "align_rectify_features is not implemented in this version and is "
+        "currently experimental/unavailable."
+    )

Conversation

Sai-Krishna99 commented Mar 5, 2026

Description

Code changes

Design decisions

Review needed on:

References

Uh oh!

CLAassistant commented Mar 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

Sai-Krishna99 Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

Sai-Krishna99 Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

Sai-Krishna99 Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

Sai-Krishna99 commented Mar 8, 2026

Uh oh!

nasaul left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

PR Review: feat/rectify-strategy — Rectify Skeleton

Issues to Fix

Design Suggestions

Out of Scope

Uh oh!

Sai-Krishna99 commented Mar 18, 2026

Uh oh!

Sai-Krishna99 commented Mar 19, 2026

Implementation update

What changed

Validation results

Uh oh!

Sai-Krishna99 commented Apr 3, 2026

Uh oh!

nasaul commented Apr 5, 2026

Review

Design Issues

Test Gaps (confirmed by coverage — rectify.py at 85%)

Minor Issues

What's Working Well

Uh oh!

Sai-Krishna99 commented Apr 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

CLAassistant commented Mar 5, 2026 •

edited

Loading

nasaul left a comment •

edited

Loading

PR Review: `feat/rectify-strategy` — Rectify Skeleton

Test Gaps (confirmed by coverage — `rectify.py` at 85%)