Skip to content

fix: correct WMA calculation to use nansum instead of nanmean#2109

Open
ayushozha wants to merge 1 commit intomicrosoft:mainfrom
ayushozha:fix/issue-1993-wma-calculation
Open

fix: correct WMA calculation to use nansum instead of nanmean#2109
ayushozha wants to merge 1 commit intomicrosoft:mainfrom
ayushozha:fix/issue-1993-wma-calculation

Conversation

@ayushozha
Copy link
Copy Markdown

Summary

Closes #1993

The weighted_mean function in WMA._load_internal (qlib/data/ops.py) normalizes weights to sum to 1 via w = w / w.sum(), then incorrectly uses np.nanmean(w * x) which divides by the element count again. The correct function is np.nansum(w * x) since the weights already sum to 1.

This is consistent with how EMA.exp_weighted_mean is implemented (which already uses np.nansum).

Before

def weighted_mean(x):
    w = np.arange(len(x)) + 1
    w = w / w.sum()
    return np.nanmean(w * x)  # BUG: divides by count again

After

def weighted_mean(x):
    w = np.arange(len(x)) + 1
    w = w / w.sum()
    return np.nansum(w * x)  # correct: weights already sum to 1

Test Plan

  • Change is consistent with EMA implementation pattern
  • Single-line fix, minimal risk

…oft#1993)

The weighted_mean function normalizes weights to sum to 1, so
np.nanmean (which divides by count) produces incorrect results.
Use np.nansum instead, consistent with the EMA implementation.
@JKDasondee
Copy link
Copy Markdown

I ran into the same WMA bug independently and wrote a regression test for it. Rather than filing a duplicate PR I'd like to offer the test here for you to fold into this branch (or for a maintainer to pick up if this one has stalled).

Verified locally: with this branch's one-line fix applied, the test passes against the mock TW 0050 close series in tests/ops/test_elem_operator.py. Without the fix, WMA($close, 5) returns e.g. 29.57 instead of the correct 147.87 for prices around 146 — far below min(close), which any weighted average with non-negative weights summing to one must respect.

def test_WMA(self):
    # Regression test for issue #1993: WMA.weighted_mean divided by the
    # window length twice (normalized weights, then np.nanmean) producing
    # values ~N times smaller than a correct weighted moving average.
    # The sister operator EMA.exp_weighted_mean correctly uses np.nansum.
    N = 5
    field = f"WMA($close, {N})"
    result = ExpressionD.expression(self.instrument, field, self.start_time, self.end_time, self.freq)
    result = result.to_numpy()

    close = self.mock_df["close"].reset_index(drop=True).to_numpy()

    def reference_weighted_mean(x):
        w = np.arange(len(x)) + 1
        w = w / w.sum()
        return np.nansum(w * x)

    golden = (
        pd.Series(close)
        .rolling(N, min_periods=1)
        .apply(reference_weighted_mean, raw=True)
        .to_numpy()
    )
    np.testing.assert_allclose(result, golden, rtol=1e-5)

    # A weighted moving average with non-negative weights summing to one
    # must stay inside the input series' range. The pre-fix implementation
    # produced values ~close/N, far below close.min().
    close_min = float(np.nanmin(close))
    close_max = float(np.nanmax(close))
    self.assertGreaterEqual(float(np.nanmin(result)), close_min - 1e-9)
    self.assertLessEqual(float(np.nanmax(result)), close_max + 1e-9)

Requires import pandas as pd at the top of the file. Place it inside TestElementOperator in tests/ops/test_elem_operator.py.

Validated with:

python -m pytest tests/ops/test_elem_operator.py::TestElementOperator::test_WMA -v

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

WMA计算错误

2 participants