Add MCDA module for spatial multi-criteria decision analysis

## Summary

Spatial multi-criteria decision analysis (MCDA) is one of the most common GIS workflows, but there's no Python library for it that works with xarray and dask. ArcGIS has Weighted Overlay (proprietary, one combination method). QGIS has GUI plugins but no programmable API. `scikit-criteria` handles tabular discrete alternatives, not raster surfaces.

xarray-spatial already produces the criterion layers people need (slope, aspect, proximity, NDVI, flow accumulation, cost distance, curve number) but stops there. There's no way to combine them into a composite suitability or hazard surface. People end up writing ad hoc pixel math with hardcoded weights and no consistency checks.

This proposes an `mcda` module covering four stages: standardize criteria to a common scale, derive weights, combine layers, and check stability through sensitivity analysis.

## Background

MCDA on rasters: each pixel is a decision alternative, each raster layer is a criterion, the output is a single surface where every pixel gets a composite score. Common applications: site suitability, hazard mapping, conservation prioritization, infrastructure routing.

The workflow has four stages with interchangeable methods at each:

```
raw criterion layers → standardize → weight → combine → suitability surface
                                                  │
                                            constraints → mask exclusion zones
                                                  │
                                            sensitivity → stability check
```

## Proposed API

### Stage 1: Standardize

Convert each criterion raster from native units to a common 0-1 suitability scale using value functions. Similar idea to sklearn preprocessing, but the curve shapes need to match how spatial criteria actually behave: distance usually decays as a sigmoid, temperature suitability peaks at some ideal, and land cover classes need a lookup table.

```python
from xrspatial import mcda

# Linear (min-max with direction)
slope_std = mcda.standardize(slope, method="linear",
                              direction="lower_is_better",
                              bounds=(0, 45))

# Sigmoidal (S-curve for distance decay and threshold effects)
dist_std = mcda.standardize(dist_to_road, method="sigmoidal",
                             midpoint=5000, spread=-0.001)

# Gaussian (peaked around an ideal value)
ndvi_std = mcda.standardize(ndvi, method="gaussian",
                             mean=0.65, std=0.15)

# Triangular (suitable within a range)
temp_std = mcda.standardize(temperature, method="triangular",
                             low=15, peak=22, high=30)

# Piecewise linear (fully custom value function)
elev_std = mcda.standardize(elevation, method="piecewise",
                             breakpoints=[0, 100, 500, 1000, 2000],
                             values=[0.0, 0.3, 1.0, 0.6, 0.0])

# Categorical (lookup table, wraps reclassify to return 0-1)
soil_std = mcda.standardize(soil_type, method="categorical",
                             mapping={1: 0.9, 2: 0.7, 3: 0.4, 4: 0.1})
```

Methods and use cases:

| Method | Shape | When to use |
|---|---|---|
| `linear` | Straight line between bounds | Simple monotonic criteria (slope, elevation) |
| `sigmoidal` | S-curve | Distance decay, threshold transitions |
| `gaussian` | Bell curve | Criteria with an ideal value (temperature, pH) |
| `triangular` | Triangle | Suitable within a defined range |
| `piecewise` | Custom breakpoints | Expert-defined irregular value functions |
| `categorical` | Lookup table | Discrete classes (land cover, soil type, geology) |

The `direction` parameter (`"lower_is_better"` or `"higher_is_better"`) controls monotonic methods. For gaussian and triangular, the shape itself defines direction.

### Stage 2: Weight

Three approaches, from simple to structured.

#### Direct assignment

```python
weights = {"slope": 0.30, "distance": 0.20, "ndvi": 0.25,
           "soil": 0.15, "flood_risk": 0.10}
```

#### AHP (Analytical Hierarchy Process)

Derive weights from pairwise expert judgments. The consistency ratio is the reason to use this over just assigning weights by gut feel: it catches when the expert says "slope matters more than distance" and "distance matters more than NDVI" but then also says "NDVI matters more than slope." CR above 0.10 means the judgments contradict each other and need revision.

```python
weights, consistency = mcda.ahp_weights(
    criteria=["slope", "distance", "ndvi", "soil", "flood_risk"],
    comparisons={
        ("slope", "distance"):     3,     # slope 3x more important than distance
        ("slope", "ndvi"):         2,
        ("slope", "soil"):         5,
        ("slope", "flood_risk"):   2,
        ("distance", "ndvi"):      1/2,
        ("distance", "soil"):      2,
        ("distance", "flood_risk"): 1/3,
        ("ndvi", "soil"):          3,
        ("ndvi", "flood_risk"):    1,
        ("soil", "flood_risk"):    1/3,
    }
)
# weights = {"slope": 0.38, "distance": 0.12, "ndvi": 0.26, "soil": 0.07, "flood_risk": 0.17}
# consistency.ratio = 0.04
# consistency.is_consistent = True  (ratio < 0.10)
```

Uses the standard Saaty eigenvector method. Input: n(n-1)/2 pairwise comparisons on a 1-9 scale (or reciprocals). The function builds the full comparison matrix, computes the principal eigenvector for weights, and derives the consistency ratio from the principal eigenvalue.

#### Rank-order methods

For when experts can rank criteria by importance but can't quantify pairwise comparisons:

```python
weights = mcda.rank_weights(
    ranking=["slope", "ndvi", "flood_risk", "distance", "soil"],
    method="roc",  # rank-order centroid
)
# Also: "rs" (rank sum), "rr" (reciprocal of ranks)
```

### Stage 3: Combine

Bundle standardized criteria into an `xr.Dataset` and combine into a single suitability surface. Dataset is the natural container since it holds multiple co-registered DataArrays, same idea as columns in a DataFrame.

```python
criteria = xr.Dataset({
    "slope": slope_std,
    "distance": dist_std,
    "ndvi": ndvi_std,
    "soil": soil_std,
    "flood_risk": flood_std,
})
```

#### Weighted Linear Combination (WLC)

Most common method. Fully compensatory: a low score on one criterion can be offset by high scores on others.

```python
suitability = mcda.wlc(criteria, weights)
# Per-pixel: Σ(wᵢ × xᵢ)
```

#### Weighted Product Model (WPM)

Multiplicative. More sensitive to low scores since a near-zero on any criterion drags the whole product down.

```python
suitability = mcda.wpm(criteria, weights)
# Per-pixel: Π(xᵢ ^ wᵢ)
```

#### Ordered Weighted Averaging (OWA)

Generalizes WLC by adding order weights applied by rank position, not criterion identity. This controls risk attitude:

- Order weights on the lowest score per pixel = conservative (AND-like: all criteria must be good)
- Order weights on the highest score per pixel = optimistic (OR-like: any one good criterion is enough)
- Uniform order weights = same as WLC

```python
# Risk-averse: emphasize worst-performing criteria per pixel
suitability = mcda.owa(
    criteria,
    criterion_weights=weights,
    order_weights=[0.05, 0.10, 0.20, 0.30, 0.35],
)

# Risk-tolerant: emphasize best-performing criteria per pixel
suitability = mcda.owa(
    criteria,
    criterion_weights=weights,
    order_weights=[0.35, 0.30, 0.20, 0.10, 0.05],
)
```

#### Fuzzy overlay

Combine criteria using fuzzy set operators. No explicit weights.

```python
suitability = mcda.fuzzy_overlay(criteria, operator="gamma", gamma=0.9)
# Operators: "and" (min), "or" (max), "sum", "product", "gamma"
```

- `and` (fuzzy AND = min): most restrictive, pixel gets its worst score
- `or` (fuzzy OR = max): most permissive, pixel gets its best score
- `product`: multiplicative, less restrictive than AND
- `sum`: additive, less restrictive than OR
- `gamma`: tunable compromise between product and sum (gamma=0 is product, gamma=1 is sum)

#### Boolean overlay

Simplest case: binary suitability from hard thresholds.

```python
suitable = mcda.boolean_overlay(
    {"slope": slope < 15, "soil": soil_std > 0.5, "access": dist_to_road < 10000},
    operator="and",
)
```

#### Which combination method to use

| Method | Compensatory? | Weights? | Good for |
|---|---|---|---|
| WLC | Fully | Yes | General suitability, well-understood trade-offs |
| WPM | Partially | Yes | Cases where low scores on any criterion should penalize hard |
| OWA | Tunable | Yes (two sets) | When risk attitude matters, disagreement on trade-offs |
| Fuzzy | No | No | Criteria on different conceptual scales, quick exploration |
| Boolean | No | No | Hard constraints, screening |

### Stage 4: Constraints and sensitivity

#### Constraints

Mask out areas that are categorically unsuitable regardless of scores.

```python
suitability = mcda.constrain(suitability,
                              exclude=[water_mask, protected_areas],
                              fill=np.nan)
```

#### Sensitivity analysis

Most ad hoc implementations skip this step entirely. The question it answers: would the top-ranked locations change if the weights shifted by a few percent? If yes, the results depend more on weight choices than on spatial data, and you should probably collect better data or get clearer expert consensus before trusting the output.

```python
# One-at-a-time: perturb each weight ±delta, hold others proportional
sensitivity = mcda.sensitivity(
    criteria, weights,
    method="one_at_a_time",
    delta=0.05,
    combine_method="wlc",
)
# Returns Dataset: per-criterion DataArrays showing score variance under perturbation

# Monte Carlo: random weight vectors, measure output stability
sensitivity = mcda.sensitivity(
    criteria, weights,
    method="monte_carlo",
    n_samples=1000,
    combine_method="wlc",
)
# Returns DataArray: per-pixel coefficient of variation across random weight samples
```

Pixels with high sensitivity have unstable scores. If the top-10 sites keep shuffling when you nudge weights by 5%, that's a red flag.

## Use cases

### Wildfire risk mapping

```python
slope       = xrs.slope(dem)
fuel_load   = xrs.ndvi(nir, red)
dist_roads  = xrs.proximity(road_raster)
aspect_heat = xrs.aspect(dem)  # south-facing = drier

slope_std   = mcda.standardize(slope, method="linear", direction="higher_is_better", bounds=(0, 45))
fuel_std    = mcda.standardize(fuel_load, method="linear", direction="higher_is_better", bounds=(0, 1))
access_std  = mcda.standardize(dist_roads, method="sigmoidal", midpoint=5000, spread=-0.001)

criteria = xr.Dataset({"slope": slope_std, "fuel": fuel_std, "access": access_std})
weights, _ = mcda.ahp_weights(
    criteria=["slope", "fuel", "access"],
    comparisons={("slope", "fuel"): 2, ("slope", "access"): 3, ("fuel", "access"): 2}
)
fire_risk = mcda.wlc(criteria, weights)
fire_risk = mcda.constrain(fire_risk, exclude=[water_mask])
```

### Flood vulnerability

This one ties directly into the existing vegetation-aware flood work (curve number, roughness, HAND). The idea is that those functions already produce the raw criterion layers; MCDA is how you'd combine them into a single vulnerability surface instead of eyeballing four separate maps.

```python
curve_num    = xrs.curve_number(land_cover, soil_group)
flow_acc     = xrs.flow_accumulation(xrs.flow_direction(dem, method="dinf"))
roughness    = xrs.mannings_roughness(land_cover)
elev_above   = xrs.hand(dem, flow_dir, channel_mask)

cn_std       = mcda.standardize(curve_num, method="linear", direction="higher_is_better", bounds=(30, 100))
flow_std     = mcda.standardize(flow_acc, method="linear", direction="higher_is_better",
                                 bounds=(0, flow_acc.max().item()), transform=np.log1p)
rough_std    = mcda.standardize(roughness, method="linear", direction="lower_is_better", bounds=(0.01, 0.2))
hand_std     = mcda.standardize(elev_above, method="sigmoidal", midpoint=10, spread=-0.5)

criteria = xr.Dataset({"curve_number": cn_std, "flow": flow_std, "roughness": rough_std, "hand": hand_std})
flood_vuln = mcda.wlc(criteria, weights={"curve_number": 0.3, "flow": 0.3, "roughness": 0.15, "hand": 0.25})
```

### Site suitability (solar farm placement)

More interesting example because it mixes several standardization methods and shows where AHP shines. The analyst has opinions about which criteria matter more, and AHP forces those opinions to be internally consistent.

```python
slope_std    = mcda.standardize(slope, method="linear", direction="lower_is_better", bounds=(0, 15))
aspect_std   = mcda.standardize(aspect, method="gaussian", mean=180, std=45)  # south-facing ideal
road_std     = mcda.standardize(dist_to_road, method="sigmoidal", midpoint=2000, spread=-0.002)
grid_std     = mcda.standardize(dist_to_grid, method="sigmoidal", midpoint=5000, spread=-0.001)
landuse_std  = mcda.standardize(land_cover, method="categorical",
                                 mapping={"barren": 1.0, "grassland": 0.8, "cropland": 0.5,
                                          "forest": 0.1, "urban": 0.0, "water": 0.0})

criteria = xr.Dataset({"slope": slope_std, "aspect": aspect_std, "road": road_std,
                        "grid": grid_std, "landuse": landuse_std})

weights, consistency = mcda.ahp_weights(
    criteria=["slope", "aspect", "road", "grid", "landuse"],
    comparisons={
        ("slope", "aspect"): 2, ("slope", "road"): 3, ("slope", "grid"): 3, ("slope", "landuse"): 1/2,
        ("aspect", "road"): 2, ("aspect", "grid"): 2, ("aspect", "landuse"): 1/3,
        ("road", "grid"): 1, ("road", "landuse"): 1/3,
        ("grid", "landuse"): 1/3,
    }
)
solar_suitability = mcda.wlc(criteria, weights)
solar_suitability = mcda.constrain(solar_suitability, exclude=[water_mask, protected_areas, urban_mask])
```

## Module structure

```
xrspatial/
└── mcda/
    ├── __init__.py           # public API: standardize, ahp_weights, rank_weights,
    │                         #   wlc, wpm, owa, fuzzy_overlay, boolean_overlay,
    │                         #   constrain, sensitivity
    ├── standardize.py        # value functions: linear, sigmoidal, gaussian,
    │                         #   triangular, piecewise, categorical
    ├── weights.py            # ahp_weights (eigenvector + consistency ratio),
    │                         #   rank_weights (roc, rs, rr)
    ├── combine.py            # wlc, wpm, owa, fuzzy_overlay, boolean_overlay
    ├── constrain.py          # mask exclusion zones
    └── sensitivity.py        # one_at_a_time, monte_carlo perturbation
```

## Backend considerations

Every operation here is element-wise or a reduction across layers (min, max, sum, product). No neighborhood/kernel ops, so no `map_overlap`.

- numpy: standard array math
- cupy: same operations on GPU, works without changes since cupy mirrors the numpy API
- dask+numpy: element-wise ops on dask arrays produce lazy graphs automatically
- dask+cupy: lazy graph with cupy chunks

The one exception is `ahp_weights`, which operates on a small n x n comparison matrix (n = number of criteria, typically 3-10). That's always a numpy eigenvector computation regardless of backend since it doesn't touch raster data.

`ArrayTypeFunctionMapping` dispatch is probably overkill here. These are all ufuncs and broadcasting that numpy, cupy, and dask handle identically. Easier to just let the array type flow through and only add explicit dispatch if something actually breaks.

## Relationship to existing modules

- `classify.reclassify`: `mcda.standardize(..., method="categorical")` would wrap `reclassify` and normalize output to 0-1. The existing function stays as-is.
- `classify.quantile`: could be used internally for quantile-based scaling, but the interfaces differ. `classify.quantile` bins into k classes; `standardize` maps to continuous 0-1.
- Terrain and hydrology functions (slope, aspect, proximity, flow_accumulation, cost_distance, curve_number, HAND, etc.) produce the raw criterion layers that feed into MCDA. They don't need changes.

## Implementation order

Suggested based on standalone utility:

1. `standardize` + `wlc` + `constrain` -- minimum viable MCDA. Covers the most common workflow.
2. `ahp_weights` -- the consistency ratio is the main thing people can't easily get from ad hoc scoring.
3. `wpm` + `fuzzy_overlay` + `boolean_overlay` -- alternative combination methods.
4. `owa` -- more advanced, controls risk attitude.
5. `sensitivity` -- useful for publication-grade work.
6. `rank_weights` -- nice to have, but direct weights and AHP cover most needs.

## No new dependencies

Everything here is implementable with numpy alone:

- Standardization: ufuncs (`np.clip`, `np.exp`, `np.where`, arithmetic)
- AHP: `np.linalg.eig` for eigenvector, basic matrix math for consistency ratio
- WLC/WPM: weighted sum/product across a stack
- OWA: `np.sort` along the layer axis + weighted sum
- Fuzzy: `np.minimum`, `np.maximum`, arithmetic
- Sensitivity: loop over weight perturbations, recompute combination

No scipy or sklearn needed.

## References

- Malczewski, J. (2006). GIS-based multicriteria decision analysis: a survey of the literature. *International Journal of Geographical Information Science*, 20(7), 703-726.
- Saaty, T.L. (1980). *The Analytic Hierarchy Process*. McGraw-Hill.
- Yager, R.R. (1988). On ordered weighted averaging aggregation operators in multicriteria decision making. *IEEE Transactions on Systems, Man, and Cybernetics*, 18(1), 183-190.
- Eastman, J.R. (1999). Multi-criteria evaluation and GIS. In P.A. Longley et al. (Eds.), *Geographical Information Systems* (pp. 493-502).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add MCDA module for spatial multi-criteria decision analysis #1030

Summary

Background

Proposed API

Stage 1: Standardize

Stage 2: Weight

Direct assignment

AHP (Analytical Hierarchy Process)

Rank-order methods

Stage 3: Combine

Weighted Linear Combination (WLC)

Weighted Product Model (WPM)

Ordered Weighted Averaging (OWA)

Fuzzy overlay

Boolean overlay

Which combination method to use

Stage 4: Constraints and sensitivity

Constraints

Sensitivity analysis

Use cases

Wildfire risk mapping

Flood vulnerability

Site suitability (solar farm placement)

Module structure

Backend considerations

Relationship to existing modules

Implementation order

No new dependencies

References

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Method	Shape	When to use
`linear`	Straight line between bounds	Simple monotonic criteria (slope, elevation)
`sigmoidal`	S-curve	Distance decay, threshold transitions
`gaussian`	Bell curve	Criteria with an ideal value (temperature, pH)
`triangular`	Triangle	Suitable within a defined range
`piecewise`	Custom breakpoints	Expert-defined irregular value functions
`categorical`	Lookup table	Discrete classes (land cover, soil type, geology)

Method	Compensatory?	Weights?	Good for
WLC	Fully	Yes	General suitability, well-understood trade-offs
WPM	Partially	Yes	Cases where low scores on any criterion should penalize hard
OWA	Tunable	Yes (two sets)	When risk attitude matters, disagreement on trade-offs
Fuzzy	No	No	Criteria on different conceptual scales, quick exploration
Boolean	No	No	Hard constraints, screening

Add MCDA module for spatial multi-criteria decision analysis #1030

Description

Summary

Background

Proposed API

Stage 1: Standardize

Stage 2: Weight

Direct assignment

AHP (Analytical Hierarchy Process)

Rank-order methods

Stage 3: Combine

Weighted Linear Combination (WLC)

Weighted Product Model (WPM)

Ordered Weighted Averaging (OWA)

Fuzzy overlay

Boolean overlay

Which combination method to use

Stage 4: Constraints and sensitivity

Constraints

Sensitivity analysis

Use cases

Wildfire risk mapping

Flood vulnerability

Site suitability (solar farm placement)

Module structure

Backend considerations

Relationship to existing modules

Implementation order

No new dependencies

References

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions