Skip to content

Add MCDA module for spatial multi-criteria decision analysis #1030

@brendancol

Description

@brendancol

Summary

Spatial multi-criteria decision analysis (MCDA) is one of the most common GIS workflows, but there's no Python library for it that works with xarray and dask. ArcGIS has Weighted Overlay (proprietary, one combination method). QGIS has GUI plugins but no programmable API. scikit-criteria handles tabular discrete alternatives, not raster surfaces.

xarray-spatial already produces the criterion layers people need (slope, aspect, proximity, NDVI, flow accumulation, cost distance, curve number) but stops there. There's no way to combine them into a composite suitability or hazard surface. People end up writing ad hoc pixel math with hardcoded weights and no consistency checks.

This proposes an mcda module covering four stages: standardize criteria to a common scale, derive weights, combine layers, and check stability through sensitivity analysis.

Background

MCDA on rasters: each pixel is a decision alternative, each raster layer is a criterion, the output is a single surface where every pixel gets a composite score. Common applications: site suitability, hazard mapping, conservation prioritization, infrastructure routing.

The workflow has four stages with interchangeable methods at each:

raw criterion layers → standardize → weight → combine → suitability surface
                                                  │
                                            constraints → mask exclusion zones
                                                  │
                                            sensitivity → stability check

Proposed API

Stage 1: Standardize

Convert each criterion raster from native units to a common 0-1 suitability scale using value functions. Similar idea to sklearn preprocessing, but the curve shapes need to match how spatial criteria actually behave: distance usually decays as a sigmoid, temperature suitability peaks at some ideal, and land cover classes need a lookup table.

from xrspatial import mcda

# Linear (min-max with direction)
slope_std = mcda.standardize(slope, method="linear",
                              direction="lower_is_better",
                              bounds=(0, 45))

# Sigmoidal (S-curve for distance decay and threshold effects)
dist_std = mcda.standardize(dist_to_road, method="sigmoidal",
                             midpoint=5000, spread=-0.001)

# Gaussian (peaked around an ideal value)
ndvi_std = mcda.standardize(ndvi, method="gaussian",
                             mean=0.65, std=0.15)

# Triangular (suitable within a range)
temp_std = mcda.standardize(temperature, method="triangular",
                             low=15, peak=22, high=30)

# Piecewise linear (fully custom value function)
elev_std = mcda.standardize(elevation, method="piecewise",
                             breakpoints=[0, 100, 500, 1000, 2000],
                             values=[0.0, 0.3, 1.0, 0.6, 0.0])

# Categorical (lookup table, wraps reclassify to return 0-1)
soil_std = mcda.standardize(soil_type, method="categorical",
                             mapping={1: 0.9, 2: 0.7, 3: 0.4, 4: 0.1})

Methods and use cases:

Method Shape When to use
linear Straight line between bounds Simple monotonic criteria (slope, elevation)
sigmoidal S-curve Distance decay, threshold transitions
gaussian Bell curve Criteria with an ideal value (temperature, pH)
triangular Triangle Suitable within a defined range
piecewise Custom breakpoints Expert-defined irregular value functions
categorical Lookup table Discrete classes (land cover, soil type, geology)

The direction parameter ("lower_is_better" or "higher_is_better") controls monotonic methods. For gaussian and triangular, the shape itself defines direction.

Stage 2: Weight

Three approaches, from simple to structured.

Direct assignment

weights = {"slope": 0.30, "distance": 0.20, "ndvi": 0.25,
           "soil": 0.15, "flood_risk": 0.10}

AHP (Analytical Hierarchy Process)

Derive weights from pairwise expert judgments. The consistency ratio is the reason to use this over just assigning weights by gut feel: it catches when the expert says "slope matters more than distance" and "distance matters more than NDVI" but then also says "NDVI matters more than slope." CR above 0.10 means the judgments contradict each other and need revision.

weights, consistency = mcda.ahp_weights(
    criteria=["slope", "distance", "ndvi", "soil", "flood_risk"],
    comparisons={
        ("slope", "distance"):     3,     # slope 3x more important than distance
        ("slope", "ndvi"):         2,
        ("slope", "soil"):         5,
        ("slope", "flood_risk"):   2,
        ("distance", "ndvi"):      1/2,
        ("distance", "soil"):      2,
        ("distance", "flood_risk"): 1/3,
        ("ndvi", "soil"):          3,
        ("ndvi", "flood_risk"):    1,
        ("soil", "flood_risk"):    1/3,
    }
)
# weights = {"slope": 0.38, "distance": 0.12, "ndvi": 0.26, "soil": 0.07, "flood_risk": 0.17}
# consistency.ratio = 0.04
# consistency.is_consistent = True  (ratio < 0.10)

Uses the standard Saaty eigenvector method. Input: n(n-1)/2 pairwise comparisons on a 1-9 scale (or reciprocals). The function builds the full comparison matrix, computes the principal eigenvector for weights, and derives the consistency ratio from the principal eigenvalue.

Rank-order methods

For when experts can rank criteria by importance but can't quantify pairwise comparisons:

weights = mcda.rank_weights(
    ranking=["slope", "ndvi", "flood_risk", "distance", "soil"],
    method="roc",  # rank-order centroid
)
# Also: "rs" (rank sum), "rr" (reciprocal of ranks)

Stage 3: Combine

Bundle standardized criteria into an xr.Dataset and combine into a single suitability surface. Dataset is the natural container since it holds multiple co-registered DataArrays, same idea as columns in a DataFrame.

criteria = xr.Dataset({
    "slope": slope_std,
    "distance": dist_std,
    "ndvi": ndvi_std,
    "soil": soil_std,
    "flood_risk": flood_std,
})

Weighted Linear Combination (WLC)

Most common method. Fully compensatory: a low score on one criterion can be offset by high scores on others.

suitability = mcda.wlc(criteria, weights)
# Per-pixel: Σ(wᵢ × xᵢ)

Weighted Product Model (WPM)

Multiplicative. More sensitive to low scores since a near-zero on any criterion drags the whole product down.

suitability = mcda.wpm(criteria, weights)
# Per-pixel: Π(xᵢ ^ wᵢ)

Ordered Weighted Averaging (OWA)

Generalizes WLC by adding order weights applied by rank position, not criterion identity. This controls risk attitude:

  • Order weights on the lowest score per pixel = conservative (AND-like: all criteria must be good)
  • Order weights on the highest score per pixel = optimistic (OR-like: any one good criterion is enough)
  • Uniform order weights = same as WLC
# Risk-averse: emphasize worst-performing criteria per pixel
suitability = mcda.owa(
    criteria,
    criterion_weights=weights,
    order_weights=[0.05, 0.10, 0.20, 0.30, 0.35],
)

# Risk-tolerant: emphasize best-performing criteria per pixel
suitability = mcda.owa(
    criteria,
    criterion_weights=weights,
    order_weights=[0.35, 0.30, 0.20, 0.10, 0.05],
)

Fuzzy overlay

Combine criteria using fuzzy set operators. No explicit weights.

suitability = mcda.fuzzy_overlay(criteria, operator="gamma", gamma=0.9)
# Operators: "and" (min), "or" (max), "sum", "product", "gamma"
  • and (fuzzy AND = min): most restrictive, pixel gets its worst score
  • or (fuzzy OR = max): most permissive, pixel gets its best score
  • product: multiplicative, less restrictive than AND
  • sum: additive, less restrictive than OR
  • gamma: tunable compromise between product and sum (gamma=0 is product, gamma=1 is sum)

Boolean overlay

Simplest case: binary suitability from hard thresholds.

suitable = mcda.boolean_overlay(
    {"slope": slope < 15, "soil": soil_std > 0.5, "access": dist_to_road < 10000},
    operator="and",
)

Which combination method to use

Method Compensatory? Weights? Good for
WLC Fully Yes General suitability, well-understood trade-offs
WPM Partially Yes Cases where low scores on any criterion should penalize hard
OWA Tunable Yes (two sets) When risk attitude matters, disagreement on trade-offs
Fuzzy No No Criteria on different conceptual scales, quick exploration
Boolean No No Hard constraints, screening

Stage 4: Constraints and sensitivity

Constraints

Mask out areas that are categorically unsuitable regardless of scores.

suitability = mcda.constrain(suitability,
                              exclude=[water_mask, protected_areas],
                              fill=np.nan)

Sensitivity analysis

Most ad hoc implementations skip this step entirely. The question it answers: would the top-ranked locations change if the weights shifted by a few percent? If yes, the results depend more on weight choices than on spatial data, and you should probably collect better data or get clearer expert consensus before trusting the output.

# One-at-a-time: perturb each weight ±delta, hold others proportional
sensitivity = mcda.sensitivity(
    criteria, weights,
    method="one_at_a_time",
    delta=0.05,
    combine_method="wlc",
)
# Returns Dataset: per-criterion DataArrays showing score variance under perturbation

# Monte Carlo: random weight vectors, measure output stability
sensitivity = mcda.sensitivity(
    criteria, weights,
    method="monte_carlo",
    n_samples=1000,
    combine_method="wlc",
)
# Returns DataArray: per-pixel coefficient of variation across random weight samples

Pixels with high sensitivity have unstable scores. If the top-10 sites keep shuffling when you nudge weights by 5%, that's a red flag.

Use cases

Wildfire risk mapping

slope       = xrs.slope(dem)
fuel_load   = xrs.ndvi(nir, red)
dist_roads  = xrs.proximity(road_raster)
aspect_heat = xrs.aspect(dem)  # south-facing = drier

slope_std   = mcda.standardize(slope, method="linear", direction="higher_is_better", bounds=(0, 45))
fuel_std    = mcda.standardize(fuel_load, method="linear", direction="higher_is_better", bounds=(0, 1))
access_std  = mcda.standardize(dist_roads, method="sigmoidal", midpoint=5000, spread=-0.001)

criteria = xr.Dataset({"slope": slope_std, "fuel": fuel_std, "access": access_std})
weights, _ = mcda.ahp_weights(
    criteria=["slope", "fuel", "access"],
    comparisons={("slope", "fuel"): 2, ("slope", "access"): 3, ("fuel", "access"): 2}
)
fire_risk = mcda.wlc(criteria, weights)
fire_risk = mcda.constrain(fire_risk, exclude=[water_mask])

Flood vulnerability

This one ties directly into the existing vegetation-aware flood work (curve number, roughness, HAND). The idea is that those functions already produce the raw criterion layers; MCDA is how you'd combine them into a single vulnerability surface instead of eyeballing four separate maps.

curve_num    = xrs.curve_number(land_cover, soil_group)
flow_acc     = xrs.flow_accumulation(xrs.flow_direction(dem, method="dinf"))
roughness    = xrs.mannings_roughness(land_cover)
elev_above   = xrs.hand(dem, flow_dir, channel_mask)

cn_std       = mcda.standardize(curve_num, method="linear", direction="higher_is_better", bounds=(30, 100))
flow_std     = mcda.standardize(flow_acc, method="linear", direction="higher_is_better",
                                 bounds=(0, flow_acc.max().item()), transform=np.log1p)
rough_std    = mcda.standardize(roughness, method="linear", direction="lower_is_better", bounds=(0.01, 0.2))
hand_std     = mcda.standardize(elev_above, method="sigmoidal", midpoint=10, spread=-0.5)

criteria = xr.Dataset({"curve_number": cn_std, "flow": flow_std, "roughness": rough_std, "hand": hand_std})
flood_vuln = mcda.wlc(criteria, weights={"curve_number": 0.3, "flow": 0.3, "roughness": 0.15, "hand": 0.25})

Site suitability (solar farm placement)

More interesting example because it mixes several standardization methods and shows where AHP shines. The analyst has opinions about which criteria matter more, and AHP forces those opinions to be internally consistent.

slope_std    = mcda.standardize(slope, method="linear", direction="lower_is_better", bounds=(0, 15))
aspect_std   = mcda.standardize(aspect, method="gaussian", mean=180, std=45)  # south-facing ideal
road_std     = mcda.standardize(dist_to_road, method="sigmoidal", midpoint=2000, spread=-0.002)
grid_std     = mcda.standardize(dist_to_grid, method="sigmoidal", midpoint=5000, spread=-0.001)
landuse_std  = mcda.standardize(land_cover, method="categorical",
                                 mapping={"barren": 1.0, "grassland": 0.8, "cropland": 0.5,
                                          "forest": 0.1, "urban": 0.0, "water": 0.0})

criteria = xr.Dataset({"slope": slope_std, "aspect": aspect_std, "road": road_std,
                        "grid": grid_std, "landuse": landuse_std})

weights, consistency = mcda.ahp_weights(
    criteria=["slope", "aspect", "road", "grid", "landuse"],
    comparisons={
        ("slope", "aspect"): 2, ("slope", "road"): 3, ("slope", "grid"): 3, ("slope", "landuse"): 1/2,
        ("aspect", "road"): 2, ("aspect", "grid"): 2, ("aspect", "landuse"): 1/3,
        ("road", "grid"): 1, ("road", "landuse"): 1/3,
        ("grid", "landuse"): 1/3,
    }
)
solar_suitability = mcda.wlc(criteria, weights)
solar_suitability = mcda.constrain(solar_suitability, exclude=[water_mask, protected_areas, urban_mask])

Module structure

xrspatial/
└── mcda/
    ├── __init__.py           # public API: standardize, ahp_weights, rank_weights,
    │                         #   wlc, wpm, owa, fuzzy_overlay, boolean_overlay,
    │                         #   constrain, sensitivity
    ├── standardize.py        # value functions: linear, sigmoidal, gaussian,
    │                         #   triangular, piecewise, categorical
    ├── weights.py            # ahp_weights (eigenvector + consistency ratio),
    │                         #   rank_weights (roc, rs, rr)
    ├── combine.py            # wlc, wpm, owa, fuzzy_overlay, boolean_overlay
    ├── constrain.py          # mask exclusion zones
    └── sensitivity.py        # one_at_a_time, monte_carlo perturbation

Backend considerations

Every operation here is element-wise or a reduction across layers (min, max, sum, product). No neighborhood/kernel ops, so no map_overlap.

  • numpy: standard array math
  • cupy: same operations on GPU, works without changes since cupy mirrors the numpy API
  • dask+numpy: element-wise ops on dask arrays produce lazy graphs automatically
  • dask+cupy: lazy graph with cupy chunks

The one exception is ahp_weights, which operates on a small n x n comparison matrix (n = number of criteria, typically 3-10). That's always a numpy eigenvector computation regardless of backend since it doesn't touch raster data.

ArrayTypeFunctionMapping dispatch is probably overkill here. These are all ufuncs and broadcasting that numpy, cupy, and dask handle identically. Easier to just let the array type flow through and only add explicit dispatch if something actually breaks.

Relationship to existing modules

  • classify.reclassify: mcda.standardize(..., method="categorical") would wrap reclassify and normalize output to 0-1. The existing function stays as-is.
  • classify.quantile: could be used internally for quantile-based scaling, but the interfaces differ. classify.quantile bins into k classes; standardize maps to continuous 0-1.
  • Terrain and hydrology functions (slope, aspect, proximity, flow_accumulation, cost_distance, curve_number, HAND, etc.) produce the raw criterion layers that feed into MCDA. They don't need changes.

Implementation order

Suggested based on standalone utility:

  1. standardize + wlc + constrain -- minimum viable MCDA. Covers the most common workflow.
  2. ahp_weights -- the consistency ratio is the main thing people can't easily get from ad hoc scoring.
  3. wpm + fuzzy_overlay + boolean_overlay -- alternative combination methods.
  4. owa -- more advanced, controls risk attitude.
  5. sensitivity -- useful for publication-grade work.
  6. rank_weights -- nice to have, but direct weights and AHP cover most needs.

No new dependencies

Everything here is implementable with numpy alone:

  • Standardization: ufuncs (np.clip, np.exp, np.where, arithmetic)
  • AHP: np.linalg.eig for eigenvector, basic matrix math for consistency ratio
  • WLC/WPM: weighted sum/product across a stack
  • OWA: np.sort along the layer axis + weighted sum
  • Fuzzy: np.minimum, np.maximum, arithmetic
  • Sensitivity: loop over weight perturbations, recompute combination

No scipy or sklearn needed.

References

  • Malczewski, J. (2006). GIS-based multicriteria decision analysis: a survey of the literature. International Journal of Geographical Information Science, 20(7), 703-726.
  • Saaty, T.L. (1980). The Analytic Hierarchy Process. McGraw-Hill.
  • Yager, R.R. (1988). On ordered weighted averaging aggregation operators in multicriteria decision making. IEEE Transactions on Systems, Man, and Cybernetics, 18(1), 183-190.
  • Eastman, J.R. (1999). Multi-criteria evaluation and GIS. In P.A. Longley et al. (Eds.), Geographical Information Systems (pp. 493-502).

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions