-
Notifications
You must be signed in to change notification settings - Fork 86
Description
Summary
Spatial multi-criteria decision analysis (MCDA) is one of the most common GIS workflows, but there's no Python library for it that works with xarray and dask. ArcGIS has Weighted Overlay (proprietary, one combination method). QGIS has GUI plugins but no programmable API. scikit-criteria handles tabular discrete alternatives, not raster surfaces.
xarray-spatial already produces the criterion layers people need (slope, aspect, proximity, NDVI, flow accumulation, cost distance, curve number) but stops there. There's no way to combine them into a composite suitability or hazard surface. People end up writing ad hoc pixel math with hardcoded weights and no consistency checks.
This proposes an mcda module covering four stages: standardize criteria to a common scale, derive weights, combine layers, and check stability through sensitivity analysis.
Background
MCDA on rasters: each pixel is a decision alternative, each raster layer is a criterion, the output is a single surface where every pixel gets a composite score. Common applications: site suitability, hazard mapping, conservation prioritization, infrastructure routing.
The workflow has four stages with interchangeable methods at each:
raw criterion layers → standardize → weight → combine → suitability surface
│
constraints → mask exclusion zones
│
sensitivity → stability check
Proposed API
Stage 1: Standardize
Convert each criterion raster from native units to a common 0-1 suitability scale using value functions. Similar idea to sklearn preprocessing, but the curve shapes need to match how spatial criteria actually behave: distance usually decays as a sigmoid, temperature suitability peaks at some ideal, and land cover classes need a lookup table.
from xrspatial import mcda
# Linear (min-max with direction)
slope_std = mcda.standardize(slope, method="linear",
direction="lower_is_better",
bounds=(0, 45))
# Sigmoidal (S-curve for distance decay and threshold effects)
dist_std = mcda.standardize(dist_to_road, method="sigmoidal",
midpoint=5000, spread=-0.001)
# Gaussian (peaked around an ideal value)
ndvi_std = mcda.standardize(ndvi, method="gaussian",
mean=0.65, std=0.15)
# Triangular (suitable within a range)
temp_std = mcda.standardize(temperature, method="triangular",
low=15, peak=22, high=30)
# Piecewise linear (fully custom value function)
elev_std = mcda.standardize(elevation, method="piecewise",
breakpoints=[0, 100, 500, 1000, 2000],
values=[0.0, 0.3, 1.0, 0.6, 0.0])
# Categorical (lookup table, wraps reclassify to return 0-1)
soil_std = mcda.standardize(soil_type, method="categorical",
mapping={1: 0.9, 2: 0.7, 3: 0.4, 4: 0.1})Methods and use cases:
| Method | Shape | When to use |
|---|---|---|
linear |
Straight line between bounds | Simple monotonic criteria (slope, elevation) |
sigmoidal |
S-curve | Distance decay, threshold transitions |
gaussian |
Bell curve | Criteria with an ideal value (temperature, pH) |
triangular |
Triangle | Suitable within a defined range |
piecewise |
Custom breakpoints | Expert-defined irregular value functions |
categorical |
Lookup table | Discrete classes (land cover, soil type, geology) |
The direction parameter ("lower_is_better" or "higher_is_better") controls monotonic methods. For gaussian and triangular, the shape itself defines direction.
Stage 2: Weight
Three approaches, from simple to structured.
Direct assignment
weights = {"slope": 0.30, "distance": 0.20, "ndvi": 0.25,
"soil": 0.15, "flood_risk": 0.10}AHP (Analytical Hierarchy Process)
Derive weights from pairwise expert judgments. The consistency ratio is the reason to use this over just assigning weights by gut feel: it catches when the expert says "slope matters more than distance" and "distance matters more than NDVI" but then also says "NDVI matters more than slope." CR above 0.10 means the judgments contradict each other and need revision.
weights, consistency = mcda.ahp_weights(
criteria=["slope", "distance", "ndvi", "soil", "flood_risk"],
comparisons={
("slope", "distance"): 3, # slope 3x more important than distance
("slope", "ndvi"): 2,
("slope", "soil"): 5,
("slope", "flood_risk"): 2,
("distance", "ndvi"): 1/2,
("distance", "soil"): 2,
("distance", "flood_risk"): 1/3,
("ndvi", "soil"): 3,
("ndvi", "flood_risk"): 1,
("soil", "flood_risk"): 1/3,
}
)
# weights = {"slope": 0.38, "distance": 0.12, "ndvi": 0.26, "soil": 0.07, "flood_risk": 0.17}
# consistency.ratio = 0.04
# consistency.is_consistent = True (ratio < 0.10)Uses the standard Saaty eigenvector method. Input: n(n-1)/2 pairwise comparisons on a 1-9 scale (or reciprocals). The function builds the full comparison matrix, computes the principal eigenvector for weights, and derives the consistency ratio from the principal eigenvalue.
Rank-order methods
For when experts can rank criteria by importance but can't quantify pairwise comparisons:
weights = mcda.rank_weights(
ranking=["slope", "ndvi", "flood_risk", "distance", "soil"],
method="roc", # rank-order centroid
)
# Also: "rs" (rank sum), "rr" (reciprocal of ranks)Stage 3: Combine
Bundle standardized criteria into an xr.Dataset and combine into a single suitability surface. Dataset is the natural container since it holds multiple co-registered DataArrays, same idea as columns in a DataFrame.
criteria = xr.Dataset({
"slope": slope_std,
"distance": dist_std,
"ndvi": ndvi_std,
"soil": soil_std,
"flood_risk": flood_std,
})Weighted Linear Combination (WLC)
Most common method. Fully compensatory: a low score on one criterion can be offset by high scores on others.
suitability = mcda.wlc(criteria, weights)
# Per-pixel: Σ(wᵢ × xᵢ)Weighted Product Model (WPM)
Multiplicative. More sensitive to low scores since a near-zero on any criterion drags the whole product down.
suitability = mcda.wpm(criteria, weights)
# Per-pixel: Π(xᵢ ^ wᵢ)Ordered Weighted Averaging (OWA)
Generalizes WLC by adding order weights applied by rank position, not criterion identity. This controls risk attitude:
- Order weights on the lowest score per pixel = conservative (AND-like: all criteria must be good)
- Order weights on the highest score per pixel = optimistic (OR-like: any one good criterion is enough)
- Uniform order weights = same as WLC
# Risk-averse: emphasize worst-performing criteria per pixel
suitability = mcda.owa(
criteria,
criterion_weights=weights,
order_weights=[0.05, 0.10, 0.20, 0.30, 0.35],
)
# Risk-tolerant: emphasize best-performing criteria per pixel
suitability = mcda.owa(
criteria,
criterion_weights=weights,
order_weights=[0.35, 0.30, 0.20, 0.10, 0.05],
)Fuzzy overlay
Combine criteria using fuzzy set operators. No explicit weights.
suitability = mcda.fuzzy_overlay(criteria, operator="gamma", gamma=0.9)
# Operators: "and" (min), "or" (max), "sum", "product", "gamma"and(fuzzy AND = min): most restrictive, pixel gets its worst scoreor(fuzzy OR = max): most permissive, pixel gets its best scoreproduct: multiplicative, less restrictive than ANDsum: additive, less restrictive than ORgamma: tunable compromise between product and sum (gamma=0 is product, gamma=1 is sum)
Boolean overlay
Simplest case: binary suitability from hard thresholds.
suitable = mcda.boolean_overlay(
{"slope": slope < 15, "soil": soil_std > 0.5, "access": dist_to_road < 10000},
operator="and",
)Which combination method to use
| Method | Compensatory? | Weights? | Good for |
|---|---|---|---|
| WLC | Fully | Yes | General suitability, well-understood trade-offs |
| WPM | Partially | Yes | Cases where low scores on any criterion should penalize hard |
| OWA | Tunable | Yes (two sets) | When risk attitude matters, disagreement on trade-offs |
| Fuzzy | No | No | Criteria on different conceptual scales, quick exploration |
| Boolean | No | No | Hard constraints, screening |
Stage 4: Constraints and sensitivity
Constraints
Mask out areas that are categorically unsuitable regardless of scores.
suitability = mcda.constrain(suitability,
exclude=[water_mask, protected_areas],
fill=np.nan)Sensitivity analysis
Most ad hoc implementations skip this step entirely. The question it answers: would the top-ranked locations change if the weights shifted by a few percent? If yes, the results depend more on weight choices than on spatial data, and you should probably collect better data or get clearer expert consensus before trusting the output.
# One-at-a-time: perturb each weight ±delta, hold others proportional
sensitivity = mcda.sensitivity(
criteria, weights,
method="one_at_a_time",
delta=0.05,
combine_method="wlc",
)
# Returns Dataset: per-criterion DataArrays showing score variance under perturbation
# Monte Carlo: random weight vectors, measure output stability
sensitivity = mcda.sensitivity(
criteria, weights,
method="monte_carlo",
n_samples=1000,
combine_method="wlc",
)
# Returns DataArray: per-pixel coefficient of variation across random weight samplesPixels with high sensitivity have unstable scores. If the top-10 sites keep shuffling when you nudge weights by 5%, that's a red flag.
Use cases
Wildfire risk mapping
slope = xrs.slope(dem)
fuel_load = xrs.ndvi(nir, red)
dist_roads = xrs.proximity(road_raster)
aspect_heat = xrs.aspect(dem) # south-facing = drier
slope_std = mcda.standardize(slope, method="linear", direction="higher_is_better", bounds=(0, 45))
fuel_std = mcda.standardize(fuel_load, method="linear", direction="higher_is_better", bounds=(0, 1))
access_std = mcda.standardize(dist_roads, method="sigmoidal", midpoint=5000, spread=-0.001)
criteria = xr.Dataset({"slope": slope_std, "fuel": fuel_std, "access": access_std})
weights, _ = mcda.ahp_weights(
criteria=["slope", "fuel", "access"],
comparisons={("slope", "fuel"): 2, ("slope", "access"): 3, ("fuel", "access"): 2}
)
fire_risk = mcda.wlc(criteria, weights)
fire_risk = mcda.constrain(fire_risk, exclude=[water_mask])Flood vulnerability
This one ties directly into the existing vegetation-aware flood work (curve number, roughness, HAND). The idea is that those functions already produce the raw criterion layers; MCDA is how you'd combine them into a single vulnerability surface instead of eyeballing four separate maps.
curve_num = xrs.curve_number(land_cover, soil_group)
flow_acc = xrs.flow_accumulation(xrs.flow_direction(dem, method="dinf"))
roughness = xrs.mannings_roughness(land_cover)
elev_above = xrs.hand(dem, flow_dir, channel_mask)
cn_std = mcda.standardize(curve_num, method="linear", direction="higher_is_better", bounds=(30, 100))
flow_std = mcda.standardize(flow_acc, method="linear", direction="higher_is_better",
bounds=(0, flow_acc.max().item()), transform=np.log1p)
rough_std = mcda.standardize(roughness, method="linear", direction="lower_is_better", bounds=(0.01, 0.2))
hand_std = mcda.standardize(elev_above, method="sigmoidal", midpoint=10, spread=-0.5)
criteria = xr.Dataset({"curve_number": cn_std, "flow": flow_std, "roughness": rough_std, "hand": hand_std})
flood_vuln = mcda.wlc(criteria, weights={"curve_number": 0.3, "flow": 0.3, "roughness": 0.15, "hand": 0.25})Site suitability (solar farm placement)
More interesting example because it mixes several standardization methods and shows where AHP shines. The analyst has opinions about which criteria matter more, and AHP forces those opinions to be internally consistent.
slope_std = mcda.standardize(slope, method="linear", direction="lower_is_better", bounds=(0, 15))
aspect_std = mcda.standardize(aspect, method="gaussian", mean=180, std=45) # south-facing ideal
road_std = mcda.standardize(dist_to_road, method="sigmoidal", midpoint=2000, spread=-0.002)
grid_std = mcda.standardize(dist_to_grid, method="sigmoidal", midpoint=5000, spread=-0.001)
landuse_std = mcda.standardize(land_cover, method="categorical",
mapping={"barren": 1.0, "grassland": 0.8, "cropland": 0.5,
"forest": 0.1, "urban": 0.0, "water": 0.0})
criteria = xr.Dataset({"slope": slope_std, "aspect": aspect_std, "road": road_std,
"grid": grid_std, "landuse": landuse_std})
weights, consistency = mcda.ahp_weights(
criteria=["slope", "aspect", "road", "grid", "landuse"],
comparisons={
("slope", "aspect"): 2, ("slope", "road"): 3, ("slope", "grid"): 3, ("slope", "landuse"): 1/2,
("aspect", "road"): 2, ("aspect", "grid"): 2, ("aspect", "landuse"): 1/3,
("road", "grid"): 1, ("road", "landuse"): 1/3,
("grid", "landuse"): 1/3,
}
)
solar_suitability = mcda.wlc(criteria, weights)
solar_suitability = mcda.constrain(solar_suitability, exclude=[water_mask, protected_areas, urban_mask])Module structure
xrspatial/
└── mcda/
├── __init__.py # public API: standardize, ahp_weights, rank_weights,
│ # wlc, wpm, owa, fuzzy_overlay, boolean_overlay,
│ # constrain, sensitivity
├── standardize.py # value functions: linear, sigmoidal, gaussian,
│ # triangular, piecewise, categorical
├── weights.py # ahp_weights (eigenvector + consistency ratio),
│ # rank_weights (roc, rs, rr)
├── combine.py # wlc, wpm, owa, fuzzy_overlay, boolean_overlay
├── constrain.py # mask exclusion zones
└── sensitivity.py # one_at_a_time, monte_carlo perturbation
Backend considerations
Every operation here is element-wise or a reduction across layers (min, max, sum, product). No neighborhood/kernel ops, so no map_overlap.
- numpy: standard array math
- cupy: same operations on GPU, works without changes since cupy mirrors the numpy API
- dask+numpy: element-wise ops on dask arrays produce lazy graphs automatically
- dask+cupy: lazy graph with cupy chunks
The one exception is ahp_weights, which operates on a small n x n comparison matrix (n = number of criteria, typically 3-10). That's always a numpy eigenvector computation regardless of backend since it doesn't touch raster data.
ArrayTypeFunctionMapping dispatch is probably overkill here. These are all ufuncs and broadcasting that numpy, cupy, and dask handle identically. Easier to just let the array type flow through and only add explicit dispatch if something actually breaks.
Relationship to existing modules
classify.reclassify:mcda.standardize(..., method="categorical")would wrapreclassifyand normalize output to 0-1. The existing function stays as-is.classify.quantile: could be used internally for quantile-based scaling, but the interfaces differ.classify.quantilebins into k classes;standardizemaps to continuous 0-1.- Terrain and hydrology functions (slope, aspect, proximity, flow_accumulation, cost_distance, curve_number, HAND, etc.) produce the raw criterion layers that feed into MCDA. They don't need changes.
Implementation order
Suggested based on standalone utility:
standardize+wlc+constrain-- minimum viable MCDA. Covers the most common workflow.ahp_weights-- the consistency ratio is the main thing people can't easily get from ad hoc scoring.wpm+fuzzy_overlay+boolean_overlay-- alternative combination methods.owa-- more advanced, controls risk attitude.sensitivity-- useful for publication-grade work.rank_weights-- nice to have, but direct weights and AHP cover most needs.
No new dependencies
Everything here is implementable with numpy alone:
- Standardization: ufuncs (
np.clip,np.exp,np.where, arithmetic) - AHP:
np.linalg.eigfor eigenvector, basic matrix math for consistency ratio - WLC/WPM: weighted sum/product across a stack
- OWA:
np.sortalong the layer axis + weighted sum - Fuzzy:
np.minimum,np.maximum, arithmetic - Sensitivity: loop over weight perturbations, recompute combination
No scipy or sklearn needed.
References
- Malczewski, J. (2006). GIS-based multicriteria decision analysis: a survey of the literature. International Journal of Geographical Information Science, 20(7), 703-726.
- Saaty, T.L. (1980). The Analytic Hierarchy Process. McGraw-Hill.
- Yager, R.R. (1988). On ordered weighted averaging aggregation operators in multicriteria decision making. IEEE Transactions on Systems, Man, and Cybernetics, 18(1), 183-190.
- Eastman, J.R. (1999). Multi-criteria evaluation and GIS. In P.A. Longley et al. (Eds.), Geographical Information Systems (pp. 493-502).