From 94b05a28d61adafb2c22d6826e6773676427fa46 Mon Sep 17 00:00:00 2001 From: Mayk Thewessen Date: Tue, 17 Mar 2026 21:37:03 +0100 Subject: [PATCH 1/4] docs: add memory best-practices notebook for large-scale models --- examples/memory-best-practices.ipynb | 717 +++++++++++++++++++++++++++ 1 file changed, 717 insertions(+) create mode 100644 examples/memory-best-practices.ipynb diff --git a/examples/memory-best-practices.ipynb b/examples/memory-best-practices.ipynb new file mode 100644 index 00000000..4e5393ba --- /dev/null +++ b/examples/memory-best-practices.ipynb @@ -0,0 +1,717 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Memory-Efficient Linopy Usage: Best Practices\n", + "\n", + "This notebook documents memory-efficient patterns for large-scale linopy models,\n", + "based on production experience with a 200-bus, 1800-generator PyPSA power system\n", + "model solved over 8760 hourly snapshots.\n", + "\n", + "**Model scale (reference):**\n", + "- ~1.4M LP variables per weekly chunk\n", + "- ~1.4M LP constraints per weekly chunk \n", + "- ~52 weekly chunks per year\n", + "- Steady-state memory: ~9 GB; peak during assembly: ~15 GB\n", + "\n", + "## Contents\n", + "1. [Understanding linopy's memory layout](#1-understanding-linopys-memory-layout)\n", + "2. [Chunked time-horizon solving](#2-chunked-time-horizon-solving)\n", + "3. [LP blueprint reuse across chunks](#3-lp-blueprint-reuse-across-chunks)\n", + "4. [Efficient RHS and objective updates](#4-efficient-rhs-and-objective-updates)\n", + "5. [Memory profiling](#5-memory-profiling)\n", + "6. [Explicit cleanup between solves](#6-explicit-cleanup-between-solves)\n", + "7. [Sparse-friendly constraint formulation](#7-sparse-friendly-constraint-formulation)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": "## 1. Understanding linopy's memory layout\n\nBefore optimising, it helps to understand where memory actually lives in a linopy model.\n\n### Data structures\n\nA `linopy.Model` stores its data primarily in `xarray.Dataset` objects — one per\nvariable and one per constraint. For a model with $N$ variables and $M$ constraints\nover $T$ timesteps:\n\n| Structure | Shape | Approx. memory (float64) |\n|-----------|-------|---------------------------|\n| Variable labels (`.labels`) | $(T, N)$ | $8TN$ bytes |\n| Variable bounds (`.lower`, `.upper`) | $2 \\times (T, N)$ | $16TN$ bytes |\n| Constraint labels + RHS + sign | $3 \\times (T, M)$ | $24TM$ bytes |\n| Expression term data (`_term` dim) | $(T, M, k)$ where $k$ = avg. terms | variable |\n| Sparse matrix $A$ (CSC) | $\\text{nnz} \\times 3$ arrays | $24 \\cdot \\text{nnz}$ bytes |\n| Solution + dual vectors | $(N + M)$ | $8(N+M)$ bytes |\n\nFor a 1.4M variable / 1.4M constraint model with $T=168$ snapshots:\n- Variable storage: ~3.4 GB \n- Constraint storage: ~1.2 GB \n- Sparse $A$ matrix (~50M non-zeros): ~1.2 GB \n- **Peak during `matrices.A` assembly: ~3× the matrix size (~3.6 GB spike)**\n- **Peak during solution unpacking: ~3× variable count** (solver array + pandas indexed copy + xarray result)\n\n### The two memory spikes\n\nThere are two distinct memory peaks in a linopy solve cycle:\n\n**1. `constraints.flat` (triggered by `matrices.A`)**: Before building the sparse\nmatrix, linopy concatenates all constraints into a pandas DataFrame with ~7 columns\nand $(M \\times T \\times k)$ rows (where $k$ = average terms per constraint, typically\n2–4). At 120 bytes/row of pandas overhead, this temporarily consumes several GB\n*before* the CSC matrix itself exists.\n\n**2. COO→CSC conversion**: The flat DataFrame is then converted to COO format\n(row, col, data arrays) then to CSC. The peak is roughly 3× the final matrix size —\nboth the COO intermediate and the final CSC exist in memory simultaneously during\nconversion.\n\n**3. Solution unpacking**: After solve, the dense solver solution vector is unpacked\ninto xarray DataArrays per variable. Three copies exist briefly: solver numpy array\n→ pandas Series with int index → xarray DataArray. This adds ~3× N_vars × 8 bytes\nof temporary memory.\n\nAvoid triggering repeated matrix builds — they repeat spikes 1 and 2 each time:" + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "import linopy\n", + "import numpy as np\n", + "\n", + "# BAD: triggers matrix build twice\n", + "A = model.matrices.A # builds CSC matrix\n", + "b = model.matrices.b # already cached — OK\n", + "A2 = model.matrices.A # re-builds if cache was cleared!\n", + "\n", + "# GOOD: access once, store locally\n", + "matrices = model.matrices\n", + "A = matrices.A\n", + "b = matrices.b\n", + "c = matrices.c" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Checking sparsity\n", + "\n", + "The sparse $A$ matrix only saves memory if the model is actually sparse. Check before\n", + "committing to large-scale solving:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "A = model.matrices.A\n", + "n_rows, n_cols = A.shape\n", + "density = A.nnz / (n_rows * n_cols)\n", + "print(f\"Matrix shape: {n_rows:,} × {n_cols:,}\")\n", + "print(f\"Non-zeros: {A.nnz:,}\")\n", + "print(f\"Density: {density:.4%}\")\n", + "print(f\"Memory (CSC): {A.data.nbytes / 1e9:.2f} GB\")\n", + "print(f\"Dense equivalent: {n_rows * n_cols * 8 / 1e9:.1f} GB\")\n", + "# Typical power systems: <0.01% density → >99% savings" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 2. Chunked time-horizon solving\n", + "\n", + "For year-long horizons (8760 h), building a single linopy model is impractical —\n", + "it would require ~500 GB for the LP alone. The standard approach is **chunked solving**:\n", + "solve weekly (168 h) sub-problems sequentially, passing boundary conditions between\n", + "chunks (e.g., storage state-of-charge).\n", + "\n", + "### Basic chunking pattern" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "import pypsa\n", + "import numpy as np\n", + "import gc\n", + "import platform\n", + "import ctypes\n", + "\n", + "CHUNK_HOURS = 168 # 1 week\n", + "\n", + "def release_memory():\n", + " \"\"\"Return freed pages to the OS (important on Linux/macOS).\"\"\"\n", + " gc.collect()\n", + " if platform.system() == 'Darwin':\n", + " libc = ctypes.CDLL('libSystem.B.dylib')\n", + " libc.malloc_zone_pressure_relief(0, 0)\n", + " elif platform.system() == 'Linux':\n", + " libc = ctypes.CDLL('libc.so.6')\n", + " libc.malloc_trim(0)\n", + "\n", + "def solve_chunked(network: pypsa.Network, chunk_hours: int = CHUNK_HOURS):\n", + " snapshots = network.snapshots\n", + " n_chunks = int(np.ceil(len(snapshots) / chunk_hours))\n", + " results = []\n", + "\n", + " for i in range(n_chunks):\n", + " chunk_snaps = snapshots[i * chunk_hours : (i + 1) * chunk_hours]\n", + " print(f\"Chunk {i+1}/{n_chunks}: {chunk_snaps[0]} – {chunk_snaps[-1]}\")\n", + "\n", + " status, condition = network.optimize(\n", + " solver_name='highs',\n", + " snapshots=chunk_snaps,\n", + " io_api='direct', # avoids writing LP files to disk\n", + " )\n", + "\n", + " if status == 'ok':\n", + " results.append(network.generators_t.p[chunk_snaps].copy())\n", + "\n", + " # Critical: free the linopy model before the next chunk\n", + " if hasattr(network, 'model'):\n", + " del network.model\n", + " release_memory()\n", + "\n", + " return results" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Why `del network.model` matters\n", + "\n", + "After `network.optimize()`, the linopy model (including the xarray Datasets, sparse\n", + "matrix, and solver model) is attached as `network.model`. It is **not** freed until\n", + "you delete it explicitly — Python's reference count keeps it alive.\n", + "\n", + "At 1.4M variables/constraints over 168 snapshots, `network.model` consumes ~9 GB.\n", + "Without `del`, each of 52 chunks accumulates: after 2 chunks you have 18 GB resident.\n", + "\n", + "The `release_memory()` call after `del` is equally important on Linux: Python's\n", + "allocator (`malloc`) does not return freed heap pages to the OS until\n", + "`malloc_trim(0)` is called. Without it, RSS stays high even though Python\n", + "considers memory free.\n", + "\n", + "## 3. LP blueprint reuse across chunks\n", + "\n", + "For models where the **LP structure** (variables, constraints, sparsity pattern) is\n", + "identical across chunks and only **right-hand-side** values and **objective\n", + "coefficients** change, you can skip the linopy rebuild entirely for chunks 2+.\n", + "\n", + "The pattern:\n", + "1. **Chunk 1**: Build the full linopy model, solve, extract the HiGHS object and\n", + " enough metadata to update it directly.\n", + "2. **Chunks 2+**: Call `changeRowsBounds()` / `changeColsCost()` on the cached\n", + " HiGHS object, then re-solve — no linopy involved.\n", + "\n", + "This saves ~50 s and ~9 GB of allocation per chunk." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from dataclasses import dataclass, field\n", + "from typing import Any\n", + "import numpy as np\n", + "\n", + "@dataclass\n", + "class LPBlueprint:\n", + " \"\"\"Cached LP structure from chunk 1, reused for chunks 2+.\"\"\"\n", + " highs: Any # highspy.Highs object\n", + " n_rows: int\n", + " n_cols: int\n", + " sense: np.ndarray # '<' / '>' / '=' per row\n", + " tv_row_indices: np.ndarray # rows that change between chunks\n", + " tv_col_indices: np.ndarray # objective columns that change\n", + " # maps variable/constraint names → HiGHS column/row index arrays\n", + " var_keys: dict = field(default_factory=dict)\n", + " con_keys: dict = field(default_factory=dict)\n", + "\n", + "\n", + "def extract_blueprint(model) -> LPBlueprint:\n", + " \"\"\"\n", + " Extract LP blueprint from a solved linopy model (chunk 1).\n", + " \n", + " Call immediately after network.optimize() while network.model is still alive.\n", + " \"\"\"\n", + " h = model.solver_model # highspy.Highs\n", + " n_rows = h.getNumRow()\n", + " n_cols = h.getNumCol()\n", + "\n", + " # Identify time-varying rows: those whose RHS changes each chunk.\n", + " # In PyPSA this is generator p_max/p_min bounds + power balance rows.\n", + " # Tag them during model build, or detect by name pattern.\n", + " tv_rows = _identify_tv_rows(model) # implementation-specific\n", + " tv_cols = _identify_tv_objective_cols(model) # implementation-specific\n", + "\n", + " # Extract constraint sense once (does not change between chunks)\n", + " sense = np.array([h.getRowSense(r) for r in range(n_rows)])\n", + "\n", + " bp = LPBlueprint(\n", + " highs=h,\n", + " n_rows=n_rows,\n", + " n_cols=n_cols,\n", + " sense=sense,\n", + " tv_row_indices=tv_rows,\n", + " tv_col_indices=tv_cols,\n", + " )\n", + "\n", + " # Cache variable → HiGHS column index mappings\n", + " for var_name, var in model.variables.items():\n", + " bp.var_keys[var_name] = var.labels.values.copy() # shape (T, N_comp)\n", + "\n", + " # Detach HiGHS from linopy so it survives del network.model\n", + " model.solver_model = None\n", + " return bp\n", + "\n", + "\n", + "def reuse_blueprint(bp: LPBlueprint, new_rhs: np.ndarray, new_obj: np.ndarray):\n", + " \"\"\"\n", + " Update and re-solve using cached HiGHS object. Skips linopy entirely.\n", + " \n", + " Parameters\n", + " ----------\n", + " bp : blueprint from extract_blueprint()\n", + " new_rhs : updated RHS values for time-varying rows (len = len(bp.tv_row_indices))\n", + " new_obj : updated objective coefficients for time-varying columns\n", + " \"\"\"\n", + " h = bp.highs\n", + "\n", + " # Update row bounds (only the ~15% that change)\n", + " lower = np.where(bp.sense[bp.tv_row_indices] != '<', new_rhs, -np.inf)\n", + " upper = np.where(bp.sense[bp.tv_row_indices] != '>', new_rhs, np.inf)\n", + " h.changeRowsBounds(\n", + " len(bp.tv_row_indices),\n", + " bp.tv_row_indices.astype(np.int32),\n", + " lower.astype(np.float64),\n", + " upper.astype(np.float64),\n", + " )\n", + "\n", + " # Update objective\n", + " h.changeColsCost(\n", + " len(bp.tv_col_indices),\n", + " bp.tv_col_indices.astype(np.int32),\n", + " new_obj.astype(np.float64),\n", + " )\n", + "\n", + " # IMPORTANT: clear previous solution so presolve runs on the new data\n", + " h.clearSolver()\n", + "\n", + " # Solve with same options as chunk 1\n", + " h.run()\n", + "\n", + " return h.getInfoValue('primal_solution_status')[1]" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Why `clearSolver()` is essential\n", + "\n", + "`h.clearSolver()` resets HiGHS's internal solver state (basis, solution vectors,\n", + "presolve reductions) without touching the LP data (rows, columns, matrix).\n", + "\n", + "Without it, HiGHS may try to warm-start from a stale basis that corresponds to a\n", + "**different** RHS — leading to wrong solutions or spurious infeasibility.\n", + "\n", + "After `clearSolver()`, presolve runs on the fresh data. For a typical power-systems\n", + "LP, presolve reduces 1.4M rows → 128K rows (~89% reduction), which is why chunks 2+\n", + "can be **faster** than chunk 1 even though they skip the linopy build overhead.\n", + "\n", + "### Warm-starting across chunks (experimental)\n", + "\n", + "An alternative to cold-start IPM on every chunk is:\n", + "- **Chunk 1**: solve with IPM + crossover to produce a simplex basis\n", + "- **Chunks 2+**: solve with dual simplex warm-started from the prior basis\n", + "\n", + "This is most beneficial when consecutive chunks are structurally similar (e.g.,\n", + "adjacent weeks with similar weather). HiGHS maintainer @jajhall confirmed that\n", + "presolve correctly maps the basis through LP reductions, so warm-start works even\n", + "with presolve enabled (see [HiGHS issue #2911](https://github.com/ERGO-Code/HiGHS/issues/2911)).\n", + "\n", + "```python\n", + "# Chunk 1: IPM + crossover\n", + "h.setOptionValue('solver', 'ipm')\n", + "h.setOptionValue('run_crossover', 'on') # produces simplex basis\n", + "h.run()\n", + "\n", + "# Chunks 2+: warm dual simplex (do NOT call clearSolver())\n", + "update_rhs_and_obj(h, new_rhs, new_obj) # update data only\n", + "h.setOptionValue('solver', 'simplex')\n", + "h.setOptionValue('simplex_strategy', 1) # dual simplex\n", + "h.run()\n", + "```\n", + "\n", + "Whether warm simplex is faster than cold IPM depends on how much the basis changes\n", + "between chunks. For highly variable renewables (adjacent weeks differ greatly),\n", + "cold IPM may be competitive.\n", + "\n", + "## 4. Efficient RHS and objective updates\n", + "\n", + "When computing new RHS values from time-series data (e.g., hourly wind profiles),\n", + "vectorised numpy operations are significantly faster and more memory-efficient than\n", + "pandas operations." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "import numpy as np\n", + "import pandas as pd\n", + "\n", + "# --- Generator upper bound RHS ---\n", + "\n", + "# BAD: pandas path — allocates intermediate Series/DataFrame at every step\n", + "def compute_gen_upper_bad(network, snapshots, gen_names):\n", + " p_max_pu = network.generators_t.p_max_pu.reindex(\n", + " index=snapshots, columns=gen_names\n", + " ).fillna(\n", + " network.generators.p_max_pu[gen_names]\n", + " ) # fillna broadcasts a Series → allocates a full (T × N) DataFrame\n", + " return (p_max_pu * network.generators.p_nom[gen_names]).values\n", + "\n", + "\n", + "# GOOD: numpy path — broadcast static values without expanding\n", + "def compute_gen_upper_good(network, snapshots, gen_names):\n", + " p_nom = network.generators.p_nom[gen_names].values # (N,)\n", + " p_max_static = network.generators.p_max_pu[gen_names].values # (N,)\n", + "\n", + " if len(network.generators_t.p_max_pu.columns) > 0:\n", + " p_max_pu = network.generators_t.p_max_pu.reindex(\n", + " index=snapshots, columns=gen_names\n", + " ).values # convert to numpy immediately — (T × N) float64\n", + "\n", + " # Fill NaNs via broadcast — avoids allocating a second (T × N) array\n", + " nan_mask = np.isnan(p_max_pu)\n", + " if nan_mask.any():\n", + " # np.broadcast_to returns a read-only view, not a copy\n", + " static_broadcast = np.broadcast_to(p_max_static[np.newaxis, :],\n", + " p_max_pu.shape)\n", + " p_max_pu[nan_mask] = static_broadcast[nan_mask]\n", + " else:\n", + " # All generators are static — use a view, not a copy\n", + " p_max_pu = np.broadcast_to(p_max_static[np.newaxis, :],\n", + " (len(snapshots), len(gen_names)))\n", + "\n", + " return p_max_pu * p_nom[np.newaxis, :] # (T × N) — element-wise, in-place" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Load aggregation pattern\n", + "\n", + "Aggregating time-varying loads to buses requires a groupby. The pandas transpose trick\n", + "avoids creating an intermediate transposed copy:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "def load_by_bus(network, snapshots, bus_names):\n", + " \"\"\"Aggregate per-load time series to per-bus totals.\"\"\"\n", + " loads = network.loads\n", + " load_ts = network.loads_t.p_set.reindex(snapshots)\n", + "\n", + " # Group by bus using the transposition trick:\n", + " # .T groups rows (loads) → .sum().T gives (snapshots × buses)\n", + " grouped = (\n", + " load_ts\n", + " .T\n", + " .groupby(loads.bus)\n", + " .sum()\n", + " .T\n", + " )\n", + "\n", + " # Add static loads (p_set in generators, not generators_t)\n", + " static_loads = loads[~loads.index.isin(load_ts.columns)]\n", + " if len(static_loads) > 0:\n", + " static_by_bus = static_loads.groupby('bus')['p_set'].sum()\n", + " grouped = grouped.add(static_by_bus, fill_value=0.0)\n", + "\n", + " return grouped.reindex(columns=bus_names, fill_value=0.0)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Sparse RHS assignment\n", + "\n", + "When the constraint label array maps variable indices to HiGHS row/column indices,\n", + "some entries are `-1` (missing / not applicable). Always filter before writing:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "def push_rhs_to_highs(h, b_full, keys, sense):\n", + " \"\"\"\n", + " Push a dense RHS array to HiGHS, skipping invalid entries.\n", + " \n", + " Parameters\n", + " ----------\n", + " h : highspy.Highs object\n", + " b_full : (T × N) numpy array of RHS values (may contain NaN for unused entries)\n", + " keys : (T × N) int32 array of HiGHS row indices (-1 = missing)\n", + " sense : (n_rows,) array of '<', '>', '=' per row\n", + " \"\"\"\n", + " flat_keys = keys.ravel()\n", + " flat_b = b_full.ravel()\n", + " valid = flat_keys >= 0\n", + "\n", + " idx = flat_keys[valid].astype(np.int32)\n", + " vals = flat_b[valid].astype(np.float64)\n", + "\n", + " row_sense = sense[idx]\n", + " lower = np.where(row_sense != '<', vals, -np.inf)\n", + " upper = np.where(row_sense != '>', vals, np.inf)\n", + "\n", + " h.changeRowsBounds(len(idx), idx, lower, upper)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 5. Memory profiling\n", + "\n", + "Use `tracemalloc` (stdlib) or `memory_profiler` to identify where memory peaks occur." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "import tracemalloc\n", + "import linopy\n", + "\n", + "def profile_model_build(network, snapshots):\n", + " \"\"\"Profile memory usage during linopy model build and solve.\"\"\"\n", + " tracemalloc.start()\n", + "\n", + " # Snapshot 1: before model build\n", + " snap0 = tracemalloc.take_snapshot()\n", + "\n", + " status, condition = network.optimize(\n", + " solver_name='highs',\n", + " snapshots=snapshots,\n", + " io_api='direct',\n", + " )\n", + "\n", + " # Snapshot 2: after model build + solve\n", + " snap1 = tracemalloc.take_snapshot()\n", + "\n", + " # Report top allocations\n", + " stats = snap1.compare_to(snap0, 'lineno')\n", + " print(\"Top 10 memory consumers:\")\n", + " for stat in stats[:10]:\n", + " print(f\" {stat}\")\n", + "\n", + " current, peak = tracemalloc.get_traced_memory()\n", + " print(f\"\\nCurrent: {current / 1e9:.2f} GB\")\n", + " print(f\"Peak: {peak / 1e9:.2f} GB\")\n", + "\n", + " tracemalloc.stop()\n", + " return network.model" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Quick memory estimate without solving\n", + "def estimate_model_memory(n_vars: int, n_cons: int, n_snapshots: int, avg_terms: float = 3.0) -> dict:\n", + " \"\"\"\n", + " Estimate peak memory for a linopy model.\n", + " \n", + " Returns dict with per-component estimates in GB.\n", + " \"\"\"\n", + " N, M, T = n_vars, n_cons, n_snapshots\n", + " nnz = int(M * T * avg_terms) # approximate non-zeros in A\n", + "\n", + " return {\n", + " 'variable_labels': 3 * 8 * N * T / 1e9, # labels + lower + upper\n", + " 'constraint_labels': 3 * 8 * M * T / 1e9, # labels + rhs + sign\n", + " 'sparse_A_steady': 3 * 8 * nnz / 1e9, # COO: data + row + col\n", + " 'sparse_A_peak': 3 * 3 * 8 * nnz / 1e9, # 3× during assembly\n", + " 'solution_vectors': 8 * (N + M) * T / 1e9,\n", + " 'xarray_overhead_est': 0.1, # rough estimate\n", + " }\n", + "\n", + "\n", + "# Example: 1.4M vars, 1.4M constraints, 168 snapshots\n", + "est = estimate_model_memory(1_400_000, 1_400_000, 168)\n", + "total = sum(est.values())\n", + "for k, v in est.items():\n", + " print(f\" {k:<30} {v:.2f} GB\")\n", + "print(f\" {'TOTAL':<30} {total:.2f} GB\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": "## 6. Explicit cleanup between solves\n\nLinopy caches several computed properties on the `matrices` accessor. These are\ninvalidated automatically when the model changes, but if you hold references or\nare debugging, they can prevent garbage collection.\n\n### Which properties are cached vs recomputed\n\nOn `model.matrices`, properties split into two groups:\n\n**`@cached_property` — stored in `__dict__`, freed by `clean_cached_properties()`:**\n- `flat_vars` — pandas DataFrame of all variables (large)\n- `flat_cons` — pandas DataFrame of all constraints (large, ~7 columns × M×T×k rows)\n- `sol` — dense solution value vector\n- `dual` — dense dual value vector\n\n**`@property` — recomputed on every access (no cache to clear):**\n- `A`, `b`, `c`, `sense`, `lb`, `ub`, `vlabels`, `clabels` — matrix/vector accessors\n\nThe `flat_cons` DataFrame is typically the biggest cached object (often 2–4 GB). It\nis created by `matrices.A` and kept in `__dict__` until explicitly cleared or the\nmodel is deleted. `clean_cached_properties()` removes all `@cached_property` entries\nfrom `__dict__` in one call.\n\n### Cache invalidation" + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# After solving chunk 1, before building chunk 2:\n", + "\n", + "# 1. Explicitly clear cached matrix properties\n", + "if hasattr(network, 'model') and hasattr(network.model, 'matrices'):\n", + " network.model.matrices.clean_cached_properties() # frees ~2-3 GB\n", + "\n", + "# 2. Detach the solver model before deleting (prevents double-free)\n", + "if hasattr(network, 'model') and hasattr(network.model, 'solver_model'):\n", + " network.model.solver_model = None\n", + "\n", + "# 3. Delete the linopy model\n", + "if hasattr(network, 'model'):\n", + " del network.model\n", + "\n", + "# 4. Force Python GC + return pages to OS\n", + "gc.collect()\n", + "# Linux:\n", + "# ctypes.CDLL('libc.so.6').malloc_trim(0)\n", + "# macOS:\n", + "# ctypes.CDLL('libSystem.B.dylib').malloc_zone_pressure_relief(0, 0)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Monitoring RSS during a run" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "import psutil\n", + "import os\n", + "\n", + "def rss_gb() -> float:\n", + " \"\"\"Return current process RSS in GB.\"\"\"\n", + " return psutil.Process(os.getpid()).memory_info().rss / 1e9\n", + "\n", + "\n", + "def solve_chunked_monitored(network, chunk_hours=168):\n", + " snapshots = network.snapshots\n", + " n_chunks = int(np.ceil(len(snapshots) / chunk_hours))\n", + "\n", + " for i in range(n_chunks):\n", + " chunk_snaps = snapshots[i * chunk_hours : (i + 1) * chunk_hours]\n", + " rss_before = rss_gb()\n", + "\n", + " network.optimize(\n", + " solver_name='highs',\n", + " snapshots=chunk_snaps,\n", + " io_api='direct',\n", + " )\n", + " rss_during = rss_gb()\n", + "\n", + " if hasattr(network, 'model'):\n", + " del network.model\n", + " release_memory()\n", + " rss_after = rss_gb()\n", + "\n", + " print(\n", + " f\"Chunk {i+1:2d}/{n_chunks} | \"\n", + " f\"before={rss_before:.1f} GB \"\n", + " f\"during={rss_during:.1f} GB \"\n", + " f\"after={rss_after:.1f} GB \"\n", + " f\"freed={rss_during - rss_after:.1f} GB\"\n", + " )" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 7. Sparse-friendly constraint formulation\n", + "\n", + "The sparsity pattern of the $A$ matrix depends heavily on how constraints are\n", + "formulated. A few linopy-specific tips:\n", + "\n", + "### Use `merge=False` when constraints are per-snapshot\n", + "\n", + "When constraints only involve variables from the **same** snapshot (e.g., nodal\n", + "power balance), pass `merge=False` to avoid linopy trying to align coordinates\n", + "across snapshot dimensions:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Nodal power balance: sum of generator output = load at each bus per snapshot\n", + "# Both sides are (snapshot × bus) aligned — no cross-snapshot terms\n", + "model.add_constraints(\n", + " gen_power.groupby('bus').sum() == load_by_bus,\n", + " name='power_balance',\n", + " # merge=False avoids unnecessary coordinate broadcasting\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Avoid dense broadcasting in constraint expressions\n", + "\n", + "Multiplying a variable by a full $(T \\times N)$ coefficient array is fine.\n", + "But creating an expression involving **all snapshots × all generators × all buses**\n", + "simultaneously will produce a dense term tensor that blows up memory:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# BAD: creates a (T × N_gen × N_bus) term array\n", + "# incidence_matrix is (N_gen × N_bus) → broadcasting creates full product\n", + "gen_at_bus = gen_vars * incidence_matrix # AVOID for large N_gen, N_bus\n", + "\n", + "# GOOD: group by bus assignment using pandas/xarray groupby\n", + "# This processes one bus at a time without materialising the full product\n", + "gen_power_by_bus = gen_vars.groupby('bus').sum() # (T × N_bus) — sparse aggregation" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Use `filter_missings=True` (default) in `to_file` / `to_matrix`\n", + "\n", + "The default `filter_missings=True` removes rows/columns with all-NaN labels before\n", + "building the sparse matrix. For models with many generators that are inactive for\n", + "part of the horizon (e.g., solar PV at night), this can halve the matrix size.\n", + "\n", + "## Summary: quick-reference checklist\n", + "\n", + "| Pattern | Savings | When to use |\n", + "|---------|---------|-------------|\n", + "| `del network.model` + `release_memory()` after each chunk | ~9 GB per chunk | Always |\n", + "| `io_api='direct'` | Avoids LP file I/O | Always for HiGHS |\n", + "| LP blueprint reuse (skip linopy for chunks 2+) | ~50 s + ~9 GB per chunk | When LP structure is identical across chunks |\n", + "| `clearSolver()` before re-solve | Correct presolve on new data | With blueprint reuse |\n", + "| Numpy broadcast fill (not `fillna`) | Avoids intermediate (T×N) copy | RHS/objective computation |\n", + "| Sparse RHS push (`valid = keys >= 0`) | Smaller HiGHS API call | Always with label arrays |\n", + "| `filter_missings=True` | Smaller $A$ matrix | When generators have gaps |\n", + "| `matrices.clean_cached_properties()` | ~2–3 GB after solve | Before `del network.model` |\n", + "| `tracemalloc` / `psutil.rss_gb()` | Identify peak locations | Profiling phase |" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (pixi main)", + "language": "python", + "name": "python3" + }, + "language_info": { + "name": "python", + "version": "3.14.0" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} \ No newline at end of file From 682959c741a452fec2dc3ab6ee76637b5f3afe77 Mon Sep 17 00:00:00 2001 From: Mayk Thewessen Date: Tue, 17 Mar 2026 21:37:07 +0100 Subject: [PATCH 2/4] docs: add nblink for memory best-practices notebook --- doc/memory-best-practices.nblink | 3 +++ 1 file changed, 3 insertions(+) create mode 100644 doc/memory-best-practices.nblink diff --git a/doc/memory-best-practices.nblink b/doc/memory-best-practices.nblink new file mode 100644 index 00000000..58e39c63 --- /dev/null +++ b/doc/memory-best-practices.nblink @@ -0,0 +1,3 @@ +{ + "path": "../examples/memory-best-practices.ipynb" +} From e563a9a18555be571d0873ececd4a98af74612d2 Mon Sep 17 00:00:00 2001 From: Mayk Thewessen Date: Tue, 17 Mar 2026 21:38:19 +0100 Subject: [PATCH 3/4] docs: add memory-best-practices to User Guide toctree --- doc/index.rst | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/doc/index.rst b/doc/index.rst index fd7f9ed8..41e490b4 100644 --- a/doc/index.rst +++ b/doc/index.rst @@ -24,7 +24,7 @@ Main features flexible data-handling features: - Define (arrays of) contnuous or binary variables with - **coordinates**, e.g. time, consumers, etc. + **coordinates**, e.g. time, consumers, etc. - Apply **arithmetic operations** on the variables like adding, subtracting, multiplying with all the **broadcasting** potentials of xarray @@ -124,6 +124,7 @@ This package is published under MIT license. gpu-acceleration migrating-from-pyomo gurobi-double-logging + memory-best-practices .. toctree:: From 3831618bdb0921ae5d712dadd8d688c3c5c72d71 Mon Sep 17 00:00:00 2001 From: "pre-commit-ci[bot]" <66853113+pre-commit-ci[bot]@users.noreply.github.com> Date: Tue, 17 Mar 2026 20:39:21 +0000 Subject: [PATCH 4/4] [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --- examples/memory-best-practices.ipynb | 185 ++++++++++++++++----------- 1 file changed, 107 insertions(+), 78 deletions(-) diff --git a/examples/memory-best-practices.ipynb b/examples/memory-best-practices.ipynb index 4e5393ba..087fe817 100644 --- a/examples/memory-best-practices.ipynb +++ b/examples/memory-best-practices.ipynb @@ -2,6 +2,7 @@ "cells": [ { "cell_type": "markdown", + "id": "7fb27b941602401d91542211134fc71a", "metadata": {}, "source": [ "# Memory-Efficient Linopy Usage: Best Practices\n", @@ -28,22 +29,23 @@ }, { "cell_type": "markdown", + "id": "acae54e37e7d407bbb7b55eff062a284", "metadata": {}, "source": "## 1. Understanding linopy's memory layout\n\nBefore optimising, it helps to understand where memory actually lives in a linopy model.\n\n### Data structures\n\nA `linopy.Model` stores its data primarily in `xarray.Dataset` objects — one per\nvariable and one per constraint. For a model with $N$ variables and $M$ constraints\nover $T$ timesteps:\n\n| Structure | Shape | Approx. memory (float64) |\n|-----------|-------|---------------------------|\n| Variable labels (`.labels`) | $(T, N)$ | $8TN$ bytes |\n| Variable bounds (`.lower`, `.upper`) | $2 \\times (T, N)$ | $16TN$ bytes |\n| Constraint labels + RHS + sign | $3 \\times (T, M)$ | $24TM$ bytes |\n| Expression term data (`_term` dim) | $(T, M, k)$ where $k$ = avg. terms | variable |\n| Sparse matrix $A$ (CSC) | $\\text{nnz} \\times 3$ arrays | $24 \\cdot \\text{nnz}$ bytes |\n| Solution + dual vectors | $(N + M)$ | $8(N+M)$ bytes |\n\nFor a 1.4M variable / 1.4M constraint model with $T=168$ snapshots:\n- Variable storage: ~3.4 GB \n- Constraint storage: ~1.2 GB \n- Sparse $A$ matrix (~50M non-zeros): ~1.2 GB \n- **Peak during `matrices.A` assembly: ~3× the matrix size (~3.6 GB spike)**\n- **Peak during solution unpacking: ~3× variable count** (solver array + pandas indexed copy + xarray result)\n\n### The two memory spikes\n\nThere are two distinct memory peaks in a linopy solve cycle:\n\n**1. `constraints.flat` (triggered by `matrices.A`)**: Before building the sparse\nmatrix, linopy concatenates all constraints into a pandas DataFrame with ~7 columns\nand $(M \\times T \\times k)$ rows (where $k$ = average terms per constraint, typically\n2–4). At 120 bytes/row of pandas overhead, this temporarily consumes several GB\n*before* the CSC matrix itself exists.\n\n**2. COO→CSC conversion**: The flat DataFrame is then converted to COO format\n(row, col, data arrays) then to CSC. The peak is roughly 3× the final matrix size —\nboth the COO intermediate and the final CSC exist in memory simultaneously during\nconversion.\n\n**3. Solution unpacking**: After solve, the dense solver solution vector is unpacked\ninto xarray DataArrays per variable. Three copies exist briefly: solver numpy array\n→ pandas Series with int index → xarray DataArray. This adds ~3× N_vars × 8 bytes\nof temporary memory.\n\nAvoid triggering repeated matrix builds — they repeat spikes 1 and 2 each time:" }, { "cell_type": "code", "execution_count": null, + "id": "9a63283cbaf04dbcab1f6479b197f3a8", "metadata": {}, "outputs": [], "source": [ - "import linopy\n", "import numpy as np\n", "\n", "# BAD: triggers matrix build twice\n", - "A = model.matrices.A # builds CSC matrix\n", - "b = model.matrices.b # already cached — OK\n", - "A2 = model.matrices.A # re-builds if cache was cleared!\n", + "A = model.matrices.A # builds CSC matrix\n", + "b = model.matrices.b # already cached — OK\n", + "A2 = model.matrices.A # re-builds if cache was cleared!\n", "\n", "# GOOD: access once, store locally\n", "matrices = model.matrices\n", @@ -54,6 +56,7 @@ }, { "cell_type": "markdown", + "id": "8dd0d8092fe74a7c96281538738b07e2", "metadata": {}, "source": [ "### Checking sparsity\n", @@ -65,6 +68,7 @@ { "cell_type": "code", "execution_count": null, + "id": "72eea5119410473aa328ad9291626812", "metadata": {}, "outputs": [], "source": [ @@ -81,6 +85,7 @@ }, { "cell_type": "markdown", + "id": "8edb47106e1a46a883d545849b8ab81b", "metadata": {}, "source": [ "## 2. Chunked time-horizon solving\n", @@ -96,27 +101,30 @@ { "cell_type": "code", "execution_count": null, + "id": "10185d26023b46108eb7d9f57d49d2b3", "metadata": {}, "outputs": [], "source": [ - "import pypsa\n", - "import numpy as np\n", + "import ctypes\n", "import gc\n", "import platform\n", - "import ctypes\n", + "\n", + "import pypsa\n", "\n", "CHUNK_HOURS = 168 # 1 week\n", "\n", + "\n", "def release_memory():\n", " \"\"\"Return freed pages to the OS (important on Linux/macOS).\"\"\"\n", " gc.collect()\n", - " if platform.system() == 'Darwin':\n", - " libc = ctypes.CDLL('libSystem.B.dylib')\n", + " if platform.system() == \"Darwin\":\n", + " libc = ctypes.CDLL(\"libSystem.B.dylib\")\n", " libc.malloc_zone_pressure_relief(0, 0)\n", - " elif platform.system() == 'Linux':\n", - " libc = ctypes.CDLL('libc.so.6')\n", + " elif platform.system() == \"Linux\":\n", + " libc = ctypes.CDLL(\"libc.so.6\")\n", " libc.malloc_trim(0)\n", "\n", + "\n", "def solve_chunked(network: pypsa.Network, chunk_hours: int = CHUNK_HOURS):\n", " snapshots = network.snapshots\n", " n_chunks = int(np.ceil(len(snapshots) / chunk_hours))\n", @@ -124,19 +132,19 @@ "\n", " for i in range(n_chunks):\n", " chunk_snaps = snapshots[i * chunk_hours : (i + 1) * chunk_hours]\n", - " print(f\"Chunk {i+1}/{n_chunks}: {chunk_snaps[0]} – {chunk_snaps[-1]}\")\n", + " print(f\"Chunk {i + 1}/{n_chunks}: {chunk_snaps[0]} – {chunk_snaps[-1]}\")\n", "\n", " status, condition = network.optimize(\n", - " solver_name='highs',\n", + " solver_name=\"highs\",\n", " snapshots=chunk_snaps,\n", - " io_api='direct', # avoids writing LP files to disk\n", + " io_api=\"direct\", # avoids writing LP files to disk\n", " )\n", "\n", - " if status == 'ok':\n", + " if status == \"ok\":\n", " results.append(network.generators_t.p[chunk_snaps].copy())\n", "\n", " # Critical: free the linopy model before the next chunk\n", - " if hasattr(network, 'model'):\n", + " if hasattr(network, \"model\"):\n", " del network.model\n", " release_memory()\n", "\n", @@ -145,6 +153,7 @@ }, { "cell_type": "markdown", + "id": "8763a12b2bbd4a93a75aff182afb95dc", "metadata": {}, "source": [ "### Why `del network.model` matters\n", @@ -179,22 +188,24 @@ { "cell_type": "code", "execution_count": null, + "id": "7623eae2785240b9bd12b16a66d81610", "metadata": {}, "outputs": [], "source": [ "from dataclasses import dataclass, field\n", "from typing import Any\n", - "import numpy as np\n", + "\n", "\n", "@dataclass\n", "class LPBlueprint:\n", " \"\"\"Cached LP structure from chunk 1, reused for chunks 2+.\"\"\"\n", - " highs: Any # highspy.Highs object\n", + "\n", + " highs: Any # highspy.Highs object\n", " n_rows: int\n", " n_cols: int\n", - " sense: np.ndarray # '<' / '>' / '=' per row\n", - " tv_row_indices: np.ndarray # rows that change between chunks\n", - " tv_col_indices: np.ndarray # objective columns that change\n", + " sense: np.ndarray # '<' / '>' / '=' per row\n", + " tv_row_indices: np.ndarray # rows that change between chunks\n", + " tv_col_indices: np.ndarray # objective columns that change\n", " # maps variable/constraint names → HiGHS column/row index arrays\n", " var_keys: dict = field(default_factory=dict)\n", " con_keys: dict = field(default_factory=dict)\n", @@ -203,18 +214,18 @@ "def extract_blueprint(model) -> LPBlueprint:\n", " \"\"\"\n", " Extract LP blueprint from a solved linopy model (chunk 1).\n", - " \n", + "\n", " Call immediately after network.optimize() while network.model is still alive.\n", " \"\"\"\n", - " h = model.solver_model # highspy.Highs\n", + " h = model.solver_model # highspy.Highs\n", " n_rows = h.getNumRow()\n", " n_cols = h.getNumCol()\n", "\n", " # Identify time-varying rows: those whose RHS changes each chunk.\n", " # In PyPSA this is generator p_max/p_min bounds + power balance rows.\n", " # Tag them during model build, or detect by name pattern.\n", - " tv_rows = _identify_tv_rows(model) # implementation-specific\n", - " tv_cols = _identify_tv_objective_cols(model) # implementation-specific\n", + " tv_rows = _identify_tv_rows(model) # implementation-specific\n", + " tv_cols = _identify_tv_objective_cols(model) # implementation-specific\n", "\n", " # Extract constraint sense once (does not change between chunks)\n", " sense = np.array([h.getRowSense(r) for r in range(n_rows)])\n", @@ -240,7 +251,7 @@ "def reuse_blueprint(bp: LPBlueprint, new_rhs: np.ndarray, new_obj: np.ndarray):\n", " \"\"\"\n", " Update and re-solve using cached HiGHS object. Skips linopy entirely.\n", - " \n", + "\n", " Parameters\n", " ----------\n", " bp : blueprint from extract_blueprint()\n", @@ -250,8 +261,8 @@ " h = bp.highs\n", "\n", " # Update row bounds (only the ~15% that change)\n", - " lower = np.where(bp.sense[bp.tv_row_indices] != '<', new_rhs, -np.inf)\n", - " upper = np.where(bp.sense[bp.tv_row_indices] != '>', new_rhs, np.inf)\n", + " lower = np.where(bp.sense[bp.tv_row_indices] != \"<\", new_rhs, -np.inf)\n", + " upper = np.where(bp.sense[bp.tv_row_indices] != \">\", new_rhs, np.inf)\n", " h.changeRowsBounds(\n", " len(bp.tv_row_indices),\n", " bp.tv_row_indices.astype(np.int32),\n", @@ -272,11 +283,12 @@ " # Solve with same options as chunk 1\n", " h.run()\n", "\n", - " return h.getInfoValue('primal_solution_status')[1]" + " return h.getInfoValue(\"primal_solution_status\")[1]" ] }, { "cell_type": "markdown", + "id": "7cdc8c89c7104fffa095e18ddfef8986", "metadata": {}, "source": [ "### Why `clearSolver()` is essential\n", @@ -329,14 +341,15 @@ { "cell_type": "code", "execution_count": null, + "id": "b118ea5561624da68c537baed56e602f", "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", - "import pandas as pd\n", "\n", "# --- Generator upper bound RHS ---\n", "\n", + "\n", "# BAD: pandas path — allocates intermediate Series/DataFrame at every step\n", "def compute_gen_upper_bad(network, snapshots, gen_names):\n", " p_max_pu = network.generators_t.p_max_pu.reindex(\n", @@ -349,8 +362,8 @@ "\n", "# GOOD: numpy path — broadcast static values without expanding\n", "def compute_gen_upper_good(network, snapshots, gen_names):\n", - " p_nom = network.generators.p_nom[gen_names].values # (N,)\n", - " p_max_static = network.generators.p_max_pu[gen_names].values # (N,)\n", + " p_nom = network.generators.p_nom[gen_names].values # (N,)\n", + " p_max_static = network.generators.p_max_pu[gen_names].values # (N,)\n", "\n", " if len(network.generators_t.p_max_pu.columns) > 0:\n", " p_max_pu = network.generators_t.p_max_pu.reindex(\n", @@ -361,19 +374,22 @@ " nan_mask = np.isnan(p_max_pu)\n", " if nan_mask.any():\n", " # np.broadcast_to returns a read-only view, not a copy\n", - " static_broadcast = np.broadcast_to(p_max_static[np.newaxis, :],\n", - " p_max_pu.shape)\n", + " static_broadcast = np.broadcast_to(\n", + " p_max_static[np.newaxis, :], p_max_pu.shape\n", + " )\n", " p_max_pu[nan_mask] = static_broadcast[nan_mask]\n", " else:\n", " # All generators are static — use a view, not a copy\n", - " p_max_pu = np.broadcast_to(p_max_static[np.newaxis, :],\n", - " (len(snapshots), len(gen_names)))\n", + " p_max_pu = np.broadcast_to(\n", + " p_max_static[np.newaxis, :], (len(snapshots), len(gen_names))\n", + " )\n", "\n", - " return p_max_pu * p_nom[np.newaxis, :] # (T × N) — element-wise, in-place" + " return p_max_pu * p_nom[np.newaxis, :] # (T × N) — element-wise, in-place" ] }, { "cell_type": "markdown", + "id": "938c804e27f84196a10c8828c723f798", "metadata": {}, "source": [ "### Load aggregation pattern\n", @@ -385,6 +401,7 @@ { "cell_type": "code", "execution_count": null, + "id": "504fb2a444614c0babb325280ed9130a", "metadata": {}, "outputs": [], "source": [ @@ -395,18 +412,12 @@ "\n", " # Group by bus using the transposition trick:\n", " # .T groups rows (loads) → .sum().T gives (snapshots × buses)\n", - " grouped = (\n", - " load_ts\n", - " .T\n", - " .groupby(loads.bus)\n", - " .sum()\n", - " .T\n", - " )\n", + " grouped = load_ts.T.groupby(loads.bus).sum().T\n", "\n", " # Add static loads (p_set in generators, not generators_t)\n", " static_loads = loads[~loads.index.isin(load_ts.columns)]\n", " if len(static_loads) > 0:\n", - " static_by_bus = static_loads.groupby('bus')['p_set'].sum()\n", + " static_by_bus = static_loads.groupby(\"bus\")[\"p_set\"].sum()\n", " grouped = grouped.add(static_by_bus, fill_value=0.0)\n", "\n", " return grouped.reindex(columns=bus_names, fill_value=0.0)" @@ -414,6 +425,7 @@ }, { "cell_type": "markdown", + "id": "59bbdb311c014d738909a11f9e486628", "metadata": {}, "source": [ "### Sparse RHS assignment\n", @@ -425,13 +437,14 @@ { "cell_type": "code", "execution_count": null, + "id": "b43b363d81ae4b689946ece5c682cd59", "metadata": {}, "outputs": [], "source": [ "def push_rhs_to_highs(h, b_full, keys, sense):\n", " \"\"\"\n", " Push a dense RHS array to HiGHS, skipping invalid entries.\n", - " \n", + "\n", " Parameters\n", " ----------\n", " h : highspy.Highs object\n", @@ -440,21 +453,22 @@ " sense : (n_rows,) array of '<', '>', '=' per row\n", " \"\"\"\n", " flat_keys = keys.ravel()\n", - " flat_b = b_full.ravel()\n", - " valid = flat_keys >= 0\n", + " flat_b = b_full.ravel()\n", + " valid = flat_keys >= 0\n", "\n", - " idx = flat_keys[valid].astype(np.int32)\n", + " idx = flat_keys[valid].astype(np.int32)\n", " vals = flat_b[valid].astype(np.float64)\n", "\n", " row_sense = sense[idx]\n", - " lower = np.where(row_sense != '<', vals, -np.inf)\n", - " upper = np.where(row_sense != '>', vals, np.inf)\n", + " lower = np.where(row_sense != \"<\", vals, -np.inf)\n", + " upper = np.where(row_sense != \">\", vals, np.inf)\n", "\n", " h.changeRowsBounds(len(idx), idx, lower, upper)" ] }, { "cell_type": "markdown", + "id": "8a65eabff63a45729fe45fb5ade58bdc", "metadata": {}, "source": [ "## 5. Memory profiling\n", @@ -465,11 +479,12 @@ { "cell_type": "code", "execution_count": null, + "id": "c3933fab20d04ec698c2621248eb3be0", "metadata": {}, "outputs": [], "source": [ "import tracemalloc\n", - "import linopy\n", + "\n", "\n", "def profile_model_build(network, snapshots):\n", " \"\"\"Profile memory usage during linopy model build and solve.\"\"\"\n", @@ -479,16 +494,16 @@ " snap0 = tracemalloc.take_snapshot()\n", "\n", " status, condition = network.optimize(\n", - " solver_name='highs',\n", + " solver_name=\"highs\",\n", " snapshots=snapshots,\n", - " io_api='direct',\n", + " io_api=\"direct\",\n", " )\n", "\n", " # Snapshot 2: after model build + solve\n", " snap1 = tracemalloc.take_snapshot()\n", "\n", " # Report top allocations\n", - " stats = snap1.compare_to(snap0, 'lineno')\n", + " stats = snap1.compare_to(snap0, \"lineno\")\n", " print(\"Top 10 memory consumers:\")\n", " for stat in stats[:10]:\n", " print(f\" {stat}\")\n", @@ -504,26 +519,29 @@ { "cell_type": "code", "execution_count": null, + "id": "4dd4641cc4064e0191573fe9c69df29b", "metadata": {}, "outputs": [], "source": [ "# Quick memory estimate without solving\n", - "def estimate_model_memory(n_vars: int, n_cons: int, n_snapshots: int, avg_terms: float = 3.0) -> dict:\n", + "def estimate_model_memory(\n", + " n_vars: int, n_cons: int, n_snapshots: int, avg_terms: float = 3.0\n", + ") -> dict:\n", " \"\"\"\n", " Estimate peak memory for a linopy model.\n", - " \n", + "\n", " Returns dict with per-component estimates in GB.\n", " \"\"\"\n", " N, M, T = n_vars, n_cons, n_snapshots\n", " nnz = int(M * T * avg_terms) # approximate non-zeros in A\n", "\n", " return {\n", - " 'variable_labels': 3 * 8 * N * T / 1e9, # labels + lower + upper\n", - " 'constraint_labels': 3 * 8 * M * T / 1e9, # labels + rhs + sign\n", - " 'sparse_A_steady': 3 * 8 * nnz / 1e9, # COO: data + row + col\n", - " 'sparse_A_peak': 3 * 3 * 8 * nnz / 1e9, # 3× during assembly\n", - " 'solution_vectors': 8 * (N + M) * T / 1e9,\n", - " 'xarray_overhead_est': 0.1, # rough estimate\n", + " \"variable_labels\": 3 * 8 * N * T / 1e9, # labels + lower + upper\n", + " \"constraint_labels\": 3 * 8 * M * T / 1e9, # labels + rhs + sign\n", + " \"sparse_A_steady\": 3 * 8 * nnz / 1e9, # COO: data + row + col\n", + " \"sparse_A_peak\": 3 * 3 * 8 * nnz / 1e9, # 3× during assembly\n", + " \"solution_vectors\": 8 * (N + M) * T / 1e9,\n", + " \"xarray_overhead_est\": 0.1, # rough estimate\n", " }\n", "\n", "\n", @@ -537,27 +555,29 @@ }, { "cell_type": "markdown", + "id": "8309879909854d7188b41380fd92a7c3", "metadata": {}, "source": "## 6. Explicit cleanup between solves\n\nLinopy caches several computed properties on the `matrices` accessor. These are\ninvalidated automatically when the model changes, but if you hold references or\nare debugging, they can prevent garbage collection.\n\n### Which properties are cached vs recomputed\n\nOn `model.matrices`, properties split into two groups:\n\n**`@cached_property` — stored in `__dict__`, freed by `clean_cached_properties()`:**\n- `flat_vars` — pandas DataFrame of all variables (large)\n- `flat_cons` — pandas DataFrame of all constraints (large, ~7 columns × M×T×k rows)\n- `sol` — dense solution value vector\n- `dual` — dense dual value vector\n\n**`@property` — recomputed on every access (no cache to clear):**\n- `A`, `b`, `c`, `sense`, `lb`, `ub`, `vlabels`, `clabels` — matrix/vector accessors\n\nThe `flat_cons` DataFrame is typically the biggest cached object (often 2–4 GB). It\nis created by `matrices.A` and kept in `__dict__` until explicitly cleared or the\nmodel is deleted. `clean_cached_properties()` removes all `@cached_property` entries\nfrom `__dict__` in one call.\n\n### Cache invalidation" }, { "cell_type": "code", "execution_count": null, + "id": "3ed186c9a28b402fb0bc4494df01f08d", "metadata": {}, "outputs": [], "source": [ "# After solving chunk 1, before building chunk 2:\n", "\n", "# 1. Explicitly clear cached matrix properties\n", - "if hasattr(network, 'model') and hasattr(network.model, 'matrices'):\n", + "if hasattr(network, \"model\") and hasattr(network.model, \"matrices\"):\n", " network.model.matrices.clean_cached_properties() # frees ~2-3 GB\n", "\n", "# 2. Detach the solver model before deleting (prevents double-free)\n", - "if hasattr(network, 'model') and hasattr(network.model, 'solver_model'):\n", + "if hasattr(network, \"model\") and hasattr(network.model, \"solver_model\"):\n", " network.model.solver_model = None\n", "\n", "# 3. Delete the linopy model\n", - "if hasattr(network, 'model'):\n", + "if hasattr(network, \"model\"):\n", " del network.model\n", "\n", "# 4. Force Python GC + return pages to OS\n", @@ -570,6 +590,7 @@ }, { "cell_type": "markdown", + "id": "cb1e1581032b452c9409d6c6813c49d1", "metadata": {}, "source": [ "### Monitoring RSS during a run" @@ -578,12 +599,15 @@ { "cell_type": "code", "execution_count": null, + "id": "379cbbc1e968416e875cc15c1202d7eb", "metadata": {}, "outputs": [], "source": [ - "import psutil\n", "import os\n", "\n", + "import psutil\n", + "\n", + "\n", "def rss_gb() -> float:\n", " \"\"\"Return current process RSS in GB.\"\"\"\n", " return psutil.Process(os.getpid()).memory_info().rss / 1e9\n", @@ -591,26 +615,26 @@ "\n", "def solve_chunked_monitored(network, chunk_hours=168):\n", " snapshots = network.snapshots\n", - " n_chunks = int(np.ceil(len(snapshots) / chunk_hours))\n", + " n_chunks = int(np.ceil(len(snapshots) / chunk_hours))\n", "\n", " for i in range(n_chunks):\n", " chunk_snaps = snapshots[i * chunk_hours : (i + 1) * chunk_hours]\n", - " rss_before = rss_gb()\n", + " rss_before = rss_gb()\n", "\n", " network.optimize(\n", - " solver_name='highs',\n", + " solver_name=\"highs\",\n", " snapshots=chunk_snaps,\n", - " io_api='direct',\n", + " io_api=\"direct\",\n", " )\n", " rss_during = rss_gb()\n", "\n", - " if hasattr(network, 'model'):\n", + " if hasattr(network, \"model\"):\n", " del network.model\n", " release_memory()\n", " rss_after = rss_gb()\n", "\n", " print(\n", - " f\"Chunk {i+1:2d}/{n_chunks} | \"\n", + " f\"Chunk {i + 1:2d}/{n_chunks} | \"\n", " f\"before={rss_before:.1f} GB \"\n", " f\"during={rss_during:.1f} GB \"\n", " f\"after={rss_after:.1f} GB \"\n", @@ -620,6 +644,7 @@ }, { "cell_type": "markdown", + "id": "277c27b1587741f2af2001be3712ef0d", "metadata": {}, "source": [ "## 7. Sparse-friendly constraint formulation\n", @@ -637,20 +662,22 @@ { "cell_type": "code", "execution_count": null, + "id": "db7b79bc585a40fcaf58bf750017e135", "metadata": {}, "outputs": [], "source": [ "# Nodal power balance: sum of generator output = load at each bus per snapshot\n", "# Both sides are (snapshot × bus) aligned — no cross-snapshot terms\n", "model.add_constraints(\n", - " gen_power.groupby('bus').sum() == load_by_bus,\n", - " name='power_balance',\n", + " gen_power.groupby(\"bus\").sum() == load_by_bus,\n", + " name=\"power_balance\",\n", " # merge=False avoids unnecessary coordinate broadcasting\n", ")" ] }, { "cell_type": "markdown", + "id": "916684f9a58a4a2aa5f864670399430d", "metadata": {}, "source": [ "### Avoid dense broadcasting in constraint expressions\n", @@ -663,20 +690,22 @@ { "cell_type": "code", "execution_count": null, + "id": "1671c31a24314836a5b85d7ef7fbf015", "metadata": {}, "outputs": [], "source": [ "# BAD: creates a (T × N_gen × N_bus) term array\n", "# incidence_matrix is (N_gen × N_bus) → broadcasting creates full product\n", - "gen_at_bus = gen_vars * incidence_matrix # AVOID for large N_gen, N_bus\n", + "gen_at_bus = gen_vars * incidence_matrix # AVOID for large N_gen, N_bus\n", "\n", "# GOOD: group by bus assignment using pandas/xarray groupby\n", "# This processes one bus at a time without materialising the full product\n", - "gen_power_by_bus = gen_vars.groupby('bus').sum() # (T × N_bus) — sparse aggregation" + "gen_power_by_bus = gen_vars.groupby(\"bus\").sum() # (T × N_bus) — sparse aggregation" ] }, { "cell_type": "markdown", + "id": "33b0902fd34d4ace834912fa1002cf8e", "metadata": {}, "source": [ "### Use `filter_missings=True` (default) in `to_file` / `to_matrix`\n", @@ -714,4 +743,4 @@ }, "nbformat": 4, "nbformat_minor": 5 -} \ No newline at end of file +}