Skip to content

add variable statistic for 'Final Energy [by Sector]|Industry'#16

Open
maxnutz wants to merge 6 commits intomainfrom
12-new-variable-statistics-final-energy-by-sectorindustry
Open

add variable statistic for 'Final Energy [by Sector]|Industry'#16
maxnutz wants to merge 6 commits intomainfrom
12-new-variable-statistics-final-energy-by-sectorindustry

Conversation

@maxnutz
Copy link
Copy Markdown
Owner

@maxnutz maxnutz commented Mar 25, 2026

  • adds new function for Final Energy -> Industry
  • includes statistics of component Load and carriers:
        "coal for industry",
        "industry electricity",
        "gas for industry",
        "H2 for industry",
        "solid biomass for industry",
        "industry methanol",
        "naphtha for industry",
        "low-temperature heat for industry",
  • adds mapping to mapping.default.yaml

  • includes testing class for function

  • pixi run pytest tests/ -v : all tests passed

  • pixi run workflow: workflow executed successfully

closes #12

Summary by Sourcery

Add industry-sector final energy statistics and align existing final energy helpers with single-network inputs.

New Features:

  • Introduce Final_Energy_by_Sector__Industry statistics function aggregating industry load carriers into country- and unit-indexed results.

Enhancements:

  • Update Final_Energy_by_Carrier__Electricity and Final_Energy_by_Sector__Transportation to operate on a single PyPSA Network and return country- and unit-indexed series instead of collection-wide dummy data.
  • Extend the default statistics mapping configuration to expose the new industry-sector final energy variable.

Tests:

  • Add a dedicated test suite for Final_Energy_by_Sector__Industry covering shape, indexing, numeric output, presence of Austria data, and handling of multiple networks.

@maxnutz maxnutz linked an issue Mar 25, 2026 that may be closed by this pull request
7 tasks
@sourcery-ai
Copy link
Copy Markdown
Contributor

sourcery-ai bot commented Mar 25, 2026

Reviewer's Guide

Implements a new Final_Energy_by_Sector__Industry statistics function (including tests and configuration mapping) and aligns existing final energy functions for electricity and transportation to operate on a single pypsa.Network while returning a country/unit-indexed Series instead of a long-format DataFrame.

Sequence diagram for Final_Energy_by_Sector__Industry statistics extraction

sequenceDiagram
    participant Caller as "Caller"
    participant StatisticsFunctions as "statistics_functions.Final_Energy_by_Sector__Industry"
    participant Network as "pypsa.Network n"
    participant PypsaStatistics as "n.statistics"
    participant Pandas as "pandas (groupby/sum)"

    Caller->>StatisticsFunctions: Final_Energy_by_Sector__Industry(n)
    StatisticsFunctions->>Network: access statistics
    Network->>PypsaStatistics: energy_balance(carrier=[industry_carriers],\ncomponents="Load",\ngroupby=["carrier","unit","country"],\ndirection="withdrawal")
    PypsaStatistics-->>StatisticsFunctions: DataFrame indexed by carrier, unit, country
    StatisticsFunctions->>Pandas: groupby(["country","unit"]).sum()
    Pandas-->>StatisticsFunctions: Series with MultiIndex (country, unit)
    StatisticsFunctions-->>Caller: pd.Series result
Loading

Class diagram for updated statistics functions and pypsa integration

classDiagram
    class statistics_functions {
        +Final_Energy_by_Carrier__Electricity(n_pypsa_Network) pd_Series
        +Final_Energy_by_Sector__Transportation(n_pypsa_Network) pd_Series
        +Final_Energy_by_Sector__Industry(n_pypsa_Network) pd_Series
    }

    class pypsa_Network {
        +statistics pypsa_StatisticsAccessor
    }

    class pypsa_StatisticsAccessor {
        +energy_balance(carrier, components, groupby, direction) pd_DataFrame
    }

    class pd_Series {
    }

    class pd_DataFrame {
    }

    statistics_functions ..> pypsa_Network : uses
    pypsa_Network ..> pypsa_StatisticsAccessor : exposes
    statistics_functions ..> pypsa_StatisticsAccessor : calls energy_balance
    statistics_functions ..> pd_Series : returns
    pypsa_StatisticsAccessor ..> pd_DataFrame : returns
Loading

File-Level Changes

Change Details Files
Refine Final_Energy_by_Carrier__Electricity to operate on a single network and return a MultiIndex Series of withdrawals from the electricity system.
  • Update function docstring to refer to pypsa.Network instead of NetworkCollection and describe the new return type and behavior.
  • Use n.statistics.energy_balance with bus_carrier filters for AC and low voltage, grouping by unit and country and summing to get total withdrawals.
  • Remove dummy placeholder implementation and obsolete usage notes about NetworkCollection.
pypsa_validation_processing/statistics_functions.py
Refine Final_Energy_by_Sector__Transportation to compute transport-sector final energy from relevant load carriers and return a MultiIndex Series.
  • Update docstring to reflect pypsa.Network input, Series output with MultiIndex (country, unit), and clarify behavior and TODO for bidirectional EV usage.
  • Call n.statistics.energy_balance for transport-related carriers (EV, fuel cell, oil, kerosene, shipping methanol and oil) on Load components with withdrawal direction.
  • Aggregate by country and unit via groupby and sum to produce the final result Series.
pypsa_validation_processing/statistics_functions.py
Add new Final_Energy_by_Sector__Industry function to compute final energy consumption in the industry sector from specific load carriers.
  • Introduce Final_Energy_by_Sector__Industry that accepts a pypsa.Network and documents Series output indexed by country and unit.
  • Define an explicit list of industry-related carriers (coal, electricity, gas, H2, solid biomass, methanol, naphtha, low-temperature heat).
  • Use n.statistics.energy_balance filtered to these carriers on Load components with withdrawal direction, then group by country and unit and sum to get aggregated values.
pypsa_validation_processing/statistics_functions.py
Add dedicated tests to validate Final_Energy_by_Sector__Industry behavior using mock networks.
  • Create TestFinalEnergyBySectorIndustry test class with tests for return type, index structure (country and unit), non-emptiness, numeric dtype, and inclusion of Austria in the index.
  • Add a multi-network test that iterates through MockNetworkCollection and asserts non-empty numeric output for each network.
tests/test_statistics_functions.py
Register the new industry-sector final energy statistic in the default mapping configuration.
  • Extend mapping.default.yaml to map the high-level variable name `Final Energy [by Sector]
Industryto the newFinal_Energy_by_Sector__Industry` function.

Assessment against linked issues

Issue Objective Addressed Explanation
#12 Implement a new statistics function to extract 'Final Energy [by Sector] Industry' from a PyPSA Network and compute the relevant industry final energy using appropriate carriers and components.
#12 Document the new 'Final Energy [by Sector] Industry' statistics function with a comprehensive docstring describing parameters, returns, and calculation approach.
#12 Expose the new 'Final Energy [by Sector] Industry' statistic via configuration and ensure it is tested (add mapping in mapping.default.yaml and corresponding tests under tests/).

Possibly linked issues


Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 1 issue, and left some high level feedback:

  • The new Final_Energy_by_Sector__Industry docstring appears to be copy-pasted from the transportation function (mentions transportation twice); please update it to correctly describe the industry sector behavior.
  • The return type annotations for Final_Energy_by_Sector__Transportation and Final_Energy_by_Sector__Industry are -> pd.DataFrame, but the functions return a grouped pd.Series; aligning the type hints (and docstrings) with the actual return type would make the API clearer.
  • The list of industry carriers in Final_Energy_by_Sector__Industry is hard-coded inside the function; consider extracting it to a shared constant or config so it can be reused or adjusted without modifying the function logic.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The new `Final_Energy_by_Sector__Industry` docstring appears to be copy-pasted from the transportation function (mentions transportation twice); please update it to correctly describe the industry sector behavior.
- The return type annotations for `Final_Energy_by_Sector__Transportation` and `Final_Energy_by_Sector__Industry` are `-> pd.DataFrame`, but the functions return a grouped `pd.Series`; aligning the type hints (and docstrings) with the actual return type would make the API clearer.
- The list of industry carriers in `Final_Energy_by_Sector__Industry` is hard-coded inside the function; consider extracting it to a shared constant or config so it can be reused or adjusted without modifying the function logic.

## Individual Comments

### Comment 1
<location path="pypsa_validation_processing/statistics_functions.py" line_range="22-24" />
<code_context>
 def Final_Energy_by_Carrier__Electricity(
     n: pypsa.Network,
 ) -> pd.DataFrame:
-    """Extract electricity final energy from a PyPSA NetworkCollection.
+    """Extract electricity final energy from a PyPSA Network.
</code_context>
<issue_to_address>
**issue:** The return type annotation conflicts with the documented/actual return type for the electricity final energy function.

The implementation and docstring show this returns a `pd.Series` with a `MultiIndex` on `country` and `unit`, but the signature says `-> pd.DataFrame`. Please either update the return annotation to `pd.Series` or adjust the function to actually return a `DataFrame` so the types are consistent.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Copy link
Copy Markdown
Collaborator

@pworschischek-aggmag pworschischek-aggmag left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the architecture. I think having one function per variable is easy to undersand -> hence great.

I left a long comment on a "gas for industry" gotcha I remember from writing sankey.py. Apart from that, I think its the way to go!

@@ -66,24 +57,21 @@

def Final_Energy_by_Sector__Transportation(
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about EV charger losses here (notspeaking of V2G, but EV batteries)?

until now I always considered FED as "whats metered at customer". Charger losses would become metered I guess.

direction="withdrawal",
direction="withdrawal", # for positive values
)
.groupby(["country", "unit"])
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we want to go NUTS - pun intended - we'd need "location" in the groupby. Lets discuss this. I think Daniel has a module prepared that aggregates NUTS levels

carriers = [
"coal for industry",
"industry electricity",
"gas for industry",
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remember "gas for industry" being a special case:

# n.statistics.energy_balance(groupby=["carrier", "bus_carrier"], bus_carrier="gas for industry", aggregate_across_components=False)
component  carrier              bus_carrier     
Link       gas for industry     gas for industry    1.320132e+08
           gas for industry CC  gas for industry    4.345682e+07
Load       gas for industry     gas for industry   -1.754700e+08

the gas for industry CC Link has efficiencies < 1.0:

# n.links.filter(like="gas for industry CC", axis=0).filter(regex="bus|eff").iloc[0, :].T
bus0                        AL gas
bus1           AL gas for industry
efficiency                     0.9
bus4                              
efficiency4                    1.0
bus3                 AL co2 stored
efficiency3                 0.1881
bus2                co2 atmosphere
efficiency2                 0.0099
Name: AL gas for industry CC-2050, dtype: object

--> the captured CO2 is sequestered and not part of the Load withdrawal.

Following the mentioned logic that defines FED as "whats metered" and assuming that consumers need to pay additional gas for CCS, we'd need sum the gas for industry and gas for industry CC Link withdrawal from gas bus instead:

# n.statistics.withdrawal(groupby=["carrier", "bus_carrier"], bus_carrier="gas", carrier=["gas for industry", "gas for industry CC"], aggregate_across_components=False)
component  carrier              bus_carrier
Link       gas for industry     gas            1.320132e+08
           gas for industry CC  gas            4.828536e+07

sum = 180298535.78097
# n.statistics.energy_balance(groupby=["carrier", "bus_carrier"], bus_carrier="gas for industry", aggregate_across_components=False, comps="Load")
carrier           bus_carrier     
gas for industry  gas for industry   -175470000.0
180298535.78097 / 1e6 - 175470000.0 / 1e6 = 4.82   # TWh

Bottom line is this:
Loads cannot always be used as FED directly if Links feeding their buses have efficiencies < 1.0


Final Energy [by Carrier]|Electricity: Final_Energy_by_Carrier__Electricity
Final Energy [by Sector]|Transportation: Final_Energy_by_Sector__Transportation
Final Energy [by Sector]|Industry: Final_Energy_by_Sector__Industry
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Title case breaks with Pythons convetion to use snake_case for function names. Is there a reason? just wondering

@@ -1,3 +1,3 @@
"""Statistics functions for PyPSA validation processing.

Each function in this module corresponds to one IAMC variable and extracts
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very useful and should be place prominently

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

New Variable Statistics: Final Energy [by Sector]|Industry

2 participants