Skip to content
This repository was archived by the owner on Mar 13, 2026. It is now read-only.

umanlp/GroupAppeals

Repository files navigation

Appeal, Align, Divide? Stance Detection for Group-Directed Messages in German Parliamentary Debates

This repository contains the code and resources for the paper Appeal, Align, Divide? Stance Detection for Group-Directed Messages in German Parliamentary Debates.

  • Title: Appeal, Align, Divide? Stance Detection for Group-Directed Messages in German Parliamentary Debates
  • Authors: Ines Rehbein, Maris Leander Buttmann, Julian Schlenker and Simone Paolo Ponzetto
  • Institutions: University of Münster, University of Mannheim
  • Supplementary Material: The pre-built database and other materials can be found here.

Requirements

Hardware

  • To run the local LLM (gemma-3-27b-it), a high-performance GPU is required. The minimum requirement is an NVIDIA H100 NVL GPU with 94 GB of VRAM.

Configuration

  1. API Keys: Create a secrets.json file in the project's root directory. It must contain your API keys in the following format:
    {
        "gemini_api_key": "YOUR_GEMINI_API_KEY",
        "huggingface_api_key": "YOUR_HUGGINGFACE_API_KEY"
    }
  2. Project Path: The scripts use relative paths assuming the project folder (stance-detection-german-llm) is placed directly in your system's home directory (e.g., ~/stance-detection-german-llm).

Getting Started

1. Database Setup

A database is required to run the classification scripts.

Recommended Method (Pre-built)

It is highly recommended to download the pre-built databases from here to save time.

Manual Method

Alternatively, you can build the database from scratch:

  1. Download the German parliamentary debates from here.
  2. Run the Jupyter notebook save_plenary_minutes.ipynb to parse the debates and build the initial database.

2. Group Mention Classification

Note: This step is only necessary if you wish to re-classify the group mentions. The pre-built database "debates_with_group_mentions" already includes these classifications.

  1. Download the fine-tuned classifier (bert-base-german-cased-finetuned-MOPE-L3_Run_3_Epochs_29) from the official MOPE repository.
  2. Create a models/ folder in the project's root directory and place the downloaded classifier inside it.
  3. Run the extraction script. The --reset_db argument must be passed to clear existing data from the relevant tables.
    python extract-group-mention/extract_group_mention.py --reset_db

Annotation Data

This section outlines the process for extracting, inserting, and evaluating annotation data.

Data Extraction

To extract a sample of data for annotation, run the following script:

python data-processing/extract_annotation_data.py

Data Insertion

To insert manually annotated data back into the database:

  1. Place the annotators' completed files into the /data/annotated_data/ folder.
  2. Run the processing script with the --reset_db argument. This will reset the corresponding tables before inserting the new data.
    python data-processing/process_annotated_data.py --reset_db

Running Stance Detection Models

  • Important: Before running a new inference, you may need to manually delete previous predictions from the database (e.g., DELETE FROM predictions WHERE [CONDITION]'). This is the recommended way, to avoid deleting prior results.
  • The whole database for predictions can be reset with running:
    inference/build_inference_table.py --reset_predictions

Gemini-2.5-Pro

Inference on the test set is designed to be run in parallel for different configurations.

  1. Run Inference: Call the script from the CLI for each prompt_type and technique combination.

    python inference/gemini_inference.py --api-key=YOUR_GEMINI_API_KEY --prompt-type=it-thinking_guideline_higher_standards --technique=zero-shot

    (Available prompt types can be found in inference/inference_helper.py)

  2. Insert Results: After the script generates a CSV output file, insert the results into the database using the following scrip:

    python inference/insert_gemini_predictions.py

Gemma-27b-it (Example Local Model)

To run inference using the local Gemma-27b-it model, execute the script:

python inference/gemma_27b_it_inference.py

Evaluate Results

  1. Run the following script with to evaluate the LLM predictions:
    python evaluation/evaluation_script.py
  2. Inspect the results in the database, via the table "evaluation_results"

Customization & Known Issues

Adding New Prompt Types

New prompt types for the models can be added by modifying the inference/inference_helper.py script.

Known Issues

  • The inference_helper.py script is complex and could be improved. It is recommended to refactor it to dynamically parse prompts from a structured file (e.g., a JSON file) to better manage the different prompt_type and technique combinations.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors