Skip to content

viper9503/CMPSC497Final

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Scientific Approach Summarization with Flan-T5

Fine-tuning Flan-T5 to generate concise "approach" descriptions from scientific papers. Given a paper's Abstract + Introduction + Conclusion (AIC) text, the model produces a short, human-readable summary of the research approach.

CMPSC 497 Final Project — Manay Lodha


Motivation

Automatic generation of concise approach descriptions can greatly accelerate literature reviews and proposal writing. This project explores how well a medium-sized open-source LLM can learn this extreme summarization task from a domain-specific dataset.

Dataset

We use the SciTLDR dataset (5.4K TLDR summaries across 3.2K papers), specifically the AIC configuration. The expert-derived TLDR serves as the target summary.

Construction steps:

  1. Load and concatenate source sentences into a single prompt; concatenate target sentences into a single summary.
  2. Filter to pairs where the prompt has 20–300 words and the target has 10–100 words.
  3. Split into train / validation / test sets.
raw = load_dataset("allenai/scitldr", "AIC", split="train+validation+test")
Split Examples
Train ~4,320
Validation ~540
Test ~540
Total ~5,400

All examples are saved as JSONL under data/.

Methodology

Tokenization

  • Tokenizer: AutoTokenizer.from_pretrained("google/flan-t5-base")
  • Max prompt length: 128 tokens
  • Max target length: 512 tokens

Training

  • Base model: google/flan-t5-base (250M parameters)
  • Framework: HuggingFace Transformers Trainer, single GPU
Hyperparameter Value
Batch size 4
Learning rate 5 × 10⁻⁵
Epochs 3
FP16 True
Eval & save strategy per-epoch

Results

Quantitative

Metric Score
ROUGE-1 0.1841
ROUGE-2 0.0753
ROUGE-L 0.1470
ROUGE-Lsum 0.1489
Avg PPL 1,777,852.93

Qualitative Example

Prompt (truncated):

We introduce a new procedural dynamic system that can generate a variety of shapes that often appear as curves…

Generated:

We introduce a new procedural dynamic system that can generate a variety of shapes that often appear as curves… We introduce a new procedural dynamic system…

Reference:

A new, very simple dynamic system is introduced that generates pretty patterns; properties are proved and possibilities are explored.

The model exhibits near-verbatim copying of the input prompt rather than producing concise, abstractive summaries.

Discussion

  • Copying behavior: The model frequently echoes the input, leading to poor abstractive summarization.
  • High perplexity: The fine-tuned model assigns very low probability to reference summaries, suggesting over-reliance on the input distribution.
  • Possible causes:
    • Insufficient training epochs / dataset size for generalization.
    • Learning rate too high, causing early convergence to copying.
    • Lack of explicit instruction prompting (raw AIC text fed directly).

Future Work

  • Prompt engineering — prepend explicit instructions (e.g., "Summarize the methods in one sentence:") to guide abstraction.
  • Longer training / larger model — experiment with flan-t5-large or more epochs at a lower learning rate.
  • Regularization — apply label smoothing or dropout to mitigate copying.
  • Data augmentation — incorporate additional summarization resources (e.g., abstract-to-TLDR pairs).

Quick Start

# Install dependencies
pip install transformers datasets rouge-score

# Prepare the dataset
python prepare_data.py

# Fine-tune
python train.py

# Evaluate
python evaluate.py

License

This project is for academic purposes (CMPSC 497).

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages