Scientific Approach Summarization with Flan-T5

Fine-tuning Flan-T5 to generate concise "approach" descriptions from scientific papers. Given a paper's Abstract + Introduction + Conclusion (AIC) text, the model produces a short, human-readable summary of the research approach.

CMPSC 497 Final Project — Manay Lodha

Motivation

Automatic generation of concise approach descriptions can greatly accelerate literature reviews and proposal writing. This project explores how well a medium-sized open-source LLM can learn this extreme summarization task from a domain-specific dataset.

Dataset

We use the SciTLDR dataset (5.4K TLDR summaries across 3.2K papers), specifically the AIC configuration. The expert-derived TLDR serves as the target summary.

Construction steps:

Load and concatenate source sentences into a single prompt; concatenate target sentences into a single summary.
Filter to pairs where the prompt has 20–300 words and the target has 10–100 words.
Split into train / validation / test sets.

raw = load_dataset("allenai/scitldr", "AIC", split="train+validation+test")

Split	Examples
Train	~4,320
Validation	~540
Test	~540
Total	~5,400

All examples are saved as JSONL under data/.

Methodology

Tokenization

Tokenizer: AutoTokenizer.from_pretrained("google/flan-t5-base")
Max prompt length: 128 tokens
Max target length: 512 tokens

Training

Base model: google/flan-t5-base (250M parameters)
Framework: HuggingFace Transformers Trainer, single GPU

Hyperparameter	Value
Batch size	4
Learning rate	5 × 10⁻⁵
Epochs	3
FP16	True
Eval & save strategy	per-epoch

Results

Quantitative

Metric	Score
ROUGE-1	0.1841
ROUGE-2	0.0753
ROUGE-L	0.1470
ROUGE-Lsum	0.1489
Avg PPL	1,777,852.93

Qualitative Example

Prompt (truncated):

We introduce a new procedural dynamic system that can generate a variety of shapes that often appear as curves…

Generated:

We introduce a new procedural dynamic system that can generate a variety of shapes that often appear as curves… We introduce a new procedural dynamic system…

Reference:

A new, very simple dynamic system is introduced that generates pretty patterns; properties are proved and possibilities are explored.

The model exhibits near-verbatim copying of the input prompt rather than producing concise, abstractive summaries.

Discussion

Copying behavior: The model frequently echoes the input, leading to poor abstractive summarization.
High perplexity: The fine-tuned model assigns very low probability to reference summaries, suggesting over-reliance on the input distribution.
Possible causes:
- Insufficient training epochs / dataset size for generalization.
- Learning rate too high, causing early convergence to copying.
- Lack of explicit instruction prompting (raw AIC text fed directly).

Future Work

Prompt engineering — prepend explicit instructions (e.g., "Summarize the methods in one sentence:") to guide abstraction.
Longer training / larger model — experiment with flan-t5-large or more epochs at a lower learning rate.
Regularization — apply label smoothing or dropout to mitigate copying.
Data augmentation — incorporate additional summarization resources (e.g., abstract-to-TLDR pairs).

Quick Start

# Install dependencies
pip install transformers datasets rouge-score

# Prepare the dataset
python prepare_data.py

# Fine-tune
python train.py

# Evaluate
python evaluate.py

License

This project is for academic purposes (CMPSC 497).

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
CMPSC 497 Final Project.pdf		CMPSC 497 Final Project.pdf
README.md		README.md
data_prep.py		data_prep.py
requirements.txt		requirements.txt
run_eval.py		run_eval.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Scientific Approach Summarization with Flan-T5

Motivation

Dataset

Methodology

Tokenization

Training

Results

Quantitative

Qualitative Example

Discussion

Future Work

Quick Start

License

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Scientific Approach Summarization with Flan-T5

Motivation

Dataset

Methodology

Tokenization

Training

Results

Quantitative

Qualitative Example

Discussion

Future Work

Quick Start

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages