Skip to content

OpenDCAI/Paper2Any

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

713 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Paper2Any Logo

Paper2Any

Python License GitHub Repo Stars

English | δΈ­ζ–‡

OpenDCAI%2FPaper2Any | Trendshift

✨ Focus on paper multimodal workflows: from paper PDFs/screenshots/text to one-click generation of model diagrams, technical roadmaps, experimental plots, and slide decks ✨

| πŸ“„ Universal File Support Β |Β  🎯 AI-Powered Generation Β |Β  🎨 Custom Styling Β |Β  ⚑ Lightning Speed |


Quickstart Online Demo Docs Contributing WeChat

Paper2Any Web Interface

πŸ“‘ Table of Contents


πŸ”₯ News

Tip

πŸ†• 2026-03-28 Β· Editable PPT Showcase Refresh
Added two new editable PPT showcase screenshots for the frontend-deck workflow:
a generated multi-slide gallery view and the canvas editing workspace with deck theme lock.

Tip

πŸ†• 2026-03-26 Β· Workflow Showcase Update
Added showcase coverage for Paper2Video, Paper2Poster, and Paper2Citation.
The README now includes a compressed video demo plus refreshed English/Chinese workflow previews.

Tip

πŸ†• 2026-02-02 Β· Paper2Rebuttal
Added rebuttal drafting support with structured response guidance and image-aware revision prompts.

Tip

πŸ†• 2026-01-28 Β· Drawio Update
Added Drawio support for visual diagram creation and showcase-ready outputs in the workflow.
KB updates in one line: multi-file PPT generation with doc convert/merge, optional image injection, and embedding-assisted retrieval.

Tip

πŸ†• 2026-01-25 Β· New Features
Added AI-assisted outline editing, three-layer model configuration system for flexible model selection, and user points management with daily quota allocation.
🌐 Online Demo: http://dcai-paper2any.nas.cpolar.cn/

Tip

πŸ†• 2026-01-20 Β· Bug Fixes
Fixed bugs in experimental plot generation (image/text) and resolved the missing historical files issue.
🌐 Online Demo: http://dcai-paper2any.nas.cpolar.cn/

Tip

πŸ†• 2026-01-14 Β· Feature Updates & Backend Architecture Upgrade

  1. Feature Updates: Added Image2PPT, optimized Paper2Figure interaction, and improved PDF2PPT effects.
  2. Standardized API: Refactored backend interfaces with RESTful /api/v1/ structure, removing obsolete endpoints for better maintainability.
  3. Dynamic Configuration: Supported dynamic model selection (e.g., GPT-4o, Qwen-VL) via API parameters, eliminating hardcoded model dependencies.
    🌐 Online Demo: http://dcai-paper2any.nas.cpolar.cn/
  • 2025-12-12 Β· Paper2Figure Web public beta is live
  • 2025-10-01 Β· Released the first version 0.1.0

✨ Core Features

From paper PDFs / images / text to editable scientific figures, slide decks, video scripts, academic posters, and other multimodal content in one click.

Paper2Any currently includes the following sub-capabilities:

  • πŸ“Š Paper2Figure - Editable Scientific Figures: Model architecture diagrams, technical roadmaps (PPT + SVG), and experimental plots with editable PPTX output.
  • 🧩 Paper2Diagram / Image2Drawio - Editable Diagrams: Generate draw.io diagrams from paper/text or images, with drawio/png/svg export and chat-based edits.
  • 🎬 Paper2PPT - Editable Slide Decks: Paper/text/topic to PPT, long-doc support, and built-in table/figure extraction.
  • πŸ“ Paper2Rebuttal: Draft structured rebuttals and revision responses with claims-to-evidence grounding.
  • πŸ–ΌοΈ PDF2PPT - Layout-Preserving Conversion: Accurate layout retention for PDF β†’ editable PPTX.
  • πŸ–ΌοΈ Image2PPT - Image to Slides: Convert images or screenshots into structured slides.
  • 🎨 PPTPolish - Smart Beautification: AI-based layout optimization and style transfer.
  • 🎬 Paper2Video: Generate video scripts and narration assets.
  • πŸ–ΌοΈ Paper2Poster - Academic Poster: Turn paper PDFs into poster-ready layouts with configurable sections, logos, and export assets.
  • πŸ”Ž Paper2Citation - Citation Explorer: Track citing authors, institutions, and notable downstream works from author names or DOI/paper URLs.
  • πŸ“ Paper2Technical: Produce technical reports and method summaries.
  • πŸ“š Knowledge Base (KB): Ingest/embedding, semantic search, and KB-driven PPT/podcast/mindmap generation.

πŸ“Έ Showcase

🧩 Drawio


✨ Upload a paper figure or screenshot as the starting point

✨ Keep the source structure visible before conversion

✨ Convert the image into an editable DrawIO canvas


✨ Generate a model or system diagram directly inside the DrawIO workbench

✨ Refine the generated architecture with chat editing and export-ready layout

πŸ“ Paper2Rebuttal: Rebuttal Drafting



✨ Rebuttal drafting and revision support

πŸ“Š Paper2Figure: Scientific Figure Generation



✨ Model Architecture Diagram Generation

✨ Model Architecture Diagram Generation




✨ Technical roadmap workbench: choose route type, input source, model config, and visual template

✨ Generated technical roadmap figure with structured dual-column layout




✨ Experimental Plot Generation (Multiple Styles)


🎬 Paper2PPT: Paper to Presentation


✨ End-to-end PPT generation demo

✨ Paper / text / topic to polished slide deck


✨ Edit slide text directly on canvas while keeping the deck theme locked

✨ Review the generated multi-page gallery before export


✨ AI-assisted outline refinement with targeted rewrite prompts

✨ Structured outline editing down to section and bullet detail




✨ Long document support for 40+ slides · Intelligent table extraction and insertion · Version history and iterative deck management

🎬 Paper2Video: PPT to Narrated Video



✨ PPT / PDF to narrated video with script confirmation, Aliyun TTS voices, and downloadable output

πŸ–ΌοΈ Paper2Poster: Paper to Poster



PNG poster result

PPT poster result

✨ Paper PDF to academic poster with configurable layout, editable poster output, and one-click export

πŸ”Ž Paper2Citation: Citation Explorer



✨ Search authors or papers to inspect citation candidates, institutions, and downstream citation context

🎨 PPT Smart Beautification



✨ AI-based Layout Optimization

✨ AI-based Layout Optimization & Style Transfer

πŸ–ΌοΈ PDF2PPT: Layout-Preserving Conversion



✨ Intelligent Cutout & Layout Preservation

✨ Image2PPT

πŸš€ Quick Start

Requirements

Python pip

.env Modes

Paper2Any now supports two configuration styles:

  • Simple mode: use *.env.simple.example. Recommended for most self-hosted users.
  • Advanced mode: use *.env.example. Use this only when you need workflow-specific model/provider overrides.

Quick choice:

cp fastapi_app/.env.simple.example fastapi_app/.env
cp frontend-workflow/.env.simple.example frontend-workflow/.env

If you need fine-grained workflow overrides instead:

cp fastapi_app/.env.example fastapi_app/.env
cp frontend-workflow/.env.example frontend-workflow/.env
🐳 Docker (Recommended) β€” Deployment & Updates
# 1. Clone
git clone https://github.com/OpenDCAI/Paper2Any.git
cd Paper2Any

# 2. Configure environment variables
cp fastapi_app/.env.simple.example fastapi_app/.env
cp frontend-workflow/.env.simple.example frontend-workflow/.env
cp deploy/docker.env.example deploy/docker.env

Required configuration:

fastapi_app/.env (backend):

# Internal API auth key. Must match frontend VITE_API_KEY.
BACKEND_API_KEY=your-backend-api-key

# Recommended: let backend own all workflow model choices
APP_BILLING_MODE=free
PAPER2ANY_CONFIG_MODE=simple

# Required: unified text entry
SIMPLE_TEXT_API_URL=https://your-text-gateway/v1
SIMPLE_TEXT_API_KEY=your_text_key

# Optional but recommended: unified image entry
SIMPLE_IMAGE_API_URL=https://your-image-gateway
SIMPLE_IMAGE_API_KEY=your_image_key

# Optional: DrawIO OCR / VLM service
SIMPLE_OCR_API_URL=https://dashscope.aliyuncs.com/compatible-mode/v1
SIMPLE_OCR_API_KEY=your_dashscope_key

# Optional: MinerU official remote API
MINERU_API_BASE_URL=https://mineru.net/api/v4
MINERU_API_KEY=your_mineru_api_key

# Optional: SAM3 segmentation service for PDF2PPT / Image2PPT / Image2Drawio
# SAM3_SERVER_URLS=http://GPU_MACHINE_IP:8001
# SAM3_SERVER_URLS=http://GPU1:8021,http://GPU2:8022

# Optional: Supabase (skip for no auth β€” core features still work)
# SUPABASE_URL=https://your-project-id.supabase.co
# SUPABASE_ANON_KEY=your_supabase_anon_key

frontend-workflow/.env (frontend):

# Must match BACKEND_API_KEY in fastapi_app/.env
VITE_API_KEY=your-backend-api-key

# Usually keep VITE_API_BASE_URL empty in Docker, because nginx proxies /api and /outputs
VITE_API_BASE_URL=

# Frontend display defaults only
VITE_DEFAULT_LLM_API_URL=https://your-text-gateway/v1
VITE_DEFAULT_LLM_MODEL=gpt-4o

# Optional: Supabase (keep consistent with backend)
# VITE_SUPABASE_URL=https://your-project-id.supabase.co
# VITE_SUPABASE_ANON_KEY=your_supabase_anon_key

deploy/docker.env (compose overrides):

BACKEND_PORT=8000
FRONTEND_PORT=3000
DOCKER_APP_WORKERS=1

# Optional: enable local SAM3 container by running DOCKER_WITH_SAM3=1 bash deploy/docker-up.sh
SAM3_PORT=8021
SAM3_SERVER_URLS=
# 3. Build + run
bash deploy/docker-up.sh

Open:

GPU services note: Docker starts backend + frontend by default.

  • Paper2PPT, Paper2Figure, Knowledge Base, etc. only need LLM APIs and work out of the box.
  • PDF2PPT, Image2PPT, Image2Drawio require SAM3 segmentation.
  • You can either point backend .env to an external SAM3 service with SAM3_SERVER_URLS=..., or start the optional local SAM3 compose profile:
    DOCKER_WITH_SAM3=1 bash deploy/docker-up.sh

See the "Advanced: Local Model Server Load Balancing" section below for details.

Modify & update:

  • After changing code or .env, rebuild: bash deploy/docker-up.sh
  • Pull latest code and rebuild:
    • git pull
    • bash deploy/docker-up.sh

Common commands:

  • View logs: bash deploy/docker-logs.sh
  • Stop services: bash deploy/docker-down.sh
  • Build only: bash deploy/docker-build.sh

Notes:

  • The first build may take a while (system deps + Python deps).
  • Frontend env is baked at build time. If you change frontend-workflow/.env or deploy/docker.env, rebuild with bash deploy/docker-up.sh.
  • Outputs/models are mounted to the host (./outputs, ./models) for persistence.

🐧 Linux Installation

We recommend using Conda to create an isolated environment (Python 3.11).

1. Create Environment & Install Base Dependencies

# 0. Create and activate a conda environment
conda create -n paper2any python=3.11 -y
conda activate paper2any

# 1. Clone repository
git clone https://github.com/OpenDCAI/Paper2Any.git
cd Paper2Any

# 2. Install base dependencies
pip install -r requirements-base.txt

# 3. Install in editable (dev) mode
pip install -e .

2. Install Paper2Any-specific Dependencies (Required)

Paper2Any involves LaTeX rendering, vector graphics processing as well as PPT/PDF conversion, which require extra dependencies.

The dependency boundary is now:

  • requirements-base.txt: shared cross-platform Python runtime
  • requirements-paper.txt: paper / PDF / figure extras
  • requirements-cu12.txt: NVIDIA CUDA 12 Linux GPU extras
  • requirements-system-ubuntu.txt: Ubuntu/Debian system packages, not Python packages
# 1. Paper / PDF / figure Python extras
pip install -r requirements-paper.txt

# 2. NVIDIA GPU runtime extras (Linux + CUDA 12 only)
pip install -r requirements-cu12.txt

# 3. LaTeX engine (tectonic) - recommended via conda
conda install -c conda-forge tectonic -y

# 4. Resolve doclayout_yolo dependency conflicts (Important)
pip install doclayout_yolo --no-deps

# 5. System dependencies (Ubuntu example; full list is mirrored in requirements-system-ubuntu.txt)
sudo apt-get update
sudo apt-get install -y ffmpeg inkscape libreoffice poppler-utils wkhtmltopdf

Important

ffmpeg, libreoffice/soffice, inkscape, poppler-utils, wkhtmltopdf, and tectonic are external system tools. They are not installed by pip, and deploy/start*.sh does not auto-install them.

3. Environment Variables

export DF_API_KEY=your_api_key_here
export DF_API_URL=xxx  # Optional: if you need a third-party API gateway
export MINERU_DEVICES="0,1,2,3" # Optional: MinerU task GPU resource pool

Tip

πŸ“š For detailed configuration guide, see Configuration Guide for step-by-step instructions on configuring models, environment variables, and starting services.

4. Configure Environment Files (Optional)

πŸ“ Click to expand: Detailed .env Configuration Guide

Paper2Any uses two .env files for configuration. Both are optional - you can run the application without them using default settings.

Step 1: Copy Example Files
# Copy backend environment file
cp fastapi_app/.env.example fastapi_app/.env

# Copy frontend environment file
cp frontend-workflow/.env.example frontend-workflow/.env
Step 2: Backend Configuration (fastapi_app/.env)

Supabase (Optional) - Only needed if you want user authentication and cloud storage:

SUPABASE_URL=https://your-project-id.supabase.co
SUPABASE_ANON_KEY=your_supabase_anon_key

Model Configuration - Customize which models to use for different workflows:

# Default LLM API URL
DEFAULT_LLM_API_URL=http://123.129.219.111:3000/v1/

# Workflow-level defaults
PAPER2PPT_DEFAULT_MODEL=gpt-5.1
PAPER2PPT_DEFAULT_IMAGE_MODEL=gemini-3-pro-image-preview
PDF2PPT_DEFAULT_MODEL=gpt-4o
# ... see .env.example for full list

Service Integration Configuration - External or local services used by image/PDF workflows:

# DrawIO OCR / VLM
PAPER2DRAWIO_OCR_API_URL=https://dashscope.aliyuncs.com/compatible-mode/v1
PAPER2DRAWIO_OCR_API_KEY=your_dashscope_key

# MinerU official remote API; if MINERU_API_KEY is empty, backend falls back to local MINERU_PORT
MINERU_API_BASE_URL=https://mineru.net/api/v4
MINERU_API_KEY=your_mineru_api_key
MINERU_API_MODEL_VERSION=vlm

# SAM3 segmentation service for PDF2PPT / Image2PPT / Image2Drawio
# One endpoint:
SAM3_SERVER_URLS=http://127.0.0.1:8001
# Or multiple endpoints for load balancing:
# SAM3_SERVER_URLS=http://127.0.0.1:8021,http://127.0.0.1:8022
Step 3: Frontend Configuration (frontend-workflow/.env)

LLM Provider Configuration - Controls the API endpoint dropdown in the UI:

# Default API URL shown in the UI
VITE_DEFAULT_LLM_API_URL=https://api.apiyi.com/v1

# Available API URLs in the dropdown (comma-separated)
VITE_LLM_API_URLS=https://api.apiyi.com/v1,http://b.apiyi.com:16888/v1,http://123.129.219.111:3000/v1

What happens when you modify VITE_LLM_API_URLS:

  • The frontend will display a dropdown menu with all URLs you specify
  • Users can select different API endpoints without manually typing URLs
  • Useful for switching between OpenAI, local models, or custom API gateways

Supabase (Optional) - Uncomment these lines if you want user authentication:

VITE_SUPABASE_URL=https://your-project.supabase.co
VITE_SUPABASE_ANON_KEY=your-anon-key
Running Without Supabase

If you skip Supabase configuration:

  • βœ… All core features work normally
  • βœ… CLI scripts do not require Supabase
  • ❌ No user authentication
  • ❌ No cloud account features such as points, redeem, invite, and history
  • ❌ No cloud file storage

Note

Quick Start: You can skip the .env configuration entirely and use CLI scripts directly with --api-key parameter. See CLI Scripts section below.


Advanced Configuration: Local Model Service Load Balancing

If you are deploying in a high-concurrency local environment, you can use script/start_model_servers.sh to start a local model service cluster (MinerU / SAM / OCR).

Script location: /DataFlow-Agent/script/start_model_servers.sh

Main configuration items:

  • MinerU (PDF Parsing)

    • MINERU_MODEL_PATH: Model path (default models/MinerU2.5-2509-1.2B)
    • MINERU_GPU_UTIL: GPU memory utilization (default 0.85)
    • Instance configuration: By default, one instance is started on each configured GPU, ports 8011-8013.
    • Load Balancer: Port 8010, automatically dispatches requests.
  • SAM3 (Segment Anything Model 3)

    • Instance configuration: By default, one instance per configured GPU, ports 8021-8022.
    • Model assets: default paths are ./models/sam3/sam3.pt and ./models/sam3/bpe_simple_vocab_16e6.txt.gz.
    • Load Balancer: Port 8020.
  • OCR (PaddleOCR)

    • Config: Runs on CPU, uses uvicorn's worker mechanism (4 workers by default).
    • Port: 8003.

Before using, please modify gpu_id and the number of instances in the script according to your actual GPU count and memory.

For local one-command development test on a single GPU (SAM3 + backend + frontend), run:

bash script/start_local_sam3_dev.sh

πŸͺŸ Windows Installation

Note

We currently recommend trying Paper2Any on Linux / WSL. If you need to deploy on native Windows, please follow the steps below.

1. Create Environment & Install Base Dependencies

# 0. Create and activate a conda environment
conda create -n paper2any python=3.12 -y
conda activate paper2any

# 1. Clone repository
git clone https://github.com/OpenDCAI/Paper2Any.git
cd Paper2Any

# 2. Install base dependencies
pip install -r requirements-win-base.txt

# 3. Install in editable (dev) mode
pip install -e .

2. Install Paper2Any-specific Dependencies (Recommended)

Paper2Any involves LaTeX rendering and vector graphics processing, which require extra dependencies:

# Python dependencies
pip install -r requirements-paper.txt

# NVIDIA GPU runtime extras (Linux only; skip on Windows)
# pip install -r requirements-cu12.txt

# tectonic: LaTeX engine (recommended via conda)
conda install -c conda-forge tectonic -y

🎨 Install Inkscape (SVG/Vector Graphics Processing | Recommended/Required)

  1. Download and install (Windows 64-bit MSI): Inkscape Download
  2. Add the Inkscape executable directory to the system environment variable Path (example): C:\Program Files\Inkscape\bin\

Tip

After configuring the Path, it is recommended to reopen the terminal (or restart VS Code / PowerShell) to ensure the environment variables take effect.

⚑ Install Windows Build of vLLM (Optional | For Local Inference Acceleration)

Release page: vllm-windows releases
Recommended version: 0.11.0

pip install vllm-0.11.0+cu124-cp312-cp312-win_amd64.whl

Important

Please make sure the .whl matches your current environment:

  • Python: cp312 (Python 3.12)
  • Platform: win_amd64
  • CUDA: cu124 (must match your local CUDA / driver)

Launch Application

Paper2Any - Paper Workflow Web Frontend (Recommended)

# Recommended one-click entrypoint on NVIDIA machines
bash deploy/start_nv.sh

Default local addresses:

Useful local deploy commands:

  • Start full stack (recommended): bash deploy/start_nv.sh
  • Start backend only after loading a deploy profile: set -a && source deploy/profiles/nv.env && set +a && bash deploy/start.sh
  • Stop backend: ./deploy/stop.sh
  • Restart backend: ./deploy/restart.sh

Notes:

  • deploy/start.sh reads deploy/app_config.sh, but it does not load deploy/profiles/*.env by itself.
  • deploy/start_nv.sh is the safe one-click entrypoint because it loads deploy/profiles/nv.env, prepares local models, starts model servers, then starts backend and frontend.
  • If you change APP_PORT, update the frontend proxy target in frontend-workflow/vite.config.ts as well.

Configure Frontend Proxy

Modify server.proxy in frontend-workflow/vite.config.ts:

export default defineConfig({
  plugins: [react()],
  server: {
    port: 3000,
    open: true,
    allowedHosts: true,
    proxy: {
      '/api': {
        target: 'http://127.0.0.1:8000',  // FastAPI backend address
        changeOrigin: true,
      },
      '/outputs': {
        target: 'http://127.0.0.1:8000',
        changeOrigin: true,
      },
    },
  },
})

Visit http://localhost:3000.

Windows: Load MinerU Pre-trained Model

# Start in PowerShell
vllm serve opendatalab/MinerU2.5-2509-1.2B `
  --host 127.0.0.1 `
  --port 8010 `
  --logits-processors mineru_vl_utils:MinerULogitsProcessor `
  --gpu-memory-utilization 0.6 `
  --trust-remote-code `
  --enforce-eager

Launch Application

🎨 Web Frontend (Recommended)

# Recommended one-click entrypoint on NVIDIA machines
bash deploy/start_nv.sh

Visit http://localhost:3000. Backend health is available at http://127.0.0.1:8000/health by default.


πŸ–₯️ CLI Scripts (Command-Line Interface)

Paper2Any provides standalone CLI scripts that accept command-line parameters for direct workflow execution without requiring the web frontend/backend.

Environment Variables

Configure API access via environment variables (optional):

export DF_API_URL=https://api.openai.com/v1  # LLM API URL
export DF_API_KEY=sk-xxx                      # API key
export DF_MODEL=gpt-4o                        # Default model

Available CLI Scripts

1. Paper2Figure CLI - Generate scientific figures (3 types)

# Generate model architecture diagram from PDF
python script/run_paper2figure_cli.py \
  --input paper.pdf \
  --graph-type model_arch \
  --api-key sk-xxx

# Generate technical roadmap from text
python script/run_paper2figure_cli.py \
  --input "Transformer architecture with attention mechanism" \
  --input-type TEXT \
  --graph-type tech_route

# Generate experimental data visualization
python script/run_paper2figure_cli.py \
  --input paper.pdf \
  --graph-type exp_data

Graph types: model_arch (model architecture), tech_route (technical roadmap), exp_data (experimental plots)

2. Paper2PPT CLI - Convert papers to PPT presentations

# Basic usage
python script/run_paper2ppt_cli.py \
  --input paper.pdf \
  --api-key sk-xxx \
  --page-count 15

# With custom style
python script/run_paper2ppt_cli.py \
  --input paper.pdf \
  --style "Academic style; English; Modern design" \
  --language en

3. PDF2PPT CLI - One-click PDF to editable PPT

# Basic conversion (no AI enhancement)
python script/run_pdf2ppt_cli.py --input slides.pdf

# With AI enhancement
python script/run_pdf2ppt_cli.py \
  --input slides.pdf \
  --use-ai-edit \
  --api-key sk-xxx

4. Image2PPT CLI - Convert images to editable PPT

# Basic conversion
python script/run_image2ppt_cli.py --input screenshot.png

# With AI enhancement
python script/run_image2ppt_cli.py \
  --input diagram.jpg \
  --use-ai-edit \
  --api-key sk-xxx

5. PPT2Polish CLI - Beautify existing PPT files

# Basic beautification
python script/run_ppt2polish_cli.py \
  --input old_presentation.pptx \
  --style "Academic style, clean and elegant" \
  --api-key sk-xxx

# With reference image for style consistency
python script/run_ppt2polish_cli.py \
  --input old_presentation.pptx \
  --style "Modern minimalist style" \
  --ref-img reference_style.png \
  --api-key sk-xxx

Note

System Requirements for PPT2Polish:

  • LibreOffice: sudo apt-get install libreoffice (Ubuntu/Debian)
  • pdf2image: pip install pdf2image
  • poppler-utils: sudo apt-get install poppler-utils

Common Options

All CLI scripts support these common options:

  • --api-url URL - LLM API URL (default: from DF_API_URL env var)
  • --api-key KEY - API key (default: from DF_API_KEY env var)
  • --model NAME - Text model name (default: varies by script)
  • --output-dir DIR - Custom output directory (default: outputs/cli/{script_name}/{timestamp})
  • --help - Show detailed help message

For complete parameter documentation, run any script with --help:

python script/run_paper2figure_cli.py --help

πŸ“‚ Project Structure

Paper2Any/
β”œβ”€β”€ dataflow_agent/          # Core codebase
β”‚   β”œβ”€β”€ agentroles/         # Agent definitions
β”‚   β”‚   └── paper2any_agents/ # Paper2Any-specific agents
β”‚   β”œβ”€β”€ workflow/           # Workflow definitions
β”‚   β”œβ”€β”€ promptstemplates/   # Prompt templates
β”‚   └── toolkits/           # Toolkits (drawing, PPT generation, etc.)
β”œβ”€β”€ fastapi_app/            # Backend API service
β”œβ”€β”€ frontend-workflow/      # Frontend web interface
β”œβ”€β”€ static/                 # Static assets
β”œβ”€β”€ script/                 # Script tools
└── tests/                  # Test cases

πŸ—ΊοΈ Roadmap

Feature Status Sub-features
πŸ“Š Paper2Figure
Editable Scientific Figures
85% Done
Done
Done
Done
🧩 Paper2Diagram
Drawio Diagrams
80% Done
Done
Done
Done
🎬 Paper2PPT
Editable Slide Decks
70% Done
Done
Done
Done
Done
Done
πŸ–ΌοΈ PDF2PPT
Layout-Preserving Conversion
90% Done
Done
Done
πŸ–ΌοΈ Image2PPT
Image to Slides
85% Done
Done
🎨 PPTPolish
Smart Beautification
60% Done
In_Progress
In_Progress
πŸ“š Knowledge Base
KB Workflows
75% Done
Done
Done
🎬 Paper2Video
Video Script Generation
40% In_Progress
In_Progress

🀝 Contributing

We welcome all forms of contribution!

Issues Discussions PR


πŸ“„ License

This project is licensed under Apache License 2.0.


If this project helps you, please give us a ⭐️ Star!

GitHub stars GitHub forks


DataFlow-Agent WeChat Community
Scan to join the community WeChat group

❀️ Made with by OpenDCAI Team

About

Turn paper/text/topic into editable research figures, technical route diagrams, and presentation slides.

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors