🎬 Ultimate Image & Video Prompt Generator

Stop staring at blank text boxes.

13 guided categories. 7,000+ curated suggestions. Three AI models.
A prompt builder that turns guesswork into a creative process — on the web or in your terminal.

Neon UI · Particle effects · Flash Site Era energy · Zero blank-box anxiety

🚀 Try the Live Demo · Report Bug · Request Feature

Theatrical Intro

Neon loading screen with animated progress bar and particle effects

Model Selection

Choose between Nano Banana, DALL-E 3, and Kling video models

Guided Wizard

Step-by-step category wizard with curated suggestions per field

13 guided categories · 3 AI models · 7,000+ curated prompts · 429 tests / 2,200 assertions · 240 extracted patterns · 0 any types

Quick Start

# Web app (no API key needed — free tier included)
git clone https://github.com/DareDev256/Ultimate-Image-Video-Prompt-Generator.git
cd Ultimate-Image-Video-Prompt-Generator/web && npm install && npm run dev

# CLI tool
cd Ultimate-Image-Video-Prompt-Generator && bun install && bun run index.ts

Open localhost:3000 and start generating — the free tier gives you 10 Nano Banana generations per day, no key required.

How It Works

  ╭──────────────╮        ╭──────────────────╮        ╭───────────────────╮
  │  ① CHOOSE    │        │  ② BUILD         │        │  ③ GENERATE       │
  │              │        │                  │        │                   │
  │  Nano Banana │  ───▸  │  13 categories   │  ───▸  │  One click →      │
  │  DALL-E 3    │        │  with curated    │        │  assembled prompt │
  │  Kling video │        │  suggestions     │        │  hits the API     │
  ╰──────────────╯        ╰──────────────────╯        ╰───────────────────╯

💡 Quick Mode — Describe your idea in one sentence and let AI expand it into a full 13-category prompt automatically. Skip the wizard entirely.

Why This Exists

Most AI image tools give you a blank text box and wish you luck.

This project replaces the blank box with guided prompt engineering — 13 categories (subject, camera, lighting, atmosphere…) with curated suggestions per field, assembled into the exact format your chosen model expects. The result: prompts that are 10× more detailed than freehand, produced in a fraction of the time.

What Makes This Different

Structured enough to guide you. Flexible enough to not constrain you.

Diversity-aware randomization — a sliding-window exclusion algorithm (not naive Math.random()) ensures the "randomize" button always surfaces fresh suggestions. Algorithm deep-dive →
Model-aware output — the same wizard produces structured JSON for Gemini or natural language for DALL-E/Kling, automatically adapting to what each model expects
Composable architecture — 13 pure section generators composed via flatMap. Adding a new prompt category is one function + one array entry, zero touch points elsewhere
Actually tested — 429 tests prove invariants like "NL and JSON generators stay in sync on the same input" and "the randomizer never deadlocks regardless of pool size"

Two Platforms, One Pipeline

🌐 Web App — Visual Wizard

A step-by-step guided experience with theatrical animations, particle effects, and a neon UI. Choose your model, walk through 9-11 categories, preview your assembled prompt, and generate — all in the browser.

⌨️ CLI Tool — Terminal Power

An interactive terminal interface for rapid prompt building with presets, templates, favorites, and image-to-prompt reverse engineering. Built on Bun with @clack/prompts.

bun run index.ts                          # Interactive wizard
bun run index.ts --analyze photo.png      # Reverse-engineer a prompt from an image
bun run index.ts --template "Subway Flash" # Start from a built-in template
bun run index.ts --preset fashion         # Use a category pack
bun run index.ts --favorites list         # Manage favorite suggestions

Features

🎯 Guided Prompt Building

13 deep categories with 3-7 fields each — Subject, Camera, Fashion, Environment, Lighting, Atmosphere, Composition, Color, Film, Technical, Vibes, and more
Curated suggestions per field (8-10 hyper-specific options like "six thick rope braids radiating outward from skull")
Quick Mode — describe your idea in plain English, AI expands to full structured prompt
Diversity-aware Randomize — sliding-window exclusion algorithm tracks recent picks per field so consecutive clicks always surface fresh suggestions (see algorithm detail)
Keyboard navigation with smart focus detection

🤖 Multi-Model Generation

Model	Type	Prompt Format	Free Tier	Notes
Nano Banana (Gemini)	Image	Structured JSON	✅ 10/day	Instant response
DALL-E 3 (OpenAI)	Image	Natural language	BYOK only	Returns AI-revised prompt alongside your image
Kling	Video (5s/10s)	Natural language	BYOK only	Async polling — generates in up to 5 min

Free Tier — Try Nano Banana without an API key (10 generations/day, server-side Gemini)
BYOK — Bring Your Own Keys for unlimited use (keys stored in localStorage only)
Kling video generation uses a two-phase async pattern: POST to create task → poll every 5s until complete. Supports 16:9, 9:16, and 1:1 aspect ratios

💡 Inspiration Gallery & Showcase

1,180+ curated image prompts from the community + 5,600+ Nano Banana prompts
50+ video prompts for Veo3/Kling/Hailuo
Search and filter by tags (fashion, portrait, 3D, anime, etc.)
"Use as Template" to pre-fill the wizard from any community prompt
Pattern library — 240 extracted patterns across lighting, cameras, moods, color grades, and styles
Showcase — 30 editorially curated examples with full 13-category prompt breakdowns and image carousels

🎨 Generation Flow

Live preview of assembled prompt (JSON or natural language depending on model)
Animated generation progress with theatrical transitions
Love It / Tweak It / Remix result actions
Gallery to save and revisit creations
30 pre-generated Showcase examples + 113 AI-generated community examples

💰 API Key Pricing

Model	Provider	Cost per Image	Get a Key
Nano Banana	Google Gemini	~$0.03	ai.google.dev
DALL-E 3	OpenAI	~$0.04–0.12	platform.openai.com
Kling	Kling AI	Varies	klingai.com

Tech Stack

Layer	Web App	CLI
Framework	Next.js 16 (App Router)	Bun runtime
Language	TypeScript 5	TypeScript 5
Styling	Tailwind CSS v4	picocolors
Animations	Framer Motion 12	—
UI	React 19 + Lucide icons	@clack/prompts
State	React Context + localStorage	File-based (JsonStore)
APIs	Gemini, OpenAI, Kling	Gemini Vision (analyzer)

Getting Started

Prerequisites

Node.js 18+ (web) or Bun (CLI + web)
API keys are optional — the free tier works out of the box:
- Google AI Studio — Gemini / Nano Banana
- OpenAI Platform — DALL-E 3
- Kling AI — Video

Web App

git clone https://github.com/DareDev256/Ultimate-Image-Video-Prompt-Generator.git
cd Ultimate-Image-Video-Prompt-Generator/web
npm install
npm run dev

Open http://localhost:3000. Configure API keys at /settings or just try the free tier.

CLI Tool

cd Ultimate-Image-Video-Prompt-Generator
bun install
bun run index.ts

Environment Variables

Variable	Required	Scope	Description
`GEMINI_API_KEY`	No	Server	Enables the free tier (10 generations/day) for users without their own keys

User-provided API keys (Gemini, OpenAI, Kling) are entered in the browser at /settings and stored in localStorage only — they never touch the server.

Deploy Your Own

# Or manually
cd web && npm run build && npx vercel --prod

Add GEMINI_API_KEY as an environment variable in Vercel to enable the free tier for your users.

Architecture

Composable Section Pipeline

The prompt generator uses a functional pipeline where each prompt section is an independent pure function:

flowchart LR
    A[ImagePrompt] --> B["13 Section Functions<br/>(subject, hair, clothing,<br/>camera, environment...)"]
    B --> C[flatMap]
    C --> D{Output Format}
    D -->|Nano Banana| E[Structured JSON]
    D -->|DALL-E / Kling| F[Natural Language]

Sections can be composed, reordered, or extended without touching other sections. The natural language generator is just 18 lines — a flatMap over the section array.

Web App Page Flow

flowchart LR
    A["/ <br/>Theatrical Intro"] --> B["/create<br/>Model Selection"]
    B --> C["/create/[model]<br/>Wizard Steps"]
    C --> D["/preview<br/>Assembled Prompt"]
    C --> Q["/quick<br/>Freeform Mode"]
    D --> E["/generate<br/>API Call + Progress"]
    E --> F["/result<br/>Love It / Tweak It / Remix"]

State persists across all page transitions via React Context + localStorage sync, surviving Framer Motion route animations.

Diversity-Aware Randomization

Most "randomize" buttons use naive Math.random() — click three times, get the same suggestion twice. This project uses a sliding-window exclusion algorithm inspired by shuffle-play systems:

Click 1: pool=[A,B,C,D,E] recent=[]       → picks C → recent=[C]
Click 2: pool=[A,B,D,E]   recent=[C]      → picks A → recent=[C,A]
Click 3: pool=[B,D,E]     recent=[C,A]    → picks E → recent=[C,A,E]
Click 4: pool=[B,D]       recent=[C,A,E]  → picks D → recent=[C,A,E,D]
Click 5: pool=[B]          recent=[C,A,E,D]→ picks B → recent=[A,E,D,B]  ← window slides
Click 6: pool=[C]          recent=[A,E,D,B]→ picks C → recent=[E,D,B,C]  ← C is fresh again

Graceful fallback: when every option is in the recent window (small pools), exclusion is skipped and the full pool is used — the algorithm never deadlocks regardless of pool size vs window size.

API Layers

The API is designed as a progressive stack — each layer wraps the one below it, adding exactly one concern:

diversePick          pure pick, no side effects     ← tests, one-off sampling
    ↓
pickWithHistory      pick + state update in one call ← eliminates temporal coupling
    ↓
createPicker         stateful factory, per-key memory ← scripts, CLI, non-React
    ↓
useDiversePick       React hook (useRef state)        ← components

Usage examples

// ① Pure function — caller manages state
import { diversePick, pushRecent } from "@/lib/diverse-pick";

let recent: string[] = [];
const pick = diversePick(["A", "B", "C", "D", "E"], recent);  // never repeats recent
recent = pushRecent(recent, pick, 4);                          // slide window forward

// ② Combined pick + push — impossible to forget the state update
import { pickWithHistory } from "@/lib/diverse-pick";

let history: string[] = [];
const result = pickWithHistory(["A", "B", "C", "D", "E"], history, 4);
history = result.recent;  // { value: "C", recent: ["C"] }

// ③ Stateful factory — drop-in for non-React consumers
import { createPicker } from "@/lib/diverse-pick";

const pick = createPicker<string>(4);
pick("lighting.mood", ["golden hour", "overcast", "neon"]);   // per-key memory
pick("camera.angle",  ["low", "eye-level", "bird's eye"]);    // separate history

// ④ React hook — same guarantees, React lifecycle
const pick = useDiversePick<string>(4);
// identical call signature to createPicker — one migration path

Layer	File	Concern
Pure algorithm	`diverse-pick.ts`	`diversePick` · `pushRecent` · `parseFieldKey` — zero deps
Combined	`diverse-pick.ts`	`pickWithHistory` — pick + state update, no temporal coupling
Factory	`diverse-pick.ts`	`createPicker` — per-key state for scripts, tests, CLI
React binding	`useDiversePick.ts`	`useDiversePick` — `useRef` state, same call signature as `createPicker`
Composition	`WizardStep.tsx` / Quick Mode	`buildRandomPrompt` + `flattenPromptToText` — full 13-category assembly

Both the wizard and Quick Mode share the identical algorithm — Quick Mode composes buildRandomPrompt with the hook's picker for zero-duplication prompt assembly across all 13 categories in a single click. Non-React consumers use createPicker for the same guarantees without hooks.

Proven properties (71 tests)

Property	What the tests prove
Exclusion correctness	When alternatives exist, items in the recent window are never picked
Graceful degradation	When every option is recent (or recent is a superset), the algorithm falls back to the full pool — never deadlocks
Type generality	Works with strings, numbers, and frozen `readonly` arrays without mutation
Statistical diversity	Over 100 picks from 5 options, at least 3 distinct values appear (probabilistic smoke test)
Window sliding	`pushRecent` trims to `maxSize`, preserves immutability, handles edge windows (size=1, size=100)
Integration	Sliding window + picker composed together: no 3 consecutive identical picks over 12-round sequences
Key derivation	`buildRandomPrompt` derives output keys from field key prefixes (not category IDs), falling back to `category.id` only for empty categories
Round-trip fidelity	`buildRandomPrompt → flattenPromptToText` preserves all non-empty values through the pipeline, including unicode

Input Validation & Sanitization

All three API routes (/api/generate/nano-banana, openai, kling) share a centralized validation layer (web/src/lib/validation.ts):

Prompt length limit (10,000 chars) to prevent oversized payloads
Control character stripping (null bytes, C0/C1 range)
API key format validation (alphanumeric + limited special chars, max 256 chars)
Upstream error sanitization — third-party error details are never leaked to the client

Inline Documentation

All types, section generators, and output formatters are documented with TSDoc — including @example blocks, {@link} cross-references, and field-level descriptions for every property in the ImagePrompt type tree.

📁 Project Structure

.
├── .todoignore                   # Excludes node_modules etc. from code-debt scanning
├── web/                          # Next.js web app
│   ├── src/
│   │   ├── app/
│   │   │   ├── page.tsx          # Theatrical intro screen
│   │   │   ├── create/           # Model selection → wizard → preview → generate → result
│   │   │   ├── gallery/          # Saved creations
│   │   │   ├── showcase/         # 30 pre-generated examples
│   │   │   ├── settings/         # API key configuration
│   │   │   └── api/generate/     # API routes (nano-banana, openai, kling)
│   │   ├── components/
│   │   │   ├── wizard/           # WizardStep, WizardProgress
│   │   │   ├── effects/          # Canvas particle system
│   │   │   └── inspiration/      # Gallery panel, search, filters, cards
│   │   ├── context/              # WizardContext (state + persistence), SoundContext
│   │   ├── hooks/                # useLocalStorage, useDiversePick, useFavorites, useFreeTier, usePatterns, useInspirationData
│   │   └── lib/                  # Categories, diverse-pick, validation, sounds
│   └── public/data/              # Prompt library, patterns, showcase metadata
│
├── src/                          # CLI tool (Bun)
│   ├── index.ts                  # Entry point with arg parsing
│   ├── analyzer/                 # Gemini Vision — reverse-engineer prompts from images
│   ├── cli/                      # Terminal UI (prompts, display, menus, args)
│   ├── core/                     # 13 categories (1,019 lines), preset packs, templates
│   ├── generators/               # Composable section pipeline
│   │   ├── sections.ts           # 13 pure section generators
│   │   ├── natural.ts            # Natural language assembly (flatMap pipeline)
│   │   └── json.ts               # JSON output with recursive cleanup
│   ├── lib/                      # Shared utilities
│   │   ├── json-store.ts         # Generic typed JSON file storage
│   │   └── nested.ts             # Dot-notation path traversal utilities
│   ├── storage/                  # Config, presets, favorites (via JsonStore)
│   └── types/                    # ImagePrompt interface (12 nested types)
│
├── scripts/                      # Data pipeline (fetch, extract, translate, generate)
└── docs/                         # Design documents and plans

Testing

bun test

429 tests across 17 test files:

Module	Tests	Coverage
Section generators	56	All 13 pure functions — edge cases, dedup, fallback precedence
Template→pipeline integration	36	Every template through NL+JSON generators, merge behavior, data integrity
Diversity-aware randomization	71	`diversePick` exclusion, full-pool fallback, superset recent, statistical diversity proof, probabilistic fairness, pigeonhole coverage, reference equality semantics, duplicate filtering, large pool (1000 opts), `pushRecent` sliding window + immutability + boundary cases, `pickWithHistory` combined pick+push (5 tests), `createPicker` per-key isolation + sequential non-repeat (4 tests), `parseFieldKey` parsing (3 tests), `buildRandomPrompt` key derivation + merge behavior, empty-suggestions contract, field iteration order, `flattenPromptToText` empty/unicode/whitespace-only/insertion-order, window eviction proof, per-field history isolation, build→flatten round-trip, full-cycle multi-category simulation
Cross-cutting invariants	24	NL/JSON consistency, cleanObject edge cases, pipeline purity, parseArgs boundaries
CLI argument parser	22	All 15 flags, shorthands, pack splitting, subcommands
Input validation & sanitization	22	Prompt length/type/control-char stripping, API key format/injection defense
Pack/template registry	18	Composition, dedup, always-core invariant, uniqueness
Category data integrity	16	Unique names/emojis, field keys, suggestion validity
Gemini analyzer	14	Error paths, MIME detection, markdown stripping
Nested path utilities	13	Dot-notation get/set, missing paths, intermediate creation
Display text wrapping	11	Word boundaries, unicode, edge cases
JsonStore persistence	11	File I/O, defaults, deep-clone isolation, roundtripping
Prompt building	—	Ordering, nesting cleanup, unicode, JSON/NL consistency

Engineering Highlights

The stuff under the neon paint.

Area	What	Why it matters
Zero `any` types	Entire codebase uses `unknown` at serialization boundaries with type narrowing	Catches bugs at compile time that `any` would silently pass through — especially in the JSON serializer where nested data arrives as `unknown`
Composable pipeline	13 section generators are pure functions composed via `flatMap`	Adding a new prompt section is one function + one array entry — no touch points in existing code
Diversity-aware randomization	Sliding-window exclusion algorithm (detail) shared between wizard, Quick Mode, and headless consumers	`diversePick` → `pickWithHistory` → `createPicker` layered API: pure function, combined pick+push, and stateful factory. 71 tests prove exclusion, graceful degradation, probabilistic fairness, pigeonhole coverage, per-key isolation, merge correctness, and round-trip fidelity (proven properties)
Centralized input validation	Shared `validation.ts` with prompt sanitization, key format checks, and length limits	One place to audit, one place to fix — not scattered across 3 API routes
Single-source model registry	`MODEL_NAMES`, `MODEL_COLORS`, `isValidModel` in `lib/models.ts` + `useCopyToClipboard` hook	Adding a model or changing brand colors is a 1-file change — replaces 4× duplicated metadata maps and 3× clipboard boilerplate
Temporal coupling elimination	`pickWithHistory` combines pick + state update in one call; `createPicker` wraps it in a stateful factory	Impossible to forget the state update step — callers can't use `diversePick` without also calling `pushRecent`
Data-driven preset parsing	Replaced 5-branch `else if` chain with a `PRESET_FLAGS` lookup map	Adding a new preset is a one-line map entry instead of a new branch
429 tests / 2,200 assertions	Every generator, every template, every CLI flag, cross-format consistency checks, web-side validation & diversity logic	Not just coverage — tests document invariants like "NL and JSON generators stay in sync on the same input"

Design Philosophy

Remember when websites had loading screens, particle effects, and sound design? This is that energy — with modern engineering underneath.

This project embraces the Flash Site Era aesthetic (2002–2006) — when websites were experiences, not just pages:

Theatrical Loading — Animated intro with progress bar and skip option
Glossy Everything — Buttons with gradients, shadows, and glow effects
Particle Systems — Canvas-based floating particles with GPU acceleration
Sound Design — 5 named sounds (click, whoosh, hover, success, processing) — off by default, opt-in via settings, preference persisted in localStorage
Over-the-top Transitions — Page slides, scale animations, staggered reveals
Neon Palette — Cyan #00d4ff, pink #ff00aa, green #00ff88, gold #ffd700
Typography — Orbitron (headings) + Exo 2 (body)

🧩 Challenges & Solutions

Challenge	Solution
State loss during page transitions	Framer Motion unmounts components on route change. Solved with React Context + localStorage sync to persist wizard state across animated transitions.
Keyboard nav vs. text input	Arrow keys conflicted with suggestion field typing. Implemented focus detection to disable shortcuts during input, re-enable on blur.
Canvas particle performance	Frame drops on lower-end devices. Reduced particle count, added `requestAnimationFrame` throttling and `will-change` GPU hints.
Multi-model prompt formats	Each AI model expects different formats. Built a unified generation interface with model-specific adapters (JSON for Nano Banana, natural language for DALL-E/Kling).
Randomize repeats same values	Naive `Math.random()` frequently repeats the same suggestion. Implemented a sliding-window exclusion algorithm (`diversePick`) that tracks recent picks per field and excludes them from the candidate pool — with graceful fallback when the pool is smaller than the window.
Type safety at serialization boundaries	`cleanObject` recursively processes prompt data of unknown shape. Replaced `any` with `unknown` + type narrowing to catch bugs at compile time instead of runtime.

Privacy & Security

API keys stored in localStorage only — never sent to any server
No server-side storage of prompts or images
Direct API calls from your browser (except free tier)
Server-side input validation: prompt length limits, control character stripping, API key format checks
Rate limiting on free tier (10 generations/day per client)
HTTP security headers on all routes: HSTS, X-Frame-Options (DENY), X-Content-Type-Options, Referrer-Policy, Permissions-Policy
Upstream API errors are sanitized — third-party error details are never leaked to the client

Data Pipeline

The scripts/ directory contains a full pipeline for refreshing and expanding the prompt library:

Script	What it does
`fetch-prompts.ts`	Fetches image + video prompts from upstream repos
`fetch-nano-banana-prompts.ts`	Fetches 5,600+ prompts from YouMind-OpenLab
`extract-patterns.ts`	Mines the corpus to build 240 patterns (lighting, cameras, moods, etc.)
`translate-titles.ts`	Extracts English titles from bilingual EN/ZH content
`batch-generate-inspiration.py`	Batch-generates preview images via Gemini 3 Pro
`update-prompts-with-generated.py`	Writes generated image paths back into prompt JSON

fetch → translate → extract patterns → generate images → update JSON

Community Prompts Attribution

The Inspiration Gallery includes curated prompts from:

@songguoxs: gpt4o-image-prompts (1,180+ image prompts) · awesome-video-prompts (50+ video prompts)
@YouMind-OpenLab: awesome-nano-banana-pro-prompts (5,600+ Nano Banana prompts)

Documentation

API Reference — Generation endpoints, CLI commands, validation rules, and error codes
Diversity-Aware Randomization — Algorithm deep-dive, 4-layer architecture, API reference (diversePick → pickWithHistory → createPicker), complexity analysis, usage examples, design decisions, and 15 proven invariants (71 tests)
Contributing Guide — Code style, component patterns, and PR process

Contributing

See CONTRIBUTING.md for guidelines. Bug reports, feature suggestions, and pull requests are welcome.

License

MIT

Built with caffeine and nostalgia for the early 2000s web.
_{429 tests. Zero any types. Maximum vibes.}

DareDev256 · ⬆ Back to Top

Name		Name	Last commit message	Last commit date
Latest commit History 93 Commits
.cursor/rules		.cursor/rules
docs		docs
screenshots		screenshots
scripts		scripts
src		src
web		web
.gitignore		.gitignore
.todoignore		.todoignore
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
bun.lock		bun.lock
package.json		package.json
tsconfig.json		tsconfig.json

Folders and files

Latest commit

History

Repository files navigation

🎬 Ultimate Image & Video Prompt Generator

Stop staring at blank text boxes.

Table of Contents

Quick Start

How It Works

Why This Exists

What Makes This Different

Two Platforms, One Pipeline

🌐 Web App — Visual Wizard

⌨️ CLI Tool — Terminal Power

Features

🎯 Guided Prompt Building

🤖 Multi-Model Generation

💡 Inspiration Gallery & Showcase

🎨 Generation Flow

💰 API Key Pricing

Tech Stack

Getting Started

Prerequisites

Web App

CLI Tool

Environment Variables

Deploy Your Own

Architecture

Composable Section Pipeline

Web App Page Flow

Diversity-Aware Randomization

API Layers

Input Validation & Sanitization

Inline Documentation

Testing

Engineering Highlights

Design Philosophy

Privacy & Security

Data Pipeline

Community Prompts Attribution

Documentation

Contributing

License

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 10

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages