A project for managing and running UCD (Unicode Character Database) extraction pipelines with a built-in UI for pipeline execution and management.
Before you begin, ensure you have the following installed:
- Node.js
>= 24.13- Download Node.js - pnpm
>= 10.29.3- Installation guide
- Clone the repository:
git clone https://github.com/ucdjs/ucdjs-pipelines.git
cd ucdjs-pipelines- Install dependencies using pnpm:
pnpm install.
├── sources/ # Reusable data sources
│ └── ucd-store.source.ts # HTTP source for api.ucdjs.dev
├── routes/ # File processing routes
│ ├── core/ # Core UCD files
│ │ ├── blocks.route.ts
│ │ └── arabic-shaping.route.ts
│ ├── auxiliary/ # Auxiliary files (break properties)
│ ├── extracted/ # Derived/extracted files
│ └── emoji/ # Emoji-related files
├── pipelines/ # Pipeline definitions (*.ucd-pipeline.ts)
│ ├── core/
│ │ ├── blocks.ucd-pipeline.ts
│ │ └── arabic-shaping.ucd-pipeline.ts
│ └── emoji/
└── index.ts # Registry - exports all pipelines
View all available pipelines in your project:
pnpm pipelines:listExecute pipelines from the command line:
pnpm pipelines:runExecute pipelines using the interactive web-based UI:
pnpm pipelines:run:uiThis opens a user-friendly interface where you can:
- Select and configure pipelines
- Monitor execution progress
- View results and logs
-
Create a route in
routes/<category>/<file>.route.ts:import { definePipelineRoute, byName } from "@ucdjs/pipelines-core"; export const myRoute = definePipelineRoute({ id: "my-route", filter: byName("MyFile.txt"), parser: async function* (ctx) { // Parse file content }, resolver: async (ctx, rows) => { // Transform to output format }, });
-
Create a pipeline in
pipelines/<category>/<file>.ucd-pipeline.ts:import { definePipeline, byExt } from "@ucdjs/pipelines-core"; import { createUcdStoreSource } from "../../sources/ucd-store.source"; import { myRoute } from "../../routes/<category>/my.route"; export const myPipeline = definePipeline({ id: "my-pipeline", name: "My Pipeline", description: "Extracts data from MyFile.txt", versions: ["16.0.0", "15.1.0"], inputs: [createUcdStoreSource()], routes: [myRoute], include: byExt(".txt"), });
-
Export from registry in
index.ts:export { myPipeline } from "./pipelines/<category>/my.ucd-pipeline"; // Add to pipelines array export const pipelines = [ // ...existing pipelines myPipeline, ];
Based on the UCD store structure:
- Core (
ucd/): UnicodeData.txt, Blocks.txt, Scripts.txt, etc. - Auxiliary (
ucd/auxiliary/): Break properties, test files - Extracted (
ucd/extracted/): Derived properties - Emoji (
ucd/emoji/): Emoji data files
See https://api.ucdjs.dev/.well-known/ucd-store/16.0.0.json for the complete file list.