You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+148-9Lines changed: 148 additions & 9 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -140,6 +140,9 @@ Create a `repomix.config.json` file in your project root for custom configuratio
140
140
141
141
```json
142
142
{
143
+
"input": {
144
+
"max_file_size": 52428800
145
+
},
143
146
"output": {
144
147
"file_path": "repomix-output.md",
145
148
"style": "markdown",
@@ -185,12 +188,16 @@ Create a `repomix.config.json` file in your project root for custom configuratio
185
188
"url": "",
186
189
"branch": ""
187
190
},
188
-
"include": []
191
+
"include": [],
192
+
"token_count": {
193
+
"encoding": "o200k_base"
194
+
}
189
195
}
190
196
```
191
197
192
198
> [!NOTE]
193
-
> *Note on `remove_comments`*: This feature is language-aware, correctly handling comment syntax for various languages like Python, JavaScript, C++, HTML, etc., rather than using a simple generic pattern.*
199
+
> *Note on `remove_comments`*: This feature is language-aware, correctly handling comment syntax for various languages rather than using a simple generic pattern. Supported languages:
@@ -201,6 +208,25 @@ The `remote` section allows you to configure remote repository processing:
201
208
202
209
When a remote URL is specified in the configuration, Repomix will process the remote repository instead of the local directory. This can be overridden by CLI parameters (`--remote-branch`).
The instruction content will be included in the "Instruction" section of the output file.
318
+
319
+
### 4.5 Token Count
320
+
321
+
Repomix provides token counting to help you understand the size of your codebase in terms of AI model tokens.
322
+
323
+
#### Choosing an Encoding
324
+
325
+
Use `--token-count-encoding` to select the tokenizer encoding:
326
+
327
+
```bash
328
+
# Use GPT-4o encoding (default)
329
+
repomix --token-count-encoding o200k_base
330
+
331
+
# Use GPT-3.5/4 encoding
332
+
repomix --token-count-encoding cl100k_base
333
+
```
334
+
335
+
#### Visualizing Token Distribution
336
+
337
+
Use `--token-count-tree` to display a file tree with token counts for each file:
338
+
339
+
```bash
340
+
# Show all files with token counts
341
+
repomix --token-count-tree
342
+
343
+
# Show only files with 100 or more tokens
344
+
repomix --token-count-tree 100
345
+
```
346
+
347
+
### 4.6 Splitting Output for Large Codebases
348
+
349
+
For large codebases that exceed AI model context limits, you can split the output into multiple files:
350
+
351
+
```bash
352
+
# Split into files of approximately 500KB each
353
+
repomix --split-output 500kb
354
+
355
+
# Split into files of approximately 2MB each
356
+
repomix --split-output 2mb
357
+
```
358
+
359
+
Output files will be numbered sequentially (e.g., `repomix-output.1.md`, `repomix-output.2.md`, etc.). Files are split at directory boundaries to keep related files together.
360
+
361
+
### 4.7 Agent Skills Generation
362
+
363
+
Repomix can generate [Claude Agent Skills](https://docs.anthropic.com/en/docs/agents-and-tools/claude-code/skills) format output, which provides structured reference materials for AI coding agents.
364
+
365
+
```bash
366
+
# Generate skills with auto-detected name
367
+
repomix --skill-generate
368
+
369
+
# Generate skills with a custom name
370
+
repomix --skill-generate my-project
371
+
372
+
# Specify output directory directly
373
+
repomix --skill-output ./my-skills-dir
374
+
```
375
+
376
+
This creates the following directory structure:
377
+
378
+
```
379
+
.claude/skills/<name>/
380
+
├── SKILL.md # Entry point with usage guide
381
+
└── references/
382
+
├── summary.md # Purpose, format, and statistics
383
+
├── project-structure.md # Directory tree with line counts
Repomix provides advanced code compression capabilities to reduce output size while preserving essential information. This feature is particularly useful when working with large codebases or when you need to focus on specific aspects of your code.
265
391
266
-
#### 4.4.1 Compression Modes
392
+
#### 4.8.1 Compression Modes
267
393
268
394
**Interface Mode** (`keep_interfaces: true`)
269
395
- Preserves function and class signatures with their complete type annotations
0 commit comments