fix (filesystem): Add UTF-8 encoding and Unicode normalization for umlauts and symlinks#3238
Conversation
…lauts and symlinks - Add UTF-8 encoding initialization for Windows STDIO - Implement Unicode NFC normalization in path-utils - Improve symlink handling with proper encoding support - Add better error messages for encoding-related issues Fixes issues with German umlauts (ä, ö, ü, ß) and symlinks
|
Confirming this fix also resolves a related issue with Japanese directory names on macOS. Reproduction
Root cause (matches this PR)macOS stores filenames in NFD, but the caller-supplied path (from config, CLI args, or MCP roots) is typically NFC. The current Workaround I'm using until this PR landsAs an interim fix on a production deployment I patched the published - const normalizedPath = path.resolve(...);
+ const normalizedPath = path.resolve(...).normalize("NFC");
- const normalizedDir = path.resolve(...);
+ const normalizedDir = path.resolve(...).normalize("NFC");with an idempotent re-apply on every server start (because Why this PR is the right fixThe
Would be great to see this merged — happy to test the built package on a macOS + Japanese filenames setup if that helps reviewers. |
Description
I found that working with directories/filesystems on macOS did not work when path names contained umlauts (ä, ö, ü, ß). The same was true when using symbolic links which involved such characters.
The root cause was threefold:
This PR fixes these issues by:
The fix benefits all languages with diacritical marks (German, French, Spanish, Nordic, Eastern European, etc.).
Publishing Your Server
Note: We are no longer accepting PRs to add servers to the README. Instead, please publish your server to the MCP Server Registry to make it discoverable to the MCP ecosystem.
To publish your server, follow the quickstart guide. You can browse published servers at https://registry.modelcontextprotocol.io/.
Server Details
index.ts: UTF-8 encoding initializationpath-utils.ts: Unicode NFC normalizationlib.ts: Enhanced symlink resolution and error handlingroots-utils.ts: Root URI parsing improvementsMotivation and Context
Problem
Users with non-ASCII characters in their file paths (common in German, French, Spanish, and many other languages) experienced complete filesystem access failures. Specifically:
?due to STDIO encoding mismatchü=u+ combining diaeresis) while the server compared against NFC (composed:ü= single character)fs.realpath()returning inconsistent Unicode formsReal-world Impact
This made the filesystem server completely unusable for:
Example Failure
Related to issue #2098 where Windows users reported umlauts converting to
?.How Has This Been Tested?
Tested extensively with Claude Desktop on macOS with the following scenarios:
Test 1: Directory with umlauts
/tmp/test-bücher/übung.txtlist_directorycorrectly showsübung.txtread_filecorrectly reads content with umlautsTest 2: Symlink with umlaut in name
verknüpfung → /targetTest 3: Symlink to directory with umlauts
link → /übungenlist_directorythrough symlink worksread_fileandwrite_filework through symlinkTest 4: Create file with umlaut in name
neue-übung.txtwith content containingÄÖÜ äöü ßAll 14 MCP filesystem tools (read, write, list, create, move, search, etc.) now work correctly with umlauts.
Testing environment:
Breaking Changes
No breaking changes. This is a pure bug fix that:
Users can simply update to the new version without any migration steps.
Types of changes
Checklist
Additional context
Technical Implementation Details
Why NFC instead of NFD?
Why re-normalize after
fs.realpath()?fs.realpath()resolves symlinks and returns the actual filesystem pathError Handling Improvements:
EILSEQ(illegal byte sequence) andEINVAL(invalid argument) error codesPerformance Impact:
String.normalize('NFC')adds ~0.01ms per path operationBroader Language Support:
This fix benefits all languages using diacritical marks:
Files Modified
src/filesystem/index.ts(9 lines added - UTF-8 init)src/filesystem/path-utils.ts(9 lines added - NFC normalization)src/filesystem/lib.ts(10 lines modified - enhanced error handling)src/filesystem/roots-utils.ts(8 lines modified - encoding fixes)Total: ~36 lines changed across 4 files - a minimal, focused fix.