An in-process virtual filesystem with hard quota enforcement for Python.
MemoryFileSystem gives you a fully isolated filesystem-like workspace inside a Python process.
- Hard quota (
MFSQuotaExceededError) to reject oversized writes before OOM - Hierarchical directories and multi-file operations (
import_tree,copy_tree,move) - File-level RW locking + global structure lock for thread-safe operations
- Free-threaded Python compatible (
PYTHON_GIL=0) — stress-tested under 50-thread contention - Async wrapper (
AsyncMemoryFileSystem) powered byasyncio.to_thread - Zero runtime dependencies (standard library only)
This is useful when io.BytesIO is too primitive (single buffer), and OS-level RAM disks/tmpfs are impractical (permissions, container policy, Windows driver friction).
pip install D-MemFSRequirements: Python 3.11+
from dmemfs import MemoryFileSystem, MFSQuotaExceededError
mfs = MemoryFileSystem(max_quota=64 * 1024 * 1024)
mfs.mkdir("/data")
with mfs.open("/data/hello.bin", "wb") as f:
f.write(b"hello")
with mfs.open("/data/hello.bin", "rb") as f:
print(f.read()) # b"hello"
print(mfs.listdir("/data"))
print(mfs.is_file("/data/hello.bin")) # True
try:
with mfs.open("/huge.bin", "wb") as f:
f.write(bytes(512 * 1024 * 1024))
except MFSQuotaExceededError as e:
print(e)open(path, mode, *, preallocate=0, lock_timeout=None)mkdir,remove,rmtree,rename,move,copy,copy_treelistdir,exists,is_dir,is_file,walk,globstat,stats,get_sizeexport_as_bytesio,export_tree,iter_export_tree,import_tree
Constructor parameters:
max_quota(default256 MiB): byte quota for file datamax_nodes(defaultNone): optional cap on total node count (files + directories). RaisesMFSNodeLimitExceededErrorwhen exceeded.default_storage(default"auto"): storage backend for new files —"auto"/"sequential"/"random_access"promotion_hard_limit(defaultNone): byte threshold above which Sequential→RandomAccess auto-promotion is suppressed (Noneuses the built-in 512 MiB limit)chunk_overhead_override(defaultNone): override the per-chunk overhead estimate used for quota accounting
Note: The
BytesIOreturned byexport_as_bytesio()is outside quota management. Exporting large files may consume significant process memory beyond the configured quota limit.
Supported binary modes: rb, wb, ab, r+b, xb
read,write,seek,tell,truncate,flush,close- file-like capability checks:
readable,writable,seekable
flush() is intentionally a no-op (compatibility API for file-like integrations).
size, created_at, modified_at, generation, is_dir
- Supports both files and directories
- For directories:
size=0,generation=0,is_dir=True
D-MemFS natively operates in binary mode. For text I/O, use MFSTextHandle:
from dmemfs import MemoryFileSystem, MFSTextHandle
mfs = MemoryFileSystem()
mfs.mkdir("/data")
# Write text
with mfs.open("/data/hello.bin", "wb") as f:
th = MFSTextHandle(f, encoding="utf-8")
th.write("こんにちは世界\n")
th.write("Hello, World!\n")
# Read text line by line
with mfs.open("/data/hello.bin", "rb") as f:
th = MFSTextHandle(f, encoding="utf-8")
for line in th:
print(line, end="")MFSTextHandle is a thin, bufferless wrapper. It encodes on write() and decodes on read() / readline(). Unlike io.TextIOWrapper, it introduces no buffering issues when used with MemoryFileHandle.
Stage data through raw → processed → output directories:
from dmemfs import MemoryFileSystem
mfs = MemoryFileSystem(max_quota=16 * 1024 * 1024)
mfs.mkdir("/raw")
mfs.mkdir("/processed")
raw_data = b"id,name,value\n1,foo,100\n2,bar,200\n"
with mfs.open("/raw/data.csv", "wb") as f:
f.write(raw_data)
with mfs.open("/raw/data.csv", "rb") as f:
data = f.read()
with mfs.open("/processed/data.csv", "wb") as f:
f.write(data.upper())
mfs.rmtree("/raw") # cleanup stagingStore, list, and export multiple files as a tree:
from dmemfs import MemoryFileSystem
mfs = MemoryFileSystem()
mfs.import_tree({
"/archive/doc1.bin": b"Document 1",
"/archive/doc2.bin": b"Document 2",
"/archive/sub/doc3.bin": b"Document 3",
})
print(mfs.listdir("/archive")) # ['doc1.bin', 'doc2.bin', 'sub']
snapshot = mfs.export_tree(prefix="/archive") # dict of {path: bytes}Serialize an in-memory SQLite DB into MFS and restore it later:
import sqlite3
from dmemfs import MemoryFileSystem
mfs = MemoryFileSystem()
conn = sqlite3.connect(":memory:")
conn.execute("CREATE TABLE t (id INTEGER, val TEXT)")
conn.execute("INSERT INTO t VALUES (1, 'hello')")
conn.commit()
with mfs.open("/snapshot.db", "wb") as f:
f.write(conn.serialize())
conn.close()
with mfs.open("/snapshot.db", "rb") as f:
raw = f.read()
restored = sqlite3.connect(":memory:")
restored.deserialize(raw)
rows = restored.execute("SELECT * FROM t").fetchall() # [(1, 'hello')]- Path/tree operations are guarded by
_global_lock. - File access is guarded by per-file
ReadWriteLock. lock_timeoutbehavior:None: block indefinitely0.0: try-lock (fail immediately withBlockingIOError)> 0: timeout in seconds, thenBlockingIOError
- Current
ReadWriteLockis non-fair: under sustained read load, writers can starve.
Operational guidance:
- Keep lock hold duration short
- Set an explicit
lock_timeoutin latency-sensitive code paths walk()andglob()provide weak consistency: each directory level is snapshotted under_global_lock, but the overall traversal is NOT atomic. Concurrent structural changes may produce inconsistent results.
from dmemfs import AsyncMemoryFileSystem
async def run() -> None:
mfs = AsyncMemoryFileSystem(max_quota=64 * 1024 * 1024)
await mfs.mkdir("/a")
async with await mfs.open("/a/f.bin", "wb") as f:
await f.write(b"data")
async with await mfs.open("/a/f.bin", "rb") as f:
print(await f.read())Minimal benchmark tooling is included:
- MFS vs
io.BytesIOvsPyFilesystem2 (MemoryFS)vstempfile - Cases: many-small-files and stream write/read
- Optional report output to
benchmarks/results/
Note: As of setuptools 82 (February 2026),
pyfilesystem2fails to import due to a known upstream issue (#597). Benchmark results including PyFilesystem2 were measured with setuptools ≤ 81 and are valid as historical comparison data.
Run:
uvx --with-requirements requirements.txt --with-editable . python benchmarks/compare_backends.py --save-md auto --save-json autoSee BENCHMARK.md for details.
Latest benchmark snapshot:
Test execution and dev flow are documented in TESTING.md.
Typical local run:
uv pip compile requirements.in -o requirements.txt
uvx --with-requirements requirements.txt --with-editable . pytest tests/ -v --timeout=30 --cov=dmemfs --cov-report=xml --cov-report=term-missingCI (.github/workflows/test.yml) runs tests with coverage XML generation.
API docs can be generated as Markdown (viewable on GitHub) using pydoc-markdown:
uvx --with pydoc-markdown --with-editable . pydoc-markdown '{
loaders: [{type: python, search_path: [.]}],
processors: [{type: filter, expression: "default()"}],
renderer: {type: markdown, filename: docs/api_md/index.md}
}'Or as HTML using pdoc (local browsing only):
uvx --with-requirements requirements.txt pdoc dmemfs -o docs/api- Core
open()is binary-only (rb,wb,ab,r+b,xb). Text I/O is available via theMFSTextHandlewrapper. - No symlink/hardlink support — intentionally omitted to eliminate path traversal loops and structural complexity (same rationale as
pathlib.PurePath). - No direct
pathlib.Path/os.PathLikeAPI — MFS paths are virtual and must not be confused with host filesystem paths. Acceptingos.PathLikewould allow third-party libraries or a plainopen()call to silently treat an MFS virtual path as a real OS path, potentially issuing unintended syscalls against the host filesystem. All paths must be plainstrwith POSIX-style absolute notation (e.g."/data/file.txt"). - No kernel filesystem integration (intentionally in-process only)
Auto-promotion behavior:
- By default (
default_storage="auto"), new files start asSequentialMemoryFileand auto-promote toRandomAccessMemoryFilewhen random writes are detected. - Promotion is one-way (no downgrade back to sequential).
- Use
default_storage="sequential"or"random_access"to fix the backend at construction; usepromotion_hard_limitto suppress auto-promotion above a byte threshold. - Storage promotion temporarily doubles memory usage for the promoted file. The quota system accounts for this, but process-level memory may spike briefly.
Security note: In-memory data may be written to physical disk via OS swap or core dumps. MFS does not provide memory-locking (e.g., mlock) or secure erasure. Do not rely on MFS alone for sensitive data isolation.
| Exception | Typical cause |
|---|---|
MFSQuotaExceededError |
write/import/copy would exceed quota |
MFSNodeLimitExceededError |
node count would exceed max_nodes (subclass of MFSQuotaExceededError) |
FileNotFoundError |
path missing |
FileExistsError |
creation target already exists |
IsADirectoryError |
file operation on directory |
NotADirectoryError |
directory operation on file |
BlockingIOError |
lock timeout or open-file conflict |
io.UnsupportedOperation |
mode mismatch / unsupported operation |
ValueError |
invalid mode/path/seek/truncate arguments |
D-MemFS ships a pytest plugin that provides an mfs fixture:
# conftest.py — register the plugin explicitly
pytest_plugins = ["dmemfs._pytest_plugin"]Note: The plugin is not auto-discovered. Users must declare it in
conftest.pyto opt in.
# test_example.py
def test_write_read(mfs):
mfs.mkdir("/tmp")
with mfs.open("/tmp/hello.txt", "wb") as f:
f.write(b"hello")
with mfs.open("/tmp/hello.txt", "rb") as f:
assert f.read() == b"hello"Design documents (Japanese):
- Architecture Spec v13 — API design, internal structure, CI matrix
- Detailed Design Spec — component-level design and rationale
- Test Design Spec — test case table and pseudocode
These documents are written in Japanese and serve as internal design references.
Key results from the included benchmark (300 small files × 4 KiB, 16 MiB stream, 2 GiB large stream):
| Case | MFS (ms) | BytesIO (ms) | tempfile (ms) |
|---|---|---|---|
| small_files_rw | 34 | 5 | 164 |
| stream_write_read | 64 | 51 | 17 |
| random_access_rw | 24 | 53 | 27 |
| large_stream_write_read | 1 438 | 7 594 | 1 931 |
| many_files_random_read | 777 | 163 | 4 745 |
MFS incurs a small overhead on tiny-file workloads but delivers significantly better performance on large streams and random-access patterns compared with BytesIO. See BENCHMARK.md and benchmark_current_result.md for full data.
Note:
tempfileresults above were measured with the system temp directory on a RAM disk. On a physical SSD/HDD,tempfileperformance will be substantially slower.
MIT License