Summary
LookupIndex holds a DuckDB in-memory connection but lacks the context manager protocol (__enter__/__exit__), and transform_spec() in engine.py creates a LookupIndex without ever calling close(). This means the DuckDB connection leaks after iteration completes.
Current behavior
# engine.py line 44-45
if transformer.lookup_index is None:
transformer.lookup_index = LookupIndex()
# ... iteration happens ...
# LookupIndex.close() is never called
After transform_spec() finishes, transformer.lookup_index still has an open DuckDB connection. In long-running processes or repeated transforms, this leaks connections.
Proposed fix
1. Add context manager protocol to LookupIndex
class LookupIndex:
def __enter__(self):
return self
def __exit__(self, *exc_info):
self.close()
2. Clean up in transform_spec
Either:
- Use a
try/finally that calls transformer.lookup_index.close() after all class_derivation blocks are processed
- Or make the caller responsible (but document this clearly)
The try/finally in engine.py already exists per-block for dropping joined tables — a top-level cleanup just needs to wrap the outer loop.
Test coverage
PR #144 adds 3 tests that currently fail, demonstrating the gap:
test_context_manager_basic — LookupIndex() as context manager
test_context_manager_cleans_up_on_exception — cleanup on exception
test_transform_spec_closes_lookup_index — connection closed after iteration
Related
Summary
LookupIndexholds a DuckDB in-memory connection but lacks the context manager protocol (__enter__/__exit__), andtransform_spec()inengine.pycreates aLookupIndexwithout ever callingclose(). This means the DuckDB connection leaks after iteration completes.Current behavior
After
transform_spec()finishes,transformer.lookup_indexstill has an open DuckDB connection. In long-running processes or repeated transforms, this leaks connections.Proposed fix
1. Add context manager protocol to
LookupIndex2. Clean up in
transform_specEither:
try/finallythat callstransformer.lookup_index.close()after all class_derivation blocks are processedThe
try/finallyinengine.pyalready exists per-block for dropping joined tables — a top-level cleanup just needs to wrap the outer loop.Test coverage
PR #144 adds 3 tests that currently fail, demonstrating the gap:
test_context_manager_basic—LookupIndex()as context managertest_context_manager_cleans_up_on_exception— cleanup on exceptiontest_transform_spec_closes_lookup_index— connection closed after iterationRelated
LookupIndexandengine.pywere introduced)