Skip to content

LookupIndex: add context manager protocol and resource cleanup in transform_spec #143

@turbomam

Description

@turbomam

Summary

LookupIndex holds a DuckDB in-memory connection but lacks the context manager protocol (__enter__/__exit__), and transform_spec() in engine.py creates a LookupIndex without ever calling close(). This means the DuckDB connection leaks after iteration completes.

Current behavior

# engine.py line 44-45
if transformer.lookup_index is None:
    transformer.lookup_index = LookupIndex()
# ... iteration happens ...
# LookupIndex.close() is never called

After transform_spec() finishes, transformer.lookup_index still has an open DuckDB connection. In long-running processes or repeated transforms, this leaks connections.

Proposed fix

1. Add context manager protocol to LookupIndex

class LookupIndex:
    def __enter__(self):
        return self

    def __exit__(self, *exc_info):
        self.close()

2. Clean up in transform_spec

Either:

  • Use a try/finally that calls transformer.lookup_index.close() after all class_derivation blocks are processed
  • Or make the caller responsible (but document this clearly)

The try/finally in engine.py already exists per-block for dropping joined tables — a top-level cleanup just needs to wrap the outer loop.

Test coverage

PR #144 adds 3 tests that currently fail, demonstrating the gap:

  • test_context_manager_basicLookupIndex() as context manager
  • test_context_manager_cleans_up_on_exception — cleanup on exception
  • test_transform_spec_closes_lookup_index — connection closed after iteration

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingenhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions