Skip to content

feat: add DecodeFieldError/ReadFileError wrappers for error context#6128

Open
koriyoshi2041 wants to merge 2 commits intolance-format:mainfrom
koriyoshi2041:fix/error-context-wrapping
Open

feat: add DecodeFieldError/ReadFileError wrappers for error context#6128
koriyoshi2041 wants to merge 2 commits intolance-format:mainfrom
koriyoshi2041:fix/error-context-wrapping

Conversation

@koriyoshi2041
Copy link

Summary

Adds two wrapper error types to provide better context when decoding errors occur, as described in #5602:

  • DecodeFieldError (lance-encoding/src/decoder.rs): wraps errors with field name and field id. Applied at field scheduler creation in both structural and legacy decode paths (Struct, List, FixedSizeList, Map).
  • ReadFileError (lance-file/src/reader.rs): wraps errors with file path. Applied at FileReader::read_tasks and read_stream_projected_blocking entry points. FileReader now stores the file path.

Both implement std::error::Error with proper source chaining and From<...> for lance_core::Error, producing error chains like:

Error: failed to read file 'data.lance'

Caused by:
    0: failed to decode field 'age' (id=10)
    1: number out of range

Design follows the approach from https://sabrinajewson.org/blog/errors as referenced in the issue.

Test plan

  • Unit tests for DecodeFieldError display, source chain, and conversion to lance_core::Error (3 tests)
  • Unit tests for ReadFileError display, source chain with nested DecodeFieldError, and conversion to lance_core::Error (3 tests)
  • cargo fmt --check passes
  • cargo clippy --all-targets -p lance-encoding -p lance-file passes
  • All existing tests pass

Closes #5602

…r context

Add wrapper error types that provide contextual information when decoding
errors occur:

- DecodeFieldError: wraps errors with field name and field id, applied at
  field scheduler creation points in both structural and legacy decode paths
- ReadFileError: wraps errors with file path, applied at FileReader's
  read_tasks and read_stream_projected_blocking entry points

This produces error chains like:
  failed to read file 'data.lance'
    Caused by:
      0: failed to decode field 'age' (id=10)
      1: number out of range

Closes lance-format#5602
@github-actions github-actions bot added the enhancement New feature or request label Mar 8, 2026
@github-actions
Copy link
Contributor

github-actions bot commented Mar 8, 2026

PR Review

Good approach following the error wrapping pattern from the linked blog post. The DecodeFieldError / ReadFileError types with proper source() chaining will meaningfully improve debuggability.

Issues

P1 — Incomplete wrapping in legacy decode path

The structural path wraps child field errors for Struct, List, FixedSizeList, and Map. But in the legacy path, only Struct children (line ~990) get wrapped. The legacy create_list_scheduler (line 634) calls create_legacy_field_scheduler for the list's child without wrapping:

let items_scheduler =
    self.create_legacy_field_scheduler(&list_field.children[0], column_infos, buffers)?;

This should also wrap with DecodeFieldError for consistency. Same for any other legacy compound types (FixedSizeList, etc.) that recurse into children.

P1 — io_source loses original error category

Both From<DecodeFieldError> and From<ReadFileError> convert via Error::io_source(). This means if the original error was e.g. Error::invalid_input(...), the wrapped error becomes an IO variant. This changes the error category, which could break callers that match on error variants for control flow. Consider preserving the original error's category or using a category-agnostic wrapping mechanism.

Nit

The #[cfg(test)] block had a comment // test coalesce indices to ranges removed — this looks like an accidental drive-by deletion (the comment described the test below it).

- Wrap child field errors with DecodeFieldError in legacy list decode
  path (create_list_scheduler), matching the structural path's coverage
- Use Error::wrapped() instead of Error::io_source() in both
  From<DecodeFieldError> and From<ReadFileError> to preserve the
  original error's category instead of converting everything to IO
- Restore accidentally deleted "test coalesce indices to ranges" comment
@koriyoshi2041
Copy link
Author

Thanks for the thorough review! All three issues have been addressed in the latest commit:

  1. P1 — Legacy decode path: Added DecodeFieldError wrapping in create_list_scheduler for child field errors, matching the structural path's behavior.

  2. P1 — Error category preservation: Changed From<DecodeFieldError> and From<ReadFileError> to use Self::wrapped() instead of Self::io_source(), so the original error category is preserved in the chain.

  3. Nit — Restored comment: Put back the // test coalesce indices to ranges comment above mod tests.

All 428 tests pass (362 lance-encoding + 66 lance-file).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add ReadFileError and DecodeColumnError wrappers to provide better context for decoding errors

1 participant