Fix Encodable for Dataclass and Scalar in #548#571
Fix Encodable for Dataclass and Scalar in #548#571eb8680 merged 45 commits intoeb-remove-responsefrom
Conversation
|
I found some failure tests. Checking more before re-opening |
|
The problem is the traces being replayed, which use the old |
Yes. Fixtures are part of it, but also these following cases now fail, which I think because of this comment #548 (comment) (we dropped the default @Template.define
def primes(first_digit: int) -> int:
"""Give a prime number with {first_digit} as the first digit. Do not use any tools."""
raise NotHandled
with handler(provider):
assert type(primes(6)) is intResult: APIConnectionError: litellm.APIConnectionError: APIConnectionError: OpenAIException - Unsupported response_format type - <class 'int'>And this: @dataclasses.dataclass
class KnockKnockJoke:
whos_there: str
punchline: str
@Template.define
def write_joke(theme: str) -> KnockKnockJoke:
"""Write a knock-knock joke on the theme of {theme}. Do not use any tools."""
raise NotHandled
@Template.define
def rate_joke(joke: KnockKnockJoke) -> bool:
"""Decide if {joke} is funny or not. Do not use any tools."""
raise NotHandled
def do_comedy():
joke = write_joke("lizards")
print("> You are onstage at a comedy club. You tell the following joke:")
print(
f"Knock knock.\nWho's there?\n{joke.whos_there}.\n{joke.whos_there} who?\n{joke.punchline}"
)
if rate_joke(joke):
print("> The crowd laughs politely.")
else:
print("> The crowd stares in stony silence.")
with handler(provider):
do_comedy()Result: APIConnectionError: litellm.APIConnectionError: APIConnectionError: OpenAIException - Unsupported response_format type - <class '__main__.KnockKnockJoke'> |
|
The solution is not to handle scalars or dataclasses or other types that can theoretically be coerced into Pydantic piecemeal, but to do the encoding generically in the base case of |
I see, I think we can make encoding more generic is adding |
effectful/handlers/llm/encoding.py
Outdated
|
|
||
|
|
||
| @functools.cache | ||
| def _wrapped_response_model(ty: Hashable) -> type[pydantic.BaseModel]: |
There was a problem hiding this comment.
name of this function could be made a little clearer - it creates a pydantic model where value is T, ..., from the call sites I guess it's only used inside _encodable_object? maybe it could be inlined? it feels a bit bespoke.
There was a problem hiding this comment.
Yeah, I tried inlining it, but functools.cache doesnt work very well with nested function. Maybe after all suggested changes below, this would be no longer needed.
effectful/handlers/llm/encoding.py
Outdated
| Encodable[T, U], | ||
| WrappedEncodable( | ||
| ty, | ||
| typing.cast(type[pydantic.BaseModel], model_cls), |
There was a problem hiding this comment.
is this cast here needed? _wrapped_response_model aloready returns a type[pydantic.BaseModel]
effectful/handlers/llm/encoding.py
Outdated
| encoded_args = _param_model(sig).model_validate( | ||
| { | ||
| k: Encodable.define(type(v), self.ctx).encode(v) | ||
| k: Encodable.define(sig.parameters[k].annotation, self.ctx).encode(v) |
There was a problem hiding this comment.
wouldn't nested_type(v) handle this? The runtime types should be more precise than any annotation on the signature.
effectful/handlers/llm/encoding.py
Outdated
| __config__={"extra": "forbid"}, | ||
| **{ | ||
| name: Encodable.define(param.annotation).enc | ||
| name: _embedded_type_from_enc(Encodable.define(param.annotation)) |
There was a problem hiding this comment.
this feels a little ad-hoc. Can't this _embedded_type_from_enc be folded into the respective smart constructors of Wrapped, Tuple and MutableSequence-Encodable?
There was a problem hiding this comment.
Thanks, I wanted to make the change centralized to here, but adding it to constructor is better architecturally.
effectful/handlers/llm/encoding.py
Outdated
| return typing.cast(typing.Any, self.adapter.validate_python(value)) | ||
|
|
||
| def decode(self, encoded_value: typing.Any) -> T: | ||
| if isinstance(encoded_value, pydantic.BaseModel): |
There was a problem hiding this comment.
shouldn't the invariant be that the input to encode should always be the enc type such that these kinds of isinstance checks can be avoided?
There was a problem hiding this comment.
Thanks! The invariant is broken after I started using embedded/wrapped type to fix scalar error.
I'm reverting that now and thinking if I can better fix it somehow.
There was a problem hiding this comment.
Yeah I remember the need for embedded_type now, we want to have formatting for template being different from pydandic base model (int 3 should be just 3 instead of {"value": 3}).
There was a problem hiding this comment.
hmmm, maybe the encodable for ints should be simpler? or that logic of unwrapping the value should be encapsulated inside the formatting section (i.e if the return value is a dict with a single field "value", then just inject the value directly?)
There was a problem hiding this comment.
Yeah, but I feel it's kind of adhoc/unsure how sound or complete we need this to be.
|
Temporary rebase to master to see if tests work. |
effectful/handlers/llm/encoding.py
Outdated
|
|
||
|
|
||
| @dataclass | ||
| class MappingEncodable[K, V](Encodable[Mapping[K, V], typing.Any]): |
There was a problem hiding this comment.
MappingEncodable doesn't seem to be doing anything different from BaseEncodable, why do we need it?
|
|
||
| @staticmethod | ||
| @functools.cache | ||
| def wrapped_model(ty: Hashable) -> type[_BoxEncoding[Any]]: |
There was a problem hiding this comment.
Can this just be inlined into _encodable_object? Where else do you need to call it?
effectful/handlers/llm/encoding.py
Outdated
|
|
||
| return typing.cast( | ||
| Encodable[Mapping[K, V], U], | ||
| MappingEncodable(ty, ty, ctx), |
There was a problem hiding this comment.
Why is ty being passed as both base and enc here? I would expect enc to be a BaseModel. Is this code path being exercised at all in the tests?
tests/test_handlers_llm_encoding.py
Outdated
| _TOOL_IDS = frozenset( | ||
| {"tool-add", "tool-greet", "tool-process", "tool-no-params", "tool-pydantic-param"} | ||
| ) | ||
| _DTC_IDS = frozenset( |
There was a problem hiding this comment.
This appears to be unused now
* Adding tests * Lint * More lint * Minor * Removing helper functions * Check deserialize invariant * Minor fix for typeddict * Better pydantic-based check * Lint * Test ToolSchema * Inline completion with tools * More inlining * reimport requires_openai * Removing casts * mark xfail some tests agian * Merging integration test into encoding
effectful/handlers/llm/encoding.py
Outdated
| ) -> Encodable[Mapping[K, V], U]: | ||
| ctx = {} if ctx is None else ctx | ||
|
|
||
| enc: type[Any] = ty |
There was a problem hiding this comment.
Not sure what happened here in response to my previous comment on this function. Originally it was wrong because ty is not a BaseModel:
return MappingEncodable(ty, enc=ty, ctx=ctx) # ty is not a BaseModelNow it's still wrong for the same reason, just with an extra layer of indirection:
totally_not_wrong = ty # still not a BaseModel
return MappingEncodable(ty, enc=totally_not_wrong, ctx=ctx)|
|
||
| @staticmethod | ||
| @functools.cache | ||
| def _typeddict_model(td: type[Any]) -> type[pydantic.BaseModel]: |
There was a problem hiding this comment.
It seems like this is the only part of MappingEncodable that is different from BaseEncodable. Maybe just have a separate TypedDictEncodable and get rid of MappingEncodable?
There was a problem hiding this comment.
TypedDictEncodable still doesn't seem any different from BaseEncodable, i.e. BaseEncodable(ty, enc, ctx) is indistinguishable from TypedDictEncodable(ty, enc, ctx) for the same enc. If the only difference is the encoding, you can just move this logic into _encodable_mapping below and get rid of TypedDictEncodable.
|
Previously list-image and other tool call was failing in indempotency due to: Demonstration of (2): from typing import TypedDict
from effectful.internals.unification import nested_type
class UserTD(TypedDict):
name: str
age: int
print("nested_type(UserTD(name='a', age=1)).value ->", nested_type(UserTD(name='a', age=1)).value)Result: |
The arbitrary decisions about which types are
I think this can be regarded as a bug (or at least missing feature) in |
|
|
||
|
|
||
| @dataclass | ||
| class _ComplexEncodable(Encodable[complex, _ComplexParts]): |
There was a problem hiding this comment.
Is complex not compatible with BaseEncodable and _BoxEncoding? When I tried making a BaseModel with a complex-valued field in the REPL and validating/serializing data, it seemed to work fine.
| identity_encoder = typing.cast( | ||
| Encodable[T, typing.Any], | ||
| BaseEncodable( | ||
| typing.cast(type[T], object), | ||
| typing.cast( | ||
| type[_BoxEncoding[T]], | ||
| BaseEncodable.wrapped_model(typing.cast(Hashable, object)), |
There was a problem hiding this comment.
All the casts here are quite confusing/misleading (why Hashable?), just use # type: ignore.
| self, encoded_value: pydantic.BaseModel | Mapping[str, Any] | ||
| ) -> Image.Image: | ||
| normalized = self.enc.model_validate(encoded_value) | ||
| image_url = typing.cast(str, getattr(normalized, "url")) |
There was a problem hiding this comment.
If this cast is saying image_url is not None, prefer an assertion about that (assert normalized.url is not None) - cast is just a lie to the type checker. In general you should only be using cast to coerce to types that are unavailable at runtime, like generics, TypeVars or TypedDict.
|
|
||
| @staticmethod | ||
| @functools.cache | ||
| def wrapped_model(ty: Hashable) -> type[_BoxEncoding[Any]]: |
|
|
||
| @staticmethod | ||
| @functools.cache | ||
| def _typeddict_model(td: type[Any]) -> type[pydantic.BaseModel]: |
* Move logic from completion to encoding * errors * attempt to appease mypy * nit * import * nit * fix * nit * simplify and add tests * update tests * tests * more consolidation * stash * refactor tests * Fix Encodable for Dataclass and Scalar in #548 (#571) * Adding scalar encoding, make tools use signature and update fixtures * Lint * Lint * More general wrappedencodable * Lint * Architectural fixes * Removing embedded_type * Lint * Minor * Minor fix * Lint * Add param_schema_type * Minor * New Encodable's architecture * Recursive dataclass encodable * Linting * Simplify encodable * Minor * Fix test * Lint * Trim * More trimming * Handling images * Lint * Further trimming * Updating test template formatting * Minor * Add MappingEncodable and cover more tests * Lint * More precise BaseEncodable * Lint * Removing adapter from base * Move wrapped model to a staticmethod * More concise handling of empty lists * Minor * Use nested_type as _wrapped_model is resolved * Remove DTC_IDS * Minor fix * Special typeddict handling * Complex Encodable * Removing casts * Add Encoding/Integration Parameterized Tests (#582) * Adding tests * Lint * More lint * Minor * Removing helper functions * Check deserialize invariant * Minor fix for typeddict * Better pydantic-based check * Lint * Test ToolSchema * Inline completion with tools * More inlining * reimport requires_openai * Removing casts * mark xfail some tests agian * Merging integration test into encoding * Enforce Tool and Image to be PydanticBase model * Update fixtures * Update minor fixture --------- Co-authored-by: Dat Nguyen (Marc) <15943389+datvo06@users.noreply.github.com>
This PR: