-
Notifications
You must be signed in to change notification settings - Fork 5
feat: HTTP-proxy LangGraph checkpointer #258
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
0c56de4 to
db0f5ab
Compare
|
|
||
| async def _post(self, path: str, body: dict[str, Any]) -> Any: | ||
| """POST JSON to the backend and return parsed response.""" | ||
| resp = self._client._client.post( # noqa: SLF001 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is the double _client call here ? can we remove that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
stored as class variable!
|
|
||
| @override | ||
| async def aget_tuple(self, config: RunnableConfig) -> CheckpointTuple | None: | ||
| configurable = config["configurable"] # type: ignore[reportTypedDictNotRequiredAccess] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we wanna make these [""] type safe to be get() so it doesn't panic in the code? not sure if this would be caught anywhere but just thought id call out
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these are needed by langgraph constructs, we should be failing loudly if they aren't there
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
using get() would produce None which would silence error
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i guess my point is that this will throw a key value error that won't be returned nicely to the user vs. adding handlers for this ourselves / help the user fix it
not blocking, but just maybe something to think about if this could be an easy mistake when creating a langgraph agent using our sdk
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see that makes sense can wrap sure !!
db0f5ab to
8145407
Compare
danielmillerp
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
comments
61d9316 to
4cb3d7f
Compare
Replace direct Postgres checkpointing with HTTP-proxied checkpoint operations through the agentex backend API. Agents no longer need DATABASE_URL or direct DB connections for LangGraph state persistence. - Add HttpCheckpointSaver that proxies through AsyncAgentex client - Add create_checkpointer() factory using the HTTP checkpointer - Replace langgraph-checkpoint-postgres dep with langgraph-checkpoint - Export checkpointer module from adk package Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
4cb3d7f to
359bb12
Compare
What this does
This PR replaces LangGraph's direct-Postgres checkpointer with an HTTP-proxy checkpointer that routes all checkpoint operations through the agentex backend API.
Why
LangGraph agents need to persist state (checkpoints) between messages. The built-in approach (
AsyncPostgresSaver) has each agent pod open its own Postgres connection pool. This doesn't scale — more agent pods means more connections, and we'd hit limits as we grow LangGraph usage. This is the same problem we already solved for Temporal: agents shouldn't talk to the DB directly. Instead, they go through the backend API, which manages a shared connection pool.How it works
We implemented LangGraph's
BaseCheckpointSaverabstract class — the same interface thatAsyncPostgresSaverimplements — but instead of running SQL queries, each method makes an HTTP POST to the backend:BaseCheckpointSavermethodaget_tuple()POST /checkpoints/get-tupleaput()POST /checkpoints/putaput_writes()POST /checkpoints/put-writesalist()POST /checkpoints/listadelete_thread()POST /checkpoints/delete-threadThe
HttpCheckpointSaveruses the existingAsyncAgentexhttpx client (withEnvAuthfor the agent API key), so authentication works the same way as every other SDK→backend call.Serialization stays in the SDK — complex Python objects are serialized via LangGraph's
serde, then base64-encoded for JSON transport. The backend just stores and retrieves the raw data.What changed
HttpCheckpointSaverclass — the HTTP-proxy checkpointer implementationcreate_checkpointer()factory — returns anHttpCheckpointSaverwired up with the SDK clientlanggraph-checkpoint-postgresandpsycopg[binary]dependencies (no longer needed)langgraph-checkpoint(base types only —BaseCheckpointSaver,CheckpointTuple, etc.)Impact on agents
Zero code changes. The
create_checkpointer()API is unchanged — agents just call it and get back a checkpointer. The only difference is they no longer needDATABASE_URLin their environment.Companion PR
/checkpointsendpoints, ORM models, migration, repository with 19 integration testsTest plan
🤖 Generated with Claude Code