Conversation
Why are these changes being introduced: * the example lambda function and test files are no longer needed and are being removed to clean up the codebase
Why are these changes being introduced: * Adds a health check/ping endpoint to the tokenizer lambda, which can be used by AWS to determine if the lambda is healthy and ready to receive traffic. * This is important for ensuring high availability and reliability of the service. * This also allows us to keep the lambda warm, which can improve performance by reducing cold start latency. Relevant ticket(s): * https://mitlibraries.atlassian.net/browse/USE-431 How does this address that need: * Adds check for "ping" key in the event, and if present, returns a simple response indicating the service is healthy. Document any side effects to this change: * We cannot use the "ping" key in the event for any other purpose, as it is now reserved for health checks.
These tests are not strictly necessary, but I was nervous we never actually tested the tokenizer handler with a real tokenizer. These tests load a real tokenizer and test that the handler can generate tokens for a simple query. They are marked as "integration" tests since they load real models from disk and are slower than typical unit tests. They are enabled by default, but can be skipped with `pytest -m "not integration"`. The total test time does not increase enough to make the default be to skip these tests at this time, but if we need to in the future we can change the default.
There was a problem hiding this comment.
Pull request overview
Adds a lightweight “ping” health-check path to the tokenizer Lambda so AWS (and/or internal warmers) can verify readiness and keep the function warm, plus expands the test suite accordingly.
Changes:
- Add
pingevent handling totokenizer_handler.lambda_handler. - Add unit + integration tests for tokenizer behavior and handler behavior.
- Remove the unused example
my_functionLambda and its test; register anintegrationpytest marker.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
lambdas/tokenizer_handler.py |
Adds ping/health-check short-circuit behavior in the Lambda handler. |
tests/test_tokenizer_handler.py |
Adds unit tests for ping behavior and an integration handler test. |
tests/test_query_tokenizer.py |
Adds integration tests that load the real tokenizer/IDF from disk. |
pyproject.toml |
Registers the integration pytest marker. |
lambdas/my_function.py |
Removes unused sample Lambda implementation. |
tests/test_my_function.py |
Removes tests for the deleted sample Lambda. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| @@ -28,6 +28,9 @@ exclude = ["tests/"] | |||
|
|
|||
| [tool.pytest.ini_options] | |||
| log_level = "INFO" | |||
There was a problem hiding this comment.
The new integration tests are marked, but nothing here configures pytest to exclude them by default. Given the repo's default test command (make test) runs plain pytest, these slow disk/model-loading tests will run on every CI/local unit-test run unless callers remember -m "not integration". Consider adding a default addopts that excludes integration tests (and documenting how to opt in), or otherwise ensuring CI/test tooling explicitly selects/filters markers.
| log_level = "INFO" | |
| log_level = "INFO" | |
| addopts = "-m 'not integration'" |
There was a problem hiding this comment.
My local testing shows the tests didn't add significant time so I did not change the defaults.
| @pytest.mark.integration | ||
| def test_integration_tokenize_query_returns_nonempty_dict(): | ||
| """Real tokenizer and IDF: tokenize_query returns a non-empty dict.""" | ||
| qt = QueryTokenizer() | ||
| result = qt.tokenize_query("machine learning") | ||
| assert isinstance(result, dict) | ||
| assert len(result) > 0 | ||
|
|
||
|
|
||
| @pytest.mark.integration | ||
| def test_integration_tokenize_query_values_are_positive_floats(): | ||
| """Real tokenizer and IDF: all weights are positive floats.""" | ||
| qt = QueryTokenizer() | ||
| result = qt.tokenize_query("machine learning") | ||
| for token, weight in result.items(): |
There was a problem hiding this comment.
These integration tests each instantiate QueryTokenizer() separately, which can be noticeably slow because it loads tokenizer assets from disk each time. Since they’re all validating behavior of the same real tokenizer, consider using a module-scoped fixture (or caching) so the tokenizer is loaded once per test session/module.
There was a problem hiding this comment.
This currently does not seem to be an issue. Test skipping integration tests and tests with the integration tests are largely as slow as the test startup time.
ghukill
left a comment
There was a problem hiding this comment.
Approved!
We've talked out-of-band of this PR about a possibly more robust (and complicated!) ping handler that might handle ping requests the same across all lambdas. I think it's a discussion still worth having, particularly for lambdas that field HTTP request invocations, but this is clearly a very effective and lightweight way to support ping behavior right now.
Full support of going this route for now, for even if pivoting to a decorated lambda that standardizes our ping behavior it would only require pulling out a couple of lines here.
Also, big fan of the integration checks as well.
Looking good to me.
| # We add this after query_tokenizer initialization to ensure the tokenizer is | ||
| # initialized during cold start, even for pings |
There was a problem hiding this comment.
Really appreciated this comment. I got concerned that the tokenizer was re-initialized for each invocation of the lambda, but then (re)remembered cached _get_tokenizer() pattern. So I'm digging the comment as it provides just enough friction to slow down and think about why that's placed there.
There was a problem hiding this comment.
Yeah I had the ping condition earlier in the handler initially as I wanted it to shortcut all the heavy lifting and then realized it might defeat the "keep warm" aspect of our health checks if it didn't actually happen after the tokenizer initialization.
| # --------------------------------------------------------------------------- | ||
|
|
||
|
|
||
| @pytest.mark.integration |
There was a problem hiding this comment.
+1 to the integration mark, just in case it could be helpful.
Note
There are 2 not strictly necessary commits included in this PR. One removes the example lambda that was not used. The second adds some non-entirely-necessary-but-made-me-feel-better integration tests. Mocks are great, but I get nervous when everything is a mock.
Why are these changes being introduced:
be used by AWS to determine if the lambda is healthy and ready to
receive traffic.
the service.
performance by reducing cold start latency.
Relevant ticket(s):
How does this address that need:
simple response indicating the service is healthy.
Document any side effects to this change:
as it is now reserved for health checks.
Includes new or updated dependencies?
NO
Changes expectations for external applications?
NO
Code review