[ELI-619] - cache campaign configs unless we send a reserved consumer id#610
[ELI-619] - cache campaign configs unless we send a reserved consumer id#610
Conversation
|
This is just a poc, will wait to add test coverage to see if people like it |
|
I check the updated PR, looks good. Nice solution. (previous solution used headers for cache enable/disable) I think, we might was a bit of thought around how we control the test consumer ids. When you originally mentioned this idea, before I looked at the PR, Maybe just needs a bit of thought around this hard coded test consumer ids, see if that is what we want. If yes, then all look great to me, nice solution. |
so yes you are right, in the sense that we will have to make sure that the conusmer mapping file uploaded by the regression tests uses consumer ids that match what we have in the reserved ids list. OR another idea would be to scrap that list and simply do it based on a prefix like "test-{consumer_id}" and then just make sure regression tests use that prefix too |
|
Worth using cachetools like with src/eligibility_signposting_api/feature_toggle/feature_toggle.py |
| CONSUMER_MAPPING_FILE_NAME = "consumer_mapping_config.json" | ||
| RESERVED_TEST_CONSUMER_IDS = {"test-consumer-1", "test-consumer-2", "test-consumer-3"} | ||
|
|
||
| TTL = { |
There was a problem hiding this comment.
this could just be env vars
There was a problem hiding this comment.
e.g.
import os
from typing import Literal
URL_PREFIX = "patient-check"
RULE_STOP_DEFAULT = False
NHS_NUMBER_HEADER = "nhs-login-nhs-number"
CONSUMER_ID = "NHSE-Product-ID"
ALLOWED_CONDITIONS = Literal["COVID", "FLU", "MMR", "RSV"]
CONSUMER_MAPPING_FILE_NAME = "consumer_mapping_config.json"
RESERVED_TEST_CONSUMER_IDS = {"test-consumer-1", "test-consumer-2", "test-consumer-3"}
CACHE_TTL_SECONDS = int(os.getenv("CONFIG_CACHE_TTL_SECONDS", "1800"))
| @@ -1,4 +1,7 @@ | |||
| import json | |||
There was a problem hiding this comment.
Using TTLCache would simplify this a bit e.g.
import json
import logging
from collections.abc import Generator
from typing import Annotated, NewType
from aws_xray_sdk.core import xray_recorder
from botocore.client import BaseClient
from cachetools import TTLCache
from wireup import Inject, service
from eligibility_signposting_api.config.constants import CACHE_TTL_SECONDS, RESERVED_TEST_CONSUMER_IDS
from eligibility_signposting_api.model.campaign_config import CampaignConfig, Rules
BucketName = NewType("BucketName", str)
logger = logging.getLogger(name)
campaign_config_cache: TTLCache[str, list[CampaignConfig]] = TTLCache(maxsize=1, ttl=CACHE_TTL_SECONDS)
@service
class CampaignRepo:
"""Repository class for Campaign Rules, which we can use to calculate a person's eligibility for vaccination.
These rules are stored as JSON files in AWS S3."""
def __init__(
self,
s3_client: Annotated[BaseClient, Inject(qualifier="s3")],
bucket_name: Annotated[BucketName, Inject(param="rules_bucket_name")],
) -> None:
super().__init__()
self.s3_client = s3_client
self.bucket_name = bucket_name
def get_campaign_configs(self, consumer_id: str) -> Generator[CampaignConfig, None, None]:
bypass = consumer_id in RESERVED_TEST_CONSUMER_IDS
cache_key = "all_campaigns"
cached = None if bypass else campaign_config_cache.get(cache_key)
with xray_recorder.in_subsegment("CampaignRepo.get_campaign_configs"):
if cached is not None:
logger.info("Using cached campaign configs")
yield from cached
return
logger.info(
"Refreshing campaign configs from S3 (consumer_id=%s, ttl_seconds=%s)",
consumer_id,
CACHE_TTL_SECONDS,
)
configs = self._load_campaign_configs_from_s3()
if not bypass:
campaign_config_cache[cache_key] = configs
yield from configs
def _load_campaign_configs_from_s3(self) -> list[CampaignConfig]:
campaign_configs: list[CampaignConfig] = []
with xray_recorder.in_subsegment("CampaignRepo.load_campaign_configs_from_s3"):
with xray_recorder.in_subsegment("list_objects"):
campaign_objects = self.s3_client.list_objects(Bucket=self.bucket_name)
with xray_recorder.in_subsegment("get_objects"):
for campaign_object in campaign_objects.get("Contents", []):
response = self.s3_client.get_object(
Bucket=self.bucket_name,
Key=f"{campaign_object['Key']}",
)
body = response["Body"].read()
campaign_configs.append(
Rules.model_validate(json.loads(body)).campaign_config
)
return campaign_configs
|
|
||
| import pytest | ||
|
|
||
| from eligibility_signposting_api.repos.campaign_repo import CampaignRepo, BucketName |
There was a problem hiding this comment.
we'd then import campaign_config_cache here
| return {"Body": io.BytesIO(json.dumps(payload).encode("utf-8"))} | ||
|
|
||
|
|
||
| class TestCampaignRepo: |
There was a problem hiding this comment.
and can have a fixture to clear the cache e.g.
@pytest.fixture(autouse=True)
def clear_cache(self):
campaign_config_cache.clear()
| self, | ||
| repo, | ||
| mock_s3_client, | ||
| monkeypatch, |
There was a problem hiding this comment.
... wouldn't need monkeypatch as could then call the cache.clear to force the refresh (later on)
Description
I've implemented an idea that looks like it would be a nice easy fit to grab some performance back.
Added some unit tests to show the cache being used and then expiring
Difficult to fully integration test without deploying but I've showed that it works at least (with a little fudging). This involves setting the env variable to "dev" in the docker set up for local runs, just for that commented out int test.
Context
stops fetching from s3 every single request
Type of changes
Checklist
Sensitive Information Declaration
To ensure the utmost confidentiality and protect your and others privacy, we kindly ask you to NOT including PII (Personal Identifiable Information) / PID (Personal Identifiable Data) or any other sensitive data in this PR (Pull Request) and the codebase changes. We will remove any PR that do contain any sensitive information. We really appreciate your cooperation in this matter.