Skip to content

docs(data-access): proposal to retire tier-client#1403

Open
ekremney wants to merge 4 commits intomainfrom
docs/retire-tier-client
Open

docs(data-access): proposal to retire tier-client#1403
ekremney wants to merge 4 commits intomainfrom
docs/retire-tier-client

Conversation

@ekremney
Copy link
Member

@ekremney ekremney commented Mar 4, 2026

Summary

Proposal to fully retire @adobe/spacecat-shared-tier-client by moving all 7 methods into EntitlementCollection and SiteEnrollmentCollection as PostgREST queries with resource embedding (JOINs).

Core argument: All of TierClient's complexity — chunked batch fetching, iterative lookups, client-side org-ownership validation — were DynamoDB workarounds that PostgREST JOINs eliminate.

The proposal includes:

  • Method-by-method analysis with PostgREST replacement code
  • Full caller inventory: 27 call sites across 4 repos
  • 3-phase migration plan

See packages/spacecat-shared-data-access/docs/retire-tier-client.md for the full writeup.

🤖 Generated with Claude Code

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@github-actions
Copy link

github-actions bot commented Mar 4, 2026

This PR will trigger no release when merged.

@ekremney ekremney requested review from ravverma and solaris007 March 4, 2026 15:11
@solaris007 solaris007 added the documentation Improvements or additions to documentation label Mar 4, 2026
Copy link
Member

@solaris007 solaris007 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @ekremney,

Strong proposal - the thesis is correct and the migration plan is well-structured. A few findings from cross-referencing the proposal against the actual TierClient source and auth-service consumer code:

Important: Some method descriptions overstate the original complexity

The proposal frames getAllEnrollment as having "chunked batch fetches (CHUNK_SIZE = 50)" and getFirstEnrollment as "iterating enrollments one-by-one calling Site.findById() for each." The actual tier-client.js code tells a different story:

  • getAllEnrollment() (lines 226-278) uses Site.batchGetByKeys(siteKeys) - a single batch call, no chunking loop. The CHUNK_SIZE = 50 pattern doesn't exist in the current code (may have been in a DynamoDB-era version).
  • getFirstEnrollment() (lines 286-308) simply calls getAllEnrollment() and returns enrollments[0]. There's no iterative one-by-one lookup.

The proposal's PostgREST replacements are still better (a JOIN beats a batch-fetch-then-filter), but the stated complexity reduction is overstated. Worth correcting to maintain credibility of the proposal.

Important: The N+1 problem is in the callers, not TierClient itself

The proposal positions TierClient's methods as inherently N+1. But checkValidEntitlement() is only 1-2 queries per invocation - that's fine. The N+1 problem lives in orgs2.js:getAllLlmoEnabledOrgs() which calls checkValidEntitlement() in a Promise.all() loop over ~928 orgs. PR #1402's allByProductCodeWithOrganization() solves this specific hot path by replacing the loop with a single JOIN.

The proposal should distinguish between:

  1. Methods that are individually inefficient (e.g., getAllEnrollment doing fetch + batch + filter when a JOIN suffices)
  2. Methods that are fine individually but called in N+1 patterns by consumers (e.g., checkValidEntitlement)

This matters for migration prioritization - the consumer-side N+1s need caller changes too, not just new collection methods.

Important: PR #1402's return type creates a migration gap for auth-service

PR #1402's allByProductCodeWithOrganization() returns plain objects ({entitlement: {id, productCode, tier}, organization: {id, name, imsOrgId}}). But the auth-service's getAllLlmoEnabledOrgs currently calls entitlement.getId(), entitlement.getProductCode(), entitlement.getTier() - model instance methods. The migration from one to the other requires rewriting property access patterns at the call site. The proposal should call this out explicitly in the Phase 2 auth-service migration notes.

Minor: checkValidEntitlement replacement query may not be correct

The proposal shows:

.from('entitlements')
.select('*, site_enrollments(*)')
.eq('organization_id', orgId)
.eq('product_code', productCode)
.maybeSingle();

But the actual method only queries site enrollments when this.site is set (line 108). The proposed replacement would always embed site enrollments even when not needed. Consider two variants: findValidForOrganization(orgId, productCode) (no site check) and findValidForSite(orgId, productCode, siteId) (with site enrollment JOIN).

Minor: createEntitlement effort rated as "Medium" but has subtle business logic

The don't-downgrade-PAID guard and hardcoded quotas (llmo_trial_prompts: 200) are product-specific business rules baked into TierClient. Moving these into EntitlementCollection.createOrUpdateWithEnrollment() puts product-specific logic into the generic data layer. Consider whether these guards belong in the collection (closer to data) or in a thin service layer in the consumer repos.

Minor: Missing trailing newline in the doc file

The diff shows 264 additions but the file is 263 lines - missing the POSIX trailing newline.

Note: Call site inventory verified for auth-service

I confirmed 7 checkValidEntitlement() call sites across 6 files in auth-service (login.js, orgs.js, orgs2.js, s2s-login.js, promise.js, access-control-util.js). Note that access-control-util.js uses both createForSite (async factory) and createForOrg - the only auth-service file using the site path.

Overall: The direction is right - retiring TierClient is well-motivated and the PostgREST replacements are sound. The main feedback is accuracy: tighten up the method-by-method descriptions to match the actual code, and call out the plain-objects-vs-model-instances gap for auth-service migration. The proposal will be more persuasive when the stated complexity matches reality.

…roposal

- Distinguish between individually-inefficient methods and methods called
  in N+1 patterns by consumers
- Split checkValidEntitlement replacement into org-only and with-site variants
- Acknowledge product-specific business logic concern for createEntitlement
- Add migration note about plain objects vs model instances
- Reference PRs #1390 and #1391 for chunking context
- Add trailing newline

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@ekremney
Copy link
Member Author

ekremney commented Mar 4, 2026

Thanks @solaris007, great review. You caught some real gaps. I've updated the proposal in 46151be to address your feedback. Here's point-by-point:

On getAllEnrollment and getFirstEnrollment complexity:

I want to push back gently here — the descriptions in the proposal are accurate:

Both are currently in the main branch. Happy to be corrected if you're looking at a different version.

On "N+1 is in callers, not TierClient" — agreed, updated:

Good distinction. checkValidEntitlement() is 1-2 queries per invocation — perfectly fine individually. The N+1 lives in orgs2.js calling it ~928 times. I've added this nuance to the thesis and method description. This matters for prioritization: the allByProductCodeWithOrganization bulk query (PR #1402) solves the hot path, while the per-invocation collection methods are about simplification, not performance.

On checkValidEntitlement replacement — split into two variants:

Good catch. Updated the proposal with findValidForOrganization(orgId, productCode) (org-only, most callers) and findValidForSite(orgId, productCode, siteId) (with site enrollment JOIN). Also added a caller table showing which variant each call site needs.

On createEntitlement business logic in data layer:

Valid concern. Added a note acknowledging the PAID guard and hardcoded quotas are product-specific. Our position: the PAID guard is a data integrity constraint (prevent invalid state transitions) and the quotas are creation defaults — both closer to data than to consumer workflows. If product-specific rules grow, they can be extracted later.

On plain objects vs model instances migration gap — added:

Added a Phase 2 migration note explicitly calling this out: cross-entity JOIN methods return plain objects (.id, .tier), while single-entity methods continue returning model instances (.getId(), .getTier()).

Trailing newline — fixed.

Copy link
Member

@solaris007 solaris007 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @ekremney,

Good revisions - the key feedback from the first review is well-addressed. The thesis now properly distinguishes N+1 sources, the checkValidEntitlement split into two variants is cleaner, and the migration note about plain objects vs model instances is clear.

Two remaining accuracy issues:

CHUNK_SIZE = 50 doesn't exist in tier-client.js

The proposal states "chunked batch fetches (CHUNK_SIZE = 50, lines 264-275 of tier-client.js, added in PR #1390)". However, CHUNK_SIZE does not appear anywhere in the current tier-client code (or anywhere in spacecat-shared-data-access). The actual getAllEnrollment() (lines 226-278) calls Site.batchGetByKeys(siteKeys) as a single call with no chunking loop. The chunking may have been in PR #1390 but was since refactored or removed. Either update the line/PR references to match the current code, or note that the chunking was moved/removed.

getFirstEnrollment description is still inaccurate

The proposal says: "The org-only path iterates enrollments one-by-one, calling Site.findById() for each until it finds one belonging to the target org."

The actual code (lines 286-308):

const { entitlement, enrollments } = await this.getAllEnrollment();
const firstEnrollment = enrollments[0];
const site = await this.Site.findById(enrollmentSiteId);

It calls getAllEnrollment() (bulk fetch + filter), takes enrollments[0], and does one Site.findById(). No iteration. The inefficiency is in getAllEnrollment doing the heavy lifting, not in getFirstEnrollment iterating.

Everything else looks good. The business logic note for createEntitlement is well-reasoned, and the Phase 2 migration note about property access patterns is clear. LGTM once these two descriptions are corrected.

Copy link
Member

@solaris007 solaris007 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @ekremney,

Correction on my previous review - the two accuracy issues I flagged (CHUNK_SIZE = 50 and getFirstEnrollment iteration) are in fact correct. PR #1390 updated tier-client.js with the chunking loop (line 264) and the one-by-one Site.findById() iteration in getFirstEnrollment (lines 332-340). My review was based on a stale checkout that predated those changes. Apologies for the noise.

With that cleared up, the proposal is accurate across the board:

  • Method descriptions match the current code
  • Call site inventory verified for auth-service (7 sites across 6 files)
  • N+1 distinction (individual inefficiency vs consumer patterns) is well-articulated
  • checkValidEntitlement split into org-only and with-site variants is clean
  • Business logic placement rationale for createEntitlement is sound
  • Phase 2 migration note about plain objects vs model instances is clear

LGTM.

allByProductCodeWithOrganization shipped in PR #1402, released as
@adobe/spacecat-shared-data-access@3.8.0.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants