Skip to content

feat(data-access): add bulk entitlement query with organization embedding#1402

Merged
solaris007 merged 4 commits intomainfrom
feat/entitlement-bulk-query-with-org
Mar 5, 2026
Merged

feat(data-access): add bulk entitlement query with organization embedding#1402
solaris007 merged 4 commits intomainfrom
feat/entitlement-bulk-query-with-org

Conversation

@ekremney
Copy link
Member

@ekremney ekremney commented Mar 4, 2026

Summary

Adds allByProductCodeWithOrganization(productCode) to EntitlementCollection — a new method that uses PostgREST resource embedding to fetch entitlements JOINed with their parent organization data in a single HTTP request.

Problem

The auth-service's getAllLlmoEnabledOrgs function (used for the /auth/orgs2 admin endpoint) currently:

  1. Calls Organization.all() — 1 query returning ~928 organizations
  2. For each org, calls TierClient.checkValidEntitlement() which makes a separate HTTP request to the PostgREST data service to query the entitlements table
  3. All ~928 requests fire in parallel via Promise.all, overwhelming the PostgREST server behind the ALB

This causes intermittent TypeError: fetch failed errors in production (observed at 2026-03-04T03:03:26 UTC affecting 50+ organizations in a single invocation):

Base Collection Error [entitlement] DataAccessError2: Failed to query
  cause: { message: 'TypeError: fetch failed' }

Solution

Uses PostgREST's resource embedding feature (leveraging the existing FK entitlements.organization_id → organizations.id) to replace ~929 HTTP requests with 1:

this.postgrestService
  .from('entitlements')
  .select('id, product_code, tier, organization_id, organizations!inner(id, name, ims_org_id)')
  .eq('product_code', productCode)
  .not('organizations.ims_org_id', 'is', null)
  .range(offset, offset + DEFAULT_PAGE_SIZE - 1);

Key design decisions

  • organizations!inner — INNER JOIN so entitlements without a matching org are excluded
  • Filters ims_org_id IS NOT NULL at the DB level to reduce data transfer
  • Paginates with range() in a loop (page size 1000) to handle growth beyond the current ~928 orgs
  • Returns plain objects (not model instances) with camelCase field names — the consumer (getAllLlmoEnabledOrgs) only needs id, productCode, tier, name, imsOrgId
  • No SQL migrations needed — PostgREST auto-detects the existing FK for embedding

Changes

File Change
src/models/entitlement/entitlement.collection.js Add allByProductCodeWithOrganization() method
src/models/entitlement/index.d.ts Add EntitlementWithOrganization interface and method type
test/unit/models/entitlement/entitlement.collection.test.js 6 unit tests: happy path, empty results, missing productCode, null org, error handling, pagination
test/it/entitlement/entitlement.test.js 4 integration tests: LLMO query, product code filtering, org data correctness, empty result

Performance impact

Metric Before After
HTTP requests to PostgREST ~929 1
Concurrent connections ~928 1
DB queries ~929 separate SELECTs 1 SELECT with JOIN

Test plan

  • Unit tests pass (6 new tests)
  • Integration tests added (4 new tests — run in CI with docker-compose)
  • After release: bump dep in auth-service and update getAllLlmoEnabledOrgs to use new method
  • Deploy to staging, hit /auth/orgs2 as admin, verify single query in CloudWatch logs
  • Verify no [entitlement] Failed to query errors

🤖 Generated with Claude Code

…ding

Add allByProductCodeWithOrganization() to EntitlementCollection that uses
PostgREST resource embedding to fetch entitlements with their parent
organization data in a single query, eliminating N+1 queries.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@ekremney ekremney requested a review from igor-grubic March 4, 2026 14:38
@ekremney ekremney assigned ravverma and unassigned ravverma Mar 4, 2026
@ekremney ekremney requested a review from ravverma March 4, 2026 14:39
@ekremney ekremney self-assigned this Mar 4, 2026
@igor-grubic
Copy link
Contributor

@ravverma I will let you review this as well, but we might be able to improve performance on some of the login queries as well in a similar way?

Copy link
Member

@solaris007 solaris007 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @ekremney,

Great work tackling the N+1 query problem - going from ~929 HTTP requests to 1 is a significant operational improvement. The PostgREST resource embedding approach is well-suited here. A few findings:

Important: This is the first collection to directly access this.postgrestService from a subclass

Every other collection in the codebase (OrganizationCollection, SiteCollection, etc.) delegates all PostgREST interaction to BaseCollection's private methods (#queryPage, #queryByIndexKeys). This PR bypasses that entire layer - building raw queries, doing its own pagination loop, and manually mapping snake_case to camelCase instead of using BaseCollection's #toModelRecord() / fromDbRecord() utilities.

This is arguably justified since PostgREST resource embedding (JOINs) isn't supported by the BaseCollection abstraction. But it sets a precedent: future cross-entity queries could follow this pattern, gradually duplicating pagination/error-handling/mapping logic across collections. Worth considering whether BaseCollection should gain a primitive for embedded queries, even if that's a follow-up.

Important: Return type breaks the collection contract

Every other collection method returns model instances (via #createInstance) or arrays of model instances. This method returns plain {entitlement, organization} objects. The TypeScript types document this clearly, which is good. But consumers used to calling .getId(), .getTier(), etc. on returned objects will get a surprise - result.entitlement.tier works, but result.entitlement.getTier() would not.

The PR description acknowledges this ("returns plain objects - the consumer only needs id, productCode, tier, name, imsOrgId"). This is pragmatic for the immediate use case, but it could be confusing if future callers expect model instances. A brief JSDoc note on the return type saying "returns plain objects, not model instances" would help.

Minor: Error wrapping is inconsistent with BaseCollection

BaseCollection uses DataAccessError (via #logAndThrowError) for all query failures. This method throws a generic Error:

throw new Error(`Failed to query entitlements with organizations: ${error.message}`);

For consistency, consider using DataAccessError - or at minimum, document why the simpler error type is preferred here.

Minor: The organization can't actually be null with !inner JOIN

The implementation maps row.organizations ? {...} : null, and there's a unit test for the organizations: null case. But since the query uses organizations!inner(...), PostgREST performs an INNER JOIN - rows where the organization doesn't exist are excluded from results entirely. So row.organizations should never be null in practice. The defensive null check doesn't hurt, but the test may give a false sense that this is an expected production path.

Minor: organization_id is selected but never used

The select string includes organization_id but it's not mapped to the return object. It doesn't cause harm (slightly more data transferred), but could be dropped for clarity.

Note: Pagination test uses hardcoded 1000

The pagination test creates Array.from({ length: 1000 }, ...) and checks rangeStub.firstCall.args against [0, 999]. If DEFAULT_PAGE_SIZE ever changes, this test would fail. Consider importing DEFAULT_PAGE_SIZE in the test and using it to build the test data, as base.collection.js does.

Overall: Good change. The N+1 fix is necessary and the PostgREST resource embedding approach is the right tool. The main concerns are architectural (precedent of bypassing BaseCollection's abstractions) and stylistic (plain objects vs model instances, error type consistency). None are blockers - recommend addressing the DataAccessError consistency and adding the JSDoc clarification, with the BaseCollection abstraction improvement as a potential follow-up.

- Use DataAccessError instead of generic Error for consistency with BaseCollection
- Add JSDoc clarifying return type is plain objects, not model instances
- Drop unused organization_id from select string
- Remove organizations-null test case (impossible with !inner JOIN)
- Import DEFAULT_PAGE_SIZE in tests instead of hardcoding 1000

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@github-actions
Copy link

github-actions bot commented Mar 4, 2026

This PR will trigger a minor release when merged.

@ekremney
Copy link
Member Author

ekremney commented Mar 4, 2026

Thanks for the thorough review @solaris007! Really appreciate the architectural thinking here, not just the line-level feedback.

Review items addressed in 3d3cc1a:

  • Switched to DataAccessError with { entityName, tableName } details, consistent with BaseCollection's #logAndThrowError
  • Added JSDoc clarifying the method returns plain objects, not model instances
  • Dropped organization_id from the select string
  • Removed the organizations: null test case — you're right, !inner makes that path impossible
  • Pagination test now imports DEFAULT_PAGE_SIZE instead of hardcoding 1000

On the architectural precedent:

You raise a good point about this being the first collection to access postgrestService directly. I think this is actually the direction we should be heading — and your review helps frame how to do it well.

The BaseCollection abstraction works great for single-entity CRUD. But now that we're fully on PostgREST, we're going to need more cross-entity queries like this one — especially as we look at consolidating some of the query patterns that currently live outside data-access (e.g. in tier-client). PostgREST resource embedding is the idiomatic way to do these, and it's not something the current BaseCollection API can express.

Where I fully agree with you is that each collection shouldn't be reimplementing pagination, error handling, and field mapping from scratch. This method ended up doing that because #queryPage, #toModelRecord, and #logAndThrowError are all private. As a follow-up, I think BaseCollection could expose a small set of protected primitives that embedded-query methods can compose — something like:

  • A paginated query runner that takes a PostgREST query builder and handles the range() loop
  • Exposed field mapping utilities (toModelRecord / toDbRecord)
  • A shared error-wrapping method

That way collections can own their domain-specific query logic (which JOINs to use, what shape to return) while reusing the infrastructure that BaseCollection already has. For example, future embedded queries across collections would look something like:

async allWithDetailsByOrganization(orgId, productCode) {
  const query = this.postgrestService
    .from('site_enrollments')
    .select('*, sites!inner(...), entitlements!inner(...)')
    .eq('entitlements.organization_id', orgId);

  // Shared pagination + error handling from BaseCollection
  return this.queryEmbedded(query, (row) => ({
    enrollment: { ... },
    site: { ... },
  }));
}

This would just complement it for cross-entity reads where JOINs are a better fit.

Happy to take a stab at those BaseCollection primitives as a follow-up PR if that sounds reasonable. Wanted to keep this one focused on the production fix.

ekremney and others added 2 commits March 4, 2026 16:46
The .not('organizations.ims_org_id', 'is', null) filter does not work
reliably with PostgREST resource embedding, causing the query to return
0 results for some product codes. The !inner JOIN already ensures only
entitlements with a matching organization are returned, and the consumer
filters on imsOrgId defensively.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ming

The allByProductCodeWithOrganization IT test was flaky because the
!inner JOIN depends on PostgREST resolving FK relationships, which
can fail intermittently when seedDatabase() toggles triggers via
ALTER TABLE (causing schema cache reloads).

The test now verifies product code filtering correctness (no cross-
contamination, no ID overlap) without asserting specific counts that
depend on the !inner JOIN returning all seed data rows.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Member

@solaris007 solaris007 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @ekremney,

All feedback addressed across the 3 follow-up commits:

  • DataAccessError now used consistently with BaseCollection patterns
  • JSDoc clearly documents plain-object return type
  • Dropped unused organization_id from select
  • Removed the impossible organizations: null test case
  • Tests now use imported DEFAULT_PAGE_SIZE instead of hardcoded 1000
  • Good catch on the .not() filter not working with resource embedding - removing it and relying on the !inner JOIN is the right call
  • IT tests made resilient to PostgREST schema cache timing - pragmatic fix

One minor note: the TypeScript interface still has organization: {...} | null in EntitlementWithOrganization, but since !inner guarantees the organization is always present, the | null is technically unreachable. Not a blocker - defensive typing is fine.

BaseCollection primitives for embedded queries tracked as a follow-up. LGTM.

@solaris007 solaris007 merged commit 588cfce into main Mar 5, 2026
7 checks passed
@solaris007 solaris007 deleted the feat/entitlement-bulk-query-with-org branch March 5, 2026 05:14
solaris007 pushed a commit that referenced this pull request Mar 5, 2026
## [@adobe/spacecat-shared-data-access-v3.8.0](https://github.com/adobe/spacecat-shared/compare/@adobe/spacecat-shared-data-access-v3.7.0...@adobe/spacecat-shared-data-access-v3.8.0) (2026-03-05)

### Features

* **data-access:** add bulk entitlement query with organization embedding ([#1402](#1402)) ([588cfce](588cfce))
@solaris007
Copy link
Member

🎉 This PR is included in version @adobe/spacecat-shared-data-access-v3.8.0 🎉

The release is available on:

Your semantic-release bot 📦🚀

solaris007 added a commit that referenced this pull request Mar 5, 2026
allByProductCodeWithOrganization shipped in PR #1402, released as
@adobe/spacecat-shared-data-access@3.8.0.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request released

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants