Skip to content

Implement separate-bucket multitenancy support for OSS attachments#784

Draft
Schmarvinius wants to merge 8 commits intomainfrom
feat/separate-bucket-multitenancy
Draft

Implement separate-bucket multitenancy support for OSS attachments#784
Schmarvinius wants to merge 8 commits intomainfrom
feat/separate-bucket-multitenancy

Conversation

@Schmarvinius
Copy link
Copy Markdown
Collaborator

@Schmarvinius Schmarvinius commented Apr 9, 2026

Implement Separate-Bucket Multitenancy Support for OSS Attachments

New Features

✨ Adds full support for separate-bucket multitenancy mode for OSS attachments. In this mode, each tenant gets its own dedicated object store bucket provisioned via SAP BTP Service Manager, providing complete storage-level tenant isolation.

Changes

Core Production Code:

  • OSClientProvider.java (new): Strategy interface for resolving the correct OSClient per tenant — supports both shared (single client) and separate (per-tenant client) modes.
  • SharedOSClientProvider.java (new): Implementation that returns a single shared OSClient for all tenants (single-tenant and shared-bucket multitenancy).
  • SeparateOSClientProvider.java (new): Implementation that resolves per-tenant OSClient instances via Service Manager, backed by a TTL-based ConcurrentHashMap cache.
  • CachedOSClient.java (new): Record wrapping an OSClient with a creation timestamp for TTL expiry checks.
  • ObjectStoreLifecycleHandler.java (new): Core business logic for tenant subscribe/unsubscribe: provisions SM service instances/bindings on subscribe, cleans up objects and deletes SM resources on unsubscribe.
  • ObjectStoreSubscribeHandler.java / ObjectStoreUnsubscribeHandler.java (new): CAP event handlers that hook into DeploymentService subscribe/unsubscribe events and delegate to ObjectStoreLifecycleHandler.
  • TenantNotProvisionedException.java (new): Exception thrown when a request is made for a tenant with no provisioned object store.
  • sm/ServiceManagerClient.java (new): REST client for the SAP BTP Service Manager API — manages service instances and bindings, including async polling.
  • sm/ServiceManagerTokenProvider.java (new): Obtains and caches OAuth tokens for authenticating with Service Manager.
  • sm/ServiceManagerCredentials.java (new): Record holding SM credentials extracted from a service binding.
  • sm/ServiceManagerBinding.java (new): Adapts SM binding credentials to the ServiceBinding interface expected by OSClientFactory.
  • sm/ServiceManagerException.java (new): Runtime exception for Service Manager operation failures.
  • OSSAttachmentsServiceHandler.java: Refactored to use OSClientProvider instead of a direct OSClient, enabling per-tenant client resolution for all attachment operations.
  • Registration.java: Refactored into two registration paths: registerSeparateMode (wires up SM client, SeparateOSClientProvider, and lifecycle handlers) and registerSharedOrSingleTenantMode (existing behavior). Added createExecutor helper and getServiceManagerBinding lookup.

Integration Tests:

  • SeparateBucketTenantLifecycleTest.java (new): Spring Boot integration tests for subscribe/unsubscribe lifecycle flows (idempotency, data isolation, cross-tenant protection).
  • AbstractMtxOssStorageTest.java / AbstractSeparateBucketOssStorageTest.java (new): Abstract base classes for real cloud storage tests covering CRUD, isolation, cleanup, concurrency, and edge cases.
  • Aws/Azure/GcpMtxOssStorageTest.java and Aws/Azure/GcpSeparateBucketOssStorageTest.java (new): Cloud-provider-specific test subclasses (skipped automatically when credentials are not set).
  • RegistrationTest.java, OSSAttachmentsServiceHandlerTest.java, OSSAttachmentsServiceHandlerTestUtils.java: Updated to use OSClientProvider/SharedOSClientProvider and to cover new separate-mode registration scenarios.
  • pom.xml (mtx-local): Added cds-feature-attachments-oss as a test dependency; pinned maven-resources-plugin version.
  • 🔄 Regenerate and Update Summary

📬 Subscribe to the Hyperspace PR Bot DL to get the latest announcements and pilot features!

PR Bot Information

Version: 1.20.10 | 📖 Documentation | 🚨 Create Incident | 💬 Feedback

  • Summary Prompt: Default Prompt
  • Correlation ID: 3ba05ae0-340d-11f1-8c54-865b56145b1b
  • Output Template: Default Template
  • LLM: anthropic--claude-4.6-sonnet
  • Event Trigger: pull_request.opened
  • File Content Strategy: Full file content

When cds.attachments.objectStore.kind=separate, each tenant gets a
dedicated object store bucket provisioned via SAP BTP Service Manager.

- Introduce OSClientProvider strategy pattern to abstract per-tenant
  client resolution (SharedOSClientProvider / SeparateOSClientProvider)
- Add Service Manager REST client for provisioning instances and bindings
- Add tenant lifecycle handlers (subscribe/unsubscribe) for bucket
  provisioning and cleanup via DeploymentService events
- Refactor Registration to support shared, separate, and single-tenant
  modes with clean code paths
- AbstractSeparateBucketOssStorageTest: 20 tests covering two-bucket
  CRUD, tenant isolation, cleanup, concurrency, and edge cases
- AWS/Azure/GCP subclasses with per-tenant bucket env var support
- SeparateBucketTenantLifecycleTest: 5 OData-level tests for
  subscribe/create/unsubscribe lifecycle flows
@Schmarvinius Schmarvinius marked this pull request as draft April 9, 2026 12:11
Copy link
Copy Markdown
Contributor

@hyperspace-insights hyperspace-insights bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR introduces a well-structured separate-bucket multitenancy feature with good test coverage, but has several substantive issues that need addressing: an unbounded polling backoff that can massively overshoot the timeout, a blocking remote call inside ConcurrentHashMap.compute(), a potential NullPointerException in token fetching when only mTLS credentials are configured, a query-injection risk from unsanitized tenant IDs in SM label queries, a TOCTOU race during concurrent tenant subscription, and leaked thread pools from re-entrant executor creation in Registration.

PR Bot Information

Version: 1.20.10 | 📖 Documentation | 🚨 Create Incident | 💬 Feedback

  • Correlation ID: 3ba05ae0-340d-11f1-8c54-865b56145b1b
  • LLM: anthropic--claude-4.6-sonnet
  • Event Trigger: pull_request.opened
  • File Content Strategy: Full file content

Comment on lines +225 to +231
long elapsed = System.currentTimeMillis() - startTime;
if (elapsed > pollTimeout.toMillis()) {
throw new ServiceManagerException("Polling timed out after " + pollTimeout);
}

Duration sleepDuration = INITIAL_POLL_INTERVAL.multipliedBy(iteration);
try {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Logic Error: Polling sleep duration grows unboundedly, causing timeout to never be reached in practice

sleepDuration is INITIAL_POLL_INTERVAL.multipliedBy(iteration), so it grows as 5s, 10s, 15s, 20s... On each iteration, the sleep duration itself may exceed the remaining poll timeout. For a 5-minute timeout, after ~12 iterations the sleep alone exceeds 60 seconds, and Thread.sleep is called before checking whether the operation would time out. This means the timeout check at the top of the loop is evaluated only after waking up, potentially overshooting the deadline significantly.

Consider checking the timeout before sleeping, and capping the sleep to the remaining time or a fixed maximum interval.

Suggested change
long elapsed = System.currentTimeMillis() - startTime;
if (elapsed > pollTimeout.toMillis()) {
throw new ServiceManagerException("Polling timed out after " + pollTimeout);
}
Duration sleepDuration = INITIAL_POLL_INTERVAL.multipliedBy(iteration);
try {
long startTime = System.currentTimeMillis();
int iteration = 1;
Duration maxInterval = Duration.ofSeconds(30);
while (true) {
long elapsed = System.currentTimeMillis() - startTime;
long remainingMs = pollTimeout.toMillis() - elapsed;
if (remainingMs <= 0) {
throw new ServiceManagerException("Polling timed out after " + pollTimeout);
}
Duration sleepDuration = INITIAL_POLL_INTERVAL.multipliedBy(iteration);
if (sleepDuration.compareTo(maxInterval) > 0) {
sleepDuration = maxInterval;
}
long sleepMs = Math.min(sleepDuration.toMillis(), remainingMs);
try {
Thread.sleep(sleepMs);
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
throw new ServiceManagerException("Interrupted while polling", e);
}

Double-check suggestion before committing. Edit this comment for amendments.


Please provide feedback on the review comment by checking the appropriate box:

  • 🌟 Awesome comment, a human might have missed that.
  • ✅ Helpful comment
  • 🤷 Neutral
  • ❌ This comment is not helpful

Comment on lines +42 to +54
CachedOSClient entry =
cache.compute(
tenantId,
(id, existing) -> {
if (existing != null && !existing.isExpired(credentialTtl)) {
return existing;
}
logger.debug(
"Resolving OSClient for tenant {} ({})",
id,
existing == null ? "cache miss" : "expired");
return createClient(id);
});
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: cache.compute() is called with a mapping function that makes a remote SM API call, which can block for a significant time. ConcurrentHashMap.compute() holds an exclusive lock on the bucket during the entire function execution. If two threads concurrently request different tenants whose keys hash to the same bucket, one will block the other. More critically, if the SM call hangs, the entire bucket segment of the map is locked.

Consider using computeIfAbsent for the cache-miss path only, or implementing a double-checked pattern with get() first and putIfAbsent() after creating the client outside the lock.

Suggested change
CachedOSClient entry =
cache.compute(
tenantId,
(id, existing) -> {
if (existing != null && !existing.isExpired(credentialTtl)) {
return existing;
}
logger.debug(
"Resolving OSClient for tenant {} ({})",
id,
existing == null ? "cache miss" : "expired");
return createClient(id);
});
CachedOSClient existing = cache.get(tenantId);
if (existing != null && !existing.isExpired(credentialTtl)) {
return existing.client();
}
logger.debug(
"Resolving OSClient for tenant {} ({})",
tenantId,
existing == null ? "cache miss" : "expired");
CachedOSClient fresh = createClient(tenantId);
cache.put(tenantId, fresh);
return fresh.client();

Double-check suggestion before committing. Edit this comment for amendments.


Please provide feedback on the review comment by checking the appropriate box:

  • 🌟 Awesome comment, a human might have missed that.
  • ✅ Helpful comment
  • 🤷 Neutral
  • ❌ This comment is not helpful

return;
}

ExecutorService executor = createExecutor(env);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: createExecutor(env) is called twice when both shared and separate modes are active over the lifecycle of the application — but more importantly, every call to eventHandlers() (which can be called multiple times by the CDS runtime) registers a new shutdown hook and creates a new thread pool, leaking threads. The executor should be stored as a field or created only once.

Additionally, if separate mode is configured, createExecutor is called once for the OSS handler and the same executor instance is reused for the lifecycle handler — that part is fine — but if eventHandlers is somehow re-invoked, a second executor and shutdown hook will be created.


Please provide feedback on the review comment by checking the appropriate box:

  • 🌟 Awesome comment, a human might have missed that.
  • ✅ Helpful comment
  • 🤷 Neutral
  • ❌ This comment is not helpful

Comment on lines +57 to +62
"grant_type=client_credentials"
+ "&client_id="
+ URLEncoder.encode(credentials.clientId(), StandardCharsets.UTF_8)
+ "&client_secret="
+ URLEncoder.encode(credentials.clientSecret(), StandardCharsets.UTF_8);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: clientSecret is appended to the request body in plain text via string concatenation. If credentials.clientSecret() is null (e.g. when only mTLS credentials are configured), URLEncoder.encode(null, ...) throws a NullPointerException. There is no guard for this case, yet ServiceManagerCredentials.hasMtlsCredentials() exists precisely to indicate that clientSecret may be absent.

Should add a null check and either throw a descriptive error or handle the mTLS path when clientSecret is null.

Suggested change
"grant_type=client_credentials"
+ "&client_id="
+ URLEncoder.encode(credentials.clientId(), StandardCharsets.UTF_8)
+ "&client_secret="
+ URLEncoder.encode(credentials.clientSecret(), StandardCharsets.UTF_8);
if (credentials.clientSecret() == null && !credentials.hasMtlsCredentials()) {
throw new ServiceManagerException(
"No client secret or mTLS credentials configured for Service Manager token retrieval");
}
String body =
"grant_type=client_credentials"
+ "&client_id="
+ URLEncoder.encode(credentials.clientId(), StandardCharsets.UTF_8)
+ "&client_secret="
+ URLEncoder.encode(
credentials.clientSecret() != null ? credentials.clientSecret() : "",
StandardCharsets.UTF_8);

Double-check suggestion before committing. Edit this comment for amendments.


Please provide feedback on the review comment by checking the appropriate box:

  • 🌟 Awesome comment, a human might have missed that.
  • ✅ Helpful comment
  • 🤷 Neutral
  • ❌ This comment is not helpful

* @return the binding credentials, or empty if no binding exists
*/
public Optional<ServiceManagerBindingResult> getBinding(String tenantId) {
String labelQuery = "tenant_id eq '%s'".formatted(tenantId);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Security: Tenant ID is interpolated directly into a query string without sanitization: "tenant_id eq '%s'".formatted(tenantId). If a tenant ID contains a single quote (') or other special characters recognized by the Service Manager query language, it could break the query or allow query injection. The same pattern applies to getInstanceByTenant.

Consider validating that tenantId does not contain quotes or special characters before using it in field/label queries, or use the SM API's label syntax in a way that properly escapes values.


Please provide feedback on the review comment by checking the appropriate box:

  • 🌟 Awesome comment, a human might have missed that.
  • ✅ Helpful comment
  • 🤷 Neutral
  • ❌ This comment is not helpful

Comment on lines +46 to +51
var existingBinding = smClient.getBinding(tenantId);
if (existingBinding.isPresent()) {
logger.info("Binding already exists for tenant {}, warming cache", tenantId);
warmCache(tenantId, existingBinding.get());
return;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Logic Error: The onTenantSubscribe idempotency check looks up an existing binding via smClient.getBinding(tenantId), but there is a TOCTOU (time-of-check/time-of-use) race: if two subscribe calls arrive concurrently for the same tenant, both could find no binding, both call createInstance, and both attempt createBinding. This results in duplicate SM instances for the same tenant.

Consider using a per-tenant lock or a distributed lock/flag (e.g., tagging the in-progress state) to prevent concurrent provisioning of the same tenant.


Please provide feedback on the review comment by checking the appropriate box:

  • 🌟 Awesome comment, a human might have missed that.
  • ✅ Helpful comment
  • 🤷 Neutral
  • ❌ This comment is not helpful

Comment on lines +53 to +54
String offeringId = smClient.getOfferingId();
String planId = smClient.getPlanId(offeringId);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Logic Error: onTenantSubscribe calls smClient.getOfferingId() and smClient.getPlanId(offeringId) on every new subscription. These values are immutable for a given SM instance and making two extra round trips per tenant subscription is both slow and fragile. If the SM is temporarily unavailable between createInstance and these calls, the operation fails unnecessarily.

Consider caching offeringId and planId at construction time or lazily on first use with a stored field.


Please provide feedback on the review comment by checking the appropriate box:

  • 🌟 Awesome comment, a human might have missed that.
  • ✅ Helpful comment
  • 🤷 Neutral
  • ❌ This comment is not helpful

Comment on lines +105 to +112
void readWithDifferentTenantPrefix_fails() throws Exception {
String contentId = uniqueId("isolation");
String t1Key = objectKey(TENANT_1, contentId);
String t2Key = objectKey(TENANT_2, contentId);

uploadContent(t1Key, "secret data");

assertThatThrownBy(() -> downloadContent(t2Key)).isInstanceOf(Exception.class);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Logic Error: readWithDifferentTenantPrefix_fails only uploads content for t1Key but never adds t2Key to createdObjectKeys. If the assertThatThrownBy assertion unexpectedly passes (the object is somehow readable), teardown will not attempt to clean up t2Key. This is benign for a missing key, but the pattern is inconsistent with how other tests track keys for cleanup — and more importantly, if a cloud provider happens to return an empty stream instead of throwing (which downloadContent handles by throwing RuntimeException), the exception type assertion would still pass but the test's semantic intent would be wrong.

More critically: the assertion isInstanceOf(Exception.class) is extremely broad and will pass for any exception including NullPointerException from a bug in downloadContent itself. Consider asserting a more specific exception type that indicates "not found" vs a general failure.


Please provide feedback on the review comment by checking the appropriate box:

  • 🌟 Awesome comment, a human might have missed that.
  • ✅ Helpful comment
  • 🤷 Neutral
  • ❌ This comment is not helpful

private ServiceManagerClient smClient;
private SeparateOSClientProvider clientProvider;
private ObjectStoreLifecycleHandler handler;
private static final ExecutorService executor = Executors.newCachedThreadPool();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: The executor is created as a static Executors.newCachedThreadPool() shared across all test instances and never shut down. Over the test suite lifecycle this leaks threads. Since this is a test, the impact is limited, but it can cause test runners to hang on exit.

Consider making it an instance field with @AfterEach shutdown, or using Executors.newSingleThreadExecutor() and shutting it down after the test class.


Please provide feedback on the review comment by checking the appropriate box:

  • 🌟 Awesome comment, a human might have missed that.
  • ✅ Helpful comment
  • 🤷 Neutral
  • ❌ This comment is not helpful

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant