-
Notifications
You must be signed in to change notification settings - Fork 1
Open
Description
Problem
We've observed intermittent performance issues in entity-api including:
- 502 Bad Gateway (on PUT calls against entity-api)
- 504 Gateway Timeout (when ingest-ui triggers calls to entity-api)
- "Max retries exceeded" (from search-api during reindex when making HTTP calls to entity-api)
These errors indicate requests fail to complete within the allowed time window under load.
Architecture
[Nginx -> uWSGI -> Python Flask App -> Neo4j Driver] (within the same container) -> Neo4j Server (separate container)
Test cases
We are creating Locust load tests in Python to cover the following scenarios:
- Baseline tests: 10 users making only GET requests (use the top 10 endpoints reported in Usage Dashboard)
- Reindex datasets: reindex 30 datasets via search-api
- Entity creations/updates: 10 new entity creations (POST) and 10 updates (PUT), which trigger reindex via the search-api queue.
- Bulk registration: bulk donor/sample registrations of ~40 entities (simulate TSV rows)
The goal is to stress the system enough to reproduce 502/504 errors and connection pool exhaustion, revealing the real bottlenecks.
Execution plan
- Run the above stress tests and collect metrics
- Fine-tune deployment parameters
- Re-run tests iteratively until stable performance is achieved
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels
Type
Projects
Status
In Progress