Skip to content

Commit 2d7b82e

Browse files
committed
Optimized JVM memory usage and executor configuration because my Railway bill is beating me up!!! (Added a doc for this, see 2026-01-13-memory-optimization.md)
1 parent 99c6b4f commit 2d7b82e

8 files changed

Lines changed: 239 additions & 43 deletions

File tree

README.md

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1700,6 +1700,13 @@ Planned enhancements to be added incrementally:
17001700

17011701
These enhancements build directly on the existing metrics and APIs without requiring backend architectural changes.
17021702

1703+
### Potential Changes Pre-CloudQueue
1704+
1705+
#### Integrating Virtual Threads (Java 21)
1706+
1707+
SpringQueuePro currently uses platform threads for clarity and debuggability. A future enhancement will migrate worker execution to **Java 21 virtual threads**, which dramatically reduce per-thread memory overhead and allow higher concurrency with minimal resource cost.
1708+
- This change is intentionally deferred to keep the current architecture stable for demos and interviews.
1709+
17031710
---
17041711

17051712
### CloudQueue — AWS-Native Evolution
Lines changed: 150 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,150 @@
1+
# SpringQueuePro — Memory Optimization (2026-01-13)
2+
3+
This project is a massive memory hog and has been racking up my monthly Railway bill a ridiculous amount. This document is dedicated to a focused memory optimization pass performed on **2026-01-23** aimed at reducing JVM and application-level memory usage when running and hosting SpringQueuePro on any memory-billed PaaS platform. (As mentioned, I'm currently hosting this on Railway with their hobby plan which costs ~5CAD a month, which is the covered amount for your hosted project's memory usage. But this project, and even the base SpringQueue version, has been jumping my bill close to ~20CAD by month's end).
4+
5+
## Memory Usage Breakdown
6+
7+
It isn't *that* surprising that this project (**JVM + SpringBoot + metrics + Redis + GraphQL** being hosted **24/7**) has been so memory-hungry. The memory usage breakdown should look something like this:
8+
| Source | Memory Hog (Why) |
9+
| ----------------------------------- | ------------------------------------------------ |
10+
| **JVM default heap sizing** | JVM happily reserves hundreds of MB even at idle |
11+
| **Spring Boot auto-config** | Loads frequently while not in use |
12+
| **ExecutorServices + schedulers** | Threads = stack memory + queues |
13+
| **Micrometer + metrics registries** | Counters, timers, tags accumulate |
14+
| **Redis + caches + in-memory maps** | authoritative + cached state |
15+
| **GraphQL + GraphiQL** | Extra schema + reflection + servlet overhead |
16+
17+
## Steps Taken to Optimize Memory Usage
18+
19+
### 1. Capping the JVM heap on Railway
20+
21+
Added the following environmental variable on Railway:
22+
```
23+
JAVA_TOOL_OPTIONS=-Xms64m -Xmx256m -XX:+UseG1GC -XX:MaxMetaspaceSize=128m
24+
```
25+
An explicit limit on the JVM's memory allocation w/ the heap and metaspace. This will prevent over-reservation and reducing idle memory use.
26+
27+
---
28+
29+
### 2. Disable GraphiQL in Production
30+
31+
Adding this to `application-prod.yml`:
32+
```yaml
33+
spring:
34+
graphql:
35+
graphiql:
36+
enabled: false
37+
```
38+
GraphiQL loads static assets, schema introspection, and additional servlet mappings that are unnecessary in production. Disabling it reduces baseline memory usage and class loading overhead.
39+
40+
---
41+
42+
### 3. Limiting Actuator & Micrometer Exposure
43+
Original version of what's shown below included health, metrics, and prometheus in **include**:
44+
```yaml
45+
management:
46+
endpoints:
47+
web:
48+
exposure:
49+
include: health,prometheus
50+
metrics:
51+
enable:
52+
jvm: false
53+
process: false
54+
system: false
55+
executor: false
56+
hibernate: false
57+
logback: false
58+
```
59+
This disables high-cardinality, always-on metric groups (JVM, system, Hibernate, etc.) that consume memory continuously, while preserving the `/prometheus` endpoint needed for future Grafana integration.
60+
61+
---
62+
63+
### 4. Remove Legacy In-Memory Task Queue
64+
I originally had these fields and methods marked as `@Deprecated` but unfortunately these still consume memory even if they're not in use.
65+
66+
- Removed:
67+
68+
```java
69+
private final ConcurrentHashMap<String, Task> jobs;
70+
```
71+
- Removed deprecated methods such as:
72+
73+
```java
74+
@Deprecated
75+
public List<Task> getJobs() {
76+
return new ArrayList<>(jobs.values());
77+
}
78+
```
79+
80+
The legacy in-memory task map retained domain objects and enabled heap copying under load. Removing it eliminates unnecessary object retention and reinforces PostgreSQL as the single source of truth.
81+
82+
---
83+
84+
### 5. Cap Worker and Scheduler Threads via Configuration
85+
Setting this explicit configuration in `application-prod.yml` for my `QueueProperties.java` file:
86+
```yaml
87+
queue:
88+
main-exec-worker-count: 5
89+
sched-exec-worker-count: 2
90+
```
91+
Thread stacks consume ~1MB each by default. Explicitly capping worker and scheduler counts prevents unbounded thread creation while still allowing meaningful concurrency for demos and testing.
92+
93+
---
94+
95+
### 6. Removing Unnecessary Dependencies (pom.xml)
96+
97+
Getting rid of stuff like this that was lying around:
98+
```xml
99+
<artifactId>spring-boot-starter-webflux</artifactId>
100+
<scope>test</scope>
101+
```
102+
*For this example specifically*, WebFlux pulls in Reactor and Netty, increasing classpath scanning, memory usage, and startup overhead even when unused.
103+
104+
---
105+
106+
### 7. Replace Executor Factories with Explicit ThreadPoolExecutor
107+
108+
Making this change in `ExecutorConfig.java` (*the original code is what's commented out in the snippet*):
109+
110+
```java
111+
@Bean("execService")
112+
public ExecutorService taskExecutor() {
113+
/*return Executors.newFixedThreadPool(props.getMainExecWorkerCount(), r -> {
114+
Thread t = new Thread(r);
115+
t.setName("QS-Worker-" + t.getId());
116+
return t;
117+
});*/
118+
return new ThreadPoolExecutor(
119+
props.getMainExecWorkerCount(),
120+
props.getMainExecWorkerCount(),
121+
0L,
122+
TimeUnit.MILLISECONDS,
123+
new LinkedBlockingQueue<>(1000),
124+
r -> {
125+
Thread t = new Thread(r);
126+
t.setName("QS-Worker-" + t.getId());
127+
t.setDaemon(true);
128+
return t;
129+
}
130+
);
131+
}
132+
```
133+
134+
Using an explicit `ThreadPoolExecutor` avoids unbounded task queues, enables backpressure under load, and reduces memory spikes during stress testing.
135+
136+
---
137+
138+
### 8. Remove Legacy Metrics
139+
140+
Removed counters and gauges tied to deprecated code paths (e.g., legacy in-memory queue metrics). Got rid of stuff like (which relates to outdated, deprecated code):
141+
```java
142+
@Bean
143+
public Gauge inMemoryQueueSizeGauge(MeterRegistry registry, QueueService queueService) {
144+
return Gauge.builder("springqpro_queue_memory_size", queueService, q -> q.getJobMapCount())
145+
.description("Number of tasks currently in legacy in-memory queue")
146+
.register(registry);
147+
}
148+
```
149+
150+
Legacy metrics retained references to unused services and data structures, increasing object retention and complicating the runtime memory graph.

springqpro-backend/pom.xml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -156,12 +156,12 @@
156156
<scope>test</scope>
157157
</dependency>
158158

159-
<dependency>
159+
<!--<dependency>
160160
<groupId>org.springframework.boot</groupId>
161161
<artifactId>spring-boot-starter-webflux</artifactId>
162162
<version>3.5.7</version>
163163
<scope>test</scope>
164-
</dependency>
164+
</dependency>-->
165165

166166
<dependency>
167167
<groupId>org.apache.commons</groupId>

springqpro-backend/src/main/java/com/springqprobackend/springqpro/config/ExecutorConfig.java

Lines changed: 19 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -3,9 +3,7 @@
33
import org.springframework.context.annotation.Bean;
44
import org.springframework.context.annotation.Configuration;
55

6-
import java.util.concurrent.ExecutorService;
7-
import java.util.concurrent.Executors;
8-
import java.util.concurrent.ScheduledExecutorService;
6+
import java.util.concurrent.*;
97

108
/* ExecutorConfig.java
119
--------------------------------------------------------------------------------------------------
@@ -52,11 +50,24 @@ public class ExecutorConfig {
5250

5351
@Bean("execService")
5452
public ExecutorService taskExecutor() {
55-
return Executors.newFixedThreadPool(props.getMainExecWorkerCount(), r -> {
56-
Thread t = new Thread(r);
57-
t.setName("QS-Worker-" + t.getId());
58-
return t;
59-
});
53+
/*return Executors.newFixedThreadPool(props.getMainExecWorkerCount(), r -> {
54+
Thread t = new Thread(r);
55+
t.setName("QS-Worker-" + t.getId());
56+
return t;
57+
});*/
58+
return new ThreadPoolExecutor(
59+
props.getMainExecWorkerCount(),
60+
props.getMainExecWorkerCount(),
61+
0L,
62+
TimeUnit.MILLISECONDS,
63+
new LinkedBlockingQueue<>(1000),
64+
r -> {
65+
Thread t = new Thread(r);
66+
t.setName("QS-Worker-" + t.getId());
67+
t.setDaemon(true);
68+
return t;
69+
}
70+
);
6071
}
6172

6273
@Bean("schedExec")

springqpro-backend/src/main/java/com/springqprobackend/springqpro/config/ProcessingMetricsConfig.java

Lines changed: 10 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,8 @@
88
import org.springframework.context.annotation.Bean;
99
import org.springframework.context.annotation.Configuration;
1010

11+
import java.time.Duration;
12+
1113
@Configuration
1214
public class ProcessingMetricsConfig {
1315
@Bean
@@ -44,8 +46,9 @@ public Counter tasksRetriedCounter(MeterRegistry registry) {
4446
public Timer processingTimer(MeterRegistry registry) {
4547
return Timer.builder("springqpro_task_processing_duration")
4648
.description("Time spent executing task handlers")
47-
.publishPercentiles(0.50, 0.90, 0.95, 0.99)
49+
.publishPercentiles(0.50, 0.90, 0.95)
4850
.publishPercentileHistogram()
51+
.sla(Duration.ofMillis(100), Duration.ofMillis(500), Duration.ofSeconds(1))
4952
.register(registry);
5053
}
5154
// The one below is for the number of tasks made by users sending GraphQL queries:
@@ -55,22 +58,24 @@ public Counter apiTaskCreateCounter(MeterRegistry registry) {
5558
.description("Tasks created from GraphQL API")
5659
.register(registry);
5760
} // TO-DO: GraphQL is my main API thing, but I'm keeping the REST stuff too -- maybe add another Counter for that specifically?
58-
@Bean
61+
/*@Bean
5962
public Counter queueEnqueueCounter(MeterRegistry registry) {
6063
return Counter.builder("springqpro_queue_enqueue_total")
6164
.description("In-memory enqueue() calls (legacy path)")
6265
.register(registry);
63-
} // <-- DEBUG: I can't remember if this is even used anymore at this point in the program...
66+
}*/ // <-- DEBUG: I can't remember if this is even used anymore at this point in the program...
67+
// 2026-01-13-NOTE: ^ Removed because the jobs field is a memory hog even if it's deprecated and it's racking up my Railway bill.
6468
@Bean
6569
public Counter queueEnqueueByIdCounter(MeterRegistry registry) {
6670
return Counter.builder("springqpro_queue_enqueue_by_id_total")
6771
.description("enqueueById() calls feeding into ProcessingService")
6872
.register(registry);
6973
}
70-
@Bean
74+
/*@Bean
7175
public Gauge inMemoryQueueSizeGauge(MeterRegistry registry, QueueService queueService) {
7276
return Gauge.builder("springqpro_queue_memory_size", queueService, q -> q.getJobMapCount())
7377
.description("Number of tasks currently in legacy in-memory queue")
7478
.register(registry);
75-
} // <-- DEBUG: I can't remember if this is even used anymore at this point in the program...
79+
}*/ // <-- DEBUG: I can't remember if this is even used anymore at this point in the program...
80+
// 2026-01-13-NOTE: ^ Removed because the jobs field is a memory hog even if it's deprecated and it's racking up my Railway bill.
7681
}

springqpro-backend/src/main/java/com/springqprobackend/springqpro/controller/rest/ProducerController.java

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -89,27 +89,27 @@ public ResponseEntity<Map<String, String>> handleEnqueue(@Valid @RequestBody Enq
8989

9090
// 2. The equivalent of GoQueue's "http.HandleFunc("/api/jobs", func(w http.ResponseWriter, r *http.Request) {...}" function:
9191
// From producer.go: "THIS IS FOR [GET /api/jobs] and [GET /api/jobs?status=queued]" <-- hence why we're using @RequestParam
92-
@GetMapping("/jobs")
92+
/*@GetMapping("/jobs")
9393
public ResponseEntity<List<Task>> handleListJobs(@RequestParam(required = false) String status) {
9494
//Task[] allJobs = queue.getJobs();
9595
//List<Task> filtered = Arrays.stream(allJobs).filter(t -> t != null && (status == null || t.getStatus().toString().equalsIgnoreCase(status))).collect(Collectors.toList());
9696
List<Task> allJobs = queue.getJobs();
9797
logger.info("The value of allJobs is {} and the value of queue.getJobs() is {}", allJobs, queue.getJobs());
9898
List<Task> filtered = allJobs.stream().filter(t -> t != null && (status == null || t.getStatus().toString().equalsIgnoreCase(status))).collect(Collectors.toList());
9999
return ResponseEntity.ok(filtered);
100-
}
100+
}*/
101101

102102
// The handlers below will be for the individual methods in GoQueue's "http.HandleFunc("/api/jobs/", func(w http.ResponseWriter, r *http.Request) {...}" function:
103103
// 3. This is for [GET /api/jobs/:id]:
104-
@GetMapping("/jobs/{id}")
104+
/*@GetMapping("/jobs/{id}")
105105
public ResponseEntity<Task> handleGetJobById(@PathVariable String id) {
106106
Task t = queue.getJobById(id);
107107
if(t == null) return ResponseEntity.notFound().build();
108108
return ResponseEntity.ok(t);
109-
}
109+
}*/
110110

111111
// 4. This is for [POST /api/jobs/:id/retry]:
112-
@PostMapping("/jobs/{id}/retry")
112+
/*@PostMapping("/jobs/{id}/retry")
113113
public ResponseEntity<?> handleRetryJobById(@PathVariable String id) {
114114
Task t = queue.getJobById(id);
115115
if(t == null) return ResponseEntity.notFound().build();
@@ -137,18 +137,18 @@ public ResponseEntity<?> handleDeleteJobById(@PathVariable String id) {
137137
if(!deleteRes) return ResponseEntity.notFound().build();
138138
return ResponseEntity.ok(Map.of("message", String.format("Job %s deleted!", id)));
139139
}
140-
/* DEBUG:+NOTE:+TO-DO: ^ When I get to the stage where I start really expanding on the API endpoints (making this a deployable microservice),
140+
DEBUG:+NOTE:+TO-DO: ^ When I get to the stage where I start really expanding on the API endpoints (making this a deployable microservice),
141141
I want to change the return value here slightly. In best practice, it's not supposed to be a 200 (OK) response, RESTful API
142142
design has it so that what I'd do here is return 204 (No Content) sign, which would imply "the resource was deleted successfully,
143143
there is no further content to return."
144144
DEBUG:+NOTE:+TO-DO: Re-scan over all the functions, honestly, and evaluate if my return codes are correct later. (Do some more reading into return codes, etc).
145-
*/
145+
146146
147147
// 6. This is for the [POST /api/clear]
148148
@PostMapping("/clear")
149149
public ResponseEntity<?> clearQueue() {
150150
queue.clear();
151151
return ResponseEntity.ok(Map.of("message", "All jobs in the queue cleared!"));
152-
}
152+
}*/
153153

154154
}

0 commit comments

Comments
 (0)