Skip to content

fix: fix unload in LLM and ModelHostObject to properly free LLM memory#954

Open
barhanc wants to merge 1 commit intomainfrom
@bh/issue948
Open

fix: fix unload in LLM and ModelHostObject to properly free LLM memory#954
barhanc wants to merge 1 commit intomainfrom
@bh/issue948

Conversation

@barhanc
Copy link
Contributor

@barhanc barhanc commented Mar 11, 2026

Description

Fixes two related memory management bugs in the LLM delete flow:

  • LLM.cpp. LLM::unload() was destroying the runner object but never calling BaseModel::unload() and thus not releasing the module memory. LLM::unload() now calls BaseModel::unload() after resetting the runner.
  • LLMController.ts. When load() was called on an already-loaded LLMController, the previous native module instance was simply overwritten without calling unload() first. This caused a memory leak. It now unloads the existing native module before loading a new one.
  • ModelHostObject.h. Reset JSI external memory pressure to 0 after model->unload() so the JS GC is correctly informed that native memory has been freed.

Introduces a breaking change?

  • Yes
  • No

Type of change

  • Bug fix (change which fixes an issue)
  • New feature (change which adds functionality)
  • Documentation update (improves or adds clarity to existing documentation)
  • Other (chores, tests, code style improvements etc.)

Tested on

  • iOS
  • Android

Testing instructions

Use the provided screen (you can simply replace the current LLM app screen) to reproduce the bug and verify the fix. It lets you load and unload an LLM and observe memory behavior with vmmap / adb.

  • Run the LLM example app
  • Prepare memory monitors
    - iOS: xcrun simctl spawn booted launchctl list | grep llm and watch -n 0.1 "vmmap <pid> | tail -12"
    - Android: watch -n 0.1 "adb shell dumpsys meminfo com.anonymous.llm"
  • Press Load - wait for ready
  • Note baseline native memory in profiler
  • Press Unload - verify native memory returns to baseline
  • Press Load again without restarting - verify no accumulation
  • Repeat load/unload 5+ times - confirm no upward drift in native memory and no crash
  • Press Load a few times without pressing Unload -- verify no memory accumulation
import { useEffect, useRef, useState } from 'react';
import { StyleSheet, Text, TouchableOpacity, View } from 'react-native';
import { LLMModule, HAMMER2_1_1_5B_QUANTIZED } from 'react-native-executorch';

export default function LLMScreen() {
  const [status, setStatus] = useState('idle');
  const llmRef = useRef<LLMModule | null>(null);

  useEffect(() => {
    llmRef.current = new LLMModule();
    return () => {
      try {
        llmRef.current?.interrupt();
        llmRef.current?.delete();
      } catch {}
    };
  }, []);

  const handleLoad = async () => {
    setStatus('loading...');
    try {
      await llmRef.current!.load(HAMMER2_1_1_5B_QUANTIZED, (p) =>
        setStatus(`loading ${(p * 100).toFixed(0)}%`)
      );
      setStatus('ready');
    } catch (e: any) {
      setStatus(`load error: ${e?.message}`);
    }
  };

  const handleUnload = () => {
    try {
      llmRef.current?.interrupt();
      llmRef.current?.delete();
      setStatus('unloaded');
    } catch (e: any) {
      setStatus(`unload error: ${e?.message}`);
    }
  };

  const isLoading = status.startsWith('loading');
  const canLoad = !isLoading;
  const canUnload = status === 'ready';

  return (
    <View style={styles.container}>
      <Text style={styles.status}>{status}</Text>
      <TouchableOpacity
        style={[styles.button, !canLoad && styles.disabled]}
        onPress={handleLoad}
        disabled={!canLoad}
      >
        <Text style={styles.buttonText}>Load</Text>
      </TouchableOpacity>
      <TouchableOpacity
        style={[styles.button, styles.unload, !canUnload && styles.disabled]}
        onPress={handleUnload}
        disabled={!canUnload}
      >
        <Text style={styles.buttonText}>Unload</Text>
      </TouchableOpacity>
    </View>
  );
}

const styles = StyleSheet.create({
  container: { flex: 1, alignItems: 'center', justifyContent: 'center', gap: 12 },
  status: { fontSize: 16, marginBottom: 8 },
  button: { backgroundColor: '#2563eb', paddingHorizontal: 32, paddingVertical: 14, borderRadius: 8 },
  unload: { backgroundColor: '#dc2626' },
  buttonText: { color: '#fff', fontSize: 16, fontWeight: '600' },
  disabled: { opacity: 0.4 },
});

Related issues

#948

Checklist

  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • I have updated the documentation accordingly
  • My changes generate no new warnings

@barhanc barhanc self-assigned this Mar 11, 2026
@barhanc barhanc added the bug fix PRs that are fixing bugs label Mar 11, 2026
@msluszniak msluszniak linked an issue Mar 11, 2026 that may be closed by this pull request
@msluszniak msluszniak self-requested a review March 11, 2026 08:51
@msluszniak
Copy link
Member

Do you believe that similar problems might be in other modules?

@NorbertKlockiewicz
Copy link
Contributor

We should wait with this PR until #892 is merged and check if the problem occurs also in this version.

@barhanc
Copy link
Contributor Author

barhanc commented Mar 11, 2026

@msluszniak I don't think the problem with LLM not freeing the module memory occurs in other models as they don't override the BaseModel::unload() method and thus do module_.reset(nullptr);. The JS GC setExternalMemoryPressure fix is an ad hoc one though, as I'm not an expert and was guided mostly by this discussion. This one is also more tricky because after the fix with LLM::unload() the memory profiler shows that the used memory is being freed properly, but doing the cycle load -> unload multiple times results in sudden crash of the whole app suggesting it is a problem with the GC. The last problem with calling load() method multiple times causing memory accumulation, I think, will pretty much be mitigated when we move to constructing models using factory pattern.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug fix PRs that are fixing bugs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Allocated memory is not cleanup after llm.delete

3 participants