-
Notifications
You must be signed in to change notification settings - Fork 62
Description
Description
Hi, is there any guidance about how to free allocated memory? i'm using LLMModule and already tried the interrupt and delete methods. Although it says that the model is unloaded (assume it's unloaded by hitting interrupt twice and it throws error that model is not loaded) the memory never drops down. (checking memory with expo dev tool - performance monitor). Once i assumed that monitor is just cached, so i tried to load model one more time, and the app is crashed. (This is kinda reproducible bug, with just snippets from the docs, and few simple buttons and states to simulate chat.)
device: iphone 16 pro
expo: v55
"react-native-executorch": "^0.7.1",
"react-native-worklets": "0.7.2",
Steps to reproduce
Simple snippet to use LLMModule instead of useLLM to access to .delete method.
Add below components to generate and button to call llm.interrupt and llm.delete.
Replace GiftedChat.append in onSend with plain state prev => [...prev, ...new]
const [messages, setMessages] = useState<IMessage[]>([])
const llmId = useMemo(() => v4(), [])
const { data: llm } = useQuery({
queryKey: ['llm', 'HAMMER2_1_1_5B_QUANTIZED'],
queryFn: async () => {
const llm = new LLMModule({
tokenCallback: token => console.log(token),
messageHistoryCallback: messages => console.log(messages),
})
llm.configure({
chatConfig: {
systemPrompt: `You are helpful assistant. Current time and date: ${new Date().toString()}`,
},
toolsConfig: {
tools: [
{
name: 'get_weather',
description: 'Get/check weather in given location.',
parameters: {
type: 'dict',
properties: {
location: {
type: 'string',
description: 'Location where user wants to check weather',
},
},
required: ['location'],
},
},
],
executeToolCallback: async call => {
if (call.toolName === 'get_weather') {
console.log('Checking weather!')
// perform call to weather API
// ...
const mockResults = 'Weather is chinazes!'
return mockResults
}
return null
},
displayToolCalls: true,
},
})
await llm.load(HAMMER2_1_1_5B_QUANTIZED, progress => console.log(progress))
return llm
},
})
const handleGenerate = useCallback(
async (chat: Message[]) => {
// Chat completion - returns the generated response
const response = await llm?.generate(chat)
return response
},
[llm],
)
const onSend = useMutation({
mutationFn: async (newMessages: IMessage[] = []) => {
setMessages(prev => GiftedChat.append(prev, newMessages))
const response = await handleGenerate(
GiftedChat.append(messages, newMessages).map(el => ({
role: el.user._id === llmId ? 'assistant' : 'user',
content: JSON.stringify(el),
})),
)
setMessages(prev =>
GiftedChat.append(prev, [
{
_id: v4(),
createdAt: new Date(),
text: response ?? '',
user: {
_id: llmId,
},
},
]),
)
},
})
if (!llm) {
return (
<View className='flex-center flex-1'>
<UiSpinner />
</View>
)
}
Snack or a link to a repository
No response
React Native Executorch version
^0.7.1
React Native version
0.83.2
Platforms
iOS
JavaScript runtime
Hermes
Workflow
Expo Dev Client
Architecture
Fabric (New Architecture)
Build type
Debug mode
Device
Real device
Device model
Iphone 16 pro
AI model
HAMMER2_1_1_5B_QUANTIZED
Performance logs
No response
Acknowledgements
Yes