Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
46 commits
Select commit Hold shift + click to select a range
0ac42b4
feat: initial implementation of multimodal runner with lfm vlm
NorbertKlockiewicz Feb 19, 2026
57d9618
feat: unified LLM runner for text-only and multimodal PTEs
NorbertKlockiewicz Mar 2, 2026
bf50ae2
feat: add conversational VLM demo with multimodal/text-only support a…
NorbertKlockiewicz Mar 2, 2026
1695f7e
fix: default UnifiedRunner temperature to 0.8 and topp to 0.9
NorbertKlockiewicz Mar 2, 2026
b660b0f
feat: add NativeMessage struct and JSI conversion for message history
NorbertKlockiewicz Mar 2, 2026
4331bde
feat: declare generateMultimodal on LLM and register JSI binding
NorbertKlockiewicz Mar 2, 2026
d6530e4
fix: remove redundant unordered_map and vector includes from LLM.h
NorbertKlockiewicz Mar 2, 2026
d261a45
feat: implement generateMultimodal with per-turn chat template and im…
NorbertKlockiewicz Mar 2, 2026
d91a64a
feat: add mediaPath to Message, remove sendMessageWithImage from LLMType
NorbertKlockiewicz Mar 2, 2026
49f5af6
feat: replace sendMessageWithImage with sendMessage(msg, mediaPath?) …
NorbertKlockiewicz Mar 2, 2026
d07ce65
fix: use updatedHistory for multimodal routing, remove redundant rese…
NorbertKlockiewicz Mar 2, 2026
b29f74c
fix: skip system messages in generateMultimodal, clear imageUri after…
NorbertKlockiewicz Mar 2, 2026
e1d0f08
feat: show image thumbnail in user message bubble when mediaPath is set
NorbertKlockiewicz Mar 2, 2026
11cab57
fix: use resizeMode contain so full image is always visible in messag…
NorbertKlockiewicz Mar 2, 2026
9ddd5d7
refactor: derive isMultimodal from load param, unify load branches, r…
NorbertKlockiewicz Mar 2, 2026
7d2ce9b
refactor: remove isMultimodal flag, inline generateMultimodal into se…
NorbertKlockiewicz Mar 2, 2026
87fa1f0
fix: make tokenizerConfigSource required throughout load pipeline
NorbertKlockiewicz Mar 2, 2026
b398952
fix: prepend system prompt to multimodal history before generateMulti…
NorbertKlockiewicz Mar 2, 2026
a0b80e3
refactor: unify generate — Jinja renders prompt+<image> tokens in JS,…
NorbertKlockiewicz Mar 2, 2026
13f631e
fix: collect imagePaths from messageHistoryWithPrompt, not full history
NorbertKlockiewicz Mar 2, 2026
76f9c7c
fix: typing
NorbertKlockiewicz Mar 2, 2026
ab8c088
feat: correctly calculate image tokens
NorbertKlockiewicz Mar 2, 2026
c211ba9
fix: add missing import
NorbertKlockiewicz Mar 2, 2026
0e29349
fix: fall back to max_seq_len when model doesn't export max_context_len
NorbertKlockiewicz Mar 2, 2026
520233f
fix: address code review — error on image/placeholder mismatch, remov…
NorbertKlockiewicz Mar 2, 2026
dfd1a81
feat: dynamic sendMessage type based on flag
NorbertKlockiewicz Mar 2, 2026
3d67b66
fix: model stopping generation in the middle of its answer
NorbertKlockiewicz Mar 3, 2026
2b26c5d
feat: add LLMCapability type and parameterize LLMTypeMultimodal
NorbertKlockiewicz Mar 3, 2026
8d1b4eb
feat: update sendMessage to accept typed media object
NorbertKlockiewicz Mar 3, 2026
f3edf5d
feat: add LFM2_VL_1_6B and LFM2_VL_1_6B_QUANTIZED model constants
NorbertKlockiewicz Mar 3, 2026
6eba3f7
feat: add IEncoder interface and VisionEncoder
NorbertKlockiewicz Mar 3, 2026
0819c20
fix: address vision_encoder quality review issues
NorbertKlockiewicz Mar 3, 2026
1de96bb
feat: add BaseLLMRunner with shared state and load()
NorbertKlockiewicz Mar 3, 2026
e08b391
feat: add TextRunner
NorbertKlockiewicz Mar 3, 2026
6703559
feat: add MultimodalRunner with plug-in encoder map
NorbertKlockiewicz Mar 3, 2026
a1edb3c
feat: wire capabilities through LLM.cpp, delete UnifiedRunner
NorbertKlockiewicz Mar 3, 2026
7076a9f
feat: forward capabilities from LLMController to native
NorbertKlockiewicz Mar 3, 2026
96525bc
feat: add logging, fix metadata application, fix module ownership and…
NorbertKlockiewicz Mar 5, 2026
b3ce27e
refactor: replace Image class with ImagePath + VisionEncoder embeddin…
NorbertKlockiewicz Mar 5, 2026
ce6856d
test: add TextRunnerTests and VLMTests suites, register in CMake and …
NorbertKlockiewicz Mar 5, 2026
4184bb3
refactor: unify multimodal/text paths in sendMessage, add getVisualTo…
NorbertKlockiewicz Mar 5, 2026
c88d97c
refactor: replace example namespace with rnexecutorch::llm::runner in…
NorbertKlockiewicz Mar 5, 2026
c7357d3
refactor: collapse BaseLLMRunner constructor, deduplicate eos_ids, re…
NorbertKlockiewicz Mar 5, 2026
69d454b
refactor: comments etc.
NorbertKlockiewicz Mar 5, 2026
6a3857b
fix: cap VLM generation tokens, propagate encoder load errors, pass i…
NorbertKlockiewicz Mar 5, 2026
551a306
revert: remove TextRunnerTests and VLMTests suites
NorbertKlockiewicz Mar 5, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion apps/llm/app.json
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,8 @@
},
"entitlements": {
"com.apple.developer.kernel.increased-memory-limit": true
}
},
"appleTeamId": "B357MU264T"
},
"android": {
"adaptiveIcon": {
Expand Down
8 changes: 8 additions & 0 deletions apps/llm/app/_layout.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -89,6 +89,14 @@ export default function _layout() {
headerTitleStyle: { color: ColorPalette.primary },
}}
/>
<Drawer.Screen
name="multimodal_llm/index"
options={{
drawerLabel: 'Multimodal LLM (VLM)',
title: 'Multimodal LLM',
headerTitleStyle: { color: ColorPalette.primary },
}}
/>
<Drawer.Screen
name="index"
options={{
Expand Down
6 changes: 6 additions & 0 deletions apps/llm/app/index.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,12 @@ export default function Home() {
>
<Text style={styles.buttonText}>Voice Chat</Text>
</TouchableOpacity>
<TouchableOpacity
style={styles.button}
onPress={() => router.navigate('multimodal_llm/')}
>
<Text style={styles.buttonText}>Multimodal LLM (VLM)</Text>
</TouchableOpacity>
</View>
</View>
);
Expand Down
310 changes: 310 additions & 0 deletions apps/llm/app/multimodal_llm/index.tsx
Original file line number Diff line number Diff line change
@@ -0,0 +1,310 @@
import { useContext, useEffect, useRef, useState } from 'react';
import {
Image,
Keyboard,
KeyboardAvoidingView,
Platform,
StyleSheet,
Text,
TextInput,
TouchableOpacity,
TouchableWithoutFeedback,
View,
} from 'react-native';
import { launchImageLibrary } from 'react-native-image-picker';
import { useIsFocused } from '@react-navigation/native';
import { useLLM, LFM2_VL_1_6B_QUANTIZED } from 'react-native-executorch';
import SendIcon from '../../assets/icons/send_icon.svg';
import PauseIcon from '../../assets/icons/pause_icon.svg';
import ColorPalette from '../../colors';
import Messages from '../../components/Messages';
import Spinner from '../../components/Spinner';
import { GeneratingContext } from '../../context';

export default function MultimodalLLMScreenWrapper() {
const isFocused = useIsFocused();
return isFocused ? <MultimodalLLMScreen /> : null;
}

function MultimodalLLMScreen() {
const [imageUri, setImageUri] = useState<string | null>(null);
const [userInput, setUserInput] = useState('');
const [isTextInputFocused, setIsTextInputFocused] = useState(false);
const textInputRef = useRef<TextInput>(null);
const { setGlobalGenerating } = useContext(GeneratingContext);

const vlm = useLLM({
model: LFM2_VL_1_6B_QUANTIZED,
});

useEffect(() => {
setGlobalGenerating(vlm.isGenerating);
}, [vlm.isGenerating, setGlobalGenerating]);

useEffect(() => {
if (vlm.error) console.error('MultimodalLLM error:', vlm.error);
}, [vlm.error]);

const pickImage = async () => {
const result = await launchImageLibrary({ mediaType: 'photo' });
if (result.assets && result.assets.length > 0) {
const uri = result.assets[0]?.uri;
if (uri) setImageUri(uri);
}
};

const sendMessage = async () => {
if (!userInput.trim() || vlm.isGenerating) return;
const text = userInput.trim();
setUserInput('');
textInputRef.current?.clear();
Keyboard.dismiss();
const currentImageUri = imageUri;
setImageUri(null);
try {
await vlm.sendMessage(
text,
currentImageUri ? { imagePath: currentImageUri } : undefined
);
} catch (e) {
console.error('Generation error:', e);
}
};

if (!vlm.isReady) {
return (
<Spinner
visible={!vlm.isReady}
textContent={
vlm.error
? `Error: ${vlm.error.message}`
: `Loading model ${(vlm.downloadProgress * 100).toFixed(0)}%`
}
/>
);
}

return (
<TouchableWithoutFeedback onPress={Keyboard.dismiss}>
<KeyboardAvoidingView
style={styles.container}
collapsable={false}
behavior={Platform.OS === 'ios' ? 'padding' : undefined}
keyboardVerticalOffset={Platform.OS === 'ios' ? 120 : 40}
>
<View style={styles.container}>
{vlm.messageHistory.length ? (
<View style={styles.chatContainer}>
<Messages
chatHistory={vlm.messageHistory}
llmResponse={vlm.response}
isGenerating={vlm.isGenerating}
deleteMessage={vlm.deleteMessage}
/>
</View>
) : (
<View style={styles.helloMessageContainer}>
<Text style={styles.helloText}>Hello! 👋</Text>
<Text style={styles.bottomHelloText}>
Pick an image and ask me anything about it.
</Text>
</View>
)}

{/* Image thumbnail strip */}
{imageUri && (
<TouchableOpacity
style={styles.imageThumbnailContainer}
onPress={pickImage}
>
<Image
source={{ uri: imageUri }}
style={styles.imageThumbnail}
resizeMode="cover"
/>
<Text style={styles.imageThumbnailHint}>Tap to change</Text>
</TouchableOpacity>
)}

<View style={styles.bottomContainer}>
{/* Image picker button */}
<TouchableOpacity
style={styles.imageButton}
onPress={pickImage}
disabled={vlm.isGenerating}
>
<Text style={styles.imageButtonText}>📷</Text>
</TouchableOpacity>

<TextInput
autoCorrect={false}
ref={textInputRef}
onFocus={() => setIsTextInputFocused(true)}
onBlur={() => setIsTextInputFocused(false)}
style={[
styles.textInput,
{
borderColor: isTextInputFocused
? ColorPalette.blueDark
: ColorPalette.blueLight,
},
]}
placeholder={imageUri ? 'Ask about the image…' : 'Your message'}
placeholderTextColor="#C1C6E5"
multiline
onChangeText={setUserInput}
/>

{userInput.trim() && !vlm.isGenerating && (
<TouchableOpacity
style={styles.sendChatTouchable}
onPress={sendMessage}
>
<SendIcon height={24} width={24} padding={4} margin={8} />
</TouchableOpacity>
)}
{vlm.isGenerating && (
<TouchableOpacity
style={styles.sendChatTouchable}
onPress={vlm.interrupt}
>
<PauseIcon height={24} width={24} padding={4} margin={8} />
</TouchableOpacity>
)}
</View>
</View>
</KeyboardAvoidingView>
</TouchableWithoutFeedback>
);
}

const styles = StyleSheet.create({
// Setup phase
setupContainer: {
flex: 1,
padding: 24,
backgroundColor: '#fff',
justifyContent: 'center',
},
setupTitle: {
fontSize: 20,
fontFamily: 'medium',
color: ColorPalette.primary,
marginBottom: 8,
},
setupHint: {
fontSize: 13,
fontFamily: 'regular',
color: ColorPalette.blueDark,
marginBottom: 32,
lineHeight: 18,
},
filePickerRow: {
flexDirection: 'row',
alignItems: 'center',
borderWidth: 1,
borderColor: ColorPalette.blueLight,
borderRadius: 10,
padding: 14,
marginBottom: 12,
backgroundColor: '#fafbff',
},
filePickerInfo: { flex: 1 },
filePickerLabel: {
fontSize: 12,
fontFamily: 'medium',
color: ColorPalette.blueDark,
marginBottom: 2,
},
filePickerValue: { fontSize: 14, fontFamily: 'regular' },
filePickerValueSet: { color: ColorPalette.primary },
filePickerValueEmpty: { color: ColorPalette.blueLight },
filePickerChevron: {
fontSize: 24,
color: ColorPalette.blueLight,
marginLeft: 8,
},
loadButton: {
marginTop: 16,
backgroundColor: ColorPalette.strongPrimary,
borderRadius: 10,
padding: 14,
alignItems: 'center',
},
loadButtonDisabled: { backgroundColor: ColorPalette.blueLight },
loadButtonText: { color: '#fff', fontFamily: 'medium', fontSize: 15 },

// Chat phase
container: { flex: 1 },
chatContainer: { flex: 10, width: '100%' },
helloMessageContainer: {
flex: 10,
width: '100%',
alignItems: 'center',
justifyContent: 'center',
},
helloText: {
fontFamily: 'medium',
fontSize: 30,
color: ColorPalette.primary,
},
bottomHelloText: {
fontFamily: 'regular',
fontSize: 20,
lineHeight: 28,
textAlign: 'center',
color: ColorPalette.primary,
paddingHorizontal: 24,
},
imageThumbnailContainer: {
flexDirection: 'row',
alignItems: 'center',
paddingHorizontal: 16,
paddingVertical: 6,
gap: 8,
},
imageThumbnail: {
width: 48,
height: 48,
borderRadius: 8,
borderWidth: 1,
borderColor: ColorPalette.blueLight,
},
imageThumbnailHint: {
fontSize: 12,
fontFamily: 'regular',
color: ColorPalette.blueDark,
},
bottomContainer: {
height: 100,
width: '100%',
flexDirection: 'row',
justifyContent: 'space-between',
alignItems: 'center',
paddingHorizontal: 16,
},
imageButton: {
width: 40,
height: 40,
justifyContent: 'center',
alignItems: 'center',
marginRight: 4,
},
imageButtonText: { fontSize: 22 },
textInput: {
flex: 1,
borderWidth: 1,
borderRadius: 8,
lineHeight: 19.6,
fontFamily: 'regular',
fontSize: 14,
color: ColorPalette.primary,
padding: 16,
},
sendChatTouchable: {
height: '100%',
width: 48,
justifyContent: 'center',
alignItems: 'flex-end',
},
});
Loading
Loading