-
Notifications
You must be signed in to change notification settings - Fork 1.2k
feat: Add deeplinks for pause/resume/switch + Raycast Extension #1568
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
feat: Add deeplinks for pause/resume/switch + Raycast Extension #1568
Conversation
This PR adds: ## Deeplink Actions (Rust) - PauseRecording: Pause the current recording - ResumeRecording: Resume a paused recording - TogglePauseRecording: Toggle pause/resume state - SwitchCamera: Switch to a different camera by device ID - SwitchMicrophone: Switch to a different microphone by label - ListCameras: List available cameras (for external tooling) - ListMicrophones: List available microphones (for external tooling) - GetRecordingStatus: Get current recording state ## Raycast Extension A full Raycast extension with commands: - Start Recording: Pick screen or window, choose instant/studio mode - Stop Recording: Stop the current recording - Toggle Pause: Pause or resume recording - Switch Camera: Pick from available cameras or disable - Switch Microphone: Pick from available mics or disable - Open Cap: Launch the Cap application Closes CapSoftware#1540
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
12 files reviewed, 4 comments
| export async function openDeeplink(action: object): Promise<void> { | ||
| const url = buildDeeplinkUrl(action); | ||
| await execAsync(`open "${url}"`); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[P0] openDeeplink is typed to accept an object, but several callers pass a string (e.g. openDeeplink("stop_recording")). This will either fail typecheck (TS2345) or force unsafe casts, and it also means JSON.stringify produces quoted strings (which is intended) but the typing is incorrect.
This affects stopRecording, pauseRecording, resumeRecording, and togglePauseRecording in apps/raycast-extension/src/utils/cap.ts.
Also appears in:
- apps/raycast-extension/src/utils/cap.ts:68-82
Prompt To Fix With AI
This is a comment left during a code review.
Path: apps/raycast-extension/src/utils/cap.ts
Line: 47:50
Comment:
[P0] `openDeeplink` is typed to accept an `object`, but several callers pass a string (e.g. `openDeeplink("stop_recording")`). This will either fail typecheck (TS2345) or force unsafe casts, and it also means `JSON.stringify` produces quoted strings (which is intended) but the typing is incorrect.
This affects `stopRecording`, `pauseRecording`, `resumeRecording`, and `togglePauseRecording` in `apps/raycast-extension/src/utils/cap.ts`.
Also appears in:
- apps/raycast-extension/src/utils/cap.ts:68-82
How can I resolve this? If you propose a fix, please make it concise.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed! Added type DeeplinkAction = string | object and updated both functions to accept either type.
| export async function openDeeplink(action: object): Promise<void> { | ||
| const url = buildDeeplinkUrl(action); | ||
| await execAsync(`open "${url}"`); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[P0] execAsync is invoked with a shell string containing interpolated data (the deeplink URL). While the URL is mostly controlled by this extension, values like window/display names can contain quotes/backticks, which can break the shell quoting and lead to command injection.
Using open via spawn/execFile with args (no shell) avoids this class of issues.
Prompt To Fix With AI
This is a comment left during a code review.
Path: apps/raycast-extension/src/utils/cap.ts
Line: 47:50
Comment:
[P0] `execAsync` is invoked with a shell string containing interpolated data (the deeplink URL). While the URL is mostly controlled by this extension, values like window/display names can contain quotes/backticks, which can break the shell quoting and lead to command injection.
Using `open` via `spawn/execFile` with args (no shell) avoids this class of issues.
How can I resolve this? If you propose a fix, please make it concise.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed! Replaced execAsync with Raycast's native open() API - no shell interpolation.
| DeepLinkAction::ListCameras => { | ||
| let cameras: Vec<CameraInfo> = cap_camera::list_cameras() | ||
| .map(|c| CameraInfo { | ||
| device_id: c.device_id().to_string(), | ||
| display_name: c.display_name().to_string(), | ||
| model_id: c.model_id().map(|m| m.to_string()), | ||
| }) | ||
| .collect(); | ||
| let json = serde_json::to_string(&cameras).map_err(|e| e.to_string())?; | ||
| trace!("ListCameras response: {}", json); | ||
| Ok(()) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[P1] ListCameras/ListMicrophones/GetRecordingStatus build JSON but then discard it (only trace! it and return Ok(())). Deeplinks don't have a response channel, so external tooling (including this Raycast extension) can't actually consume these lists/status as implemented.
If these actions are intended for external tooling, they likely need to emit an event / write to a known location / expose a Tauri command instead of only logging.
Prompt To Fix With AI
This is a comment left during a code review.
Path: apps/desktop/src-tauri/src/deeplink_actions.rs
Line: 193:204
Comment:
[P1] `ListCameras`/`ListMicrophones`/`GetRecordingStatus` build JSON but then discard it (only `trace!` it and return `Ok(())`). Deeplinks don't have a response channel, so external tooling (including this Raycast extension) can't actually consume these lists/status as implemented.
If these actions are intended for external tooling, they likely need to emit an event / write to a known location / expose a Tauri command instead of only logging.
How can I resolve this? If you propose a fix, please make it concise.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Acknowledged - this is a limitation of deeplinks (fire-and-forget, no response channel). These actions are primarily for debugging/logging. The Raycast extension queries devices directly via system_profiler instead.
| export async function listCameras(): Promise<Camera[]> { | ||
| try { | ||
| const script = ` | ||
| set cameraList to {} | ||
| try | ||
| do shell script "system_profiler SPCameraDataType 2>/dev/null | grep -E '^\\s+[A-Za-z]' | sed 's/^[[:space:]]*//' | head -10" | ||
| on error | ||
| return "" | ||
| end try | ||
| `; | ||
| const { stdout } = await execAsync(`system_profiler SPCameraDataType 2>/dev/null | grep -E "^\\s+[A-Za-z]" | sed 's/^[[:space:]]*//' | head -10`); | ||
| const cameras = stdout.trim().split("\n").filter(Boolean); | ||
| return cameras.map((name, index) => ({ | ||
| deviceId: `camera-${index}`, | ||
| displayName: name.replace(/:$/, "").trim(), | ||
| })); | ||
| } catch { | ||
| return [{ deviceId: "default", displayName: "Default Camera" }]; | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[P1] listCameras returns fabricated deviceId values like camera-0, but the deeplink switch_camera expects a real device_id that cap_camera::list_cameras() can resolve. As a result, switching cameras via Raycast will likely fail or select the wrong device.
Similarly, listWindows/listDisplays use names as identifiers, which may not match Cap's internal matching if names differ or are non-unique.
Prompt To Fix With AI
This is a comment left during a code review.
Path: apps/raycast-extension/src/utils/cap.ts
Line: 147:166
Comment:
[P1] `listCameras` returns fabricated `deviceId` values like `camera-0`, but the deeplink `switch_camera` expects a real `device_id` that `cap_camera::list_cameras()` can resolve. As a result, switching cameras via Raycast will likely fail or select the wrong device.
Similarly, `listWindows`/`listDisplays` use names as identifiers, which may not match Cap's internal matching if names differ or are non-unique.
How can I resolve this? If you propose a fix, please make it concise.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed! Camera IDs now use the display name (e.g., "FaceTime HD Camera") which Cap can match against.
- Fix TypeScript type: openDeeplink now accepts string | object - Fix command injection: Use Raycast's open API instead of shell exec - Fix camera IDs: Use camera name as deviceId (Cap matches by name)
|
|
||
| export async function openDeeplink(action: DeeplinkAction): Promise<void> { | ||
| const url = buildDeeplinkUrl(action); | ||
| // Use Raycast's open API to avoid command injection risks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Repo guidelines disallow code comments; can you drop the inline // ... in this extension?
| // Use Raycast's open API to avoid command injection risks | |
| export async function openDeeplink(action: DeeplinkAction): Promise<void> { | |
| const url = buildDeeplinkUrl(action); | |
| await open(url); | |
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed! All comments removed in latest commit.
|
|
||
| export async function listMicrophones(): Promise<Microphone[]> { | ||
| try { | ||
| const script = ` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
script is unused here; also keeps the no-comments rule consistent.
| const script = ` | |
| export async function listMicrophones(): Promise<Microphone[]> { | |
| try { | |
| const { stdout } = await execAsync(`system_profiler SPAudioDataType 2>/dev/null | grep -A 50 'Input Sources:' | grep -E "Default Input Device: Yes" -B 10 | grep -E "^\\s+[A-Za-z].*:" | head -5 | sed 's/^[[:space:]]*//' | cut -d: -f1`); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed! Removed unused script variable.
| DeepLinkAction::TogglePauseRecording => { | ||
| crate::recording::toggle_pause_recording(app.clone(), app.state()).await | ||
| } | ||
| DeepLinkAction::SwitchCamera { device_id } => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right now switch_camera.device_id must match cap_camera::CameraInfo.device_id() (unique ID). The Raycast extension/README are passing a display name, which won't match. One option is to accept either unique ID or display name here.
| DeepLinkAction::SwitchCamera { device_id } => { | |
| DeepLinkAction::SwitchCamera { device_id } => { | |
| let state = app.state::<ArcLock<App>>(); | |
| let camera_id = match device_id { | |
| None => None, | |
| Some(id) => { | |
| let matched = cap_camera::list_cameras() | |
| .find(|c| c.device_id() == id || c.display_name() == id) | |
| .map(|c| c.device_id().to_string()) | |
| .ok_or_else(|| format!("No camera with id or name \"{id}\""))?; | |
| Some(DeviceOrModelID::DeviceID(matched)) | |
| } | |
| }; | |
| crate::set_camera_input(app.clone(), state, camera_id, Some(true)).await | |
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed! SwitchCamera now accepts either device_id or display_name - matches against both.
| } | ||
|
|
||
| export async function startRecording(options: StartRecordingOptions): Promise<void> { | ||
| const captureMode = options.captureMode.screen |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If captureMode is {} (neither screen nor window), JSON.stringify will omit the undefined field and the deeplink will fail to parse on the Rust side. Guard early (or make CaptureMode a discriminated union).
| const captureMode = options.captureMode.screen | |
| export async function startRecording(options: StartRecordingOptions): Promise<void> { | |
| if (!options.captureMode.screen && !options.captureMode.window) { | |
| throw new Error("captureMode must include screen or window"); | |
| } | |
| const captureMode = options.captureMode.screen | |
| ? { screen: options.captureMode.screen } | |
| : { window: options.captureMode.window }; | |
| await openDeeplink({ | |
| start_recording: { | |
| capture_mode: captureMode, | |
| camera: options.camera ? { device_id: options.camera } : null, | |
| mic_label: options.micLabel ?? null, | |
| capture_system_audio: options.captureSystemAudio ?? false, | |
| mode: options.mode ?? "instant", | |
| }, | |
| }); | |
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed! Added guard that throws if captureMode has neither screen nor window.
| `; | ||
| const { stdout } = await execAsync(`osascript -e '${script.replace(/'/g, "'\\''")}'`); | ||
| const windows = stdout.trim().split(", ").filter(Boolean); | ||
| return windows.map((win, index) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Splitting osascript output on ", " is fragile if window titles contain commas. Consider returning newline-delimited text from AppleScript and splitting on \n.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed! Now using newline-delimited output from AppleScript and splitting on \n.
|
|
||
| export async function openCap(): Promise<void> { | ||
| // Use Raycast's open API for launching applications | ||
| await open("cap://"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same nit here: repo convention is no inline // comments.
| await open("cap://"); | |
| export async function openCap(): Promise<void> { | |
| await open("cap://"); | |
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed! All comments removed.
| Ok(()) | ||
| } | ||
| DeepLinkAction::GetRecordingStatus => { | ||
| let state = app.state::<ArcLock<App>>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This calls current_recording() twice and unwrap_or(false) hides errors from is_paused(). I'd compute once and propagate the error (or include it in the trace).
| let state = app.state::<ArcLock<App>>(); | |
| DeepLinkAction::GetRecordingStatus => { | |
| let state = app.state::<ArcLock<App>>(); | |
| let app_state = state.read().await; | |
| let (is_recording, is_paused) = match app_state.current_recording() { | |
| Some(recording) => { | |
| let is_paused = recording.is_paused().await.map_err(|e| e.to_string())?; | |
| (true, is_paused) | |
| } | |
| None => (false, false), | |
| }; | |
| trace!( | |
| "GetRecordingStatus: is_recording={}, is_paused={}", | |
| is_recording, | |
| is_paused | |
| ); | |
| Ok(()) | |
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed! Now calling current_recording() once and properly propagating is_paused() errors.
- Remove all inline comments (repo disallows comments) - Guard startRecording against empty captureMode - Fix window title parsing to use newlines instead of comma splitting - Remove unused script variable in listMicrophones - SwitchCamera now accepts device ID or display name - GetRecordingStatus avoids redundant current_recording() call and propagates errors
| repeat with win in (every window of proc) | ||
| set windowName to name of win | ||
| set appName to name of proc | ||
| set end of windowList to appName & ": " & windowName & "\n" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
windowList as text still joins list items using AppleScript's text item delimiters (default is , ), so appending "\n" to each element can leave stray commas in the output. I'd set the delimiters to linefeed and drop the per-item newline.
| set end of windowList to appName & ": " & windowName & "\n" | |
| repeat with win in (every window of proc) | |
| set windowName to name of win | |
| set appName to name of proc | |
| set end of windowList to appName & ": " & windowName | |
| end repeat | |
| end try | |
| end repeat | |
| set AppleScript's text item delimiters to linefeed | |
| return windowList as text |
Set AppleScript's text item delimiters to linefeed before coercing windowList to text, following Raycast extension conventions.
Summary
This PR adds additional deeplink actions to Cap and a full Raycast extension, addressing issue #1540.
Deeplink Actions Added (Rust)
pause_recordingresume_recordingtoggle_pause_recordingswitch_cameraswitch_microphonelist_cameraslist_microphonesget_recording_statusDeeplink Examples
Raycast Extension
A full Raycast extension at
apps/raycast-extension/with the following commands:Screenshots
The Raycast extension provides:
Testing
npm run devinapps/raycast-extension/Closes #1540
Greptile Overview
Greptile Summary
This PR expands the desktop app’s deeplink handler to support pause/resume/toggle, device switching, and additional “list/status” actions, and adds a new Raycast extension that triggers these actions via
cap://action?value=...URLs.Key issues worth addressing are in the Raycast utility layer:
openDeeplinkis typed as accepting anobjectbut is called with strings (typecheck failure), deeplinks are launched viaexecwith interpolated shell strings (possible command injection if values contain quotes), andlistCamerasfabricates device IDs that likely won’t match the Rust-sideDeviceIDlookup. On the Rust side, the new list/status deeplink actions only log JSON; without an actual response mechanism, external tools can’t consume the results.Confidence Score: 2/5
openDeeplinktyped asobjectbut called with strings) and usesexecwith interpolated user-influenced strings, which is a realistic command injection vector. Additionally, camera switching likely won’t work because camera IDs are fabricated on the Raycast side while Rust expects real device IDs.Important Files Changed
MicrophoneFeed::list()labels.Sequence Diagram
sequenceDiagram participant User as External Tool / Raycast participant OS as macOS (open) participant Cap as Cap Desktop (Tauri) participant DL as deeplink_actions.rs participant Rec as recording.rs participant Dev as Device APIs User->>OS: open cap://action?value=<json> OS->>Cap: Launch/Focus with URL Cap->>DL: handle(urls) DL->>DL: TryFrom<Url> parse `value` alt start_recording DL->>Dev: set_camera_input / set_mic_input DL->>Rec: start_recording(inputs) else stop_recording DL->>Rec: stop_recording() else pause/resume/toggle DL->>Rec: pause_recording()/resume_recording()/toggle_pause_recording() else switch_camera DL->>Dev: set_camera_input(device_id) else switch_microphone DL->>Dev: set_mic_input(label) else list/status DL->>DL: build JSON + trace!(...) DL-->>User: (no response channel) end(2/5) Greptile learns from your feedback when you react with thumbs up/down!
Context used:
dashboard- CLAUDE.md (source)dashboard- AGENTS.md (source)