Skip to content

Conversation

@tvdavies
Copy link

@tvdavies tvdavies commented Feb 1, 2026

Summary

This PR adds additional deeplink actions to Cap and a full Raycast extension, addressing issue #1540.

Deeplink Actions Added (Rust)

Action Description
pause_recording Pause the current recording
resume_recording Resume a paused recording
toggle_pause_recording Toggle pause/resume state
switch_camera Switch to a different camera by device ID (or null to disable)
switch_microphone Switch to a different microphone by label (or null to disable)
list_cameras List available cameras (for external tooling)
list_microphones List available microphones (for external tooling)
get_recording_status Get current recording state

Deeplink Examples

# Pause recording
open "cap://action?value=%22pause_recording%22"

# Toggle pause
open "cap://action?value=%22toggle_pause_recording%22"

# Switch camera
open "cap://action?value=%7B%22switch_camera%22%3A%7B%22device_id%22%3A%22FaceTime%20HD%20Camera%22%7D%7D"

# Disable camera
open "cap://action?value=%7B%22switch_camera%22%3A%7B%22device_id%22%3Anull%7D%7D"

Raycast Extension

A full Raycast extension at apps/raycast-extension/ with the following commands:

  • Start Recording - Pick a screen or window, choose instant/studio mode
  • Stop Recording - Stop the current recording
  • Toggle Pause - Pause or resume the current recording
  • Switch Camera - Pick from available cameras or disable camera
  • Switch Microphone - Pick from available microphones or disable microphone
  • Open Cap - Launch the Cap application

Screenshots

The Raycast extension provides:

  • Screen/window picker for starting recordings
  • Mode selector (Instant vs Studio)
  • Camera/microphone lists with disable option

Testing

  1. Build the desktop app with the new deeplink actions
  2. Test deeplinks via terminal using the examples above
  3. Install the Raycast extension via npm run dev in apps/raycast-extension/
  4. Test all Raycast commands

Closes #1540

Greptile Overview

Greptile Summary

This PR expands the desktop app’s deeplink handler to support pause/resume/toggle, device switching, and additional “list/status” actions, and adds a new Raycast extension that triggers these actions via cap://action?value=... URLs.

Key issues worth addressing are in the Raycast utility layer: openDeeplink is typed as accepting an object but is called with strings (typecheck failure), deeplinks are launched via exec with interpolated shell strings (possible command injection if values contain quotes), and listCameras fabricates device IDs that likely won’t match the Rust-side DeviceID lookup. On the Rust side, the new list/status deeplink actions only log JSON; without an actual response mechanism, external tools can’t consume the results.

Confidence Score: 2/5

  • This PR is not safe to merge as-is due to correctness and security concerns in the Raycast deeplink execution path.
  • The Raycast extension currently has a TypeScript type mismatch (openDeeplink typed as object but called with strings) and uses exec with interpolated user-influenced strings, which is a realistic command injection vector. Additionally, camera switching likely won’t work because camera IDs are fabricated on the Raycast side while Rust expects real device IDs.
  • apps/raycast-extension/src/utils/cap.ts; apps/desktop/src-tauri/src/deeplink_actions.rs

Important Files Changed

Filename Overview
apps/desktop/src-tauri/src/deeplink_actions.rs Adds deeplink actions for pause/resume/toggle, device switching, and list/status helpers; list/status actions currently only log JSON (no consumable response).
apps/raycast-extension/README.md Adds README describing Raycast extension and deeplink usage; no functional issues spotted in doc content.
apps/raycast-extension/assets/extension-icon.png Adds Raycast extension icon asset (binary).
apps/raycast-extension/package.json Adds Raycast extension manifest (commands, deps, scripts); no obvious schema issues from review.
apps/raycast-extension/src/open-cap.tsx Adds 'Open Cap' no-view command that launches Cap and shows HUD; straightforward.
apps/raycast-extension/src/start-recording.tsx Adds interactive start recording command (display/window selection + mode); relies on name-based identifiers which may be non-unique/mismatch Cap.
apps/raycast-extension/src/stop-recording.tsx Adds stop recording no-view command; depends on utils/cap openDeeplink typing/shell execution.
apps/raycast-extension/src/switch-camera.tsx Adds camera switch UI; depends on utils/cap listCameras generating real device ids (currently fabricated).
apps/raycast-extension/src/switch-microphone.tsx Adds microphone switch UI; works if mic labels match Cap’s MicrophoneFeed::list() labels.
apps/raycast-extension/src/toggle-pause.tsx Adds toggle pause no-view command; depends on utils/cap openDeeplink typing/shell execution.
apps/raycast-extension/src/utils/cap.ts Implements deeplink builder and OS queries via shell; has type mismatch for openDeeplink arg and potential command injection via exec string interpolation; device IDs for cameras are not real.
apps/raycast-extension/tsconfig.json Adds TS config for the Raycast extension; standard strict settings.

Sequence Diagram

sequenceDiagram
  participant User as External Tool / Raycast
  participant OS as macOS (open)
  participant Cap as Cap Desktop (Tauri)
  participant DL as deeplink_actions.rs
  participant Rec as recording.rs
  participant Dev as Device APIs

  User->>OS: open cap://action?value=<json>
  OS->>Cap: Launch/Focus with URL
  Cap->>DL: handle(urls)
  DL->>DL: TryFrom<Url> parse `value`
  alt start_recording
    DL->>Dev: set_camera_input / set_mic_input
    DL->>Rec: start_recording(inputs)
  else stop_recording
    DL->>Rec: stop_recording()
  else pause/resume/toggle
    DL->>Rec: pause_recording()/resume_recording()/toggle_pause_recording()
  else switch_camera
    DL->>Dev: set_camera_input(device_id)
  else switch_microphone
    DL->>Dev: set_mic_input(label)
  else list/status
    DL->>DL: build JSON + trace!(...)
    DL-->>User: (no response channel)
  end
Loading

(2/5) Greptile learns from your feedback when you react with thumbs up/down!

Context used:

  • Context from dashboard - CLAUDE.md (source)
  • Context from dashboard - AGENTS.md (source)

This PR adds:

## Deeplink Actions (Rust)
- PauseRecording: Pause the current recording
- ResumeRecording: Resume a paused recording
- TogglePauseRecording: Toggle pause/resume state
- SwitchCamera: Switch to a different camera by device ID
- SwitchMicrophone: Switch to a different microphone by label
- ListCameras: List available cameras (for external tooling)
- ListMicrophones: List available microphones (for external tooling)
- GetRecordingStatus: Get current recording state

## Raycast Extension
A full Raycast extension with commands:
- Start Recording: Pick screen or window, choose instant/studio mode
- Stop Recording: Stop the current recording
- Toggle Pause: Pause or resume recording
- Switch Camera: Pick from available cameras or disable
- Switch Microphone: Pick from available mics or disable
- Open Cap: Launch the Cap application

Closes CapSoftware#1540
Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

12 files reviewed, 4 comments

Edit Code Review Agent Settings | Greptile

Comment on lines 47 to 50
export async function openDeeplink(action: object): Promise<void> {
const url = buildDeeplinkUrl(action);
await execAsync(`open "${url}"`);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[P0] openDeeplink is typed to accept an object, but several callers pass a string (e.g. openDeeplink("stop_recording")). This will either fail typecheck (TS2345) or force unsafe casts, and it also means JSON.stringify produces quoted strings (which is intended) but the typing is incorrect.

This affects stopRecording, pauseRecording, resumeRecording, and togglePauseRecording in apps/raycast-extension/src/utils/cap.ts.

Also appears in:

  • apps/raycast-extension/src/utils/cap.ts:68-82
Prompt To Fix With AI
This is a comment left during a code review.
Path: apps/raycast-extension/src/utils/cap.ts
Line: 47:50

Comment:
[P0] `openDeeplink` is typed to accept an `object`, but several callers pass a string (e.g. `openDeeplink("stop_recording")`). This will either fail typecheck (TS2345) or force unsafe casts, and it also means `JSON.stringify` produces quoted strings (which is intended) but the typing is incorrect.

This affects `stopRecording`, `pauseRecording`, `resumeRecording`, and `togglePauseRecording` in `apps/raycast-extension/src/utils/cap.ts`.

Also appears in:
- apps/raycast-extension/src/utils/cap.ts:68-82

How can I resolve this? If you propose a fix, please make it concise.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed! Added type DeeplinkAction = string | object and updated both functions to accept either type.

Comment on lines 47 to 50
export async function openDeeplink(action: object): Promise<void> {
const url = buildDeeplinkUrl(action);
await execAsync(`open "${url}"`);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[P0] execAsync is invoked with a shell string containing interpolated data (the deeplink URL). While the URL is mostly controlled by this extension, values like window/display names can contain quotes/backticks, which can break the shell quoting and lead to command injection.

Using open via spawn/execFile with args (no shell) avoids this class of issues.

Prompt To Fix With AI
This is a comment left during a code review.
Path: apps/raycast-extension/src/utils/cap.ts
Line: 47:50

Comment:
[P0] `execAsync` is invoked with a shell string containing interpolated data (the deeplink URL). While the URL is mostly controlled by this extension, values like window/display names can contain quotes/backticks, which can break the shell quoting and lead to command injection.

Using `open` via `spawn/execFile` with args (no shell) avoids this class of issues.

How can I resolve this? If you propose a fix, please make it concise.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed! Replaced execAsync with Raycast's native open() API - no shell interpolation.

Comment on lines +193 to +204
DeepLinkAction::ListCameras => {
let cameras: Vec<CameraInfo> = cap_camera::list_cameras()
.map(|c| CameraInfo {
device_id: c.device_id().to_string(),
display_name: c.display_name().to_string(),
model_id: c.model_id().map(|m| m.to_string()),
})
.collect();
let json = serde_json::to_string(&cameras).map_err(|e| e.to_string())?;
trace!("ListCameras response: {}", json);
Ok(())
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[P1] ListCameras/ListMicrophones/GetRecordingStatus build JSON but then discard it (only trace! it and return Ok(())). Deeplinks don't have a response channel, so external tooling (including this Raycast extension) can't actually consume these lists/status as implemented.

If these actions are intended for external tooling, they likely need to emit an event / write to a known location / expose a Tauri command instead of only logging.

Prompt To Fix With AI
This is a comment left during a code review.
Path: apps/desktop/src-tauri/src/deeplink_actions.rs
Line: 193:204

Comment:
[P1] `ListCameras`/`ListMicrophones`/`GetRecordingStatus` build JSON but then discard it (only `trace!` it and return `Ok(())`). Deeplinks don't have a response channel, so external tooling (including this Raycast extension) can't actually consume these lists/status as implemented.

If these actions are intended for external tooling, they likely need to emit an event / write to a known location / expose a Tauri command instead of only logging.

How can I resolve this? If you propose a fix, please make it concise.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Acknowledged - this is a limitation of deeplinks (fire-and-forget, no response channel). These actions are primarily for debugging/logging. The Raycast extension queries devices directly via system_profiler instead.

Comment on lines 147 to 166
export async function listCameras(): Promise<Camera[]> {
try {
const script = `
set cameraList to {}
try
do shell script "system_profiler SPCameraDataType 2>/dev/null | grep -E '^\\s+[A-Za-z]' | sed 's/^[[:space:]]*//' | head -10"
on error
return ""
end try
`;
const { stdout } = await execAsync(`system_profiler SPCameraDataType 2>/dev/null | grep -E "^\\s+[A-Za-z]" | sed 's/^[[:space:]]*//' | head -10`);
const cameras = stdout.trim().split("\n").filter(Boolean);
return cameras.map((name, index) => ({
deviceId: `camera-${index}`,
displayName: name.replace(/:$/, "").trim(),
}));
} catch {
return [{ deviceId: "default", displayName: "Default Camera" }];
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[P1] listCameras returns fabricated deviceId values like camera-0, but the deeplink switch_camera expects a real device_id that cap_camera::list_cameras() can resolve. As a result, switching cameras via Raycast will likely fail or select the wrong device.

Similarly, listWindows/listDisplays use names as identifiers, which may not match Cap's internal matching if names differ or are non-unique.

Prompt To Fix With AI
This is a comment left during a code review.
Path: apps/raycast-extension/src/utils/cap.ts
Line: 147:166

Comment:
[P1] `listCameras` returns fabricated `deviceId` values like `camera-0`, but the deeplink `switch_camera` expects a real `device_id` that `cap_camera::list_cameras()` can resolve. As a result, switching cameras via Raycast will likely fail or select the wrong device.

Similarly, `listWindows`/`listDisplays` use names as identifiers, which may not match Cap's internal matching if names differ or are non-unique.

How can I resolve this? If you propose a fix, please make it concise.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed! Camera IDs now use the display name (e.g., "FaceTime HD Camera") which Cap can match against.

- Fix TypeScript type: openDeeplink now accepts string | object
- Fix command injection: Use Raycast's open API instead of shell exec
- Fix camera IDs: Use camera name as deviceId (Cap matches by name)

export async function openDeeplink(action: DeeplinkAction): Promise<void> {
const url = buildDeeplinkUrl(action);
// Use Raycast's open API to avoid command injection risks
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Repo guidelines disallow code comments; can you drop the inline // ... in this extension?

Suggested change
// Use Raycast's open API to avoid command injection risks
export async function openDeeplink(action: DeeplinkAction): Promise<void> {
const url = buildDeeplinkUrl(action);
await open(url);
}

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed! All comments removed in latest commit.


export async function listMicrophones(): Promise<Microphone[]> {
try {
const script = `
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

script is unused here; also keeps the no-comments rule consistent.

Suggested change
const script = `
export async function listMicrophones(): Promise<Microphone[]> {
try {
const { stdout } = await execAsync(`system_profiler SPAudioDataType 2>/dev/null | grep -A 50 'Input Sources:' | grep -E "Default Input Device: Yes" -B 10 | grep -E "^\\s+[A-Za-z].*:" | head -5 | sed 's/^[[:space:]]*//' | cut -d: -f1`);

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed! Removed unused script variable.

DeepLinkAction::TogglePauseRecording => {
crate::recording::toggle_pause_recording(app.clone(), app.state()).await
}
DeepLinkAction::SwitchCamera { device_id } => {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now switch_camera.device_id must match cap_camera::CameraInfo.device_id() (unique ID). The Raycast extension/README are passing a display name, which won't match. One option is to accept either unique ID or display name here.

Suggested change
DeepLinkAction::SwitchCamera { device_id } => {
DeepLinkAction::SwitchCamera { device_id } => {
let state = app.state::<ArcLock<App>>();
let camera_id = match device_id {
None => None,
Some(id) => {
let matched = cap_camera::list_cameras()
.find(|c| c.device_id() == id || c.display_name() == id)
.map(|c| c.device_id().to_string())
.ok_or_else(|| format!("No camera with id or name \"{id}\""))?;
Some(DeviceOrModelID::DeviceID(matched))
}
};
crate::set_camera_input(app.clone(), state, camera_id, Some(true)).await
}

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed! SwitchCamera now accepts either device_id or display_name - matches against both.

}

export async function startRecording(options: StartRecordingOptions): Promise<void> {
const captureMode = options.captureMode.screen
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If captureMode is {} (neither screen nor window), JSON.stringify will omit the undefined field and the deeplink will fail to parse on the Rust side. Guard early (or make CaptureMode a discriminated union).

Suggested change
const captureMode = options.captureMode.screen
export async function startRecording(options: StartRecordingOptions): Promise<void> {
if (!options.captureMode.screen && !options.captureMode.window) {
throw new Error("captureMode must include screen or window");
}
const captureMode = options.captureMode.screen
? { screen: options.captureMode.screen }
: { window: options.captureMode.window };
await openDeeplink({
start_recording: {
capture_mode: captureMode,
camera: options.camera ? { device_id: options.camera } : null,
mic_label: options.micLabel ?? null,
capture_system_audio: options.captureSystemAudio ?? false,
mode: options.mode ?? "instant",
},
});
}

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed! Added guard that throws if captureMode has neither screen nor window.

`;
const { stdout } = await execAsync(`osascript -e '${script.replace(/'/g, "'\\''")}'`);
const windows = stdout.trim().split(", ").filter(Boolean);
return windows.map((win, index) => {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Splitting osascript output on ", " is fragile if window titles contain commas. Consider returning newline-delimited text from AppleScript and splitting on \n.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed! Now using newline-delimited output from AppleScript and splitting on \n.


export async function openCap(): Promise<void> {
// Use Raycast's open API for launching applications
await open("cap://");
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same nit here: repo convention is no inline // comments.

Suggested change
await open("cap://");
export async function openCap(): Promise<void> {
await open("cap://");
}

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed! All comments removed.

Ok(())
}
DeepLinkAction::GetRecordingStatus => {
let state = app.state::<ArcLock<App>>();
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This calls current_recording() twice and unwrap_or(false) hides errors from is_paused(). I'd compute once and propagate the error (or include it in the trace).

Suggested change
let state = app.state::<ArcLock<App>>();
DeepLinkAction::GetRecordingStatus => {
let state = app.state::<ArcLock<App>>();
let app_state = state.read().await;
let (is_recording, is_paused) = match app_state.current_recording() {
Some(recording) => {
let is_paused = recording.is_paused().await.map_err(|e| e.to_string())?;
(true, is_paused)
}
None => (false, false),
};
trace!(
"GetRecordingStatus: is_recording={}, is_paused={}",
is_recording,
is_paused
);
Ok(())
}

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed! Now calling current_recording() once and properly propagating is_paused() errors.

tvdavies and others added 2 commits February 1, 2026 01:10
- Remove all inline comments (repo disallows comments)
- Guard startRecording against empty captureMode
- Fix window title parsing to use newlines instead of comma splitting
- Remove unused script variable in listMicrophones
- SwitchCamera now accepts device ID or display name
- GetRecordingStatus avoids redundant current_recording() call and propagates errors
repeat with win in (every window of proc)
set windowName to name of win
set appName to name of proc
set end of windowList to appName & ": " & windowName & "\n"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

windowList as text still joins list items using AppleScript's text item delimiters (default is , ), so appending "\n" to each element can leave stray commas in the output. I'd set the delimiters to linefeed and drop the per-item newline.

Suggested change
set end of windowList to appName & ": " & windowName & "\n"
repeat with win in (every window of proc)
set windowName to name of win
set appName to name of proc
set end of windowList to appName & ": " & windowName
end repeat
end try
end repeat
set AppleScript's text item delimiters to linefeed
return windowList as text

Set AppleScript's text item delimiters to linefeed before coercing
windowList to text, following Raycast extension conventions.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bounty: Deeplinks support + Raycast Extension

1 participant