feat: Support for Deep Links and Raycast Extension (#1540)#1701
feat: Support for Deep Links and Raycast Extension (#1540)#1701yemsy26 wants to merge 2 commits intoCapSoftware:mainfrom
Conversation
# Commit & Git Workflow ## Commit Message (Conventional Commits) ``` feat(deeplink): add recording controls and raycast extension CapSoftware#1540 Extends the DeepLinkAction enum with PauseRecording, ResumeRecording, TogglePauseRecording, SwitchMicrophone, and SwitchCamera variants. Adds a companion Raycast extension with two commands: cap-control (recording lifecycle) and switch-device (mic/camera switcher). All Rust code uses the ? operator for error propagation (no .unwrap). API key in Raycast is stored via LocalStorage, never hard-coded. Closes CapSoftware#1540 ``` --- ## Git Commands to Submit the PR ### Step 1 — Fork & clone (if not already done) ```bash # Fork the repo on GitHub first, then: git clone https://github.com/<YOUR-USERNAME>/Cap.git cd Cap git remote add upstream https://github.com/CapSoftware/Cap.git ``` ### Step 2 — Create a feature branch ```bash git checkout -b feat/deeplink-raycast-1540 ``` ### Step 3 — Copy the generated files ```bash # Rust change cp path/to/cap-bounty-1540/apps/desktop/src-tauri/src/deeplink_actions.rs \ apps/desktop/src-tauri/src/deeplink_actions.rs # Raycast extension cp -r path/to/cap-bounty-1540/extensions/raycast extensions/raycast ``` ### Step 4 — Stage and commit ```bash git add apps/desktop/src-tauri/src/deeplink_actions.rs git add extensions/raycast/ git commit -m "feat(deeplink): add recording controls and raycast extension CapSoftware#1540 Extends the DeepLinkAction enum with PauseRecording, ResumeRecording, TogglePauseRecording, SwitchMicrophone, and SwitchCamera variants. Adds a companion Raycast extension with two commands: cap-control (recording lifecycle) and switch-device (mic/camera switcher). All Rust code uses the ? operator for error propagation (no .unwrap). API key in Raycast is stored via LocalStorage, never hard-coded. Closes CapSoftware#1540" ``` ### Step 5 — Push and open the PR ```bash git push origin feat/deeplink-raycast-1540 # Then open GitHub and create the PR from feat/deeplink-raycast-1540 → main # Paste the contents of PR_DESCRIPTION.md into the PR body. ``` --- ## Pre-PR Checklist ```bash # Rust syntax check (no full build needed for CI check) cd apps/desktop cargo check # TypeScript lint cd extensions/raycast npm install npm run lint npm run build ```
| #[derive(Debug, Deserialize, Serialize)] | ||
| #[serde(rename_all = "camelCase", tag = "type")] | ||
| pub enum CaptureMode { | ||
| Screen(String), | ||
| Window(String), | ||
| } |
There was a problem hiding this comment.
CaptureMode incompatible with #[serde(tag = "type")]
CaptureMode::Screen(String) and Window(String) are newtype tuple variants wrapping a primitive (String). Serde's internally-tagged representation (tag = "type") requires newtype variants to wrap a struct/map-like type — it cannot be used with primitives because there is no way to merge {"type": "screen"} with a bare JSON string value.
With this attribute, serde will either produce a compile-time error or silently fail at runtime when deserializing a startRecording deep link payload like:
{"captureMode": {"screen": "Built-in Display"}}That format is externally-tagged (the variant name is the JSON key), which requires no tag attribute. The fix is to remove tag = "type" from CaptureMode:
| #[derive(Debug, Deserialize, Serialize)] | |
| #[serde(rename_all = "camelCase", tag = "type")] | |
| pub enum CaptureMode { | |
| Screen(String), | |
| Window(String), | |
| } | |
| #[derive(Debug, Deserialize, Serialize)] | |
| #[serde(rename_all = "camelCase")] | |
| pub enum CaptureMode { | |
| Screen(String), | |
| Window(String), | |
| } |
Note: rename_all = "camelCase" still works here because "Screen" → "screen" and "Window" → "window" happen to be the same in both camelCase and snake_case for single-word variants.
Prompt To Fix With AI
This is a comment left during a code review.
Path: apps/desktop/src-tauri/src/deeplink_actions.rs
Line: 27-32
Comment:
**`CaptureMode` incompatible with `#[serde(tag = "type")]`**
`CaptureMode::Screen(String)` and `Window(String)` are newtype tuple variants wrapping a primitive (`String`). Serde's internally-tagged representation (`tag = "type"`) requires newtype variants to wrap a **struct/map-like type** — it cannot be used with primitives because there is no way to merge `{"type": "screen"}` with a bare JSON string value.
With this attribute, serde will either produce a compile-time error or silently fail at runtime when deserializing a `startRecording` deep link payload like:
```json
{"captureMode": {"screen": "Built-in Display"}}
```
That format is **externally-tagged** (the variant name is the JSON key), which requires **no** `tag` attribute. The fix is to remove `tag = "type"` from `CaptureMode`:
```suggestion
#[derive(Debug, Deserialize, Serialize)]
#[serde(rename_all = "camelCase")]
pub enum CaptureMode {
Screen(String),
Window(String),
}
```
Note: `rename_all = "camelCase"` still works here because `"Screen"` → `"screen"` and `"Window"` → `"window"` happen to be the same in both camelCase and snake_case for single-word variants.
How can I resolve this? If you propose a fix, please make it concise.| DeepLinkAction::PauseRecording => { | ||
| let state = app.state::<ArcLock<App>>(); | ||
| let app_lock = state.read().await; | ||
|
|
||
| let recording = app_lock | ||
| .current_recording() | ||
| .ok_or_else(|| "No active recording to pause".to_string())?; | ||
|
|
||
| recording | ||
| .pause() | ||
| .await | ||
| .map_err(|e| format!("Failed to pause recording: {e}")) | ||
| } | ||
|
|
||
| // ---------------------------------------------------------------- | ||
| // Resume Recording | ||
| // ---------------------------------------------------------------- | ||
| DeepLinkAction::ResumeRecording => { | ||
| let state = app.state::<ArcLock<App>>(); | ||
| let app_lock = state.read().await; | ||
|
|
||
| let recording = app_lock | ||
| .current_recording() | ||
| .ok_or_else(|| "No active recording to resume".to_string())?; | ||
|
|
||
| let is_paused = recording | ||
| .is_paused() | ||
| .await | ||
| .map_err(|e| format!("Failed to query pause state: {e}"))?; | ||
|
|
||
| if !is_paused { | ||
| return Err("Recording is not currently paused".to_string()); | ||
| } | ||
|
|
||
| recording | ||
| .resume() | ||
| .await | ||
| .map_err(|e| format!("Failed to resume recording: {e}")) | ||
| } | ||
|
|
||
| // ---------------------------------------------------------------- | ||
| // Toggle Pause / Resume | ||
| // ---------------------------------------------------------------- | ||
| DeepLinkAction::TogglePauseRecording => { | ||
| let state = app.state::<ArcLock<App>>(); | ||
| let app_lock = state.read().await; | ||
|
|
||
| let recording = app_lock | ||
| .current_recording() | ||
| .ok_or_else(|| "No active recording".to_string())?; | ||
|
|
||
| let is_paused = recording | ||
| .is_paused() | ||
| .await | ||
| .map_err(|e| format!("Failed to query pause state: {e}"))?; | ||
|
|
||
| if is_paused { | ||
| recording | ||
| .resume() | ||
| .await | ||
| .map_err(|e| format!("Failed to resume recording: {e}")) | ||
| } else { | ||
| recording | ||
| .pause() | ||
| .await | ||
| .map_err(|e| format!("Failed to pause recording: {e}")) | ||
| } | ||
| } |
There was a problem hiding this comment.
Missing
RecordingEvent emissions after pause/resume/toggle
The existing Tauri commands for these operations always emit a RecordingEvent so the UI can react to the state change:
// from recording.rs
RecordingEvent::Paused.emit(&app).ok(); // after pause
RecordingEvent::Resumed.emit(&app).ok(); // after resumeThe three new deep link arms (PauseRecording, ResumeRecording, TogglePauseRecording) perform the operation but never emit these events. Any UI component or listener subscribed to RecordingEvent will not be notified when the action arrives via deep link, causing the displayed pause state to become stale.
Each arm should emit the corresponding event after a successful operation, e.g.:
DeepLinkAction::PauseRecording => {
let state = app.state::<ArcLock<App>>();
let app_lock = state.read().await;
let recording = app_lock
.current_recording()
.ok_or_else(|| "No active recording to pause".to_string())?;
recording
.pause()
.await
.map_err(|e| format!("Failed to pause recording: {e}"))?;
RecordingEvent::Paused.emit(app).ok();
Ok(())
}The same pattern applies to ResumeRecording (RecordingEvent::Resumed) and TogglePauseRecording (branch on is_paused to emit the correct event).
Prompt To Fix With AI
This is a comment left during a code review.
Path: apps/desktop/src-tauri/src/deeplink_actions.rs
Line: 226-293
Comment:
**Missing `RecordingEvent` emissions after pause/resume/toggle**
The existing Tauri commands for these operations always emit a `RecordingEvent` so the UI can react to the state change:
```rust
// from recording.rs
RecordingEvent::Paused.emit(&app).ok(); // after pause
RecordingEvent::Resumed.emit(&app).ok(); // after resume
```
The three new deep link arms (`PauseRecording`, `ResumeRecording`, `TogglePauseRecording`) perform the operation but never emit these events. Any UI component or listener subscribed to `RecordingEvent` will not be notified when the action arrives via deep link, causing the displayed pause state to become stale.
Each arm should emit the corresponding event after a successful operation, e.g.:
```rust
DeepLinkAction::PauseRecording => {
let state = app.state::<ArcLock<App>>();
let app_lock = state.read().await;
let recording = app_lock
.current_recording()
.ok_or_else(|| "No active recording to pause".to_string())?;
recording
.pause()
.await
.map_err(|e| format!("Failed to pause recording: {e}"))?;
RecordingEvent::Paused.emit(app).ok();
Ok(())
}
```
The same pattern applies to `ResumeRecording` (`RecordingEvent::Resumed`) and `TogglePauseRecording` (branch on `is_paused` to emit the correct event).
How can I resolve this? If you propose a fix, please make it concise.|
I've pushed the requested changes to address the feedback: Fixed Serde Serialization: Removed the incompatible tag = "type" from CaptureMode to ensure correct deserialization of screen/window payloads. UI State Sync: Added RecordingEvent emissions (Paused / Resumed) in all relevant deep link handlers. This ensures the recording overlay and UI components reflect the correct state immediately after a deep link action. Everything is ready for a final review! /claim #1540 |
PR: feat(deeplink): add recording controls and raycast extension #1540
Summary
This PR resolves Bounty #1540 by extending Cap's existing
cap-desktop://deep link infrastructure with full recording lifecycle controls and building a companion Raycast extension.Changes
🦀 Rust —
apps/desktop/src-tauri/src/deeplink_actions.rsExtended the
DeepLinkActionenum with five new variants:PauseRecordingcap-desktop://action?value={"type":"pauseRecording"}ResumeRecordingcap-desktop://action?value={"type":"resumeRecording"}TogglePauseRecordingcap-desktop://action?value={"type":"togglePauseRecording"}SwitchMicrophone { label }cap-desktop://action?value={"type":"switchMicrophone","label":"<name>"}SwitchCamera { id }cap-desktop://action?value={"type":"switchCamera","id":"<deviceId>"}Implementation notes:
execute()arms useapp.state::<ArcLock<App>>().read().await— no blocking calls or.unwrap().TogglePauseRecordingqueriesis_paused()and branches cleanly with the?operator.SwitchMicrophoneandSwitchCameradelegate to the existingcrate::set_mic_input()/crate::set_camera_input()functions, keeping logic DRY.#[serde(rename_all = "camelCase")]and#[serde(tag = "type")]— consistent with the existing URL parser.🔌 TypeScript —
extensions/raycast/New Raycast extension with two commands:
cap-control(Recording Controls)switch-device(Switch Input Device)LocalStorage— never hard-coded).https://api.cap.so/v1/devices/*.nulllabel/id to mute the input.Testing Instructions
Deep Links (Rust)
After building the app (
pnpm turbo build --filter @cap/desktop), test each action from Terminal:Verified states to test:
Raycast Extension
Checklist
.unwrap()calls in Rust#[serde(rename_all = "camelCase")]LocalStorage, never hard-coded// Fix for Issue #1540 - Deep Links & Raycast Supportcap-desktop://action?value=<JSON>URL schemaStartRecording,StopRecording,OpenEditor,OpenSettingsvariants unchangedCloses #1540
/claim #1540
Greptile Summary
This PR extends Cap's
cap-desktop://deep link system with five new recording lifecycle actions (PauseRecording,ResumeRecording,TogglePauseRecording,SwitchMicrophone,SwitchCamera) and adds a companion Raycast extension with two commands. The overall architecture is solid — actions are well-structured, follow existing patterns, and avoid.unwrap()calls — but three functional bugs need to be fixed before this is ready to merge.Key issues found:
CaptureModeserde attribute mismatch (P1): The#[serde(tag = "type")]attribute added toCaptureModeis incompatible withScreen(String)/Window(String)tuple newtype variants. Serde's internally-tagged mode cannot merge atypediscriminator into a bare JSON string. The TypeScript format{"screen": "Built-in Display"}(externally-tagged) won't deserialize with this attribute present, breaking everystartRecordingdeep link. The fix is to removetag = "type"fromCaptureModeonly.Missing
RecordingEventemissions (P1): The newPauseRecording,ResumeRecording, andTogglePauseRecordingdeep link arms perform the operation but never emitRecordingEvent::Paused/RecordingEvent::Resumed. The existing Tauri commands always emit these events so the UI can update its pause indicator. Omitting them here leaves the UI state stale after a deep-link-triggered pause or resume.SwitchCameratype mismatch (P1): The Raycast extension sendsid: cam.deviceId— a plainstring— but Rust'sDeviceOrModelIDis an externally-tagged enum ({"DeviceID": "..."}or{"ModelID": {...}}). A bare string will fail deserialization, silently rejecting every camera switch request from the extension.Missing error handling in
handleDisableMic/handleDisableCam(P2): These helpers callopen()withouttry/catch, unlike thesendSwitchutility used for all other device actions.Confidence Score: 4/5
Not safe to merge — three P1 functional bugs cause startRecording deep links to fail, UI state desync on pause/resume, and broken camera switching from Raycast.
Three separate P1 issues cause core functionality to fail at runtime: the CaptureMode serde tag breaks startRecording deserialization, missing RecordingEvent emissions desync the UI, and the camera-ID type mismatch silently rejects every switchCamera request from the extension. Each is a targeted, well-scoped fix, so the score remains 4 rather than lower — the overall structure is sound and no data loss or security risks are present.
apps/desktop/src-tauri/src/deeplink_actions.rs (CaptureMode tag + RecordingEvent emissions) and extensions/raycast/src/switch-device.tsx (DeviceOrModelID format).
Important Files Changed
Sequence Diagram
sequenceDiagram participant User participant Raycast as Raycast Extension participant OS as macOS open() participant Cap as Cap Desktop (Tauri) participant UI as Cap UI Note over Raycast: cap-control.tsx User->>Raycast: Select "Pause Recording" Raycast->>OS: open(cap-desktop://action?value={"type":"pauseRecording"}) OS->>Cap: deep link event Cap->>Cap: DeepLinkAction::try_from(&url) Cap->>Cap: PauseRecording.execute() Cap->>Cap: recording.pause().await Note over Cap,UI: ⚠️ RecordingEvent::Paused NOT emitted — UI stays stale Note over Raycast: switch-device.tsx User->>Raycast: Select camera "HD Webcam" Raycast->>OS: open(cap-desktop://action?value={"type":"switchCamera","id":"device-123"}) OS->>Cap: deep link event Cap->>Cap: serde_json::from_str — id="device-123" Note over Cap: ⚠️ DeviceOrModelID expects {"DeviceID":"device-123"} — parse fails Note over Raycast: cap-control.tsx User->>Raycast: Select "Start Recording" Raycast->>OS: open(cap-desktop://action?value={"type":"startRecording","captureMode":{"screen":"Built-in Display"},...}) OS->>Cap: deep link event Cap->>Cap: serde_json::from_str — CaptureMode {"screen":"..."} Note over Cap: ⚠️ tag="type" on CaptureMode breaks deserializationComments Outside Diff (2)
extensions/raycast/src/switch-device.tsx, line 739-743 (link)switchCamerasends a plain stringidbut Rust expects a taggedDeviceOrModelIDcam.deviceIdis a plainstringfrom the API response, so the JSON sent over the deep link is:{"type": "switchCamera", "id": "some-device-id"}On the Rust side,
SwitchCamera { id: Option<DeviceOrModelID> }uses:With default serde derive this is an externally-tagged enum — it deserializes from
{"DeviceID": "some-device-id"}, not from a bare string. Sending a bare string will cause serde to return aParseFailederror and the camera switch will silently do nothing.The TypeScript type and JSON payload need to match the Rust enum structure. Assuming device IDs from the API are always device IDs (not model IDs), the action should be:
And the TypeScript type for the
idfield should be updated accordingly:Prompt To Fix With AI
extensions/raycast/src/switch-device.tsx, line 924-934 (link)handleDisableMic/handleDisableCamUnlike
sendSwitch(), which wrapsopen()in atry/catchand shows a failure toast on error,handleDisableMicandhandleDisableCamcallopen()directly without any error handling. If theopen()call rejects (e.g., Cap is not installed), the error will be an unhandled promise rejection.Consider extracting a shared helper for the disable paths, or at minimum add a
try/catch:The same applies to
handleDisableCam.Prompt To Fix With AI
Prompt To Fix All With AI
Reviews (1): Last reviewed commit: "<!-- Fix for Issue #1540 - Deep Links & ..." | Re-trigger Greptile
(2/5) Greptile learns from your feedback when you react with thumbs up/down!