Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
202 changes: 191 additions & 11 deletions apps/desktop/src-tauri/src/deeplink_actions.rs
Original file line number Diff line number Diff line change
@@ -1,3 +1,15 @@
// Fix for Issue #1540 - Deep Links & Raycast Support
//
// Extends the existing DeepLinkAction enum with:
// - PauseRecording
// - ResumeRecording
// - TogglePauseRecording
// - SwitchMicrophone { label }
// - SwitchCamera { id }
//
// All new actions use idiomatic Rust error handling with `?`.
// No .unwrap() calls anywhere in this file.

use cap_recording::{
RecordingMode, feeds::camera::DeviceOrModelID, sources::screen_capture::ScreenCaptureTarget,
};
Expand All @@ -8,34 +20,73 @@ use tracing::trace;

use crate::{App, ArcLock, recording::StartRecordingInputs, windows::ShowCapWindow};

#[derive(Debug, Serialize, Deserialize)]
#[serde(rename_all = "snake_case")]
// ---------------------------------------------------------------------------
// CaptureMode helper
// ---------------------------------------------------------------------------

#[derive(Debug, Deserialize, Serialize)]
#[serde(rename_all = "camelCase")]
pub enum CaptureMode {
Screen(String),
Window(String),
}
Comment on lines +27 to 32
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 CaptureMode incompatible with #[serde(tag = "type")]

CaptureMode::Screen(String) and Window(String) are newtype tuple variants wrapping a primitive (String). Serde's internally-tagged representation (tag = "type") requires newtype variants to wrap a struct/map-like type β€” it cannot be used with primitives because there is no way to merge {"type": "screen"} with a bare JSON string value.

With this attribute, serde will either produce a compile-time error or silently fail at runtime when deserializing a startRecording deep link payload like:

{"captureMode": {"screen": "Built-in Display"}}

That format is externally-tagged (the variant name is the JSON key), which requires no tag attribute. The fix is to remove tag = "type" from CaptureMode:

Suggested change
#[derive(Debug, Deserialize, Serialize)]
#[serde(rename_all = "camelCase", tag = "type")]
pub enum CaptureMode {
Screen(String),
Window(String),
}
#[derive(Debug, Deserialize, Serialize)]
#[serde(rename_all = "camelCase")]
pub enum CaptureMode {
Screen(String),
Window(String),
}

Note: rename_all = "camelCase" still works here because "Screen" β†’ "screen" and "Window" β†’ "window" happen to be the same in both camelCase and snake_case for single-word variants.

Prompt To Fix With AI
This is a comment left during a code review.
Path: apps/desktop/src-tauri/src/deeplink_actions.rs
Line: 27-32

Comment:
**`CaptureMode` incompatible with `#[serde(tag = "type")]`**

`CaptureMode::Screen(String)` and `Window(String)` are newtype tuple variants wrapping a primitive (`String`). Serde's internally-tagged representation (`tag = "type"`) requires newtype variants to wrap a **struct/map-like type** β€” it cannot be used with primitives because there is no way to merge `{"type": "screen"}` with a bare JSON string value.

With this attribute, serde will either produce a compile-time error or silently fail at runtime when deserializing a `startRecording` deep link payload like:
```json
{"captureMode": {"screen": "Built-in Display"}}
```

That format is **externally-tagged** (the variant name is the JSON key), which requires **no** `tag` attribute. The fix is to remove `tag = "type"` from `CaptureMode`:

```suggestion
#[derive(Debug, Deserialize, Serialize)]
#[serde(rename_all = "camelCase")]
pub enum CaptureMode {
    Screen(String),
    Window(String),
}
```

Note: `rename_all = "camelCase"` still works here because `"Screen"` β†’ `"screen"` and `"Window"` β†’ `"window"` happen to be the same in both camelCase and snake_case for single-word variants.

How can I resolve this? If you propose a fix, please make it concise.


#[derive(Debug, Serialize, Deserialize)]
#[serde(rename_all = "snake_case")]
// ---------------------------------------------------------------------------
// The main action enum β€” all variants are (de)serializable from JSON so the
// URL parser (`TryFrom<&Url>`) can hydrate them from ?value=<JSON>.
// ---------------------------------------------------------------------------

#[derive(Debug, Deserialize, Serialize)]
#[serde(rename_all = "camelCase", tag = "type")]
pub enum DeepLinkAction {
/// Start a new recording session.
StartRecording {
capture_mode: CaptureMode,
camera: Option<DeviceOrModelID>,
mic_label: Option<String>,
capture_system_audio: bool,
mode: RecordingMode,
},

/// Stop the active recording session.
StopRecording,

/// Pause the active recording. Returns an error if no recording is active.
PauseRecording,

/// Resume a paused recording. Returns an error if recording is not paused.
ResumeRecording,

/// Toggle between paused and recording states.
TogglePauseRecording,

/// Switch the active microphone. Pass `None` to mute/disable the mic.
SwitchMicrophone {
label: Option<String>,
},

/// Switch the active camera. Pass `None` to disable the camera.
SwitchCamera {
id: Option<DeviceOrModelID>,
},

/// Open the Cap editor for a given project path.
OpenEditor {
project_path: PathBuf,
},

/// Navigate to a Settings page.
OpenSettings {
page: Option<String>,
},
}

// ---------------------------------------------------------------------------
// URL β†’ Action parsing
// ---------------------------------------------------------------------------

pub fn handle(app_handle: &AppHandle, urls: Vec<Url>) {
trace!("Handling deep actions for: {:?}", &urls);
trace!("Handling deep link actions for: {:?}", &urls);

let actions: Vec<_> = urls
.into_iter()
Expand All @@ -49,7 +100,7 @@ pub fn handle(app_handle: &AppHandle, urls: Vec<Url>) {
ActionParseFromUrlError::Invalid => {
eprintln!("Invalid deep link format \"{}\"", &url)
}
// Likely login action, not handled here.
// Likely a login/auth action β€” handled elsewhere.
ActionParseFromUrlError::NotAction => {}
})
.ok()
Expand All @@ -70,6 +121,10 @@ pub fn handle(app_handle: &AppHandle, urls: Vec<Url>) {
});
}

// ---------------------------------------------------------------------------
// Parse error types
// ---------------------------------------------------------------------------

pub enum ActionParseFromUrlError {
ParseFailed(String),
Invalid,
Expand All @@ -80,6 +135,7 @@ impl TryFrom<&Url> for DeepLinkAction {
type Error = ActionParseFromUrlError;

fn try_from(url: &Url) -> Result<Self, Self::Error> {
// On macOS, a .cap file opened from Finder arrives as a file:// URL.
#[cfg(target_os = "macos")]
if url.scheme() == "file" {
return url
Expand All @@ -88,26 +144,38 @@ impl TryFrom<&Url> for DeepLinkAction {
.map_err(|_| ActionParseFromUrlError::Invalid);
}

// All programmatic deep links use the "action" domain:
// cap-desktop://action?value=<JSON>
match url.domain() {
Some(v) if v != "action" => Err(ActionParseFromUrlError::NotAction),
_ => Err(ActionParseFromUrlError::Invalid),
}?;
Some(v) if v != "action" => return Err(ActionParseFromUrlError::NotAction),
_ => {}
};

let params = url
.query_pairs()
.collect::<std::collections::HashMap<_, _>>();

let json_value = params
.get("value")
.ok_or(ActionParseFromUrlError::Invalid)?;

let action: Self = serde_json::from_str(json_value)
.map_err(|e| ActionParseFromUrlError::ParseFailed(e.to_string()))?;

Ok(action)
}
}

// ---------------------------------------------------------------------------
// Action execution
// ---------------------------------------------------------------------------

impl DeepLinkAction {
pub async fn execute(self, app: &AppHandle) -> Result<(), String> {
match self {
// ----------------------------------------------------------------
// Start Recording
// ----------------------------------------------------------------
DeepLinkAction::StartRecording {
capture_mode,
camera,
Expand All @@ -125,12 +193,12 @@ impl DeepLinkAction {
.into_iter()
.find(|(s, _)| s.name == name)
.map(|(s, _)| ScreenCaptureTarget::Display { id: s.id })
.ok_or(format!("No screen with name \"{}\"", &name))?,
.ok_or_else(|| format!("No screen with name \"{}\"", &name))?,
CaptureMode::Window(name) => cap_recording::screen_capture::list_windows()
.into_iter()
.find(|(w, _)| w.name == name)
.map(|(w, _)| ScreenCaptureTarget::Window { id: w.id })
.ok_or(format!("No window with name \"{}\"", &name))?,
.ok_or_else(|| format!("No window with name \"{}\"", &name))?,
};

let inputs = StartRecordingInputs {
Expand All @@ -144,12 +212,124 @@ impl DeepLinkAction {
.await
.map(|_| ())
}

// ----------------------------------------------------------------
// Stop Recording
// ----------------------------------------------------------------
DeepLinkAction::StopRecording => {
crate::recording::stop_recording(app.clone(), app.state()).await
}

// ----------------------------------------------------------------
// Pause Recording
// ----------------------------------------------------------------
DeepLinkAction::PauseRecording => {
let state = app.state::<ArcLock<App>>();
let app_lock = state.read().await;

let recording = app_lock
.current_recording()
.ok_or_else(|| "No active recording to pause".to_string())?;

recording
.pause()
.await
.map_err(|e| format!("Failed to pause recording: {e}"))?;

crate::recording::RecordingEvent::Paused.emit(app).ok();
Ok(())
}

// ----------------------------------------------------------------
// Resume Recording
// ----------------------------------------------------------------
DeepLinkAction::ResumeRecording => {
let state = app.state::<ArcLock<App>>();
let app_lock = state.read().await;

let recording = app_lock
.current_recording()
.ok_or_else(|| "No active recording to resume".to_string())?;

let is_paused = recording
.is_paused()
.await
.map_err(|e| format!("Failed to query pause state: {e}"))?;

if !is_paused {
return Err("Recording is not currently paused".to_string());
}

recording
.resume()
.await
.map_err(|e| format!("Failed to resume recording: {e}"))?;

crate::recording::RecordingEvent::Resumed.emit(app).ok();
Ok(())
}

// ----------------------------------------------------------------
// Toggle Pause / Resume
// ----------------------------------------------------------------
DeepLinkAction::TogglePauseRecording => {
let state = app.state::<ArcLock<App>>();
let app_lock = state.read().await;

let recording = app_lock
.current_recording()
.ok_or_else(|| "No active recording".to_string())?;

let is_paused = recording
.is_paused()
.await
.map_err(|e| format!("Failed to query pause state: {e}"))?;

if is_paused {
recording
.resume()
.await
.map_err(|e| format!("Failed to resume recording: {e}"))?;

crate::recording::RecordingEvent::Resumed.emit(app).ok();
Ok(())
} else {
recording
.pause()
.await
.map_err(|e| format!("Failed to pause recording: {e}"))?;

crate::recording::RecordingEvent::Paused.emit(app).ok();
Ok(())
}
}
Comment on lines +226 to +305
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Missing RecordingEvent emissions after pause/resume/toggle

The existing Tauri commands for these operations always emit a RecordingEvent so the UI can react to the state change:

// from recording.rs
RecordingEvent::Paused.emit(&app).ok();   // after pause
RecordingEvent::Resumed.emit(&app).ok();  // after resume

The three new deep link arms (PauseRecording, ResumeRecording, TogglePauseRecording) perform the operation but never emit these events. Any UI component or listener subscribed to RecordingEvent will not be notified when the action arrives via deep link, causing the displayed pause state to become stale.

Each arm should emit the corresponding event after a successful operation, e.g.:

DeepLinkAction::PauseRecording => {
    let state = app.state::<ArcLock<App>>();
    let app_lock = state.read().await;

    let recording = app_lock
        .current_recording()
        .ok_or_else(|| "No active recording to pause".to_string())?;

    recording
        .pause()
        .await
        .map_err(|e| format!("Failed to pause recording: {e}"))?;

    RecordingEvent::Paused.emit(app).ok();
    Ok(())
}

The same pattern applies to ResumeRecording (RecordingEvent::Resumed) and TogglePauseRecording (branch on is_paused to emit the correct event).

Prompt To Fix With AI
This is a comment left during a code review.
Path: apps/desktop/src-tauri/src/deeplink_actions.rs
Line: 226-293

Comment:
**Missing `RecordingEvent` emissions after pause/resume/toggle**

The existing Tauri commands for these operations always emit a `RecordingEvent` so the UI can react to the state change:

```rust
// from recording.rs
RecordingEvent::Paused.emit(&app).ok();   // after pause
RecordingEvent::Resumed.emit(&app).ok();  // after resume
```

The three new deep link arms (`PauseRecording`, `ResumeRecording`, `TogglePauseRecording`) perform the operation but never emit these events. Any UI component or listener subscribed to `RecordingEvent` will not be notified when the action arrives via deep link, causing the displayed pause state to become stale.

Each arm should emit the corresponding event after a successful operation, e.g.:

```rust
DeepLinkAction::PauseRecording => {
    let state = app.state::<ArcLock<App>>();
    let app_lock = state.read().await;

    let recording = app_lock
        .current_recording()
        .ok_or_else(|| "No active recording to pause".to_string())?;

    recording
        .pause()
        .await
        .map_err(|e| format!("Failed to pause recording: {e}"))?;

    RecordingEvent::Paused.emit(app).ok();
    Ok(())
}
```

The same pattern applies to `ResumeRecording` (`RecordingEvent::Resumed`) and `TogglePauseRecording` (branch on `is_paused` to emit the correct event).

How can I resolve this? If you propose a fix, please make it concise.


// ----------------------------------------------------------------
// Switch Microphone
// ----------------------------------------------------------------
DeepLinkAction::SwitchMicrophone { label } => {
let state = app.state::<ArcLock<App>>();
crate::set_mic_input(state, label).await
}

// ----------------------------------------------------------------
// Switch Camera
// ----------------------------------------------------------------
DeepLinkAction::SwitchCamera { id } => {
let state = app.state::<ArcLock<App>>();
crate::set_camera_input(app.clone(), state, id, None).await
}

// ----------------------------------------------------------------
// Open Editor
// ----------------------------------------------------------------
DeepLinkAction::OpenEditor { project_path } => {
crate::open_project_from_path(Path::new(&project_path), app.clone())
}

// ----------------------------------------------------------------
// Open Settings
// ----------------------------------------------------------------
DeepLinkAction::OpenSettings { page } => {
crate::show_window(app.clone(), ShowCapWindow::Settings { page }).await
}
Expand Down
71 changes: 71 additions & 0 deletions extensions/raycast/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
# Cap β€” Raycast Extension
<!-- Fix for Issue #1540 - Deep Links & Raycast Support -->

Control your [Cap](https://cap.so) screen recording sessions directly from Raycast β€” no mouse required.

## Commands

| Command | Description |
|---|---|
| **Cap: Recording Controls** | Start, stop, pause, resume, or toggle your recording session |
| **Cap: Switch Input Device** | Switch the active microphone or camera |

## Requirements

- **Cap for macOS** installed from [cap.so](https://cap.so)
- A **Cap API key** (only required for the device switcher β€” find it in Cap Settings β†’ Developer)

## How It Works

Both commands build a `cap-desktop://action?value=<JSON>` deep link and call `open()` to hand off control to the Cap desktop app. Cap handles all state transitions; this extension stays stateless.

### URL Schema

```
cap-desktop://action?value=<URL-encoded JSON>
```

| Action | JSON |
|---|---|
| Start Recording | `{"type":"startRecording","captureMode":{"screen":"Built-in Display"},...}` |
| Stop Recording | `{"type":"stopRecording"}` |
| Pause Recording | `{"type":"pauseRecording"}` |
| Resume Recording | `{"type":"resumeRecording"}` |
| Toggle Pause | `{"type":"togglePauseRecording"}` |
| Switch Microphone | `{"type":"switchMicrophone","label":"MacBook Pro Microphone"}` |
| Switch Camera | `{"type":"switchCamera","id":"<deviceId>"}` |
| Disable Microphone | `{"type":"switchMicrophone","label":null}` |
| Disable Camera | `{"type":"switchCamera","id":null}` |

## Setup

### Cap: Recording Controls
No setup required. Just invoke the command and select an action.

### Cap: Switch Input Device
1. Open the command in Raycast.
2. On first use you'll be prompted to enter your Cap API key.
3. The key is stored in Raycast's encrypted local storage β€” never sent anywhere except the Cap API.
4. Select a microphone or camera to switch. Cap will activate the chosen device immediately.

## Security

- API keys are stored in **Raycast's `LocalStorage`** (encrypted, sandboxed per extension).
- No credentials are hard-coded or logged.
- Deep links only communicate with the locally running Cap app.

## Development

```bash
cd extensions/raycast
npm install
npm run dev # Hot-reload development mode
npm run build # Production build
npm run lint # ESLint check
```

## Related

- [Cap GitHub Repository](https://github.com/CapSoftware/Cap)
- [Issue #1540 β€” Bounty: Deeplinks support + Raycast Extension](https://github.com/CapSoftware/Cap/issues/1540)
- [Raycast Developer Documentation](https://developers.raycast.com)
Loading