Skip to content

feat(cli): improve error messages for endpoint creation failures#230

Merged
blainekasten merged 5 commits intomainfrom
nikitha/mle-3108-improve-endpoint-cli-errors
Feb 9, 2026
Merged

feat(cli): improve error messages for endpoint creation failures#230
blainekasten merged 5 commits intomainfrom
nikitha/mle-3108-improve-endpoint-cli-errors

Conversation

@atihkin
Copy link
Copy Markdown
Contributor

@atihkin atihkin commented Feb 2, 2026

  • Add specific error messages for hardware configuration issues:

    • Hardware not compatible with model
    • Hardware unavailable (no capacity)
    • Insufficient capacity for replicas
    • Hardware available but config not supported (suggests toggling speculative decoding)
  • Add client-side validation with clear errors:

    • min/max replicas: must be non-negative, min <= max
    • gpu-count: must be 1, 2, 4, or 8
    • availability-zone: validates against available zones
  • Improve API error handling in decorator:

    • Endpoint not found: suggests listing endpoints
    • Permission denied: clear access error message
    • Authentication errors: clear message

Fixes MLE-3108


Note

Medium Risk
Touches multiple CLI commands and changes error-handling/printing paths, which can affect user-facing output and exit behavior, but does not modify core API request logic or sensitive auth flows.

Overview
Improves the endpoints CLI UX by adding an endpoint-specific APIError handler that turns common failures (endpoint not found, permission denied, auth issues) into clearer actionable messages across all endpoints subcommands.

Adds client-side validation to together endpoints create (non-negative replicas via IntRange, min_replicas <= max_replicas, and optional availability-zone validation) and provides more targeted creation failure messaging for invalid hardware and invalid/unsupported models.

Updates endpoint output formatting so autoscaling is shown as Min Replicas/Max Replicas, and suppresses autoscaling fields when listing endpoints unless --mine is used (also populating display/hardware/autoscaling from list item model_extra when available).

Written by Cursor Bugbot for commit 80a569a. This will update automatically on new commits. Configure here.

@atihkin atihkin requested a review from blainekasten February 2, 2026 21:20
gpu: str,
gpu_count: int,
speculative_decoding_enabled: bool = False,
) -> None:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think all of these specific hardware errors should be implemented in the server code - not the sdk code. Then everyone on every version gets the benefits + other languages/curls/etc

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would even be cool in cases where the hardware selection was invalid to maybe add a property in the response that lists valid options. Then the clients could just render that to the user directly

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as discussed, until we add server side validation, we are ok to keep this.

Comment thread src/together/lib/cli/api/endpoints/create.py Outdated
Comment thread src/together/lib/cli/api/endpoints/create.py Outdated
Comment thread src/together/lib/cli/api/endpoints/create.py
atihkin and others added 2 commits February 4, 2026 15:32
- Add specific error messages for hardware configuration issues:
  - Hardware not compatible with model
  - Hardware unavailable (no capacity)
  - Insufficient capacity for replicas
  - Hardware available but config not supported (suggests toggling speculative decoding)

- Add client-side validation with clear errors:
  - min/max replicas: must be non-negative, min <= max
  - gpu-count: must be 1, 2, 4, or 8
  - availability-zone: validates against available zones

- Improve API error handling in decorator:
  - Endpoint not found: suggests listing endpoints
  - Permission denied: clear access error message
  - Authentication errors: clear message

Fixes MLE-3108

Co-authored-by: Cursor <cursoragent@cursor.com>
- Display min_replicas and max_replicas as separate fields
- Only show autoscaling info when --mine flag is used
- Access autoscaling data from model_extra for list responses
- Show display_name and hardware for dedicated endpoints in list

Fixes MLE-2238

Co-authored-by: Cursor <cursoragent@cursor.com>
@atihkin atihkin force-pushed the nikitha/mle-3108-improve-endpoint-cli-errors branch from 5c3c901 to 6eb5bba Compare February 4, 2026 23:47
Comment thread src/together/lib/cli/api/_utils.py Outdated
Comment on lines +163 to +186
if "not found" in error_lower and "endpoint" in error_lower:
endpoint_id = kwargs.get("endpoint_id", "")
endpoint_display = f"'{endpoint_id}'" if endpoint_id else ""
click.echo(prefix_styled + click.style("Failed", fg="red"))
click.echo(
prefix_styled + click.style(f"Endpoint {endpoint_display} not found.", fg="red")
)
click.echo(
prefix_styled + "The endpoint may have been deleted or the ID may be incorrect.",
err=True,
)
click.echo(
prefix_styled + "Use 'together endpoints list' to see your endpoints.",
err=True,
)
elif "permission" in error_lower or "forbidden" in error_lower or "unauthorized" in error_lower:
click.echo(prefix_styled + click.style("Failed", fg="red"))
click.echo(
prefix_styled + click.style("You don't have permission to access this resource.", fg="red")
)
click.echo(
prefix_styled + "This may belong to another user or organization.",
err=True,
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This error catcher is meant to be very generalized. I don't think it's great to put endpoints errors here.. is there any reason it has to be here instead of being in the endpoints code?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fair, moved it to endpoints code.

atihkin and others added 2 commits February 6, 2026 16:37
…points

Address PR feedback: keep handle_api_errors generic and move endpoint-specific
error handling (not found, permission, auth) into endpoints code.

- Simplify handle_api_errors in api/_utils.py to only show generic API errors
- Add endpoints/_utils.py with handle_endpoint_api_errors decorator
- Apply handle_endpoint_api_errors to all endpoint commands (create, list,
  retrieve, stop, start, delete, update, hardware, availability_zones)

Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Comment thread src/together/lib/cli/api/endpoints/_utils.py Outdated
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.

This PR is being reviewed by Cursor Bugbot

Details

Your team is on the Bugbot Free tier. On this plan, Bugbot will review limited PRs each billing cycle for each member of your team.

To receive Bugbot reviews on all of your PRs, visit the Cursor dashboard to activate Pro and start your 14-day free trial.

click.echo(
prefix_styled + "Use 'together endpoints list' to see your endpoints.",
err=True,
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Error messages inconsistently split between stdout and stderr

Low Severity

The error handling in handle_endpoint_api_errors outputs the main error messages ("Failed", "Endpoint not found", "Permission denied", etc.) to stdout without err=True, while helper messages use err=True to output to stderr. This splits a single logical error across both streams. If a user redirects stdout, they'd see partial error info in the file and partial on terminal. This is inconsistent with the rest of the codebase where error messages in endpoint commands consistently use err=True.

Additional Locations (2)

Fix in Cursor Fix in Web

@blainekasten blainekasten merged commit 0285a69 into main Feb 9, 2026
11 checks passed
@stainless-app stainless-app Bot mentioned this pull request Feb 9, 2026
@atihkin atihkin deleted the nikitha/mle-3108-improve-endpoint-cli-errors branch February 10, 2026 01:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants