Skip to content

feat: support branch in uri#6193

Draft
hamersaw wants to merge 2 commits intolance-format:mainfrom
hamersaw:feature/support-branch-in-uri
Draft

feat: support branch in uri#6193
hamersaw wants to merge 2 commits intolance-format:mainfrom
hamersaw:feature/support-branch-in-uri

Conversation

@hamersaw
Copy link
Contributor

No description provided.

Signed-off-by: Daniel Rammer <hamersaw@protonmail.com>
@hamersaw hamersaw changed the title feat: upport branch in uri feat: support branch in uri Mar 13, 2026
@github-actions github-actions bot added enhancement New feature or request python labels Mar 13, 2026
@github-actions
Copy link
Contributor

PR Review: feat: support branch in uri

Summary

Adds ?branch=<name> query parameter support to dataset URIs, splitting the internal physical URI (<root>/tree/<branch>) from a user-facing display URI (<root>?branch=<name>). The display URI round-trips through open/write correctly.

Issues

P0: unwrap_or_default() silently produces empty URI in commit path

In commit.rs line ~397 (the WriteDestination::Uri(_) arm):

clean_uri is Option<String> that is Some only when self.dest is WriteDestination::Uri. Since this code is inside the WriteDestination::Uri(_) match arm, it should always be Some. However, if by refactoring this invariant is accidentally broken, unwrap_or_default() would silently create a Dataset with an empty uri field, which would be very hard to debug. Consider using .expect("clean_uri must be Some for WriteDestination::Uri") or restructuring to avoid the Option entirely (e.g., compute clean_uri inside the match arm where it's needed, rather than above both arms).

P1: parse_branch_from_uri is called redundantly in insert.rs

In insert.rs resolve_context(), the branch is parsed/stripped from the URI to create object_store/base_path/commit_handler (lines ~341-358). But then immediately below (lines ~366-373), DatasetBuilder::from_uri(uri) is called with the original URI, which internally calls parse_branch_from_uri again and resolves the branch. When the dataset exists, the resolved dataset's store/path/handler then override the ones just computed above (lines ~376-380). This means the initial object store resolution (lines ~346-352) is wasted work for existing datasets. Consider restructuring to avoid the redundant parse and object store creation.

P1: No URL-encoding of branch names in make_display_uri

make_display_uri formats branch names directly into the URI without URL-encoding. Branch names with special characters (e.g., spaces, &, ?, =, #) will produce malformed URIs. While parse_branch_from_uri does handle URL-decoding on the Url::parse path (via query_pairs()), the fallback path for local URIs does not decode, creating an asymmetry. The display URI should URL-encode the branch name, or at minimum document that branch names must be URL-safe.

Minor notes

  • The #[allow(clippy::misnamed_getters)] on uri() is a reasonable pragmatic choice for API compat, with a good comment explaining the mismatch.
  • Good test coverage for the URI parsing, including edge cases (empty values, URL encoding, mixed params, local paths).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request python

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant