fix: handle multi-byte UTF-8 identifiers in NameGenerator::suggest_name#21793
Merged
A4-Tacks merged 1 commit intorust-lang:masterfrom Mar 10, 2026
Merged
Conversation
…ame` `split_numeric_suffix` used `rfind` to locate the last non-numeric character and then split at `pos + 1`. Since `rfind` returns a byte offset, this panics when the last non-numeric character is multi-byte (e.g. CJK identifiers like `日本語`). Use `str::ceil_char_boundary` to advance past the full character before splitting.
There was a problem hiding this comment.
Pull request overview
This PR fixes a UTF-8 boundary bug in NameGenerator::split_numeric_suffix so that identifiers ending with multi-byte characters (e.g., CJK or accented Latin) no longer risk panicking when splitting off numeric suffixes.
Changes:
- Add doctests covering multi-byte UTF-8 identifiers with and without numeric suffixes.
- Fix
split_numeric_suffixto split at a valid UTF-8 character boundary usingstr::ceil_char_boundary. - Update the
split_numeric_suffixdoc example to reflect its actual return value.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
A4-Tacks
approved these changes
Mar 10, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
split_numeric_suffixusedrfindto locate the last non-numeric character and then split atpos + 1. Sincerfindreturns a byte offset, this panics when the last non-numeric character is multi-byte (e.g. CJK identifiers like日本語).Use
str::ceil_char_boundaryto advance past the full character before splitting.Added doctests covering CJK and accented Latin identifiers with numeric suffixes.