Skip to content

feat(awk): add Unicode \u escape sequences#635

Merged
chaliy merged 1 commit intomainfrom
claude/issue-617-awk-unicode-escapes-TeSJP
Mar 15, 2026
Merged

feat(awk): add Unicode \u escape sequences#635
chaliy merged 1 commit intomainfrom
claude/issue-617-awk-unicode-escapes-TeSJP

Conversation

@chaliy
Copy link
Contributor

@chaliy chaliy commented Mar 15, 2026

Summary

  • Add \u escape sequence support in awk string literals (gawk 5.3+ feature)
  • Supports 1-8 hex digits for BMP and supplementary plane characters (e.g., \u0041 → A, \u1F600 → 😀)
  • Invalid code points or bare \u gracefully fall back to literal text
  • Update builtins spec to document --csv/-k and \u escape support

Test plan

  • 5 new unit tests for Unicode escapes (basic, multibyte, emoji, bare \u, mixed)
  • 2 new spec tests for Unicode escapes
  • All existing awk tests still pass
  • Full test suite passes
  • cargo clippy and cargo fmt clean

Closes #617

Support \u followed by 1-8 hex digits in awk string literals,
matching gawk 5.3+ behavior. Handles BMP and supplementary plane
characters. Invalid code points fall back to literal text.

Closes #617
@chaliy chaliy merged commit ab79808 into main Mar 15, 2026
23 checks passed
@chaliy chaliy deleted the claude/issue-617-awk-unicode-escapes-TeSJP branch March 15, 2026 01:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat(awk): Consider gawk 5.3+ features (CSV, Unicode escapes)

1 participant